Read the source code and review the basics via LinkedList

preface

This article is based on JDK1.8

ArrayList implements RandomAccess. ArrayList implements RandomAccess. ArrayList implements RandomAccess. Fulfill your vows today and start reading the LinkedList.

LinkedList is one of the collections that we use a lot, and for the average programmer,

List list1 = new ArrayList()
List list2 = newLinkedList (),Copy the code

How to choose?

ArrayList is based on an array, while LinkedList is based on a LinkedList, so the key is the difference between an array and a LinkedList.

Speaking of which, does it illustrate a very important truth, the foundation, the foundation, the foundation? If you want to become a real programmer, whether you are a professional programmer or a beginner, you need to work hard to build a solid foundation.

Anyway, arrayLists are based on arrays and are fast to find (by index), but slow to add and delete. LinkedList is based on linked lists and is fast to add and delete, but slow to find. But this is relative, and knowing these two things is not enough, so keep reading.

Class signature

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable.java.io.Serializable
Copy the code

In view of many omissions in the previous article, here are some more explanations:

The generic

The collection class has been generic since version 1.5, that is, its primary purpose is compile-time checking to avoid adding improper types. Simple said.

List list1 = new LinkedList(); In this case, the default type is Object, which means that you can add elements of any type. When fetching elements, you need to cast them, which increases the probability of errors.

List

list2 = new LinkedList

(); Where the String on the right side of the equals sign can be omitted. At this point, checks are performed at compile time, adding non-string elements does not compile directly, and fetching does not require casting. Of course, type erasure of different periods is involved here, which is not the focus of this article. I will write about it later if necessary.

Because most of the time we use collections we want to store the same type, it’s important to use generics (declaring types up front). Here, too, is the idea that the sooner mistakes are exposed, the better.

The Serializable and Cloneable

Cloneable, Serializable interface, with clone and serialization capabilities.

Deque

The implementation of the Deque interface, which in turn inherits the Queue interface, also means that the LinkedList can be used as a Queue, implementing “first in, first out”.

The List and AbstractList

Abstract class AbstractList implements the List interface. ArrayList extends AbstractList from AbstractList and implements the List interface again. AbstractSequentialList is AbstractList. AbstractSequentialList is AbstractList. AbstractSequentialList is AbstractList.

See two answers on Stack Overflow:

A netizen said that he had asked the author who designed this class, and the author himself said that this was a flaw in the design at that time, which had been left over. (Of course, I personally think this is questionable).

The second user gave an example of the unintended consequences of using proxies without directly reimplementing the List interface. (this makes sense from a practical point of view, but considering that the collection class is already in jdk1.2 and the proxy class is in 1.3, the logic is somewhat questionable.)

My personal understanding:

When designing the collection class, Daishen takes into account the situation of future optimization.

Specifically, it’s about understanding the difference between an interface and an abstract class, especially prior to java8. The interface is a specification that makes it easy to plan the architecture. The abstract class is already partially implemented, but more to help us reduce code duplication. In other words, the abstract class here is a utility class that just happens to implement the List interface, and it is possible to replace the abstract class because of Java single inheritance.

In interface oriented programming, List List = new LinkedList(); If there is a better implementation of LinkedList in the future that no longer inherits the AbstractSequentialList abstract class, since the List interface itself is already implemented directly, the above code will be fine as long as the internal implementation is logical. Conversely, if you don’t implement List and remove the AbstractSequentialList abstract class inheritance, the above old code will not compile and will not be “backward compatible.”

RandomAccess interface (not implemented)

LinkedList does not implement the RandomAccess interface; it is implemented by ArrayList, which is put there for comparison purposes.

Note that the random access capability here refers to access by index, the E get(int Index) method defined by the List interface, and means that both ArrayList and LinkedList must implement this method.

Back to the essence of the question, why can array-based arrayLists be randomly accessed, but linkedLists are not?

Again, the basics: An array is a contiguous chunk of memory, with each element allocated a fixed size and easy to locate to a specified index. In addition to data, each node of the linked list also has a pointer to the next node. Memory allocation is not necessarily continuous. In order to know the value of a certain index, we can only start from the beginning (or from the end) one by one.

The RandomAccess interface is a markup interface without any methods. The only thing it does is use Instanceof to determine if an implementation set has random access capability.

List list1 = newLinkedList ();if (list1 instanceof RandomAccess) {
    / /...
}
Copy the code

The key to this problem is the difference between ArrayList and LinkedList in implementing the GET method on the List interface, which will be covered later.

variable

// The actual number of stored elements
transient int size = 0;

/ * * * points to head node, there is no point to head nodes in a node reference * * Invariant: (first = = null && last = = null) | | * (first. Prev = = null && first. The item! = null) */
transient Node<E> first;

/ * * * points to the end node and end node there is no point to the next node reference * Invariant: (first = = null && last = = null) | | * (last. Next = = null &&. Last item! = null) */
transient Node<E> last;

// Node type, containing stored elements and Pointers to the next and previous nodes
private static class Node<E> {
    E item;
    Node<E> next;
    Node<E> prev;

    Node(Node<E> prev, E element, Node<E> next) {
        this.item = element;
        this.next = next;
        this.prev = prev; }}Copy the code

Notice the node types here, and you can see that the LinkedList implementation is based on a two-way LinkedList. Why not just use a one-way list to link to the bottom? The most important reason is to search efficiency. As mentioned above, the search efficiency of linked list is relatively low. If it is a one-way linked list, when searching by index, no matter which position the index is in, it can only start from the beginning, which needs N times on average. For bi-linked lists, it takes N/2 times on average to determine whether the index is in the first half or the second half, and then to decide whether to start at the beginning or the last. Of course, the disadvantage of bidirectional linked list is that it needs more storage space, which reflects the idea of space for time from another aspect.

The above two variables, first and last, are essentially references to objects, the same as s in Student s=new Student(), except that first must point to the head of the list, and last must point to the end of the list, so that we can always iterate from the beginning or from the end.

The constructor

// Empty parameter construct
public LinkedList(a) {}// Construct the LinkedList by specifying the collection and call the addAll method
public LinkedList(Collection<? extends E> c) {
    this(a); addAll(c); }Copy the code

Commonly used method

There are many commonly used methods (so many that one graph cannot be cut off), which are mainly divided into two categories: List system and Deque system.

Methods under the List system:

I’m going to focus on two things here, add and get

add(E e)

Add an element to the end of the list, return true on success

// Add an element to the end of the list, return true on success
public boolean add(E e) {
    linkLast(e);
    return true;
}


void linkLast(E e) {
    //1. Copy a reference l to the tail node
    final Node<E> l = last;
    //2. Construct the element to be added as a node with prev pointing to the last node
    final Node<E> newNode = new Node<>(l, e, null);
    //3. Last points to the newly constructed node
    last = newNode
    //4. If the initial list is empty, point first to the new node
    if (l == null)
        first = newNode;
    //5. If the initial list is not empty, add next to the last element to point to the new node
    else
        l.next = newNode;
    
    // Number of stored elements +1
    size++;
    // Number of changes +1
    modCount++;
}
Copy the code

The key is the linkLast(E E) method, which initially adds elements to an empty list and initially adds elements to a non-empty list.

The knowledge involved here is very basic, or the basic operation of the linked list, but it is difficult to describe clearly in simple language, so draw a schematic diagram to show it.

LinkLast (E, E) method

The basic form of a bidirectional list

For the addition of an empty list

Corresponding linkLast(E E) method annotation 1, 2, 3, 4

An empty list with no nodes means that both first and last point to NULL

1. Copy a reference l to the tail node (blue)

The copied reference L essence also points to NULL

2. Construct the element to be added as a node newNode. Prev points to L, which is null

3. Last points to the newly constructed node (red)

4. Initially the list is empty, and first points to the new node

At this point, both first and last point to the only non-empty node, and of course the reference to newNode still exists, but it is meaningless.

For the addition of non-empty linked lists

Corresponding linkLast(E E) method notes 1, 2, 3, 5

1. Copy a reference l to the tail node (blue)

2. Construct the element to be added as a node newNode with prev pointing to the last node (blue)

3. Last points to the newly constructed node (red)

5. Add the last element to next to point to the new node (green)

At this point, the newNode and L references still exist, but are meaningless.

add(int index, E element)

Adds the element to the specified location

public void add(int index, E element) {
    checkPositionIndex(index);

    if (index == size)
        linkLast(element);
    else
        linkBefore(element, node(index));
}
Copy the code

Index >= 0 and index <= size;

If index==size, insert the linkLast method directly into the end of the list.

LinkBefore calls node(index).

node(index)

The function of node(index) is to return the node with the specified index. Here we use the knowledge mentioned above, first determine whether the index is in the first half or the second half, and then decide whether to start from the beginning or from the end.

Node<E> node(int index) {
    // assert isElementIndex(index);

    // If the index is in the first half, the index is traversed backwards from the beginning node
    if (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        // If the index is in the back half, it is traversed from the end forward
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        returnx; }}Copy the code

linkBefore

Looking back at linkBefore, the parameters are the element to be inserted and the node at the specified position

void linkBefore(E e, Node<E> succ) {
    // assert succ ! = null;
    //1. Copy the previous node reference to the target location
    final Node<E> pred = succ.prev;
    //2. Construct a new node. Prev points to the previous node and next points to the original node
    final Node<E> newNode = new Node<>(pred, e, succ);
    //3. The prev of the original node points to the new node
    succ.prev = newNode;
    //4. If it is inserted into the header, first points to the new node
    if (pred == null)
        first = newNode;
    //5. Non-head node, next of the target node points to the new node
    else
        pred.next = newNode;
    
    
    size++;
    modCount++;
}
Copy the code

As can be seen from the above process, the key process lies in the linkBefore method, and we also draw a graph to show it:

Add to the header:

1. Copy the reference to the previous node pointing to the target location

Node<E> pred = succ.prev;
Copy the code

Essentially pointing to null

2. Construct a new node with prev pointing to the previous node and next pointing to the original node

Node<E> newNode = new Node<>(pred, e, succ);
Copy the code

3. The prev of the target node points to the node to be added

succ.prev = newNode;
Copy the code

4. First points to the new node

first = newNode;
Copy the code

Middle position add

As shown in the figure, assuming the specified add to the third node, i.e. index=2, there must be a disconnect between the second and third nodes.

1. Copy the previous node reference to the target location, that is, the second node

Node<E> pred = succ.prev;
Copy the code

2. Construct a new node. Prev points to the last node copied and next points to the node at the original target location

Node<E> newNode = new Node<>(pred, e, succ);
Copy the code

3. The prev of the target node points to the node to be added

succ.prev = newNode;
Copy the code

5. Next of the target node points to the new node

pred.next = newNode;
Copy the code

get(int index)

public E get(int index) {
    checkElementIndex(index);
    return node(index).item;
}
Copy the code

The get method is to obtain elements by index, and essentially call node(index). This method has been mentioned in the previous section. Although bidirectional linked lists improve efficiency to some extent, reducing N to N/2, the time complexity is still a constant times of N, so don’t use this method easily. Arraylists should be used when random access is required, and LinkedLists should be considered when traversal is required and when additions and deletions are required. This method is specified by the List interface, and is one of the reasons LinkedList doesn’t implement the RandomAccess interface.

Methods under the Deque system:

When we use LinkedList as a queue and stack, we mainly use the Deque approach.

If you look at it a little bit, you’ll see that many of these methods are essentially repetitive, like push(E, E) is actually calling addFirst(E),

AddFirst (e) also calls linkFirst(e) directly; Pop () calls removeFirst() directly;

Why bother with so many names for one method? This is because the LinkedList has different roles when viewed from different perspectives. It can be added anywhere, it can be removed anywhere.

You are advised to read the corresponding comments carefully.

As a queue

The basic feature of queues is “first in, first out”, which is equivalent to adding elements at the end of the list and removing elements at the head of the list.

The corresponding methods are offer(E E), peek(), poll()

public boolean offer(E e) {
    return add(e);
}


public boolean add(E e) {
    linkLast(e);
    return true;
}
Copy the code

You can see that the essence of the Offer method is to add an element to the end of the list, as described earlier in the linkLast(e) method.

/**
* Retrieves, but does not remove, the head (first element) of this list.
*
* @return the head of this list, or {@code null} if this list is empty
* @since1.5 * /
public E peek(a) {
    final Node<E> f = first;
    return (f == null)?null : f.item;
 }
Copy the code

The peek() method returns the first element in the queue, but does not remove the element. This means that peek gets the same element multiple times.

 /**
 * Retrieves and removes the head (first element) of this list.
 *
 * @return the head of this list, or {@code null} if this list is empty
 * @since1.5 * /
public E poll(a) {
    final Node<E> f = first;
    return (f == null)?null : unlinkFirst(f);
}
Copy the code

The poll() method returns the first element in the queue and removes it from the queue. So multiple polls get different elements.

Obviously the poll method is more consistent with the concept of a queue.

There is no detailed explanation of the deletion method here, because if you look closely at the previous add method, the deletion method is also very simple, just over the deleted element join pointer, so there is no need to waste space here. Might as well draw yourself, helpful to understand.

As a stack,

The basic feature of the stack is “first in, last out”, which is equivalent to adding elements to the head of the list and removing elements to the head of the list.

The corresponding methods are push(E E) and pop().

public void push(E e) {
    addFirst(e);
}

public void addFirst(E e) {
    linkFirst(e);
 }
Copy the code

Add (int index, e Element); push (int index, e element);

public E pop(a) {
    return removeFirst();
}
Copy the code

The pop() method returns and deletes the first element.

conclusion

This article focuses on the basics of LinkedList and reviews some of the basics, both Java-related and basic data structure knowledge, such as linked list-related operations. When you draw a picture for the first time, sometimes a picture is worth a thousand words. The biggest feeling I get when I write here is that the basics matter, it determines how far you can go.

I hope my article can bring you a little help!