List of the container

The List collection system is probably the most commonly used API in daily development, and is often used as a final interview question (JVM, collection, concurrency). The overall design of the collection code is also a combination of many programming ideas, which is a very good reference for programmers.

The bottom line

  • Basis: element add delete, container information;
  • Advanced: storage structure, capacity management;

The API system

  • ArrayList: maintain the array implementation, query fast;
  • Vector: Maintains an array implementation, thread-safe;
  • LinkedList: Maintain LinkedList implementation, add and delete fast;

Core features include initialization and loading, element management, auto scaling, array and linked list data structures. The underlying thread safe operation of Vector is based on ArrayList, while ArrayList and LinkedList are non-thread safe operations, so the natural efficiency is higher than that of Vector. This feature can be found by reading the source code.

A detailed ArrayList

1, array features

ArrayList is the concrete implementation class of the List interface in the collection system. The underlying Object array is maintained to load and manage the data:

private static final Object[] EMPTY_ELEMENTDATA = {};

When it comes to array structure, subconsciously, the index query is based on the corresponding element, so it is fast. If the element is deleted, it may cause a large number of elements to move, so it is less efficient than LinkedList.

Array storage mechanism:

Belongs to an array is compact continuous storage through the subscript index can random access and quickly find the corresponding element, because there are open up the mechanism of memory space in advance, so relative to save storage space, if the array once need expansion, is to open up a bigger chunk of memory space, and copy all the data in the past, the efficiency is very low.

2. Construction method

There are two main construction methods:

Arrayless constructor: Initializes an ArrayList, declaring an empty array.

public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

Argumentative constructor: Sets the length of the array if the capacity parameter is greater than 0.

public ArrayList(int initialCapacity) { if (initialCapacity > 0) { this.elementData = new Object[initialCapacity]; } else if (initialCapacity == 0) { this.elementData = EMPTY_ELEMENTDATA; } else { throw new IllegalArgumentException("Illegal Capacity: "+initialCapacity); }}

If the array length is not specified by a constructor, the default array length is used, and the array size is set during element addition.

private static final int DEFAULT_CAPACITY = 10;

3. Loading data

As we can see from the above analysis, arrays are limited-size, but an ArrayList can always load elements, with bound values, but usually not that many: ArrayList (array) {ArrayList (array);

private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

Exceeding this limit will throw an out-of-memory error.

Load element: determines whether the capacity is sufficient;

public boolean add(E e) {
    ensureCapacityInternal(size + 1);
    elementData[size++] = e;
    return true;
}

When the capacity is not enough, the expansion operation will be carried out. Here the key source code of quantification is posted:

private void grow(int minCapacity) {
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    elementData = Arrays.copyOf(elementData, newCapacity);
}

Mechanism: Calculate the newCapacity (newCapacity=15), copy a new array, and set the capacity to newCapacity.

Specify location add: This method is rarely used. Again, look at two key lines of code.

public void add(int index, E element) {
    ensureCapacityInternal(size + 1);
    System.arraycopy(elementData, index,elementData,index+1,size-index);
    elementData[index] = element;
    size++;
}

Mechanism: Determine the size of an array, and then it is a very straightforward array copy operation. Here is a simple diagram:

As shown in the figure above, suppose that the element E is placed at index=1 and run as follows:

  • Gets the length of the array index to the end position;
  • Copy it to index+1;
  • Place the element in the original index position.

This process is just like queuing, if you insert one bit in the first place, that is, all the following back one bit, naturally inefficient, of course, this is not absolute, if you move the length of the array is small enough, or keep adding at the end of the array, the effect of efficiency will be reduced.

4. Remove data

See above data loading, the logic of the opposite again look, still look at a few key lines of source:

public E remove(int index) {
    E oldValue = elementData(index);
    int numMoved = size - index - 1;
    if (numMoved > 0) {
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    }
    elementData[--size] = null;
    return oldValue;
}

Mechanism: Logically, it is similar to the mechanism of adding elements, that is, copying the element after the added position to the position at which the index begins. This logic in the queue is like leaving the front one bit, and all the following queues move forward one bit.

The same is true for efficiency. If the first element in the set is removed, all subsequent elements have to be moved. The further back the removed element is, the effect of efficiency will be relatively reduced.

5. Capacity and quantity

In the source code of the collection, there are two key fields that need to be clarified:

  • Capacity: the capacity of a collection;
  • Size: number of elements in the container;

In general, the size of the container is based on the number of elements that can be loaded. This means that additional elements trigger the capacity expansion mechanism.

3. LinkedList details

1, linked list structure characteristics

The linked list structure is stored in a non-continuous and non-sequential physical unit, and the logical order between node elements is realized through the link order of Pointers in the linked list. A linked list consists of a series of nodes that can be dynamically generated at run time. The node consists of two parts: a data field that stores the data elements and a pointer field that stores the address of the next node.

The characteristics of description

  • Physical storage is disordered and discontinuous;
  • The linked list is composed of a number of nodes in a chain structure;
  • To form an orderly link structure at the logical level;
  • The first node does not refer to the address of the previous node;
  • The tail node has no address to point to the next node.

Linked list structure solves the defect that the number of elements in array storage needs to be known in advance. It can make full use of memory space and realize flexible dynamic memory management.

2. LinkedList structure

The underlying data storage structure of LinkedList is realized based on linked lists. First, let’s look at the description of nodes:

private static class Node<E> { E item; Node<E> next; Node<E> prev; Node(Node<E> prev, E element, Node<E> next) { this.item = element; this.next = next; this.prev = prev; }}

Define the static class Node in LinkedList to describe the structure of the Node: elements, front and back Pointers. Define three core variables in the LinkedList class:

transient int size = 0;
transient Node<E> first;
transient Node<E> last;

The size, the first node, and the description of these three variables are very clear in the source code comments:

The last node on the first node is null, the next node on the last node is null and the item is not null.

3. Element management

A major feature of LinkedList is the efficiency of adding and removing elements, according to the structure of the list to see the source code.

Add elements

As you can see from the source code, we actually call this method when we add an element, placing the newly added element last in the list:

void linkLast(E e) {
    final Node<E> l = last;
    final Node<E> newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}

Combined with the constructor of the Node class, the actual operation is shown below:

The core logic is that the new tail establishes a pointer relationship with the old tail and handles the first node variable.

Remove elements

To delete an element, we can either delete it by the element object or by the element index.

E unlink(Node<E> x) {
    final E element = x.item;
    final Node<E> next = x.next;
    final Node<E> prev = x.prev;
    if (prev == null) {
        first = next;
    } else {
        prev.next = next;
        x.prev = null;
    }
    if (next == null) {
        last = prev;
    } else {
        next.prev = prev;
        x.next = null;
    }
    x.item = null;
    size--;
    modCount++;
    return element;
}

Similar to the core logic of adding elements, it is a process of rebuilding the node pointer:

  • The two ifs determine whether the first node is deleted;
  • The next of the last node of the deleted node refers to the next node of the deleted node;
  • The prev of the next node of the deleted node points to the prev node of the deleted node;

Through the source analysis of add and delete method, it can be seen that adding and deleting elements in LinkedList does not involve large-scale element movement and affects very few nodes, so the efficiency is naturally much higher than that of ArrayList.

4. Query method

Based on a linked list structure instead of an array, it has a significant impact on the efficiency of the element query.

public E get(int index) { checkElementIndex(index); return node(index).item; } Node<E> node(int index) { if (index < (size >> 1)) { Node<E> x = first; for (int i = 0; i < index; i++) x = x.next; return x; } else { Node<E> x = last; for (int i = size - 1; i > index; i--) x = x.prev; return x; }}

This source code combined with the LinkedList structure, is really very strategic:

  • The first is the validity check of index;
  • And then determine whether the index is in the upper or lower half of the list;
  • If the upper half of the list: traverses sequentially from the first node;
  • If the last node is traversed in reverse order, the last node is traversed in reverse order.

As you can see from the source code above, searching for the middle elements in a LinkedList is less efficient as more traversal is required, so LinkedList is more suitable for searching for the first element.

Four, the source code address

Making address GitEE, https://github.com/cicadasmile/java-base-parent, https://gitee.com/cicadasmile/java-base-parent

Read labels

【 JAVA Foundation 】【 Design Patterns 】【 Structure and Algorithms 】【Linux System 】【 Database 】

[Distributed Architecture] [Micro Services] [Big Data Components] [SpringBoot Advanced] [Spring&Boot Foundation]

【 Data Analysis 】【 Technical Map 】【 Workplace 】