Java collection (2) an ArrayList and LinkList

The introduction

ArrayList<E> and LinkList<E> inherit from the List<E> interface. The List<E> interface is ordered, repeatable, and accessible by integer indexes. ArrayList<E> and LinkList<E> are quite different in implementation. ArrayList<E> is implemented through arrays, while LinkList<E> uses bidirectional lists. Studying them together will help you understand their differences more clearly.

Frame structure

LinkList<E> inherits the Deque<E> interface as well as the List<E> interface. LinkList<E> inherits the Deque<E> interface. Deque<E> is a dual-end queue interface, LinkList<E> because of the implementation of bidirectional linked list, so it can naturally achieve the characteristics of dual-end queue head and tail in and out.

The data structure

In the previous article, we said that one of the biggest reasons why there are so many implementation classes derived from a Collection<E> interface is that each implementation has a different data structure, which in turn leads to different usage scenarios for each Collection. The fundamental difference between ArrayList<E> and LinkList<E> lies in the data structure. Only by understanding their respective data structures can we understand their respective application scenarios more deeply. In the source code for ArrayList<E> there is an elementData variable that represents the data structure used by ArrayList<E> : arrays.

//The array buffer into which the elements of the ArrayList are stored.
transient Object[] elementData;
Copy the code

The elementData variable is the basis of the ArrayList<E> operations, all of which are implemented based on an array of Object type elementData. Arrays have the following characteristics:

Once the array size is initialized, the length is fixed.
The memory addresses between the elements of an array are contiguous.
Can store elements of only one data type.

There is a transient keyword worth noting, which signals that the current object does not need to be serialized. If you know serialization, skip the following introduction: What is serialization? Serialization is simply the process of persisting an object. The process of converting an object to a byte stream is called serialization, and an object must be converted to a byte stream in order to travel across the network. Correspondingly, the process of converting an object from a byte stream to an object is called deserialization. In Java, the only way to signal that an object can be serialized is to inherit the Serializable interface, which is an empty interface. Now that serialization is clear, let’s look at the transient keyword. In Java, the key declared as transient is ignored when serializing, but why ignore this object? If it’s ignored how does the object recover when deserialized? Let’s start by thinking about what objects need to be ignored when serializing? Serialization is a time-consuming and space-consuming process. In general, there are many intermediate variables or temporary variables in an object in addition to the variables that must be persisted. The purpose of declaring these variables is to facilitate the operation of this class.

import java.io.IOException;
import java.io.ObjectInputStream;

public class SerializableDateTime implements java.io.Serializable {

	private static final long serialVersionUID = -8291235042612920489L;

	private String date = "2011-11-11";

	private String time = "The 11:11";

    // Objects that do not need serialization
	private transient String dateTime;

	public void initDateTime(a) {
		dateTime = date + time;
	}

    // When deserialized, assign a value to dateTime
	private void readObject(ObjectInputStream inputStream) throws IOException, ClassNotFoundException { inputStream.defaultReadObject(); initDateTime(); }}Copy the code

The dateTime object in SerializableDateTime is assigned a value when called from the outside, but this object is not basic data and does not need to be serialized. During deserialization, the value can be returned by calling initDateTime. So you only need to serialize the Date and time objects. Marking the dateTime object as transient enables on-demand serialization. So why ignore elementData in ArrayList<E>? This is mainly because elementData contains not only all the useful elements, but also a lot of unused space that does not need to be serialized completely. To save space, only the part of elementData that contains the object is serialized. The space of the elementData object is restored during deserialization, which saves space and time for serialization.

// when serialized
private void writeObject(java.io.ObjectOutputStream s)
    throws java.io.IOException{
    // Write out element count, and any hidden stuff
    int expectedModCount = modCount;
    s.defaultWriteObject();

    // Write out size as capacity for behavioural compatibility with clone()
    s.writeInt(size);

    // Write out all elements in the proper order.
    // serialize elements with size. Size is the size of the actual stored element, not the size of the elementData element
    for (int i=0; i<size; i++) {
        s.writeObject(elementData[i]);
    }

    if(modCount ! = expectedModCount) {throw newConcurrentModificationException(); }}// called when deserializing
private void readObject(java.io.ObjectInputStream s)
    throws java.io.IOException, ClassNotFoundException {
    elementData = EMPTY_ELEMENTDATA;

    // Read in size, and any hidden stuff
    s.defaultReadObject();

    // Read in capacity
    s.readInt(); // ignored

    if (size > 0) {
        // be like clone(), allocate array based upon size not capacity
        // Restore the space of the elementData object
        ensureCapacityInternal(size);

        Object[] a = elementData;
        // Read in all elements in the proper order.
        for (int i=0; i<size; i++) {
            // Populate the contents of the elementData elementa[i] = s.readObject(); }}}Copy the code

This method of serialization and deserialization is very clever and can be used in our programming process to save space and time for serialization and deserialization.

The underlying implementation of LinkedList<E> uses a LinkedList as a data structure, and is bidirectional, with each element containing references to its previous and next elements:

// The first element of the list
transient Node<E> first;

// The last element of the list
transient Node<E> last;

// Inner class representation of the linked list
private static class Node<E> {
    // Current element
    E item;
    // Next element
    Node<E> next;
    // The previous element
    Node<E> prev;

    Node(Node<E> prev, E element, Node<E> next) {
        this.item = element;
        this.next = next;
        this.prev = prev; }}Copy the code

Linked list features:

The length is not fixed and can be increased or decreased at any time
Elements in a linked list can be contiguous or discontiguous in memory address, and most of the time, discontiguous.

The constructor

ArrayList<E> provides three constructors. The default constructor initializes an empty array and expands the array as elements are added, which affects the array’s performance to a certain extent. If you can predict the final array size in advance, you can use ArrayList(int initialCapacity) to initialize the array size. This reduces the performance penalty caused by capacity expansion.

public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        // Initialize the array size
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); }}public ArrayList(a) {
    // Initialize an empty array
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

public ArrayList(Collection<? extends E> c) {
    elementData = c.toArray();
    if((size = elementData.length) ! =0) {
        // c.toArray might (incorrectly) not return Object[] (see 6260652)
        if(elementData.getClass() ! = Object[].class) elementData = Arrays.copyOf(elementData, size, Object[].class); }else {
        // replace with empty array.
        this.elementData = EMPTY_ELEMENTDATA; }}Copy the code

LinkList<E> only provides two constructors, the default constructor is an empty function, because the use of linked list data structure does not need to initialize space, and does not need to expand, each time the need to add elements can be directly added, linked list is more reasonable than array in the maximum utilization of space. This does not mean that a linked list uses less space. Rather, each node of the list uses more space for the same element than an array because it stores references to the next node (a bidirectional list stores references to the next two nodes).

public LinkedList(a) {}public LinkedList(Collection<? extends E> c) {
    this(a); addAll(c); }Copy the code

Add elements

ArrayList<E> During the process of adding elements, you need to consider whether the array space is sufficient. If not, you need to expand the array.

//ArrayList
      
        adds elements to the end
      
public boolean add(E e) {
    // Check the capacity of the array. If the capacity is insufficient, call the grow(int minCapacity) method
    ensureCapacityInternal(size + 1);  
    elementData[size++] = e;
    return true;
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

/ / capacity
private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    // The new capacity is 1.5 times as large as the original capacity.
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    // Copy all the data elements into the new array, calling system. arrayCopy internally to copy all the array elements
    elementData = Arrays.copyOf(elementData, newCapacity);
}
Copy the code

Don’t increase:Capacity:

As can be seen from the above, it is very convenient to add elements to the end without expansion, and the time complexity is O(1). In the case of expansion, all elements need to be copied to the new array every time, and the time complexity is O(n), which has certain performance loss. LinkedList

does not need to be expanded when adding elements due to the nature of linked lists, but LinkedList

requires a new Node each time to store elements.

//LinkedList
      
        add element to end
      
public boolean add(E e) {
    linkLast(e);
    return true;
}

void linkLast(E e) {
    final Node<E> l = last;
    //new a new list element and link to the end
    final Node<E> newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}
Copy the code

ensureCapacityInternal(size + 1);  // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,size - index);
elementData[index] = element;
size++;
Copy the code

}

<img src="http://images2017.cnblogs.com/blog/368583/201711/368583-20171130181801667-175597278.png" style="max-width: 770px"> LinkedList&lt; E> Add to the specified location first need to find the location of the element, and then add. Public void add(int index, E element) {checkPositionIndex(index);if(index == size) // Add the element directly to the end of linkLast(element);elseLinkBefore (Element, node(index)); } assert isElementIndex(index) {assert isElementIndex(index); // Specify that the index is less than half the number of elements and traversed from first and from last when greater than half the number of elementsif (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        returnx; }}Copy the code

Does this lookup of LinkedList<E> affect performance? How does ArrayList<E> compare to all elements after expansion and displacement insertion bits? Let’s do a simple test for three special cases of insertion in the head, tail, and middle position. Insert to the tail:

private static void addTailElementArrayList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new ArrayList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addTailElementArrayList time: " + (endTime - startTime));
}

private static void addTailElementLinkedList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new LinkedList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addTailElementLinkedList time: " + (endTime - startTime));
}
Copy the code

	100	1000	10000	100000
ArrayList	0	0	1	160
LinkList	0	0	1	110

Insert to head:

private static void addHeadElementArrayList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new ArrayList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(0, i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addHeadElementArrayList time: " + (endTime - startTime));
}

private static void addHeadElementLinkedList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new LinkedList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(0, i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addHeadElementLinkedList time: " + (endTime - startTime));
}
Copy the code

	100	1000	10000	100000
ArrayList	0	1	10	900
LinkList	0	1	1	6

To insert into the middle:

private static void addCenterIndexElementArrayList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new ArrayList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(list.size()>>1, i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addCenterIndexElementArrayList time: " + (endTime - startTime));
}

private static void addCenterIndexElementLinkedList(int count) {

    long startTime = System.currentTimeMillis();
    List<Integer> list = new LinkedList<Integer>();
    for (int i = 0; i < count; i++) {
        list.add(list.size()>>1, i);
    }
    long endTime = System.currentTimeMillis();
    System.out.println("addCenterIndexElementLinkedList time: " + (endTime - startTime));
}
Copy the code

	100	1000	10000	100000
ArrayList	0	1	6	400
LinkList	0	3	80	10000

A few simple conclusions can be drawn from this:

When added to the end, there is no significant performance difference between ArrayList

and LinkedList

. Although ArrayList

needs to be expanded, LinkedList

also needs to new a Node object.
When inserted into the header, LinkedList

performs significantly better than ArrayList

because ArrayList

needs to move all elements back one place at a time, whereas LinkedList

only needs to change the first element at a time because it is a two-way list.
ArrayList

performs significantly better than LinkedList

when inserted into the middle position, because ArrayList

only needs to move half of the elements, while LinkedList

can only traverse from the beginning or the end due to the special nature of its bidirectional list lookup elements. Half of the elements are traversed each time, which takes a lot of time, and ArrayList

has a smaller performance cost in scaling up and moving elements than might be expected.

When choosing between ArrayList<E> and LinkedList<E>, we need to take full account of the usage scenario. LinkedList<E> is not necessarily better than ArrayList<E> in data insertion. On the contrary, ArrayList<E> is much better in many cases. It is not necessary to choose LinkedList<E> because there are many insert operations. Other factors such as the position of the insert elements should also be considered in the final decision.

Remove elements

ArrayList<E> Delete an element By iterating through an element to find an equal element and then using the index to delete it. After deletion, the element after the deleted element is moved forward.

public boolean remove(Object o) {
    if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true; }}else {
        for (int index = 0; index < size; index++)
            // Find the index of equals and delete it
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true; }}return false;
}

private void fastRemove(int index) {
    modCount++;
    int numMoved = size - index - 1;
    if (numMoved > 0)
        // All deleted elements move forward
        System.arraycopy(elementData, index+1, elementData, index,
                            numMoved);
    elementData[--size] = null; // clear to let GC do its work
}
Copy the code

LinkedList<E> delete the equals element by going back through the list.

public boolean remove(Object o) {
    if (o == null) {
        for(Node<E> x = first; x ! =null; x = x.next) {
            if (x.item == null) {
                unlink(x);
                return true; }}}else {
        for(Node<E> x = first; x ! =null; x = x.next) {
            if (o.equals(x.item)) {
                unlink(x);
                return true; }}}return false;
}
Copy the code

Traverse elements

There is a more efficient way to iterate over elements with ArrayList<E>, which implements the RandomAccess interface to support fast access on behalf of ArrayList<E>. RandomAccess itself is an empty interface that is typically used to represent a class of characteristics, and RandomAccess represents the characteristics of an implementation class that is fast to access. ArrayList<E> implements fast access through indexes. This means that ArrayList<E> is faster to traverse through a for loop than through an Iterator or ListIterator. LinkedList<E> does not implement this excuse, so it is generally accessed through the Iterator.

Java collection (2) an ArrayList and LinkList

The introduction

Frame structure

The data structure

The constructor

Add elements

Remove elements

Traverse elements

Related Posts

How do I copy objects (deep and shallow copies) in Java? Recommended | Java Debug Notes

Docker basic use

2019 Java Interview Questions basics series 228 (4)