Java collection series source analysis article:

  • Deep TreeMap source code parsing (JDK1.8)
  • Deep LinkedHashMap source code parsing (JDK1.8)

ArrayList source analysis (JDK1.8)

There are two kinds of storage structure in data structure, namely: sequential storage structure, chain storage structure. In Java, classes that implement each of these four structures are:

  • Sequential storage structures: ArrayList, Stack
  • Linked storage structures: LinkedList, Queue

Only the source code for ArrayList, which is an array queue, is analyzed here. Compared to arrays in Java, its size can grow dynamically.

The characteristics of

  • The basicArrayListOften used to randomly access elements, but is slow to insert and remove elements in the middle of a List;
  • ArrayListOperations in are not thread-safe. Therefore, it is recommended to use it in a single threadArrayListIn multithreading, you can chooseVectororCopyOnWriteArrayList, recommended useCopyOnWriteArrayList

Inheritance relationships

As you can see above, ArrayList implements an abstract class with four interfaces. It inherits abstract class AbstracList and implements four interfaces: List, RandomAccess, Cloneable and Serializable

  • ArrayListInheritance inAbstractListTo achieve theList. It is an array queue, providing related add, delete, modify, traversal and other functions;
  • ArrayListTo achieve theRandmoAccessInterface, which provides random access;
  • ArrayListTo achieve theCloneableInterfaces, that is, override functionsclone(), can be cloned;
  • ArrayListTo achieve thejava.io.SerializableInterface, that meansArrayListSerialization is supported and can be transmitted by serialization.

Member variables

In the source code for ArrayList, the main member variables are as follows:

public class ArrayList<E> extends AbstractList<E> implements List<E>, RandomAccess.Cloneable.java.io.Serializable{...// The default array length
    private static final int DEFAULT_CAPACITY = 10;
    // An empty array by default
    private static final Object[] EMPTY_ELEMENTDATA = {};
    // The default empty array, unlike the above, is used in a different constructor
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
    // The actual array used to store the data
    transient Object[] elementData;
    // Number of elements in the array
    private int size;
    // The maximum size of the array to allocate. Attempting to allocate a larger array may result in an OutOfMemoryError
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8; . }Copy the code

The underlying implementation of an ArrayList is an array, with a default size of 10.

The constructor

  • ArrayList() : Constructs an empty list with an initial capacity of 10
  • ArrayList(Collection
    c) : Constructs a list of specified elements
  • ArrayList(int initialCapcity) : Constructs an empty list with an initial capacity
// First, call ArrayList(10) to initialize an Object array of size 10 by default
public ArrayList(a) {
	this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
/ / the second
public ArrayList(int initialCapacity) {
    // If the user initialized size is greater than 0, create a new Objec array with the user initialized value size
	if (initialCapacity > 0) {
    	this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        // equal to 0, the variable EMPTY_ELEMENTDATA is assigned, the default empty array
    	this.elementData = EMPTY_ELEMENTDATA;
    } else {
        < 0, an exception is thrown
    	throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); }}// Group the container number and assign the value to the Object array
public ArrayList(Collection<? extends E> c) {
	elementData = c.toArray();
    if((size = elementData.length) ! =0) {
    	// c.toArray might (incorrectly) not return Object[] (see 6260652)
        // When c. toarray returns an array not of type Object, do the following conversion
        if(elementData.getClass() ! = Object[].class) elementData = Arrays.copyOf(elementData, size, Object[].class); }else {
    	// replace with empty array.
        this.elementData = EMPTY_ELEMENTDATA; }}Copy the code

Add, delete, change and check operation

For the basic operation of adding, deleting, changing and checking, only some important source code is given here for parsing.

Increase the operating

  • Add (E E) : Adds an element to the end of the list.
  • Add (int index, E Element) : Adds an element at the specified position
  • addAll( Collection
    c) : Adds a collection to the end of an element. The above return type is Boolean
  • EnsureCapcity (Int minCapcity) : Ensure that the list contains the minimum capacity of minCapcity

The default insertion of ArrayList is inserted at the end of the array, which is an O(1) operation when capacity expansion is not required, and O(n) operation when capacity expansion is required. Therefore, if you can judge the size of the data in advance, you can specify the size of the initialization array to avoid excessive capacity expansion. The code below looks at some of the first methods to increment elements:

// Step 1:
public boolean add(E e) {
    // Check the array size before adding elements
	ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

private void ensureCapacityInternal(int minCapacity) {
	ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}
// Step 2:
private void ensureExplicitCapacity(int minCapacity) {
	modCount++;

	// overflow-conscious code
    // If the added element is larger than the current array length, expand the array
    if (minCapacity - elementData.length > 0)
    	grow(minCapacity);
}
// Step 3: Expand capacity
private void grow(int minCapacity) {
	// overflow-conscious code
    int oldCapacity = elementData.length;
    // Increase the size of the array by half, or 1.5 times
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    // newCapacity = minCapacity; MinCapacity Indicates the actual number of elements
    if (newCapacity - minCapacity < 0)
    	newCapacity = minCapacity;
    // If the expanded capacity exceeds the maximum variable MAX_ARRAY_SIZE
    if (newCapacity - MAX_ARRAY_SIZE > 0)
    	newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    // Copy the elements held by elementData to the new array object, and finally point the reference to elementData to the new array object
    // The original array object will be recycled after some time because it has no reference.
    elementData = Arrays.copyOf(elementData, newCapacity);
}

private static int hugeCapacity(int minCapacity) {
	if (minCapacity < 0) // overflow
    	throw new OutOfMemoryError();
    return (minCapacity > MAX_ARRAY_SIZE) ? Integer.MAX_VALUE : MAX_ARRAY_SIZE;
}
Copy the code

ArrayList also supports insertion at specified indexes. When inserting at the specified index, the specified index and the elements after it need to be pushed back one place, so it is an O(n) operation. The source code is as follows:

// Step 1:
public void add(int index, E element) {
    // Check if index is between 0 and size.
	rangeCheckForAdd(index);
	// See if elementData is long enough to expand
	ensureCapacityInternal(size + 1);  // Increments modCount!!
    // Move elementData one bit back from index
    System.arraycopy(elementData, index, elementData, index + 1, size - index);
    elementData[index] = element;
    size++;
}
// Step 2:
private void rangeCheckForAdd(int index) {
	if (index > size || index < 0)
    	throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

// Step 3:
private void ensureCapacityInternal(int minCapacity) {
	ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}

private void ensureExplicitCapacity(int minCapacity) {
	modCount++;

   	// overflow-conscious code
    if (minCapacity - elementData.length > 0)
    	grow(minCapacity);
}
Copy the code

Delete operation

  • Remove (Object O) : Removes the first element in the list with an O
  • Remove (int index) : deletes the element at the specified position in the list
  • RemoveAll (Collection
    c) : Delete all elements in the list containing C
  • RemoveIf (Predictcate
    filter) : Removes all elements of the given predicate in the list
  • RemoveRange (int from,int to) : Removes all elements from from to
  • Clear () : Clears all elements. The return type is void

When an ArrayList deletes an element, it needs to push all the elements after the deleted element forward one place, which is O(n) complexity. So ArrayList is not suitable for applications that require frequent insertion and deletion of elements at specified locations.

public E remove(int index) {
    // Step 1: If index >= size, throw an exception
	rangeCheck(index);

    modCount++;
    // Step 2: Get the value of the deleted element
    E oldValue = elementData(index);
	// Step 3: Move all elements after index forward one
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    elementData[--size] = null; // clear to let GC do its work
	// Step 4: Return the value to be deleted
    return oldValue;
}
Copy the code

Let’s look at another implementation:

public boolean remove(Object o) {
	if (o == null) {
    	for (int index = 0; index < size; index++)
        	if (elementData[index] == null) {
            	fastRemove(index);
                eturn true; }}else {
    	for (int index = 0; index < size; index++)
        	if (o.equals(elementData[index])) {
            	fastRemove(index);
                return true; }}return false;
}

 /* * Private remove method that skips bounds checking and does not * return the value removed. */   
 private void fastRemove(int index) {
 	modCount++;
   	int numMoved = size - index - 1;
	if (numMoved > 0)
    	System.arraycopy(elementData, index+1, elementData, index, numMoved);
    elementData[--size] = null; // clear to let GC do its work
}
Copy the code

Change the operation

  • retainAll( Collection
    c) : keeps only the same elements as c in the list, equivalent to &
  • Set (int index,E Element) : Replaces the element in the index position with an element.
  • Size () : Returns the number of elements in this list
  • sort(Comparator
    c) : Sort by the specified sort rule
  • SubList (int from, int to) : Returns a list from from to
  • ToArray () : Converts a list to an array
  • TrimToSize () : Changes the capacity of the current instance to the current size of the list.

Set method

Make sure that the set position is smaller than the current array length (size) and greater than 0, get the specified position (index) element, then put the specified position (index) element, then put the specified position (index) element, then return the original position (oldValue) element.

// Step 1:
public E set(int index, E element) {
    // Check if index is greater than size and throw an exception if it is
	rangeCheck(index);

	E oldValue = elementData(index);
    // Overwrites the elements on index in ArrayList
    elementData[index] = element;
    // Returns the overwritten element
    return oldValue;
}
// Step 2;
private void rangeCheck(int index) {
	if (index >= size)
    	throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}
Copy the code

SubList method

We’re creating an internal SubList object in the ArrayList class. The first argument we pass in is this, which returns a partial view of the current list, pointing to the same place where the contents of the data are stored. If you modify the content returned by sublist, the original list will also change.

public List<E> subList(int fromIndex, int toIndex) {
	subListRangeCheck(fromIndex, toIndex, size);
    return new SubList(this.0, fromIndex, toIndex);
}
Copy the code

Use the subList method in ArrayList with caution

  • The subList result of ArrayList cannot be forcibly converted to ArrayList, otherwise ClassCastException will be thrown. The Java. Util. RandomAccessSubList always be cast to Java. Util. The ArrayList. SubList returns the inner subList class of ArrayList. SubList is not an ArrayList, but a view of ArrayList. All operations on subList sublists are reflected in the original list.
List<String> list = new ArrayList<>();
list.add("1");
list.add("1");
list.add("2");
ArrayList<String> strings =  (ArrayList)list.subList(0.1); Exception in thread"main" java.lang.ClassCastException: java.util.ArrayList$SubList cannot be cast to java.util.ArrayList  at com.nuih.List.ArrayListTest.main(ArrayListTest.java:29)
Copy the code
  • In subList scenario, highly pay attention to the modification of the original collection element number, can lead to traverse the child list, add, delete all can produce ConcurrentModificationException anomalies.
List<String> list = new ArrayList<>();
list.add("1");
list.add("1");
list.add("2");
List<String> subList =  list.subList(0.1);
// Add a value to the original List
list.add("10");
subList.add("11"); / / this line will be submitted to the Java. Util. ConcurrentModificationException
Copy the code

The trimToSize method

public void trimToSize(a) {
    // The number of changes is increased by 1
	modCount++;
    // If the current element is smaller than the array capacity, the free space (including null values) in elementData is removed
    // For example, if the array length is 10, only the first three elements have values and the rest are empty, then the array length is 3
    if (size < elementData.length) {
    	elementData = (size == 0)? EMPTY_ELEMENTDATA : Arrays.copyOf(elementData, size); }}Copy the code

ToArray method

First, the Object[] toArray() method. This method may be thrown Java. Lang. ClassCastException, if use directly down the method of transformation, change the set the ArrayList to specify the type of Array, the Array would throw the exception, and if you don’t down transformation into an Array of Array, If you cast each element down, you don’t throw this exception. Obviously, it’s inefficient and inconvenient to cast each element down in an array.

The second, T[] toArray(T[] a) method. This method can directly transform the Array converted from ArrayList into a whole downward transformation (transformation is actually implemented in the source code of this method), and it can be seen from the source code of this method that when the size of parameter A is insufficient, the internal call will be made to the array.copyof method, which internally creates a new Array return, Therefore, the common form of this method is as follows:

public Object[] toArray() {
	return Arrays.copyOf(elementData, size);
}

public <T> T[] toArray(T[] a) {
	if (a.length < size)
    	// Make a new array of a's runtime type, but my contents:
        return (T[]) Arrays.copyOf(elementData, size, a.getClass());
    System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
    	a[size] = null;
    return a;
}
Copy the code
// The first way (most commonly used)
Integer[] integer = arrayList.toArray(new Integer[0]);

// The second way (easy to understand)
Integer[] integer1 = new Integer[arrayList.size()];
arrayList.toArray(integer1);

// Throw an exception. Java does not support downward transitions
// Integer[] integer2 = new Integer[arrayList.size()];
// integer2 = arrayList.toArray();
Copy the code

Check the operating

  • Contains (Object O) : Returns true if contains element O
  • Get (int index) : Returns the element with the specified index
  • IndexOf (Object o) : returns the indexOf the first occurrence of the specified element in this list, or -1 if the list does not contain the element
  • LastindexOf (Object O): Returns the index of the last occurrence of the specified element in this list, or -1 if the list does not contain the element
  • IsEmpty () : returns true if the list isEmpty
  • Iterator () : Returns an iterator for the elements in the list
  • ListIterator () : List iterators that return lists (in proper order)
  • ListIterator (int index) : listIterator that returns a list from the appropriate position (in the correct order)

The get method

Returns the element at the specified location

public E get(int index) {
	rangeCheck(index);

    return elementData(index);
}
Copy the code

The contains method

Call indexOf to compare each element in the array and return true if the right element is found and false if no element is found.

public boolean contains(Object o) {
	return indexOf(o) >= 0;
}

public int indexOf(Object o) {
	if (o == null) {
    	for (int i = 0; i < size; i++)
        	if (elementData[i]==null)
            	return i;
    } else {
    	for (int i = 0; i < size; i++)
        	if (o.equals(elementData[i]))
            	return i;
    }
    return -1;
}
Copy the code

The iterator method

The interator method returns an inner class that can be called to the properties of the outer class because the inner class was created with an external this pointer by default.

public Iterator<E> iterator(a) {
	return new Itr();
}
Copy the code

In general, after calling the iterator, we will use the iterator traverses, used here next do traversal when there is a need to pay attention to, is to call next time, may trigger a ConcurrentModificationException, when the number of changes, This exception occurs when the number of changes is inconsistent with the expected number of changes when the iterator method is called.

@SuppressWarnings("unchecked")
public E next(a) {
	checkForComodification();
    int i = cursor;
    if (i >= size)
    	throw new NoSuchElementException();
    Object[] elementData = ArrayList.this.elementData;
    if (i >= elementData.length)
    	throw new ConcurrentModificationException();
    cursor = i + 1;
    return (E) elementData[lastRet = i];
}

final void checkForComodification(a) {
	if(modCount ! = expectedModCount)throw new ConcurrentModificationException();
}
Copy the code

The expectedModCount value is set when the user calls the Iterator method of the ArrayList, but after the user adds or removes the elements of the ArrayList, the modCount changes and the values are not equal. Would trigger a ConcurrentModificationException is unusual, this is in a multi-threaded use cases, common an exception.

traverse

ArrayList traversal in four ways

  • The first is a regular for loop that randomly iterates through index values.
// Random access
List<String> list = new ArrayList<>();
int size = list.size();
for (int i = 0; i < size; i++) {
	value = list.get(i);
}
Copy the code
  • The second is traversal by iterator. Iterator.
// iterator traversal
Iterator<String> iter = list.iterator();
while (iter.hasNext()) {
	value = iter.next();
}
Copy the code
  • The third, for-each.
// Enhance the for loop
for (String s : list) {
	value = s;
}
Copy the code
  • The fourth forEach + lambda loop is traversed
list.forEach(p -> {
	p.hashCode();
});
Copy the code

The performance comparison

Since there are four types of traversal, let’s look at which traversal efficiency. Let’s look at the time of each of the four loops through an experiment: test code

package com.nuih.List;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Iterator;
import java.util.List;
import java.util.UUID;

public class ArrayListTest {
    public static void main(String[] args) {
        // Data preheat
        List<String> testList = createTestList(10);
        testForEach(testList);
        testFor(testList);
        testRandFor(10, testList);
        
        List<Integer> integers = Arrays.asList(10.50.100.500.1000.10000.50000.100000.5000000.10000000.30000000);
        for(Integer i : integers) { testRand(i); }}private static void testRand(int size) {
        System.out.println("-- -- -- -- -- -- -- -- -- -- - the number of times:" + size + "-- -- -- -- -- -- -- -- -- -- -- --");
        List<String> list = createTestList(size);
        // Random access is traversed by index values.
        long time1 = System.nanoTime();
        testRandFor(size, list);
        long time2 = System.nanoTime();
        // Enhance the for loop
        testFor(list);
        long time3 = System.nanoTime();
        // iterator traversal
        testIterator(list);
        long time4 = System.nanoTime();
        // forEach + lambda
        testForEach(list);
        long time5 = System.nanoTime();
        System.out.println("Random access \t\t" + (time2 - time1) / 1000 + " ms");
        System.out.println("Enhanced for traversal \t\t" + (time3 - time2) / 1000 + " ms");
        System.out.println("Iterator traversal \t\t" + (time4 - time3) / 1000 + " ms");
        System.out.println("ForEach traverse t \ \ t" + (time5 - time4) / 1000 + " ms");
        System.out.println();
    }

    private static void testRandFor(int size, List<String> list) {
        for (int i = 0; i < size; i++) { list.get(i).hashCode(); }}private static void testFor(List<String> list) {
        for(String s : list) { s.hashCode(); }}private static void testIterator(List<String> list) {
        Iterator<String> iter = list.iterator();
        while(iter.hasNext()) { iter.next().hashCode(); }}private static void testForEach(List<String> list) {
        list.forEach(p -> {
            p.hashCode();
        });
    }

    public static List<String> createTestList(int size) {
        List<String> list = new ArrayList<>(size);
        for (int i = 0; i < size; i++) {
            list.add(UUID.randomUUID().toString());
        }
        returnlist; }}Copy the code

Test results:

Conclusion: If the amount of data is small, it seems that the four cycles take about the same time, but with the increase of the amount of data, the efficiency of Foreach is the best. But one of the odd things we can see from the above is that the first time through the loop forEach takes the longest time even though the amount of data is very small. But the time after that is normal. If you let go of the warm-up code in the test, the elapsed time of each run is normal.

-- -- -- -- -- -- -- -- -- -- - the number:10------------ Random access15Ms enhancementfortraverse8Ms iterator traversal6Ms forEach traversal65728Ms -----------50------------ Random access16Ms enhancementfortraverse21Ms iterator traversal13Ms forEach traversal10Ms -----------100------------ Random access7Ms enhancementfortraverse23Ms iterator traversal34Ms forEach traversal19Ms -----------500------------ Random access64Ms enhancementfortraverse47Ms iterator traversal39Ms forEach traversal105Ms -----------1000------------ Random access129Ms enhancementfortraverse99Ms iterator traversal81Ms forEach traversal58Ms -----------10000------------ Random access1748Ms enhancementfortraverse1921Ms iterator traversal1270Ms forEach traversal2212Ms -----------50000------------ Random access4013Ms enhancementfortraverse2739Ms iterator traversal3628Ms forEach traversal2368Ms -----------100000------------ Random access9874Ms enhancementfortraverse4500Ms iterator traversal5159Ms forEach traversal6232Ms -----------5000000------------ Random access215933Ms enhancementfortraverse27000Ms iterator traversal26586Ms forEach traversal22105Ms -----------10000000------------ Random access379588Ms enhancementfortraverse57104Ms iterator traversal42973Ms forEach traversal40539Ms -----------30000000------------ Random access1090531Ms enhancementfortraverse195013Ms iterator traversal185519Ms forEach traversal151925 ms
Copy the code

ArrayList deletes data

While there are four ways to traverse, there are only two ways to delete data correctly

  • The first deletes through an iterator. In this way, it is also recommended by the Ali Code Regulations.

Iterator<String> iter = list.iterator();        
while (iter.hasNext()) {
    iter.next().hashCode();
    iter.remove();
}
Copy the code
  • The second kind of reverse loop deletion
for(int i = list.size()-1; i>=0; i--) { list.remove(i); }Copy the code

Here’s another example of an incorrect delete operation

  • A normal for loop deletes elements in order, moving them to the left. Duplicate elements cannot be deleted
List<String> list = new ArrayList<>();
list.add("1");
list.add("1");
list.add("2");
for(int i=0; i<list.size(); i++){ list.remove(i); } System.out.println(String.join(",",list));
Copy the code

Result output: 1

  • Enhanced for loop to remove throws Java. Util. ConcurrentModificationException
List<String> list = new ArrayList<>();
list.add("1");
list.add("1");
list.add("2");
for (String s : list) {
	list.remove(s);
}        
System.out.println(String.join(",",list));
Copy the code

ArrayList ConcurrentModificationException

ConcurrentModificationException appeared in the use of ForEach traversal, iterators iterate through at the same time, to delete, add appear abnormal. Common ArrayList, HashMap can throw this kind of exception. If you are careless, you can easily make this error and lead to online accidents!

Under the following summary of the ArrayList some usage scenarios, to discuss whether can throw ConcurrentModificationException

For.. I traverse

This traversal means for(int I = 0; i

List<Integer> list = new Arraylist<>();
list.add(1);
list.add(2);
list.add(3);
for(int i = 0; i<list.size(); i++){ list.add(i); }Copy the code

Traverse the delete, will not have a ConcurrentModificationException, but must pay attention to the code, and prevent abnormal an array. The following form of code throws an array out of bounds exception.

List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
int length = list.size();
for(int i = 0; i<length; i++){ list.remove(i); }Copy the code

The ForEach traversal

ForEach traversal is For(Object O: List) traversal. ForEach loop is a JAVA syntactic sugar, which is equivalent to iterator loop traversal on the bytecode level. In this case, add elements will throw ConcurrentModificationException, and remove elements in most cases, will throw ConcurrentModificationException (small knowledge, if and only if remove small label for the size () – 2, The next-to-last element does not throw an exception. In this case, an exception will be thrown

List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
for (Integer i : list) {
	if(i == 1){ list.remove(i); }}Copy the code

You can modify the preceding statement. If I == 1 is changed to I == 2, no exception will be raised.

subList

In subList scenario, highly pay attention to the modification of the original collection element number, can lead to traverse the child list, add, delete all can produce ConcurrentModificationException anomalies.

List<String> list = new ArrayList<>();
list.add("1");
list.add("1");
list.add("2");
List<String> subList =  list.subList(0.1);
// Add a value to the original List
list.add("10");
subList.add("11"); / / this line will be submitted to the Java. Util. ConcurrentModificationException
Copy the code

How to avoid ConcurrentModificationException

  1. If you need to iterate over additions, it is best to create a temporary List that is the same as the old List, iterate over the old List, and then add elements to the temporary List
  2. Remove (iterator.remove()) instead of calling list.remove().

3. Be careful

conclusion

The underlying implementation of an ArrayList is an array, an ordered, repeatable container for holding elements. Suitable for scenarios where you need to find elements at a specified index. The complexity is O(n) when it is necessary to frequently insert, delete, or find specified elements.

  • When initializing a List, try to specify its capacity. (Minimize capacity expansion times)
  • When an ArrayList object is created using the no-argument constructor, the initial length of the array in the ArrayList object is zero (an empty array).
  • ArrayList’s expansion strategy is to increase the size of the current array by half each time (non-fixed allocation).
  • ArrayList expands by creating a new array and copying the data into the new array.

Some pictures from the network, copyright to the original author, delete.Copy the code