How can Java lists be traversed more efficiently?

Java has three ways to iterate:

  • Normal for loop traversal (for)
  • Enhanced for loop traversal (foreach)
  • Iterator loop

These three traversal methods are different.

Let’s start with a program to compare how long it takes different lists to traverse in different ways:

import java.util.ArrayList; import java.util.Iterator; import java.util.LinkedList; import java.util.List; /** * @author Zhangjianhua * @date 2021-05-09 16:08 * @description list for loop */ @suppresswarnings ("ALL") public class Main  { public static void main(String[] args) { testForeachTransform(); long arrayListForTime = arrayListForTime(); System.out.println(" Test the elapsed time of ArrayList traversal by for: "+ arrayListForTime); long arrayListForeachTime = arrayListForeachTime(); System.out.println(" Test how long ArrayList traverses through foreach: "+ arrayListForeachTime); long arrayListIteratorTime = arrayListIteratorTime(); System.out.println(" Test ArrayList traversing through iterator: "+ arrayListIteratorTime); long linkedListForTime = linkedListForTime(); System.out.println(" Test LinkedList elapsed time by traversing for: "+ linkedListForTime); long linkedListForeachTime = linkedListForeachTime(); System.out.println(" Test LinkedList traversal time by foreach: "+ linkedListForeachTime); long linkedListIteratorTime = linkedListIteratorTime(); System.out.println(" Test LinkedList traversing through iterator: "+ linkedListIteratorTime); // Based on the test results: // ArrayList traversing through the for interface is similar to traversing through the iterator. // LinkedList traversing through the iterator is much faster than traversing through the for interface. // In our application, we need to consider which implementation class uses the List interface. // Implement the RandomAccess interface to distinguish which implementation class of List // implement the RandomAccess interface to iterate through the data, // A List without RandomAccess traverses the data through the iterator. Private static void testForeachTransform() { System.out.println(" Test what foreach converts to "); // For a primitive array, foreach traversal is converted to a for traversal, because iterator is not implemented. Int [] Arrays = {1, 2, 3, 4, 5}; int[] arrays = {2, 3, 4, 5}; for (int i = 0; i < arrays.length; ++i) { } for (int array : arrays) { } List<Integer> list = new ArrayList<>(); list.add(1); list.add(2); list.add(3); list.add(4); list.add(5); for (int i = 0; i < list.size(); ++i) { } for (int element : Private static long arrayListForTime() {list <Integer> arrayList = new ArrayList<>(); for (int i = 0; i < 100000; i++) { arrayList.add(i); } long startTime = system.currentTimemillis (); For (int I = 0; i < arrayList.size(); i++) { arrayList.get(i); } long endTime = system.currentTimemillis (); Return endTime - startTime; } private static long arrayListForeachTime() {List<Integer> arrayList = new ArrayList<>(); for (int i = 0; i < 100000; i++) { arrayList.add(i); } long startTime = system.currentTimemillis (); For (int array: arrayList) {} long endTime = system.currentTimemillis (); Return endTime - startTime; } private static long arrayListIteratorTime() {List<Integer> arrayList = new ArrayList<>(); for (int i = 0; i < 100000; i++) { arrayList.add(i); } long startTime = system.currentTimemillis (); <Integer> iterator = arrayList.iterator(); while (iterator.hasNext()) { iterator.next(); } long endTime = system.currentTimemillis (); Return endTime - startTime; } private static long linkedListForTime() {List<Integer> LinkedList () = new LinkedList<>(); for (int i = 0; i < 100000; i++) { linkedList.add(i); } long startTime = system.currentTimemillis (); For (int I = 0; i < linkedList.size(); i++) { linkedList.get(i); } long endTime = system.currentTimemillis (); Return endTime - startTime; } private static long linkedListForeachTime() {List<Integer> linkedList = new LinkedList<>(); for (int i = 0; i < 100000; i++) { linkedList.add(i); } long startTime = system.currentTimemillis (); For (int array: linkedList) {} long endTime = system.currentTimemillis (); Return endTime - startTime; } private static long linkedListIteratorTime() {List<Integer> linkedList = new LinkedList<>(); for (int i = 0; i < 100000; i++) { linkedList.add(i); } long startTime = system.currentTimemillis (); <Integer> iterator = linkedList.iterator(); while (iterator.hasNext()) { iterator.next(); } long endTime = system.currentTimemillis (); Return endTime - startTime; }}

Here’s the result:

We can see that for the sequential ArrayList, there is little difference between the normal for loop, the enhanced for loop, and the iterator loop. It takes 2ms to traverse 100,000 times.

However, for LinkedList, the difference between the normal for loop and the enhanced for/iterator loop is very large. The normal for loop takes 4,390 ms to iterate 100,000 times, while the enhanced for/iterator loop takes only 3ms / 1ms to iterate 100,000 times.

Why is that?

Because the time complexity of a sequential list is O(n) in any of the three iterations. A chained list with a normal for loop takes O(n^2), but an order of magnitude away from a chained list with an enhanced for loop and an iterator loop takes O(n), so a chained list with a normal for loop is slow.

First, an enhanced for loop for a list is the equivalent of an iterator loop.

How do you see that?

You can look at the compiled class file and decompile it to see the list’s enhanced for loop being converted into an iterator loop. But let’s just say that if we use an enhanced for loop for a primitive array, we compile it to a normal for loop, not to an iterator loop, because the primitive array doesn’t define an iterator loop, so we convert it to a normal for loop.

Run the testForeachTransform method of the program above, and compare the transformation of the enhanced for loop with the list and the enhanced for loop with the array of basic types. Run the program and look at the compiled class file. You can see what code was converted to.

Since the enhanced for loop is converted to an iterator loop for lists, and lists can be sequential or chained, we analyze them in four ways:

  • A normal for loop for a sequential list
  • Enhanced for loop for a sequential list
  • A normal for loop for chained lists
  • An enhanced for loop for chained lists

Why does a chained list with an iterator loop take an exponential order of magnitude less time than a normal for loop? We can find out by looking at the source code.

The following hand to teach you how to look at the source code, how to find the answer to the traversal efficiency from the source code:

To prepare

Here to jdk1.8 as test, found out the test source code, address: https://github.com/zjhpure/so…

Open with idea, open the place shown in the figure.

1. A normal for loop for sequential lists

Find the arrayListForTime method as shown below:

As you can see from the code, the code between the start time and the end time is the elapsed time of the for loop, and each time the loop is traversed, the get method is called, so as long as you know the elapsed time of the GET method, you know the total elapsed time of the for loop. We first look at the source code of get method through the jump function of IDEA, but after jump, an interface is displayed, as shown in the figure below:

How to do? How do I find the source code for the get method?

You need to debug through the break point to find the source code for the get method. Hit a breakpoint at the get method and click the debug button. After the program runs, it will jump to the get method and stop. When debugging, execute Force Step Into and jump Into the source code of the GET method, you can see the source code of the get method, as shown in the figure below:

We can see that the source code for the get method looks like this:

If you encounter the interface when debugging the source code of the get method, continue to execute Force Step Into and jump Into the method. You can see the source code of the method. There will be a lot of operations to see the source code, will not write in detail how to find the source code method, also according to the above operation can be found.

Back in the source code, there are two parts, first executing the rangeCheck method, then executing the elementData method. The rangeCheck method of the first step is to check whether the input subscript exceeds the size of the array.

/** * Checks if the given index is in range. If not, throws an appropriate * runtime exception. This method does *not* check if the index is * negative: It is always used immediately prior to an array access, * which throws an ArrayIndexOutOfBoundsException if index is negative. */ private void rangeCheck(int index) { if (index  >= size) throw new IndexOutOfBoundsException(outOfBoundsMsg(index)); }

The second step of the elementData method, source code as follows:

// Positional Access Operations @SuppressWarnings("unchecked") E elementData(int index) { return (E) elementData[index];  }

Here is directly from the elementData array to take the element, because it is a sequential list, the bottom is to use the array to save the element, through the idea click jump elementData can see that elementData is an array, as shown in the following figure:

So in summary, for a normal for loop with a sequential list, the time for each get is O(1), and the time for going through n elements is O(n).

2. Enhanced for loops for sequential lists

For lists, the enhanced for loop is converted to an iterator loop. You can find the answer by looking at the class file compiled after Java runs. Let’s first find the testForeachTransform method, as shown below:

By clicking on the run program and looking at the compiled class file, we can see that the enhanced for loop is converted to a normal for loop for an array of basic types, and the enhanced for loop to an iterator loop for a list, as shown below:

So the enhanced for loop that parses a list is essentially an iterator loop that parses a list, and now the iterator loop that parses a sequential list

We find the arrayListIteratorTime method, as shown in the figure below:

Each iteration calls the next method, so knowing how long the next method takes tells you how long the iterator loop takes. Iterator () iterator () iterator () iterator () iterator ();

    /**
     * Returns an iterator over the elements in this list in proper sequence.
     *
     * <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
     *
     * @return an iterator over the elements in this list in proper sequence
     */
    public Iterator<E> iterator() {
        return new Itr();
    }

In iterator, create an Itr object. Itr class is an inner class.

/** * An optimized version of AbstractList.Itr */ private class Itr implements Iterator<E> { int cursor; // index of next element to return int lastRet = -1; // index of last element returned; -1 if no such int expectedModCount = modCount; Itr() {} public boolean hasNext() { return cursor ! = size; } @SuppressWarnings("unchecked") public E next() { checkForComodification(); int i = cursor; if (i >= size) throw new NoSuchElementException(); Object[] elementData = ArrayList.this.elementData; if (i >= elementData.length) throw new ConcurrentModificationException(); cursor = i + 1; return (E) elementData[lastRet = i]; } public void remove() { if (lastRet < 0) throw new IllegalStateException(); checkForComodification(); try { ArrayList.this.remove(lastRet); cursor = lastRet; lastRet = -1; expectedModCount = modCount; } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } } @Override @SuppressWarnings("unchecked") public void forEachRemaining(Consumer<? super E> consumer) { Objects.requireNonNull(consumer); final int size = ArrayList.this.size; int i = cursor; if (i >= size) { return; } final Object[] elementData = ArrayList.this.elementData; if (i >= elementData.length) { throw new ConcurrentModificationException(); } while (i ! = size && modCount == expectedModCount) { consumer.accept((E) elementData[i++]); } // update once at end of iteration to reduce heap write traffic cursor = i; lastRet = i - 1; checkForComodification(); } final void checkForComodification() { if (modCount ! = expectedModCount) throw new ConcurrentModificationException(); }}

As you can see, the Itr class has a next method and a hasNext method. The hasNext method is called in the condition of the while loop. As you can see from the source code, the hasNext method determines whether the cursor is equal to the size of the list.

Let’s analyze the next method. There are only three key sentences in the source code of the next method, as shown below:

ArrayList. This. ElementData is to obtain a list of the array, and then the cursor = I + 1, is the cursor plus one, the last element from an array elementData, return element.

So in summary, for the enhanced for loop (iterator loop) of the sequential list, the time for each next method is O(1), and the time for traversing n elements is O(n).

3. Normal for loop for chained lists

Find the linkedListForTime method as shown below:

Reading the source code in the same way, we can see that every time we walk through it, we call the get method, so as long as we know how much time the get method takes, we know how much time the for loop takes. Find the source code for the get method as follows:

/** * Returns the element at the specified position in this list. * * @param index index of the element to return * @return the element at the specified position in this list * @throws IndexOutOfBoundsException {@inheritDoc} */ public E  get(int index) { checkElementIndex(index); return node(index).item; }

Call the checkElementIndex method and return node(index).item. Let’s look at the source code for the checkElementIndex method as follows:

private void checkElementIndex(int index) { if (! isElementIndex(index)) throw new IndexOutOfBoundsException(outOfBoundsMsg(index)); }

The isElementIndex method is called, and the isElementIndex method is called.

    /**
     * Tells if the argument is the index of an existing element.
     */
    private boolean isElementIndex(int index) {
        return index >= 0 && index < size;
    }

The rangeCheck method is similar to the rangeCheck method for sequential lists. It is used to determine the validity of subscripts. Get = node(index).item;

/** * Returns the (non-null) Node at the specified element index. */ Node<E> node(int index) { // assert isElementIndex(index); if (index < (size >> 1)) { Node<E> x = first; for (int i = 0; i < index; i++) x = x.next; return x; } else { Node<E> x = last; for (int i = size - 1; i > index; i--) x = x.prev; return x; }}

Because it’s a chained list, the underlying list holds the elements. The whole process can be seen from the source of the node method, determines the subscript position is in the middle or in the middle on the left on the right, and if the middle on the left, then from the start node to search one by one, if it is in the middle on the right, then from the front end node to search one by one, so the chain list to find an element of time complexity is O (n).

From this we can see that a chained list is much less efficient at finding an element than a sequential list, because it takes O(1) to find an element, and a chained list takes O(n) to find an element.

Because there are n elements to traverse, the time complexity for a chained list to traverse n elements is O(n^2), and the normal for loop for a chained list is an order of magnitude slower than the normal for loop for a sequential list.

But the chained list does not search for each element n times to find, according to the source code here can be found, according to the subscript in the middle right or left to judge from the start node or from the end node forward, combined, the search time is 1,2,3,4… N / 2 n / 2… 4,3,2,1.

Total power :(1+n/2)n/2=3/4n^2, 3/4n^2, also on the order of n^2.

So in summary, for a normal for loop with a chained list, the time for each get is O(n), and the time for going through n elements is O(n^2).

4. Enhanced for loops for chained lists

For chained lists, the enhanced for loop is also converted to an iterator loop.

So the enhanced for loop that parses a list is essentially an iterator loop that parses a list, and now the iterator loop that parses a chained list

We find the linkedListIteratorTime method, as shown in the figure below:

Each iteration calls the next method, so knowing how long the next method takes tells you how long the iterator loop takes. Let’s look at the source code of the iterator method, as follows:

    /**
     * Returns an iterator over the elements in this list (in proper
     * sequence).<p>
     *
     * This implementation merely returns a list iterator over the list.
     *
     * @return an iterator over the elements in this list (in proper sequence)
     */
    public Iterator<E> iterator() {
        return listIterator();
    }

In the iterator method, call the listIterator method.

    /**
     * {@inheritDoc}
     *
     * <p>This implementation returns {@code listIterator(0)}.
     *
     * @see #listIterator(int)
     */
    public ListIterator<E> listIterator() {
        return listIterator(0);
    }

In the listIterator method, invoke method listIterator(0), which means start with the first element of the list.

    /**
     * Returns a list-iterator of the elements in this list (in proper
     * sequence), starting at the specified position in the list.
     * Obeys the general contract of {@code List.listIterator(int)}.<p>
     *
     * The list-iterator is <i>fail-fast</i>: if the list is structurally
     * modified at any time after the Iterator is created, in any way except
     * through the list-iterator's own {@code remove} or {@code add}
     * methods, the list-iterator will throw a
     * {@code ConcurrentModificationException}.  Thus, in the face of
     * concurrent modification, the iterator fails quickly and cleanly, rather
     * than risking arbitrary, non-deterministic behavior at an undetermined
     * time in the future.
     *
     * @param index index of the first element to be returned from the
     *              list-iterator (by a call to {@code next})
     * @return a ListIterator of the elements in this list (in proper
     *         sequence), starting at the specified position in the list
     * @throws IndexOutOfBoundsException {@inheritDoc}
     * @see List#listIterator(int)
     */
    public ListIterator<E> listIterator(int index) {
        checkPositionIndex(index);
        return new ListItr(index);
    }

In another listIterator method, call the checkPositionIndex method and create a ListItr object. ListItr class is an inner class.

private void checkPositionIndex(int index) { if (! isPositionIndex(index)) throw new IndexOutOfBoundsException(outOfBoundsMsg(index)); }

In the checkPositionIndex method, call the isPositionIndex method.

    /**
     * Tells if the argument is the index of a valid position for an
     * iterator or an add operation.
     */
    private boolean isPositionIndex(int index) {
        return index >= 0 && index <= size;
    }

The isPositionIndex method is used to determine whether it is the subscript of an element. It is similar to the rangeCheck method used to determine the validity of the subscript. Let’s look at the ListItr class source code, as follows:

private class ListItr implements ListIterator<E> { private Node<E> lastReturned; private Node<E> next; private int nextIndex; private int expectedModCount = modCount; ListItr(int index) { // assert isPositionIndex(index); next = (index == size) ? null : node(index); nextIndex = index; } public boolean hasNext() { return nextIndex < size; } public E next() { checkForComodification(); if (! hasNext()) throw new NoSuchElementException(); lastReturned = next; next = next.next; nextIndex++; return lastReturned.item; } public boolean hasPrevious() { return nextIndex > 0; } public E previous() { checkForComodification(); if (! hasPrevious()) throw new NoSuchElementException(); lastReturned = next = (next == null) ? last : next.prev; nextIndex--; return lastReturned.item; } public int nextIndex() { return nextIndex; } public int previousIndex() { return nextIndex - 1; } public void remove() { checkForComodification(); if (lastReturned == null) throw new IllegalStateException(); Node<E> lastNext = lastReturned.next; unlink(lastReturned); if (next == lastReturned) next = lastNext; else nextIndex--; lastReturned = null; expectedModCount++; } public void set(E e) { if (lastReturned == null) throw new IllegalStateException(); checkForComodification(); lastReturned.item = e; } public void add(E e) { checkForComodification(); lastReturned = null; if (next == null) linkLast(e); else linkBefore(e, next); nextIndex++; expectedModCount++; } public void forEachRemaining(Consumer<? super E> action) { Objects.requireNonNull(action); while (modCount == expectedModCount && nextIndex < size) { action.accept(next.item); lastReturned = next; next = next.next; nextIndex++; } checkForComodification(); } final void checkForComodification() { if (modCount ! = expectedModCount) throw new ConcurrentModificationException(); }}

As you can see, the ListItr class has a next method and a hasNext method. The hasNext method is called in the condition of the while loop. As you can see from the source code, the hasNext method is to determine whether the cursor nextIndex is equal to the size of the list. The while loop ends.

Let’s analyze the next method. From the source code of the next method, we can find that it is constantly traversing the list, as shown in the figure below:

So the iterator loop over n elements takes O(n), which is an order of magnitude faster than the normal for loop over N elements takes O(n^2).

Why is it suddenly an order of magnitude lower?

We carefully compared the source code can be seen above, in fact the iterator loop traverse just change the train of thought, for ordinary for loop chain list, each time to find an element should be from the beginning of the node list or the end node to the middle one by one to find, so look for an element, the time complexity is O (n). But let’s think about it, we’re going to go through all the elements, so can we not look for the element every time? Certainly, because each element holds the address of the next node, so you can change a way of thinking, starting from the start node traversal, find the next node through the node, to find and down through the next node to a node, until the end node, every time you don’t need to find an element need to look for a time.

Iterator loop traversal is the idea above. A simple change in thinking can reduce time by an order of magnitude.

For chained lists, the normal for loop takes O(n) time for each get, but the iterator loop takes O(1) time for each next, so you can dramatically reduce the time. The reason why each next method of a chained list is O(1) is that each next method moves the pointer back one bit, rather than looking through it like the get method of a normal for loop.

So in summary, for the enhanced for loop (iterator loop) of a chained list, the time for each next method is O(1), and the time for traversing n elements is O(n).

conclusion

According to the test results:

  • ArrayList traverses 100,000 pieces of data through an ordinary for loop, taking 2ms and taking O(n) time.
  • The ArrayList traverses 100,000 pieces of data through an enhanced for loop, taking 2ms with a time complexity of O(n).
  • ArrayList iterates through 100,000 pieces of data, taking 2ms and O(n) time.
  • LinkedList traverses 100,000 pieces of data through a normal for loop, which takes 4390ms and takes O(n^2) time, which is significantly slower.
  • LinkedList traverses 100,000 pieces of data through an enhanced for loop that takes 3ms and takes O(n) time.
  • LinkedList iterates through 100,000 pieces of data, taking 1ms and taking O(n) time.

Conclusion:

  • For sequential lists, use a normal for loop to traverse the data
  • For chained lists, an enhanced for loop is used to traverse the data

Public account: Pure programming (chunjie_tech)

Original link:https://mp.weixin.qq.com/s/A8…