This is my second article about getting started.

Break out of your comfort zone and review your Java knowledge regularly to ensure you don’t fall behind.

Systematic Relearning Java 2, let’s take a look at common questions in Java collections.

First question: why is there no longer a circular LinkedList underlying LinkedList in Java 1.7?

Prior to JDK 1.7 (for example, JDK1.6), LinkedList was a circular list implemented by headerEntry.

That is, an empty Entry is initialized to be used as a header, and then end to end to form a circular linked list, as shown in the figure below:

Every time an element is added or removed, it defaults to the end of the chain, before the header. Since traversal is in the next direction, doing something before the header is the same as doing something at the end of the list.

The corresponding code is as follows:

// from jdk1.6-linkedList
private Entry<E> addBefore(Eo) {
    Entry<E>newEntry = new Entry<E>(o,header,header.previous);
    newEntry.previous.next = newEntry;
    newEntry.next.previous = newEntry;
    size++;
    modCount++;
    return newEntry;
}
Copy the code

In JDK 1.7, the headerEntry circular list in JDK 1.6 was replaced with a non-circular list consisting of firstEntry and lastEntry.

The code for adding an element looks like this:

// from jdk1.7-linkedList
void linkLast(E e) {
        final Node<E> l = last;
        final Node<E> newNode =new Node<>(l, e,null);
        last = newNode;
        if (l ==null)
            first = newNode;
        else
            l.next = newNode;
        size++;
        modCount++;
}
Copy the code

Compared to 1.6, the code looks simpler and cleaner.

The comparison shows that:

  1. First/LAST has a clearer concept of chain head and chain tail;

  2. First/last saves a new headerEntry;

  3. Insert/delete operations at the head/tail, first /last is faster:

  • Insert/delete in the middle

    The pointer to index is changed at both ends of the list.

  • Insert/delete at both ends

    For a circular list, since it is connected end to end, we need to deal with Pointers at both ends; For acyclic lists, only one side of the process is first.previous/last.next, so in theory acyclic lists are more efficient.

    In general, two-end (header/tail) operations are the most common, so it is a wise choice for version 1.7 to replace them with acyclic lists.

The original reference links: blog.csdn.net/tiwerbao/ar…

Second question: Why does ArrayList implement RandomAccess?

When we look at the source code, ArrayList implements the RandomAccess interface, while LinkedList does not.

RandomAccess itself is an empty interface, so why implement it?

In fact, RandomAccess is a tag interface, and the official explanation is that as long as the List implements this interface, it can support fast RandomAccess.

And what is random access?

We know that Collections is a utility class for Collections. Take the binary search method in the Collections source code for example:

public static <T> int binarySearch(List<? extends Comparable<? super T>> list, T key) {
    if (list instanceof RandomAccess || list.size()<BINARYSEARCH_THRESHOLD)
         return Collections.indexedBinarySearch(list, key);
    else
         return Collections.iteratorBinarySearch(list, key);
}
Copy the code

In the binarySearch method, we first determine if the list is an instance of RandomAccess, if so, execute the indexedBinarySearch method, and if not, execute the iteratorBinarySearch method.

The difference between these two methods is that the former uses index traversal, which is more suitable for ArrayList; The latter uses iterators to iterate over and is more appropriate for LinkedList.

So using the RandomAccess interface as a marker, you can determine which search scheme is more efficient.

Why is the object array elementData in ArrayList a transient modifier when ArrayList is serializable?

First of all, we need to know the role of TRANSIENT:

Once a variable is transient, it is no longer part of the object’s persistence and its contents cannot be accessed after serialization.

The ArrayList is dynamically expanded, so the array of objects does not store data. If the array is serialized externally, the entire array of objects will be serialized.

So ArrayList implements its own private writeObject and readObject methods to optimize serialization and deserialization for better performance.

In a nutshell, we use transient to decorate the object array elementData to prevent it from being serialized by an external method.

Fourth question: In actual testing, it is found that LinkedList is not necessarily more efficient than ArrayList when adding elements. Why is this?

When you test the performance of ArrayList and LinkedList, you will find that the LinkedList is not necessarily more efficient than ArrayList when you manipulate ArrayList and LinkedList in different locations. So why is that?

Let’s start with the test results (in terms of elapsed time) :

In other words, LinkedList performs better than ArrayList only when new elements are added from the head of the collection, while LinkedList performs worse than ArrayList when new elements are added from the middle and tail of the collection.

Here’s why:

  1. Add elements from the collection header

If an ArrayList adds an element to the header, it copies and rearranges the data after the header, which is inefficient.

When adding an element to the LinkedList, the LinkedList will first loop to find the location where the element needs to be added. If the location is in the first half of the List, the LinkedList will find the location from front to back. If the position is in the second half, look from back to front. So LinkedList adding elements to the header is very efficient;

  1. Add elements from the middle of the collection

When an ArrayList adds elements to the middle of an array, some data also needs to be copied and rearranged, which is not very efficient.

LinkedList, which adds elements to the middle, is the least efficient way to add elements.

Because it is near the middle, the loop lookup before adding an element is the most efficient operation to traverse the element.

  1. Add an element from the end of the collection

Arraylists are very efficient at adding elements to the tail without copying and rearranging data;

LinkedList is less efficient than ArrayList because of the new object and the process of changing the pointer to the object.

Q5: In JDK 1.8 HashMap, the tree threshold is 8, and the chain threshold is 6. Why not keep the same?

Okay, speaking of people, let’s explain the concept.

As we know, the underlying data structure of a HashMap is an array containing linked lists or red-black trees.

When the list length is greater than 8, the list will be converted into a red-black tree, which is tree-ization;

If the number of elements in a red-black tree is less than or equal to 6, the tree will degenerate into a linked list.

Why doesn’t it degenerate less than 8? For example, it degenerates at 7, but it has to be less than or equal to 6. Right?

Imagine that if the threshold was 7, deleting an element would degrade the red-black tree to a linked list and adding an element would have to be tree-like. The back and forth transition structure would degrade performance, so the threshold would not be critical.

Q 6: Why is the length of a HashMap a power of 2?

The hash(key) method used in the source code to calculate the index of the array to be inserted will be calculated by (length-1) & hash, 2 to the power of -1 into the binary each bit is 1, numerically equivalent to length modulo.

In this way, the distribution of data in the array is relatively discrete, the probability of collision is small, and the query efficiency will be improved.

Q7: Why was the HashMap list changed to tail insertion in JDK 1.8?

As we know, when hash conflicts occur, elements are stored as linked lists, which in JDK 1.7 are inserted from the head and in JDK 1.8 from the tail.

In the head insertion method, the original order of the elements in the linked list will be changed during expansion, which will easily lead to the problem of the linked list ring in concurrent scenarios.

The tail insertion method, on the other hand, preserves the original order of the linked list elements during expansion, thus eliminating the problem of the linked list being looped.

Q: Why did we optimize the storage location calculation of HashMap array expansion in JDK 1.8?

The storage location of array expansion corresponding to the two versions is calculated as follows:

JDK 1.7: Shredding all the data, recalculating the hash and storing it, also known as rehash, is quite expensive;

JDK 1.8: resorts the elements of an array, either in their original location or in their original location + original capacity.

In addition to poor performance, the JDK 1.7 rehash operation under concurrent conditions can cause problems with circular linked lists between elements.

JDK 1.8 addressed this issue, but we still don’t recommend using HashMap in multithreaded Settings because there are other issues with using HashMap in multithreaded Settings, like data loss.

ConcurrentHashMap is recommended for concurrent problems.