HashMap is a very common container in Java. It was only used in the use stage before, and it was only half-understood the underlying design. It was not until you saw the source code that you realized its clever design and how it works. Before you understand the underlying principles, it is recommended that you study several common data structures to get a better understanding of HashMap.

The data structure

The data structure of the HashMap is
Array plus linked list, the default initial size is
16, the default load factor is
0.75When the actual Capacity (which is the size of each node, not the size of a non-empty array) exceeds the defined value of Capacity multiplied by the load factor, the HashMap will be executed once
resizeScaling operation to create a new array 2 times the size at the same time
rehashAll the data (the reason why rehash is needed instead of copying the previous array position is that the length of the array is used in the hash algorithm, and the length of the array has changed after expansion)

The initial size of the HashMap is 16, set to a power of 2 for bitwise operations, and try to set the size of the HashMap to a power of 2 when manually setting it

Lists Evolution and Degradation Mechanisms: The length of the lists in a HashMap evolves into a red-black tree when it is greater than 8, and degenerates into a linked list when it is less than 6. Now, why eight and six? That’s because in many times of experiments found that 8 times hash collision probability is very, very small, can set to 8 in the small probability incident, transform the data structure, optimize the subsequent query performance, degradation threshold is set to 6 in order to avoid the set to 8, or 7, because the hash collision of switching back and forth near the eight data structure.

The Hash algorithm

Each key will undergo a hash operation (get the key’s hashCode first, and move the hashCode unsigned 16 bits to the right [hashCode >>> 16], and perform XOR operation on the right-shifted value and hashCode (the same value in binary is 0, [hash = hashCode ^ hashCode >>> 16] [hash = hashCode ^ hashCode >>> 16] [hash = hashCode ^ hashCode >>> 16] Otherwise 0) [(length-1) & hash] gets.

The hash collision array in HashMap is inserted into a node using tail interpolation (JDK7 is header) that contains the key, hash values, value, and the pointer to the next node

Tips: << move left, right fill 0 >> move right, left initial bit is positive fill 0, negative fill 1 >>> unsigned move right, initial bit fill 0 For binary, initial bit 0 is positive and 1 is negative

Thread safety

HashMap is not thread-safe, as is the case in JDK7 and JDK8

JDK7

In JDK7, thread A inserts A pair of key-values at the position of an index in an array, and the pointer to A new Entry next points to the first Entry in the bucket. Thread B also inserts A pair of key-values. Since thread A has not completed the assignment, the next value of the new Entry created by thread B also points to the first Entry in the bucket and successfully adds it. Then Thread B finishes, and Thread A starts the assignment, and it will see that the data added by Thread B is overwritten after the execution of Thread A

(2) In the transfer expansion method, assume that A linked bucket has two adjacent entries, Entry1 and Entry2. The next pointer of Entry1 points to Entry2. When thread A is executing the transfer operation, and Entry1 is executed before the legend mark, If there is a hash collision between Entry1 and Entry2 after rehash, then the next pointer of Entry2 points to Entry1 in the enlarged HashMap due to the existence of header interpolation. If the next pointer of Entry1 points to Entry2, then the next pointer of Entry2 points to Entry1. If the next pointer of Entry2 points to Entry1, then the next pointer of Entry1 points to Entry1. Thread A cannot break out of resize

JDK8

In JDK8, the transfer method was removed and replaced by the resize method. Because of the existence of tail interpolation, there will be no endless loop, but there will be data coverage problems.


Thread A and Thread B are at the same time in the PUT value, and the hash value of their key is calculated to get the same index value. A hash collision occurs, but at this time, Thread A is suspended, and Thread B performs newNode operation. After Thread B finishes, Thread A continues to execute, which also performs newNode operation. The previous update of thread B is overwritten, and the data put of thread B disappears