HashMap is thread-unsafe, and operating on instance variables of type HashMap in an object in a multithreaded environment can cause all sorts of unexpected problems.

This article elaborates on several thread-safety issues with HashMap.

Note: The following is based on JDK1.8

Multithreaded PUT can cause element loss

1.1 The test code is as follows

Note: Just as sample code that might cause this problem, running it directly does not necessarily cause problems

public class ConcurrentIssueDemo1 { private static Map<String, String> map = new HashMap<>(); Public static void main(String[] args) {// Thread 1 => T1 new Thread(new Runnable() {@override public void run() {for (int  i = 0; i < 99999999; i++) { map.put("thread1_key" + i, "thread1_value" + i); } } }).start(); // Thread 2 => t2 new Thread(new Runnable() {@override public void run() {for (int I = 0; i < 99999999; i++) { map.put("thread2_key" + i, "thread2_value" + i); } } }).start(); }}Copy the code

1.2 Scenarios that trigger this problem

Take a look at the source code for the PUT method

public V put(K key, V value) { return putVal(hash(key), key, value, false, true); } final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, I; / / initializes the hash table if ((TAB = table) = = null | | (n = TAB. Length) = = 0) n = (TAB = the resize ()). The length; // Compute the hash position in the hash table and assign the element at that position to p, If (p = TAB [I = (n-1) & hash]) == null) TAB [I] = newNode(hash, key, value, null); Else {// The current index element of the hash table already exists, append Node<K,V> e to this element; K k; if (p.hash == hash && ((k = p.key) == key || (key ! = null && key.equals(k)))) e = p; else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; If ((e = p.ext) == null) {// #1 p.ext = newNode(hash, key, value, null); // #2 if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key ! = null && key.equals(k)))) break; p = e; } } if (e ! = null) { // existing mapping for key V oldValue = e.value; if (! onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; }Copy the code

Assume the table state in the current HashMap is as follows:

Assume that T1 performs PUT (” key2 “, “value2″) and T2 performs PUT (” key3 “, “value3”), and the hash values of key2 and key3 are the same as those of key1 in the figure.

Under normal circumstances, after putting is complete, the table should be in one of the following two states

Now let’s look at the anomalies

Assume that thread 1 and thread 2 are now executing at #1 in the PUT source code and that the current table state is as follows

Then both threads execute if ((e = p.ext) == null) and come to line #2.

P ext = newNode(hash, key, value, null);

Then the table will look like this

P next = newNode(hash, key, value, null);

In this case, the table changes to the following state

So the key2 element is lost.

2 If PUT and GET are concurrent, GET may be null

Scenario: Thread 1 performs PUT, and rehash occurs because the number of elements exceeds threshold. Thread 2 performs GET, which may cause this problem.

Analysis is as follows:

Take a look at the resize method source code

The new capacity and threshold are calculated, a new hash table is created, and elements from the old hash table are rehashed into the new hash table

The key codes are #1 and #2

// Hash table TRANSIENT Node<K,V>[] table; Final Node<K,V>[] resize() {begin Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold } else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; End @suppressWarnings ({"rawtypes","unchecked "}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; end @SuppressWarnings({"rawtypes","unchecked "}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; // #1 table = newTab; // #2 // rehash begin if (oldTab ! = null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) ! = null) { oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { // preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) ! = null); if (loTail ! = null) { loTail.next = null; newTab[j] = loHead; } if (hiTail ! = null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } // rehash end return newTab; }Copy the code

At code #1, a new hash table is created with the newly calculated capacity, and #2 assigns the newly created empty hash table to the instance variable table.

Note that the instance variable table is empty.

So, if another thread executes a GET at this point, it will get out NULL.

3 In JDK7, the concurrent put of HashMap creates a circular linked list, resulting in an infinite loop during GET

This problem has been solved in JDK8.

3.1 Formation of circular linked list in JDK7

Occurs when multiple threads are concurrent with resize.

The relevant source code is as follows:

void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } Entry[] newTable = new Entry[newCapacity]; transfer(newTable, initHashSeedAsNeeded(newCapacity)); table = newTable; threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1); ** * Transfers all entries from current table to newTable. */ / Transfers all entries from current table to newTable. Void Transfer (Entry[] newTable, Boolean rehash) {int newCapacity = newtable.length; For (Entry<K,V> e: table) {// The table variable is the old hash table while(null! = e) { // #1 Entry<K,V> next = e.next; if (rehash) { e.hash = null == e.key ? 0 : hash(e.key); Int I = indexFor(e.hash, newCapacity); // Use the hash value of the element to calculate the position of the element in the new hash table. // #2 e.next = newTable[I]; // #3 newTable[i] = e; // #4 e = next; }}}Copy the code

Assume that thread 1(T1) and thread 2(T2) resize at the same time. Before resize, the states of the two threads and the Hashmap are as follows

The table field in the HashMap object in the heap refers to the old hash table with two elements at index 7. Let’s take the rehash of these two elements as an example to see how the circular linked list is formed.

Thread 1 and thread 2 respectively create a hash table, which is represented by the newTable field.

PS: If it is too long to present the execution of each step in the form of graph, here is the state at the end of each cycle, which can be calculated step by step according to the code and the explanation of each step.

Step1: after t2 completes #1, the CPU executes t1, and t1 completes

You can calculate from the figure above, and the state is as follows

Use t2.e to represent local variable e in thread 2, t2.next likewise.

Step2: t2 continue downward to complete the cycle

Step3: t2 continue the next cycle

Step4: T2 continues the next cycle, and the cyclic linked list appears

3.2 The appearance of an infinite loop

The hashmap. get method is as follows:

public V get(Object key) { if (key == null) return getForNullKey(); Entry<K,V> entry = getEntry(key); return null == entry ? null : entry.getValue(); } final Entry<K,V> getEntry(Object key) { if (size == 0) { return null; } int hash = (key == null) ? 0 : hash(key); For(Entry<K,V> e = table[indexFor(hash, table.length)]; e ! = null; e = e.next) { Object k; / / assume that this condition has not established the if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) return e; } return null; }Copy the code

The for loop e = e.next is never empty, so if you get a key that doesn’t exist in the list, you’ll end up with an infinite loop.