The key to improving performance of ConcurrentHashMap is concurrent expansion of multiple threads. Hash conflicts reduce retrieval efficiency, so expansion is a good way to reduce conflicts. However, if only one thread does the migration work, all the other read-write threads have to wait, which is inefficient. ConcurrentHashMap has two member variables table,nextTable. Table stores the current data, and nextTable is the target container for moving data when expansion occurs. As long as the slot being operated does not conflict with the slot being migrated, concurrent reads and writes during capacity expansion can be maximized.

Here’s a look at some of the core methods

public V get(Object key) { Node<K, V>[] tab; Node<K, V> e, p; int n, eh; K ek; Int h = spread(key.hashcode ()); If ((TAB = table)! = null && (n = tab.length) > 0 && (e = tabAt(tab, (n - 1) & h)) ! = null) {/ / first determine the hash to the same Because the hash of some elements in the same slot may be different if ((eh = e.h ash) = = h) {if (= e.k (ek ey) = = key | | (ek! = null && key.equals(ek))) return e.val; } else if (eh < 0) return (p = e.type (h, key)) return (p = e.type (h, key)); = null ? p.val : null; // Hash mismatch traverse the list while ((e = e.next)! = null) { if (e.hash == h && ((ek = e.key) == key || (ek ! = null && key.equals(ek)))) return e.val; } } return null; }Copy the code

You can observe that the get method is a lock-free method. What happens if another thread changes the list? And you can see from the implementation of put that we’re using tail insertion and let’s say that the new entry is B, and it’s going to mount to A. This happens to access a in two cases: 1)read()a -> put()b -> read()a.next can read the latest b node. 2) Read ()a -> read() A. next -> put() B Cannot read the latest B node

The hash value at (1) is negative. If entry-hash is negative, it indicates that the slot is being migrated. This will be explained later.

public boolean containsValue(Object value) { if (value == null) throw new NullPointerException(); Node<K, V>[] t; if ((t = table) ! = null) { Traverser<K, V> it = new Traverser<K, V>(t, t.length, 0, t.length); for (Node<K, V> p; (p = it.advance()) ! = null; ) { V v; if ((v = p.val) == value || (v ! = null && value.equals(v))) return true; } } return false; }Copy the code

Traverser The Traverser object is the base iterator of ConcurrentHashMap. Advance returns a Node each time it is called.

final Node<K, V> advance() { Node<K, V> e; If ((e = next)! = null) e = e.next; for (; ;) { Node<K, V>[] t; int i, n; // must use locals in checks // = null) return next = e; / / at this point is beyond the table scan range Namely traverse all data if (baseIndex > = baseLimit | | (t = TAB) = = null | | (n = t.l ength) < = (I = index) | | I <  0) return next = null; // Hash < 0 indicates that the entry is a tree structure or is being migrated. // tabAt(t, I), I = index have located the entry based on the latest index. If ((e = tabAt(t, I))! = null && e.hash < 0) { if (e instanceof ForwardingNode) { tab = ((ForwardingNode<K, V>) e).nextTable; e = null; pushState(t, i, n); continue; } else if (e instanceof TreeBin) e = ((TreeBin<K, V>) e).first; else e = null; } if (stack ! = null) recoverState(n); Else if ((index = I + baseSize) >= n) index = ++baseIndex; // visit upper slots if present } }Copy the code

As you can see, it is also a lock-free method, which is basically the same as GET, leaving the tree or migration logic aside.

Now, how is the data inserted

final V putVal(K key, V value, boolean onlyIfAbsent) { if (key == null || value == null) throw new NullPointerException(); Int hash = spread(key.hashcode ()); int binCount = 0; For (Node<K, V>[] TAB = table; ;) { Node<K, V> f; int n, i, fh; / / if the container has not been initialized to initialize the if (TAB = = null | | (n = TAB. Length) = = 0) TAB = initTable (); Else if ((f = tabAt(TAB, I = (n-1) & hash)) == null) {// Try to use cas. If the configuration succeeds, no subsequent processing is required. There are two scenarios in which the competition fails. If (casTabAt(TAB, I, null, new Node<K, V>(hash, key, value, null)) if (casTabAt(TAB, I, null, new Node<K, V>(hash, key, value, null))) break; Else if ((fh = f.hash) == MOVED) // Move logic later TAB = helpTransfer(TAB, F); V oldVal = null; Synchronized (f) {// synchronized (f) {// synchronized (f) {// synchronized (f) { If the table has been locked and the table itself has changed, try to relocate the slot and lock the head node again So now we're going to do hash = MOVED if (tabAt(TAB, If (fh >= 0) {// If (fh >= 0) {// If (fh >= 0) {// If (fh >= 0) {// If (fH >= 0) {// If (fH >= 0) { for (Node<K, V> e = f; ; ++binCount) { K ek; / / a hashMap operations from the if (e.h ash = = hash && (= e.k (ek ey) = = key | | (ek! = null && key.equals(ek)))) { oldVal = e.val; if (! onlyIfAbsent) e.val = value; break; } Node<K, V> pred = e; If ((e = e.next) == null) {next = new Node<K, V>(hash, key, value, null); break; Else if (f instanceof TreeBin) {Node<K, V> p; binCount = 2; If ((p = ((TreeBin<K, V>) f).puttreeval (hash, key, value))! = null) { oldVal = p.val; if (! onlyIfAbsent) p.val = value; }}}} // indicates that if (binCount! If (binCount >= TREEIFY_THRESHOLD) treeifyBin(TAB, I); // Return the old value if (oldVal! = null) return oldVal; break; }}} // Check whether the current table has enough data and expand the table. addCount(1L, binCount); return null; }Copy the code

Treification of linked lists

private final void treeifyBin(Node<K, V>[] tab, int index) { Node<K, V> b; int n, sc; if (tab ! If ((n = tab.length) < MIN_TREEIFY_CAPACITY) // The size of the array is doubled by default tryPresize(n << 1); else if ((b = tabAt(tab, index)) ! Synchronized (b) {if synchronized (b) {if synchronized (b) {if synchronized (b) {if synchronized (b) {if synchronized (b) {if synchronized (b) {if synchronized (b) {if synchronized (b) {if synchronized (b) {if synchronized (b) {if synchronized (b) (tabAt(tab, index) == b) { TreeNode<K, V> hd = null, tl = null; For (Node<K, V> e = b; e ! = null; e = e.next) { TreeNode<K, V> p = new TreeNode<K, V>(e.hash, e.key, e.val, null, null); if ((p.prev = tl) == null) hd = p; else tl.next = p; tl = p; Setabat (TAB, index, new TreeBin<K, V>(hd)) setTabAt(TAB, index, new TreeBin<K, V>(hd)); } } } } }Copy the code

Attempt to expand capacity

private final void tryPresize(int size) { int c = (size >= (MAXIMUM_CAPACITY >>> 1)) ? MAXIMUM_CAPACITY : tableSizeFor(size + (size >>> 1) + 1); int sc; While ((sc = sizeCtl) >= 0) {Node<K, V>[] TAB = table; int n; / / if the array has not been initialized Logic here like init the if (TAB = = null | | (n = TAB. Length) = = 0) {n = > c (sc)? sc : c; if (U.compareAndSwapInt(this, SIZECTL, sc, If (table == TAB) {@suppressWarnings ("unchecked") Node<K, V>[] nt = (Node<K, V> V>[]) new Node<?, ?>[n]; table = nt; sc = n - (n >>> 2); } } finally { sizeCtl = sc; }}} / / has been expanded to the size of the enough else if (c < = sc | | n > = MAXIMUM_CAPACITY) break; Else if (TAB == table) {if (TAB == table) {if (TAB == table) {if (TAB == table) {if (TAB == table) resizeStamp(n); While sc >= 0 if (sc < 0) {Node<K, V>[] nt; if ((sc >>> RESIZE_STAMP_SHIFT) ! = rs || sc == rs + 1 || sc == rs + MAX_RESIZERS || (nt = nextTable) == null || transferIndex <= 0) break; if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) transfer(tab, nt); Else if (U.compareAndSwapInt(this, SIZECTL, sc, SIZECTL) (rs << RESIZE_STAMP_SHIFT) + 2)) transfer(tab, null); }}}Copy the code

Map insert and query are not considered in the case of data migration. Let’s see how data migration is implemented

/** * Moves and/or copies the nodes in each bin to new table. See * above for explanation. ** @param TAB source */ private final void transfer(Node<K, V>[] TAB, Node<K, V>[] nextTab) { int n = tab.length, stride; If ((stride = (NCPU > 1)? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE) stride = MIN_TRANSFER_STRIDE; // Subdivide range // Indicates that nextTable needs to be updated. By default, nextTable is expanded to twice the size. // Only one thread performs updates to nextTable null) { // initiating try { @SuppressWarnings("unchecked") Node<K, V>[] nt = (Node<K, V>[]) new Node<? ,? >[n << 1]; nextTab = nt; } catch (Throwable ex) { // try to cope with OOME sizeCtl = Integer.MAX_VALUE; return; } // Note that there is an additional field to maintain a new array, nextTable = nextTab; transferIndex = n; } int nextn = nextTab.length; ForwardingNode<K, V> FWD = new forwardingNode <K, V>(nextTab); // Indicates whether to update the slot range. Boolean advance = true; boolean finishing = false; Do you want to ensure sweep before research nextTab (right) what do you want to do to fully understand concurrentHashMap (right) what do you want to do to accurately analyze all thread collisions (right) The old table detected slots that need to be expanded by checking ForwardingNode and participated in the expansion. Now the ForwardingNode should be gradually set to all slots of the old table. // Each thread is responsible for moving only a few slots For (int I = 0, bound = 0; int I = 0, bound = 0; ;) { Node<K, V> f; int fh; While (advance) {nextBound represents the lower subscript limit of the slots carried by this thread in this cycle. NextIndex represents the upper subscript limit of the slots carried by this thread nextIndex, nextBound; / / -- > = I bound on behalf of a slot on the handling Go forward a / / finishing representative had complete data expands the scope does not need to update the slot So advance to false if (-- > = I bound | | finishing) advance = false; Else if ((nextIndex = transferIndex) <= 0) {I = -1; advance = false; } else if (U.compareAndSwapInt (this, transferIndex,) {if (U.compareAndSwapInt (this, transferIndex,)} nextIndex, NextBound = (nextIndex > stride? NextIndex - stride: NextBound ~nextIndex bound = nextBound; i = nextIndex - 1; advance = false; }} / / within the this time handling all slots have been moving to finish the if (I < 0 | | I > = n | | I + n > = nextn) {int sc; If (finishing) {nextTable = null; table = nextTab; sizeCtl = (n << 1) - (n >>> 1); return; If (U.compareAndSwapInt(this, SIZECTL, sc = SIZECTL, if (U.compareAndSwapInt(this, SIZECTL, sc = SIZECTL, Sc-1)) {// If ((sc-2)! = resizeStamp(n) << RESIZE_STAMP_SHIFT) return; Finishing = advance = true; // This is the last thread to migrate data. i = n; // Recheck before commit} else if ((f = tabAt(tab, i)) == null) advance = casTabAt(tab, i, null, fwd); // If the current slot has been moved by another thread // If the current slot has been moved by another thread // If the current slot has been moved by another thread // If the current slot has been moved by another thread // If the current slot has been moved by another thread // If the current slot has been moved by another thread // ((fh = f.hash) == MOVED) advance = true; // Already processed // The most common case is that there is data to be moved in this slot else {// There is also a lock on the header node, which creates lock contention with putValue and tree Synchronized (f) {if (tabAt(TAB, synchronized) {if (tabAt(TAB, synchronized); i) == f) { Node<K, V> ln, hn; If (fh >= 0) {int runBit = fh&n; Node<K, V> lastRun = f; For (Node<K, V> p = f.ext; p ! = null; p = p.next) { int b = p.hash & n; // Indicates that the hash of one node is inconsistent with that of the other node. If (b! = runBit) { runBit = b; // lastRun = p; }} if (runBit == 0) {ln = lastRun; hn = null; } else { hn = lastRun; ln = null; } for (Node<K, V> p = f; p ! = lastRun; p = p.next) { int ph = p.hash; K pk = p.key; V pv = p.val; If ((ph&n) == 0) ln = new Node<K, V>(ph, pk, pv, ln); else hn = new Node<K, V>(ph, pk, pv, hn); } // Insert two small lists into the new table setTabAt(nextTab, I, ln); setTabAt(nextTab, i + n, hn); // Since the capacity expansion has not been completed, the slot is marked as in capacity expansion so that the new PUT thread finds the mark corresponding to the slot and will assist in capacity expansion. // There is an implicit benefit here. // Insert data into a slot that has not yet been moved. However, once the thread that has been moved preempts the slot, it is marked and is not allowed to operate on the slot until the whole table has been expanded The hash thread does not wait until it sees the move flag and assists with the data migration setTabAt(TAB, I, FWD). // Continue to move to the next slot advance = true; Else if (f instanceof TreeBin) {TreeBin<K, V> t = (TreeBin<K, V>) f; TreeNode<K, V> lo = null, loTail = null; TreeNode<K, V> hi = null, hiTail = null; int lc = 0, hc = 0; for (Node<K, V> e = t.first; e ! = null; e = e.next) { int h = e.hash; TreeNode<K, V> p = new TreeNode<K, V> (h, e.key, e.val, null, null); if ((h & n) == 0) { if ((p.prev = loTail) == null) lo = p; else loTail.next = p; loTail = p; ++lc; } else { if ((p.prev = hiTail) == null) hi = p; else hiTail.next = p; hiTail = p; ++hc; Ln = (lc <= UNTREEIFY_THRESHOLD)? untreeify(lo) : (hc ! = 0)? new TreeBin<K, V>(lo) : t; hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) : (lc ! = 0)? new TreeBin<K, V>(hi) : t; setTabAt(nextTab, i, ln); setTabAt(nextTab, i + n, hn); setTabAt(tab, i, fwd); advance = true; } } } } } }Copy the code

As you can conclude from the above code, the SIZECTL values are changed every time a thread migrates data, and when all data is migrated, the old table is replaced with the new table because the table is volatile, so the changes can be observed. The other threads will spin and wait for the table to update after the data migration is complete

What do other threads do when they find a MOVE tag

final Node<K, V>[] helpTransfer(Node<K, V>[] tab, Node<K, V> f) { Node<K, V>[] nextTab; int sc; // The new array object if (TAB! = null && (f instanceof ForwardingNode) && (nextTab = ((ForwardingNode<K, V>) f).nextTable) ! = null) { int rs = resizeStamp(tab.length); While (nextTab == nextTable && table == TAB && // sizeCtl is a special negative number when the table is initialized, -1 (sc = sizeCtl) < 0) {// If ((sc >>> RESIZE_STAMP_SHIFT)! = rs || sc == rs + 1 || sc == rs + MAX_RESIZERS || transferIndex <= 0) break; If (U.compareAndSwapInt(this, SIZECTL, sc, sc +1)) {transfer(TAB, nextTab); break; } } return nextTab; } return table; }Copy the code

Once the all-important data migration logic has been parsed, some of the other methods are implemented much the same, first finding the target slot with the hash value, and then participating in the data migration if it is marked as MOVE. Lock the head node if it is a normal node. And then there’s normal insert, replace, delete.

Now let’s go back to what the get() request does when it finds a negative hash in the entry.find() method. (It turns out that TreeNode’s find method has a lock check step because red-black trees need locks to be balanced. Concurrency control is implemented by cas to modify the volatile int lockState property. Not expanding.)

What if the slot corresponding to get is in the process of data migration

Node<K, V> find(int h, Object k) { // loop to avoid arbitrarily deep recursion on forwarding nodes outer: for (Node<K, V>[] tab = nextTable; ;) { Node<K, V> e; int n; if (k == null || tab == null || (n = tab.length) == 0 || (e = tabAt(tab, (n - 1) & h)) == null) return null; for (; ;) { int eh; K ek; if ((eh = e.hash) == h && ((ek = e.key) == k || (ek ! = null && k.equals(ek)))) return e; If (e instanceof ForwardingNode) {TAB = ((ForwardingNode<K, V>) e).nexttable; continue outer; } else return e.type (h, k); } if ((e = e.next) == null) return null; }}}Copy the code

Slot Operation sequence of head nodes During capacity expansion, lock head nodes, complete data migration, and change them to MOVE. So when get is called and the MOVE flag is found, the data migration for the slot is complete, and you can query the data directly to the target container of the expansion. There is a problem with this table. If find is triggered at the same time, data cannot be retrieved.

So let’s go back to this method and see what happens when we discover that an entry is being migrated

final Node<K, V> advance() { Node<K, V> e; // next ! If ((e = next)! If ((e = next)! = null) e = e.next; for (; ;) { Node<K, V>[] t; int i, n; // must use locals in checks // = null) return next = e; / / at this point is beyond the table scan range Namely traverse all data if (baseIndex > = baseLimit | | (t = TAB) = = null | | (n = t.l ength) < = (I = index) | | I <  0) return next = null; // Hash < 0 indicates that the entry is a tree structure or is being migrated. // tabAt(t, I), I = index have located the entry based on the latest index. If ((e = tabAt(t, I))! = null && e.hash < 0) { if (e instanceof ForwardingNode) { tab = ((ForwardingNode<K, V>) e).nextTable; e = null; PushState (t, I, n); pushState(t, I, n); continue; } else if (e instanceof TreeBin) e = ((TreeBin<K, V>) e).first; else e = null; } if (stack ! = null) // Restore the stored property recoverState(n) with stack; Else if ((index = I + baseSize) >= n) index = ++baseIndex; // visit upper slots if present } }Copy the code

When an entry is found to be an entry in migration, the table,index, and N are stored in a stack structure. And since TAB is updated and t changes in the next round, we can read the slot that index corresponds to the migration array. In this case, e=tabAt(t, I) is used to obtain the entry. Because the stack is assigned. The recoveryState command restores the table and related Pointers. And then the next loop uses if (e! = null) return next = e; Iterate over the migrated data.

What does pushState/recoveryState do

private void pushState(Node<K, V>[] t, int i, Int n) {// Spare will be updated when pushState is flushed before recoveryState, but a get() request for only one Traverser object should be a TableStack that is not reentered; // reuse if possible if (s ! = null) spare = s.next; else s = new TableStack<K, V>(); // Store the current identity information in this structure. s.length = n; s.index = i; s.next = stack; stack = s; } private void recoverState(int n) { TableStack<K, V> s; int len; while ((s = stack) ! = null && (index += (len = s.length)) >= n) { n = len; // Restore the stack data to the current object index = s.index; tab = s.tab; s.tab = null; TableStack<K, V> next = s.next; s.next = spare; // save for reuse stack = next; spare = s; If (s == null && (index += baseSize) >= n) index = ++baseIndex; }Copy the code

The other traversal methods use this class as a template.