In the jdk7 ConcurrentHashMap

The data structure of ConcurrentHashMap is basically similar to that of HashMap, except that:

1. Internal synchronization mechanism (block lock) is added when data is written to ensure thread safety, and read operation is lockless operation;

2. During capacity expansion, the transfer of old data is performed concurrently, which improves capacity expansion efficiency.

Source code analysis

The ConcurrentHashMap constructor creates a segment. Minimum length is 2. (MIN_SEGMENT_TABLE_CAPACITY=2)

Segment locking is used in JDK7 to ensure thread safety. Segment inherits ReentrantLock.





Find the segment where the key is based on the hash value. Then, call the segment’s PUT method.



If the lock is successful, the logic of the put method continues. If the lock fails, the scanAndLockForPut method is entered.



In this method you are always trying to get the lock. In other words,In ConcurrentHashMap, locks are added to segments.

Create a new HashEntry and move the segment from the old hash table to the new hash table. Problems with circular links in hashMap.

private void rehash(HashEntry<K,V> node) { /* * Reclassify nodes in each list to new table. Because we * are using power-of-two expansion, the elements from * each bin must either stay at same index, or move with a * power of two offset. We eliminate unnecessary node * creation by catching cases where old nodes can be * reused because their next fields won't change. * Statistically, at the default threshold, only about * one-sixth of them need cloning when a table * doubles. The nodes they replace will be garbage * collectable  as soon as they are no longer referenced by * any reader thread that may be in the midst of * concurrently traversing table. Entry accesses use plain * array indexing because they are followed by volatile * table write. */ HashEntry<K,V>[] oldTable = table; int oldCapacity = oldTable.length; int newCapacity = oldCapacity << 1; threshold = (int)(newCapacity * loadFactor); HashEntry<K,V>[] newTable = (HashEntry<K,V>[]) new HashEntry[newCapacity]; int sizeMask = newCapacity - 1; for (int i = 0; i < oldCapacity ; i++) { HashEntry<K,V> e = oldTable[i]; if (e ! = null) { HashEntry<K,V> next = e.next; int idx = e.hash & sizeMask; if (next == null) // Single node on list newTable[idx] = e; else { // Reuse consecutive sequence at same slot HashEntry<K,V> lastRun = e; int lastIdx = idx; for (HashEntry<K,V> last = next; last ! = null; last = last.next) { int k = last.hash & sizeMask; if (k ! = lastIdx) { lastIdx = k; lastRun = last; } } newTable[lastIdx] = lastRun; // Clone remaining nodes for (HashEntry<K,V> p = e; p ! = lastRun; p = p.next) { V v = p.value; int h = p.hash; int k = h & sizeMask; HashEntry<K,V> n = newTable[k]; newTable[k] = new HashEntry<K,V>(h, p.key, v, n); } } } } int nodeIndex = node.hash & sizeMask; // add the new node node.setNext(newTable[nodeIndex]); newTable[nodeIndex] = node; table = newTable; }Copy the code

In the jdk8 ConcurrentHashMap

ConcurrentHashMap in JDK8 differs from JDK7 in that it uses CAS instead of segmentlocking.

Important member attributes

private static final int MAXIMUM_CAPACITY = 1 << 30; /** * The default initial table capacity. Must be a power of 2 * (i.e., at least 1) and at most MAXIMUM_CAPACITY. */ private static final int DEFAULT_CAPACITY = 16; /** * The largest possible (non-power of two) array size. * Needed by toArray and related methods. */ static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8; /** * The default concurrency level for this table. Unused but * defined for compatibility with previous versions of this class. */ private static final int DEFAULT_CONCURRENCY_LEVEL = 16; /** * The load factor for this table. Overrides of this value in * constructors affect only the initial table capacity. The * actual floating point value isn't normally used -- it is * simpler to use expressions such as {@code n - (n >>> 2)} for * the associated resizing threshold. Load factor: 75% by default. When the table usage reaches 75%, the tabel length is doubled to reduce hash collisions. %table.lengh */ private static final Float LOAD_FACTOR = 0.75f; /** * The bin count threshold for using a tree rather than list for a * bin. Bins are converted to trees when adding an element to a * bin with at least this many nodes. The value must be greater * than 2, and should be at least 8 to mesh with assumptions in * tree removal about conversion back to plain bins upon * shrinkage. The default is 8, and when the list length reaches 8, the structure is transformed into a red-black tree. */ static final int TREEIFY_THRESHOLD = 8; /** * The bin count threshold for untreeifying a (split) bin during a * resize operation. Should be less than TREEIFY_THRESHOLD, and at * most 6 to mesh with shrinkage detection under removal. By default, 6, the red-black tree turns into a linked list threshold. */ static final int UNTREEIFY_THRESHOLD = 6; /** * The smallest table capacity for which bins may be treeified. * (Otherwise the table is resized if too many nodes in a bin.) * The value should be at least 4 * TREEIFY_THRESHOLD to avoid * conflicts between resizing and treeification thresholds. The default value is 16. During table expansion, each thread must migrate at least number of table slots. */ static final int MIN_TREEIFY_CAPACITY = 64; /** * Minimum number of rebinnings per transfer step. Ranges are * subdivided to allow multiple resizer threads. This value * serves as a lower bound to avoid resizers encountering * excessive memory contention. The value should be at least * DEFAULT_CAPACITY. */ private static final int MIN_TRANSFER_STRIDE = 16; /** * The number of bits used for generation stamp in sizeCtl. * Must be at least 6 for 32bit arrays. */ private static int RESIZE_STAMP_BITS = 16; /** * The maximum number of threads that can help resize. * Must fit in 32 - RESIZE_STAMP_BITS bits. */ private static final int MAX_RESIZERS = (1 << (32 - RESIZE_STAMP_BITS)) - 1; /** * The bit shift for recording size stamp in sizeCtl. */ private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS; static final int MOVED = -1; // hash for forwarding nodes When node. hash is MOVED, the table is expanding static final int TREEBIN = -1; // Hash for roots of trees represents this element with red-black tree static final int RESERVED = -3; // hash for transient reservations static final int HASH_BITS = 0x7fffffff; // Usable bits of Normal node hash // Table is a temporary variable that migrates all elements to nextTable during migration. private transient volatile Node<K,V>[] nextTable; /** * Table initialization and resizing control. When negative, the * table is being initialized or resized: -1 for initialization, * else -(1 + the number of active resizing threads). Otherwise, * when table is null, holds the initial table size to use upon * creation, or 0 for default. After initialization, holds the * next element count value upon which to resize the table. 0: the table has not been initialized. -1: the table is being initialized. The value is smaller than -1. The actual value is resizeStamp(n)<<RESIZE_STAMP_SHIFT+2, indicating that the table is expanding and greater than 0: After initialization, indicates the maximum number of table elements. The default value is 0.75*n */ private TRANSIENT volatile int sizeCtl. /** * The next table index (plus one) to split while resizing. TransferIndex Indicates the index of the currently migrated element */ private TRANSIENT int transferIndex;Copy the code

ForwardingNode inherits Node, a special Node whose hashCode =MOVED indicates that the table is being expanded. During capacity expansion, if an element of the table is null, then that element is set to ForwardingNode. When the next thread inserts data into that element, check hashCode =MOVED to help capacity expansion.

The constructor



The constructor, similar to hashMap in JDK8, converts a capacity that is not a power of two.

Put () method

final V putVal(K key, V value, boolean onlyIfAbsent) { if (key == null || value == null) throw new NullPointerException(); int hash = spread(key.hashCode()); int binCount = 0; for (Node<K,V>[] tab = table;;) {// for loop CAS Node<K,V> f; int n, i, fh; If (TAB = = null | | (n = TAB. Length) = = 0) / / if the hash table is empty to initialize TAB = initTable (); Else if ((f = tabAt(TAB, I = (n-1) & hash)) == null) { If (casTabAt(TAB, I, null, new Node<K,V>(hash, key, value)) Null))) // the CAS operation is placed in the current hash table position break; // no lock when adding to empty bin } else if ((fh = f.hash) == MOVED) // TAB = helpTransfer(TAB, f); TAB = helpTransfer(TAB, f); Else {// The current hash table position is not empty V oldVal = null; Synchronized (f) {if (tabAt(TAB, I) == f) {if (fh >= 0) {binCount = 1; for (Node<K,V> e = f;; ++binCount) {// Loop to build a linked list; if (e.hash == hash && ((ek = e.key) == key || (ek ! = null && key.equals(ek)))) {oldVal = e.val; if (! onlyIfAbsent) e.val = value; break; } Node<K,V> pred = e; Next = new Node<K,V>(hash, key, value, null); break; } } } else if (f instanceof TreeBin) { Node<K,V> p; binCount = 2; if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! = null) { oldVal = p.val; if (! onlyIfAbsent) p.val = value; } } } } if (binCount ! If (binCount >= TREEIFY_THRESHOLD) treeifyBin(TAB, I); if (oldVal ! = null) return oldVal; break; } } } addCount(1L, binCount); return null; }Copy the code



The while loop CAS operation is also used to initialize the table operation to ensure security in multithreading.

Final Node<K,V>[] helpTransfer(Node<K,V>[] TAB, Node<K,V> f) {//table Node<K,V>[] nextTab; int sc; if (tab ! = null && (f instanceof ForwardingNode) && (nextTab = ((ForwardingNode<K,V>)f).nextTable) ! Int rs = resizeStamp(tab.length); While (nextTab = = nextTable && table = = TAB && (sc = sizeCtl) < 0) {/ / that / / capacity also judge whether sign changed | | expansion over the if ((sc > > > RESIZE_STAMP_SHIFT) ! = = = rs | | sc rs + 1 | | / / maximum help thread | | whether expansion transfer subscript in adjusting (expansion) sc = = rs + MAX_RESIZERS | | transferIndex < = 0) break; If (U.compareAndSwapInt(this, sizeCtl, sc, sc + 1)) {transfer(TAB, nextTab); transfer(TAB, nextTab); break; } } return nextTab; } return table; }Copy the code

We mainly did the following things:

  • Check whether capacity expansion is complete.
  • SizeCtrl = sizeCtrl+1, then call Transfer () for real expansion.
private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) { int n = tab.length, stride; if ((stride = (NCPU > 1) ? (n >>> 3)/NCPU: n) < MIN_TRANSFER_STRIDE) // subdivide range, each thread migrates at least 16 slots, if large, the maximum stride = MIN_TRANSFER_STRIDE; NextTab == null) {try {@suppressWarnings ("unchecked") Node<K,V>[] nt = (Node<K,V>[])new Node<? ,? >[n << 1]; NextTab = nt; } catch (Throwable ex) { // try to cope with OOME sizeCtl = Integer.MAX_VALUE; return; } nextTable = nextTab; transferIndex = n; } int nextn = nexttab.length; ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab); // Whether to advance to the next cycle Boolean advance = true; Case of this method Boolean finishing = false; // to ensure sweep before research nextTab, complete state, if yes, end this method. for (int i = 0, bound = 0;;) { Node<K,V> f; int fh; While (advance) {// take the next cycle int nextIndex, nextBound; / / this thread processing range for the [bound, I), is also no processing is completed, then continue to process the if (-- > = I bound | | finishing) advance = false; Else if ((nextIndex = transferIndex) <= 0) {I = -1; advance = false; } // This condition changes the transferIndex value, Else if (U.compareAndSwapInt (this, TRANSFERINDEX, nextIndex, //nextBound) NextBound = (nextIndex > stride? NextIndex - stride: 0))) {nextBound = nextindex-1; }} the if (I < 0 | | I > = n | | I + n > = nextn) {/ / every migration thread can reach here int sc; if (finishing) {/ / migration nextTable = null; SizeCtl = (n << 1) - (n >>> 1); // Expand by 2n-0.5N = 1.50N, update the new capacity threshold return; If (U.compareAndSwapInt(this, SIZECTL, sc = SIZECTL, sc-1)) {if (U.compareAndSwapInt(this, SIZECTL, sc = SIZECTL, SC-1)) SizeCtl = sizeCtl - 1. These two objects are also split if ((sc-2)! = resizeStamp(n) << RESIZE_STAMP_SHIFT) return; finishing = advance = true; i = n; // recheck before commit}} Else if ((f = tabAt(TAB, I)) == null) advance = casTabAt(TAB, I, null, FWD); Else if ((fh = f.hash) == MOVED) // Other threads already worked, advance = true; // already processed, if we push through the next cycle, we'll still check if we're done with the I and the bound else {// We need to lock it so that we can't put any more values in it, and we'll lock it when we put data in it. Let's say the whole table is migrating, it hasn't migrated to this element yet, and another thread inserts data into this node, and it's migrated to here, Synchronized (f) {if (tabAt(TAB, I) == f) { If (fh >= 0) {int runBit = fh&n;//n <K,V> lastRun = f; For (Node<K,V> p = f.ext; p! = null; p = p.ext) {int b = p.hash & n; // If (b! = runBit) { runBit = b; lastRun = p; } } if (runBit == 0) { ln = lastRun; hn = null; } else {// for example, 1,16,32, if %16 is low, then it must be 0.  hn = lastRun; ln = null; } for (Node<K,V> p = f; p ! = lastRun; p = p.next) { int ph = p.hash; K pk = p.key; V pv = p.val; Ln = new Node<K,V>(ph, pk, pv, ln); // next = null; // next = null; Hn = new Node<K,V>(pH, pk, pv, hn);} setTabAt(nextTab, I, hn); SetTabAt (TAB, I, FWD); advance = true;} else if (f instanceof TreeBin) {// If (f instanceof TreeBin<K,V> t = (TreeBin<K,V>)f; TreeNode<K,V> lo = null, loTail = null; TreeNode<K,V> hi = null, hiTail = null; int lc = 0, hc = 0;  for (Node<K,V> e = t.first; e ! Int h = e.bash; TreeNode<K,V> p = new TreeNode<K,V> (h, e.key, e.val, null, null); If ((h & n) == 0) {// if (h & n) == 0) {// if (h & n) == 0) lo = p; Else loTail. Next = p; // Last p's next is this p loTail = p; } else {if ((p.prev = hiTail) == null) hi = p; }} ln = (lc <= UNTREEIFY_THRESHOLD)? Untreeify (lo) :// // New TreeBin<K,V>(lo) : Hn = (hc <= UNTREEIFY_THRESHOLD)? Untreeify (hi) : (lc!= 0)? New TreeBin<K,V>(hi) : t; setTabAt(nextTab, i, ln); setTabAt(nextTab, i + n, hn); setTabAt(tab, i, fwd); advance = true; } } } } } }Copy the code

There are two variables to know:

  • Advance: Indicates whether the next round element can be migrated.
  • Finishing: Whether all elements of the table are migrated.

Basically, the following things were done:

  • Stride determine the number of elements that the thread migrates in each turn, for example, enter a thread, determine to expand the table subscript elements between (a, B], the next thread expand (b, C]. This is also limited by 16 for b minus A or c minus b. That is, each thread expands at least 16 consecutive table elements. The subscript that marks the current transfer is stored in the transferIndex.
  • Check whether nextTab is initialized. If not, it is the first thread to initialize nextTab, and the size of the table is twice that of the previous thread.
  • Enter the while loop to find the range of table subscripts in (bound, I) for this migration. Note that the range is half-open and half-closed.
  • I -> bound traverses each element of the table from the largest to the smallest:
  1. If the element is empty, write ForwardingNode to that element and then check the next element. When another thread inserts data into the element and knows that the table is being migrated by another thread, helpTransfer is called in putVal to help the migration.
  2. If the element hash=MOVED, the table is in the process of being migrated. Skip it. There’s no reason to run around here.
  3. Otherwise, the element follows a linked list or a red-black tree, hash>0 indicates a linked list, and f instanceof TreeBin indicates a red-black tree.
  • List migration works as follows: Each node of the list is traversed. If f. Hash &n==0 of the node is true, the node is placed on I; otherwise, the node is placed on n+ I.

Before migration, the element is locked. When traversing the list, the lastRun variable is used here and the value of the last hash is retained. If f.hash ==0 for all nodes in the whole list, the following nodes will be considered to have the same value as long as the lastRun value is found in the second traversal, thus reducing the unnecessary F.hash and n values. After traversing all nodes, two linked lists are formed. Ln stores f.hash&n=0 and hn stores non-0. Ln is stored in the I element of nextTable and n+ I in n+ I.

The blue node represents: F.hash ==0, and the green node represents F.hash! = 0. The blue nodes are still in the range (0, n), and the green nodes are in the range (n, 2n-1).

  • Migrating a linked list works the same way as a red-black tree. In a red-black tree, we record the first of each red-black tree (which is not the smallest hash node) and the next of each node. Based on these two elements, we can access all elements of the red-black tree, which is also a linked list at this point. Red-black trees migrate in the same way as linked lists. The red-black tree is split into HN and LN according to the migration. According to the length of the linked list, whether the linked list is red-black tree structure or degenerate into a linked list is determined.

4. How to confirm that all table elements are migrated:

If (U.compareAndSwapInt(this, SIZECTL, sc = SIZECTL, sc-1)) {if (U.compareAndSwapInt(this, sc = SIZECTL, SC-1)) { SizeCtl = sizeCtl - 1. These two objects are also split if ((sc-2)! = resizeStamp(n) << RESIZE_STAMP_SHIFT) return; finishing = advance = true; i = n; // recheck before commit }Copy the code

SizeCtl = resizeStamp(n) << RESIZE_STAMP_SHIFT+2 for the first thread to start the migration, then each new thread to help the migration will be sizeCtl=sizeCtl+1, after the migration,sizeCtl-1, Therefore, as long as one thread is still in the migration state, then the sizeCtl> resizeStamp(n) << RESIZE_STAMP_SHIFT+2 will always be valid. When only the last thread completes the migration, both sides of the equation will be valid. SizeCtl =sizeCtl+1, and then when it’s done, it subtracts by 1, so it’s not equal. Notice here, with sizeCtl, it assigns values to SC before it subtracts by 1, so it’s sc.

conclusion

Table expansion is the process of migrating table elements to new tables. During the element migration, the process can be completed concurrently, speeding up the migration speed without blocking threads. After all elements are migrated, the old table is lost and the new table is used.