1. Introduction

Why use ConcurrentHashMap

For two reasons:

  1. Using HashMap in concurrent programming can cause infinite loops (in JDK1.7, JDK1.8 can cause data loss)
  2. HashTable is very inefficient

2. ConcurrentHashMap structure

The ConcurrentHashMap structure has changed considerably between JDK 1.7 and 1.8, which will be covered later.

2.1 Structure in JDK 1.7

In JDK 1.7, ConcurrentHashMap consists of a Segment data structure and a HashEntry array structure. Sectionalized locking is used for security.

Segment is a ReentrantLock and acts as a lock in ConcurrentHashMap. HashEntry is used to store key-value pairs.

A ConcurrentHashMap contains an array of segments, and a Segment contains a HashEntry array. The structure of a Segment is similar to that of a HashMap, which is an array and a linked list.

2.2 Structure in JDK 1.8

The implementation of JDK1.8 has abandoned the concept of segments, and instead uses a Node array + linked list + red-black tree data structure. Concurrency control is managed using Synchronized and CAS, making it look like an optimized, thread-safe HashMap. While you can still see the Segment data structure in JDK1.8, the attributes have been simplified to be compatible with older versions.

3. The implementation

3.1 Implementation in JDK 1.7

3.1.1 initialization

The initialization of ConcurrentHashMap is to initialize the Segment size (ssize) by bit operation, which is calculated by concurrentLevel.

int sshift = 0;
int ssize = 1;
while (ssize < concurrencyLevel) {
    ++sshift;
    ssize <<= 1;
}
Copy the code

Ssize is calculated using the position operation (ssize <<=1), so the Segment size is raised to the power of 2, and the Segment size is 16 by default.

The initialization of HashEntry under each Segment element is also calculated according to the location operation, which is represented by cap

int cap = 1;
while (cap < c)
    cap <<= 1;
Copy the code

The size of a HashEntry is also calculated to the power of 2 to the N (cap <<=1). The initial value of cap is 1, so the minimum size of a HashEntry is 2.

3.1.2 get operation

The implementation of the Segment GET operation is very simple and efficient. It goes through a hash and then uses the hash value to hash to the Segment and then hashed to the element.

public V get(Object key){
    int hash = hash(key.hashCode());
    return segmentFor(hash).get(key,hash);
}
Copy the code

The efficiency of the GET operation is that no lock is required during the entire get operation, and the lock is read only when an empty value is read. The reason is to use shared variables as volatile.

transient volatile int count;
volatile V value;
Copy the code

3.1.3 put operation

For ConcurrentHashMap data insertion, there are two hashes to locate where the data is stored

static class Segment<K.V> extends ReentrantLock implements Serializable {
    / / to omit
}
Copy the code

When you perform the PUT operation, you go through two steps:

  1. Determine whether to expand the capacity
  2. Locate where the element was added and place it in the HashEntry array

The first hash of the Segment is performed to locate the Segment. If the Segment is not initialized, the CAS operation is performed to assign the Segment. The second hash operation is performed to locate the corresponding HashEntry. When inserting data into a given HashEntry location (tail-interpolating), tryLock() from ReentrantLock will attempt to acquire the lock and, if successful, insert it directly into the corresponding location. If a thread has already acquired a lock for this Segment, the current thread will continue to spin and call tryLock() to acquire the lock. After the specified number of attempts, the thread will be suspended, waiting to be woken up.

3.1.4 the size operation

The ConcurrentHashMap element size is computed concurrently, which means that while you are calculating the size, it is also inserting data concurrently, which may cause your calculated size to differ from your actual size.

The solution adopted by ConcurrentHashMap is to try twice to collect the size of each Segment without locking it. If the count changes during the statistics process, it will collect the size of all segments by locking it again.

try {
    for(; ;) {if (retries++ == RETRIES_BEFORE_LOCK) {
            for (int j = 0; j < segments.length; ++j)
                ensureSegment(j).lock(); // force creation  
        }
        sum = 0L;
        size = 0;
        overflow = false;
        for (int j = 0; j < segments.length; ++j) {
            Segment<K, V> seg = segmentAt(segments, j);
            if(seg ! =null) {
                /* In the put, remove, and clean methods, the * element adds one to the modCount variable. * statistics depend on the change of this variable */
                sum += seg.modCount;
                int c = seg.count;
                if (c < 0 || (size += c) < 0) overflow = true; }}if (sum == last)
            break; last = sum; }}finally {
    if (retries > RETRIES_BEFORE_LOCK) {
        for (int j = 0; j < segments.length; ++j) segmentAt(segments, j).unlock(); }}Copy the code

Implementation of the 3.2 JDK 1.8

3.2.1 Basic properties and concepts

Take a look at the basic properties:

// Maximum size of node array: 2^30=1073741824
private static final int MAXIMUM_CAPACITY = 1 << 30;
// Default initial value, must be a power of 2
private static final int DEFAULT_CAPACITY = 16;
// The array may have a maximum value, which needs to be associated with the toArray () related method
static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
// Concurrency level, legacy, for compatibility with previous versions
private static final int DEFAULT_CONCURRENCY_LEVEL = 16;
// Load factor
private static final float LOAD_FACTOR = 0.75 f;
// List to red black tree threshold,> 8 list to red black tree
static final int TREEIFY_THRESHOLD = 8;
<=UNTREEIFY_THRESHOLD, untreeify(lo))
static final int UNTREEIFY_THRESHOLD = 6;
static final int MIN_TREEIFY_CAPACITY = 64;
private static final int MIN_TRANSFER_STRIDE = 16;
private static int RESIZE_STAMP_BITS = 16;
//2^15-1, help resize maximum number of threads
private static final int MAX_RESIZERS = (1< < (32 - RESIZE_STAMP_BITS)) - 1;
//32-16=16, the offset of the size recorded in sizeCtl
private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS;
// Forwarding Nodes hash value
static final int MOVED = -1;
// The hash value of the root node
static final int TREEBIN = -2;
/ / ReservationNode hash value
static final int RESERVED = -3;
// Number of available processors
static final int NCPU = Runtime.getRuntime().availableProcessors();
// Store an array of nodes
transient volatile Node<K,V>[] table;
/* When the value is negative: -1 indicates that the table is being initialized, and -n indicates that n-1 threads are expanding the table * when the value is 0: indicates that the table has not been initialized * when the value is positive: Indicates the initial or next capacity expansion size */
private transient volatile int sizeCtl;
Copy the code

Important concepts:

  1. Table: null by default, initialized on the first insert, an array with a default size of 16, used to store Node data, and always a power of 2 during expansion.
  2. NextTable: The default value is NULL. This array is twice the size of the original array
  3. Node: a data structure that stores keys, values, and the hash value of keys.
class Node<K.V> implements Map.Entry<K.V> {
    final int hash;
    final K key;
    volatile V val;
    volatile Node<K,V> next;
    // Omit part of the code
}
Copy the code

Value and next are both volatile to ensure concurrency visibility.

  1. ForwardingNode: a special Node Node with a hash value of -1 where references to nextTable are stored.
final class ForwardingNode<K.V> extends Node<K.V> {
    final Node<K,V>[] nextTable;
    ForwardingNode(Node<K,V>[] tab) {
        super(MOVED, null.null.null);
        this.nextTable = tab; }}Copy the code

ForwardingNode is used only when the table is expanded. It is used as a placeholder in the table to indicate that the current node is null or has been moved.

  1. TreeNode class and TreeBin class: The TreeNode class represents each node in a red-black tree. When the number of nodes on a list exceeds the specified value, the list becomes a red-black tree, and each node becomes a TreeNode. Unlike HashMap, ConcurrentHashMap stores not the TreeNode directly in the bucket, but a TreeBin, maintaining a red-black tree inside the TreeBin, that is, the TreeNode is used inside the TreeBin.

3.2.2 initialization

When ConcurrentHashMap is instantiated with parameters, the table size is adjusted according to the parameters. Assuming the parameter is 100, it is eventually adjusted to 256, ensuring that the size of the table is always raised to a power of 2.

The table is initialized

private final Node<K,V>[] initTable() {
    Node<K,V>[] tab; int sc;
    while ((tab = table) == null || tab.length == 0) {
        // If a thread finds sizeCtl<0, it means that another thread has successfully executed the CAS operation, and the current thread only needs to yield CPU slices
        if ((sc = sizeCtl) < 0) 
            Thread.yield(); // lost initialization race; just spin
        else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
            try {
                if ((tab = table) == null || tab.length == 0) {
                    int n = (sc > 0)? sc : DEFAULT_CAPACITY;@SuppressWarnings("unchecked")
                    Node<K,V>[] nt = (Node<K,V>[])newNode<? ,? >[n]; table = tab = nt; sc = n - (n >>>2); }}finally {
                sizeCtl = sc;
            }
            break; }}return tab;
}
Copy the code

3.2.3 put operation

Assume that the table has been initialized and CAS + synchronized is used to implement concurrent insert or update operations of PUT.

final V putVal(K key, V value, boolean onlyIfAbsent) {
    if (key == null || value == null) throw new NullPointerException();
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            if (casTabAt(tab, i, null.new Node<K,V>(hash, key, value, null)))
                break;                   // no lock when adding to empty bin
        }
        else if((fh = f.hash) == MOVED) tab = helpTransfer(tab, f); . } addCount(1L, binCount);
    return null;
}
Copy the code

The hash algorithm

static final int spread(int h) {
    return (h ^ (h >>> 16)) & HASH_BITS;
}
Copy the code

N is the size of the table

int index = (n - 1) & hash

Get element F in table for index

Unsafe.getobjectvolatile retrieves data directly from the specified memory, ensuring that the data is always up to date.

If f is null, the position in the table insert element, for the first time using Unsafe.com pareAndSwapObject insert Node Node method.

If CAS succeeds, the Node is inserted. The addCount(1L, binCount) method checks whether the current capacity needs to be expanded.

If the CAS fails, another thread inserted the node earlier, and the spin retries to insert the node at that location.

If the hash value of F is -1, it indicates that f is currently a ForwardingNode, which means that other threads are expanding the capacity. Perform the expansion operation together.

In the rest of the case, new nodes are inserted into the appropriate locations as linked lists or red-black trees. This process uses synchronous built-in locks to achieve concurrency. The code is as follows:

synchronized (f) {
    if (tabAt(tab, i) == f) {
        if (fh >= 0) {
            binCount = 1;
            for (Node<K,V> e = f;; ++binCount) {
                K ek;
                if(e.hash == hash && ((ek = e.key) == key || (ek ! =null && key.equals(ek)))) {
                    oldVal = e.val;
                    if(! onlyIfAbsent) e.val = value;break;
                }
                Node<K,V> pred = e;
                if ((e = e.next) == null) {
                    pred.next = new Node<K,V>(hash, key,
                                              value, null);
                    break; }}}else if (f instanceof TreeBin) {
            Node<K,V> p;
            binCount = 2;
            if((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! =null) {
                oldVal = p.val;
                if(! onlyIfAbsent) p.val = value; }}}}Copy the code

Synchronize on node f, node insert before, again using tabAt (TAB, I) = = f, prevent modified by other threads.

If f. ash >= 0, it indicates that f is the head node of the linked list structure, traversing the list. If the corresponding node node is found, change value; otherwise, add a node to the end of the list. If f is a TreeBin type node, f is a red-black root node, traversing the elements on the tree structure, updating or adding nodes. If the number of nodes in the list is binCount >= TREEIFY_THRESHOLD(default is 8), the list is converted to a red-black tree structure.

Table capacity

When the table capacity is insufficient (that is, the number of table elements reaches the capacity threshold sizeCtl), the table needs to be expanded.

The entire expansion is divided into two parts:

Build a nextTable that is twice the size of the table. Copy the data of the table to nextTable.

These two processes are simple to implement in a single thread, but ConcurrentHashMap supports concurrent insertion, and the expansion operation will naturally have concurrent occurrence. In this case, the second step can support concurrent replication of nodes, which naturally improves the performance, but the implementation complexity has also risen a step.

Let’s take the first step, building nextTable. This process, of course, can only be initialized by a single thread, as follows:

private final void addCount(long x, int check) {... Omit part of the codeif (check >= 0) {
        Node<K,V>[] tab, nt; int n, sc;
        while (s >= (long)(sc = sizeCtl) && (tab = table) ! =null &&
               (n = tab.length) < MAXIMUM_CAPACITY) {
            int rs = resizeStamp(n);
            if (sc < 0) {
                if((sc >>> RESIZE_STAMP_SHIFT) ! = rs || sc == rs +1 ||
                    sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                    transferIndex <= 0)
                    break;
                if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                    transfer(tab, nt);
            }
            else if (U.compareAndSwapInt(this, SIZECTL, sc,
                                         (rs << RESIZE_STAMP_SHIFT) + 2))
                transfer(tab, null); s = sumCount(); }}}Copy the code

Change the sizeCtl value with Unsafe.compareAndSwapInt to ensure that only one thread can initialize nextTable. After the expansion, the array will be twice as long but 1.5 times as large.

The idea of moving a node from a table to a nextTable is to iterate and replicate.

The first step is to obtain the number of times (I) to be traversed, and then to initialize a forwardNode instance (FWD) by using the tabAt method to obtain the element (f) at the position of I.

If f = = null, I’s position in the table into the FWD, this process is to use Unsafe.com pareAndSwapObjectf method, dealed with the implementation of the concurrent mobile nodes.

If f is the head node of the list, then construct an unordered list and place them in the positions of I and I +n in the nextTable, respectively. This is done using the unsafe.putobjectvolatile method to assign FWD to the table. If f is a TreeBin node, do a reverse processing and determine whether untreeify is required. Place the result of processing in the position of I and I +n in the nextTable respectively. Unsafe.putobjectvolatile assigns FWD to the original position of the table. After you have traversed all the nodes, you have replicated the table to point to the nextTable, and updated the sizeCtl to 0.75 times the size of the new array.

Red-black tree structure

Note: If the number of elements in the list exceeds the TREEIFY_THRESHOLD, which is 8 by default, the list is converted into a red-black tree to improve traversal query efficiency.

if(binCount ! =0) {
    if (binCount >= TREEIFY_THRESHOLD)
        treeifyBin(tab, i);
    if(oldVal ! =null)
        return oldVal;
    break;
}
Copy the code

Next let’s look at how to construct the tree structure. The code looks like this:

private final void treeifyBin(Node<K,V>[] tab, int index) {
    Node<K,V> b; int n, sc;
    if(tab ! =null) {
        if ((n = tab.length) < MIN_TREEIFY_CAPACITY)
            tryPresize(n << 1);
        else if((b = tabAt(tab, index)) ! =null && b.hash >= 0) {
            synchronized (b) {
                if (tabAt(tab, index) == b) {
                    TreeNode<K,V> hd = null, tl = null;
                    for(Node<K,V> e = b; e ! =null; e = e.next) {
                        TreeNode<K,V> p =
                            new TreeNode<K,V>(e.hash, e.key, e.val,
                                              null.null);
                        if ((p.prev = tl) == null)
                            hd = p;
                        else
                            tl.next = p;
                        tl = p;
                    }
                    setTabAt(tab, index, new TreeBin<K,V>(hd));
                }
            }
        }
    }
}
Copy the code

As you can see, the code block of the spanning tree node is synchronized. After entering the synchronized code block, verify whether the index element in the table has been modified.

  1. Generate a linked list of Treenodes with hd as the head Node according to the Node linked list at index position in table.
  2. According to hd head node, TreeBin tree structure is generated, and the root node of the tree structure is written to the memory of the index position of the table, the specific implementation is as follows:
TreeBin(TreeNode<K,V> b) {
    super(TREEBIN, null.null.null);
    this.first = b;
    TreeNode<K,V> r = null;
    for(TreeNode<K,V> x = b, next; x ! =null; x = next) {
        next = (TreeNode<K,V>)x.next;
        x.left = x.right = null;
        if (r == null) {
            x.parent = null;
            x.red = false;
            r = x;
        }
        else {
            K k = x.key;
            inth = x.hash; Class<? > kc =null;
            for (TreeNode<K,V> p = r;;) {
                int dir, ph;
                K pk = p.key;
                if ((ph = p.hash) > h)
                    dir = -1;
                else if (ph < h)
                    dir = 1;
                else if ((kc == null &&
                          (kc = comparableClassFor(k)) == null) ||
                         (dir = compareComparables(kc, k, pk)) == 0)
                    dir = tieBreakOrder(k, pk);
                    TreeNode<K,V> xp = p;
                if ((p = (dir <= 0)? p.left : p.right) ==null) {
                    x.parent = xp;
                    if (dir <= 0)
                        xp.left = x;
                    else
                        xp.right = x;
                    r = balanceInsertion(r, x);
                    break; }}}}this.root = r;
    assert checkInvariants(root);
}
Copy the code

A binary tree is built based on the hash value of a Node.

3.2.4 get operation

Get is much simpler than put.

public V get(Object key) {
    Node<K,V>[] tab; 
    Node<K,V> e, p;
    int n, eh; 
    K ek;
    int h = spread(key.hashCode());
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (e = tabAt(tab, (n - 1) & h)) ! =null) {
        if ((eh = e.hash) == h) {
            if((ek = e.key) == key || (ek ! =null && key.equals(ek)))
                return e.val;
        }
        else if (eh < 0)
            return(p = e.find(h, key)) ! =null ? p.val : null;
        while((e = e.next) ! =null) {
            if(e.hash == h && ((ek = e.key) == key || (ek ! =null && key.equals(ek))))
                returne.val; }}return null;
}
Copy the code
  1. Check whether the table is null. If so, return null.
  2. Compute the hash value of the key, obtain the Node in the specified position in the specified table, find the corresponding Node by traversing the linked list or tree structure, and return the value.

3.2.4 the size operation

The JDK1.8 size is calculated by the CAS of baseCount and counterCell, and finally obtains the size by baseCount and traversing the counterCell array. Specific reference: ConcurrentHashMap size method principle analysis

4. Why did JDK 1.8 drop fragmentation locks

Many people don’t understand why Doug Lea made such a big change in JDK1.8 and used synchronized to get better performance. The reasons are as follows:

  1. The granularity of locks in JDk1.8 is much finer. The concurrentLevel of ConcurrentHashMap in JDK1.7 is basically fixed. In jdk1.8, the concurrentLevel is the same as the array size. Every time we expand the array, the concurrency is doubled.
  2. The introduction of red-black tree and the optimization of linked list make the efficiency of PUT and get in hash conflict more efficient
  3. With JVM support, ReentrantLock is, after all, API level, leaving little room for further performance optimization. Synchronized is directly supported by the JVM and can be optimized at run time: lock coarsening, lock elimination, lock spin, and so on. This allows synchronized to gain performance gains with JDK updates without changing the code.

5. Summary & References

summary

ConcurrentHashMap’s data structure is similar to that of HashMap. In contrast, ConcurrentHashMap only adds synchronization operations to control concurrency. From ReentrantLock+Segment+HashEntry in JDK1.7 to synchronized+CAS+HashEntry+ red-black tree in JDK1.8, the optimization is pretty big.

The resources

  • Infinite loop problem caused by multithreaded expansion in HashMap1.8
  • ConcurrentHashMap Underlying principle
  • ConcurrentHashMap1.8
  • Analysis of the ConcurrentHashMap size method
  • The art of Java concurrent programming
  • ConcurrentHashMap performance in JDK1.8 is greater than jdK1.7