This is the seventh day of my participation in the First Challenge 2022. For details: First Challenge 2022.

Introduction to the

ConcurrentHashMap is a thread-safe version of HashMap and uses an array + linked list + red-black tree structure internally to store elements. Compared to the same thread-safe HashTable, efficiency and other aspects are greatly improved.

Before reading this article, if you are not familiar with HashMap, you can read this article: An in-depth look at the HashMap source code

A brief introduction to the various locks will give you an idea of the concepts when we talk about them later.

The lock

synchronized

Keywords in Java, internally implemented as monitor locks, are primarily indicated by object monitor fields in the object header.

Synchronized has been optimized a lot since its old release, and at runtime exists in three different ways: biased locking, lightweight locking, and heavyweight locking.

Biased lock: when a piece of synchronized code is continuously accessed by a thread, the thread will automatically acquire the lock, reducing the cost of acquiring the lock.

Lightweight lock: When a biased lock is accessed by another thread, the biased lock will be upgraded to a lightweight lock. The thread will spin to try to acquire the lock without blocking and improve performance.

Heavyweight lock: A lightweight lock will be blocked if the spinning thread spins for a certain number of times but does not obtain the lock. The heavyweight lock will be upgraded to heavyweight lock, which will block other threads and reduce the performance.

CAS

CAS, Compare And Swap, is a kind of optimistic lock. It believes that concurrent operations on the same data may not be modified. When updating data, it tries to update data, And keeps trying if it fails.

Volatile (non-locking)

In Java, when multiple threads access the same variable and one thread modifies the value of the variable, the other threads can immediately see the modified value.

spinlocks

Spin locking, which means that the thread trying to acquire the lock does not block, but keeps trying in a circular manner. This has the advantage of reducing the context switch of the thread to unlock the lock, improving performance, but the disadvantage is that the loop consumes CPU.

Segmented lock

Segment-based lock is a kind of lock design idea that reduces the granularity of locks. It is mainly used in ConcurrentHashMap to achieve efficient concurrent operations. When the operation does not need to update the entire array, only one item in the array is locked.

ReentrantLock

Reentrant lock means that a thread will automatically acquire the lock if it tries to acquire the lock again. The advantage of reentrant lock is to avoid deadlock. Synchronized is also a reentrant lock.

The body of the

In Java, HashMap is not thread safe. What are the solutions if you want to safely manipulate a map in multiple threads?

  • The first method is to useHashtableThread safety class;
  • The second method is to useCollections.synchronizedMapMethod, add synchronization lock to the method;
  • The third method is to use and send packetsConcurrentHashMapClass;

Hashtable is a thread-safe class that uses synchronized locks for almost all add, delete, and query methods.

As long as one thread accesses or manipulates the object, the other threads can only block and wait for the lock to be released. In the competitive multi-threaded scenario, the performance will be very poor. Therefore, Hashtable is not recommended.

Collections.synchronizedmap inside lock to ensure that the use object multithreaded scenarios, safe operation, nature is also a full table lock on a HashMap!

useCollections.synchronizedMapMethod, in the competitive multi-threaded environment performance is still very poor, so not recommended!

The ConcurrentHashMap class adopts the idea of segmental locking. It splits the HashMap into smaller arrays, each consisting of n hashentries. This array is named Segment and inherits from ReentrantLock.

Of course, JDK1.7 and JDK1.8 implement ConcurrentHashMap quite differently!

If the length of a ConcurrentHashMap is larger than 8, it will be converted to a red-black tree structure.

ConcurrentHashMap in JDK1.8

The ConcurrentHashMap class in JDK1.8 cancels Segment locking and adopts CAS + synchronized to ensure concurrent security. The data structure is similar to that of HashMap in JDK1.8. Both are arrays + lists (when the list length is greater than 8, the list structure becomes red and black binary tree) structure.

Synchronized in ConcurrentHashMap only locks the first node of the current linked list or red-black binary tree. As long as nodes hash does not conflict, concurrency will not occur, which is N times more efficient than ConcurrentHashMap in JDK1.7.

With that said, let’s take a look at the source implementation of ConcurrentHashMap.

The ConcurrentHashMap JDK1.7

What is Segment?

The Segment itself is equivalent to a HashMap object.

Like a HashMap, a Segment contains an array of Hashentries, each of which is both a key-value pair and the head node of a linked list.

The single Segment structure is as follows:

How many Segment objects like this are in the ConcurrentHashMap collection? There are two to the NTH of them, kept in an array called segments. Therefore, the structure of the ConcurrentHashMap is as follows:

ConcurrentHashMap is, so to speak, a two-level hash table. Under a total hash table, there are several subhash tables.

This secondary structure is similar to the horizontal splitting of a database.

Let’s look at the ConcurrentHashMap concurrent read and write cases.

Case1: Concurrent writes to different segments (can be executed concurrently)

Case2: write and read of the same Segment (can be executed concurrently)

Case3: concurrent writes to the same Segment

Writes to segments need to be locked, so concurrent writes to the same Segment are blocked.

This shows that each Segment in ConcurrentHashMap holds a lock. It reduces the granularity of locks while ensuring thread safety, making concurrent operations more efficient.

Next, let’s look at reading and writing ConcurrentHashMap.

The get method

  1. Hash the input Key to obtain the Hash value.

  2. Locate the Segment object using the hash value

  3. Locate the array in the Segment using the hash value.

Put method

  1. Hash the input Key to obtain the Hash value.

  2. Locate the Segment object using the hash value

  3. Gets a reentrant lock

  4. Locate the array in the Segment using the hash value.

  5. Insert or overwrite a HashEntry object.

  6. Releases the lock.

The ConcurrentHashMap JDK1.8

While ConcurrentHashMap in JDK1.7 addresses the security of HashMap concurrency, it is still slow to traverse queries when conflicting lists are too long!

In JDK1.8, HashMap introduces red-black binary tree design. When the length of the conflicting list is larger than 8, the list will be transformed into red-black binary tree structure. Red-black binary tree is also known as balanced binary tree, which greatly improves the query efficiency.

Compared with ConcurrentHashMap in JDK1.7, ConcurrentHashMap in JDK1.8 abandons the original Segment locking implementation and adopts CAS + synchronized to ensure the security of concurrency.

ConcurrentHashMap in JDK1.8 uses volatile for shared variables in the Node class, as in JDK1.7.

static class Node<K.V> implements Map.Entry<K.V> {
        final int hash;
        final K key;
        volatile V val;
        volatile Node<K,V> next;

        Node(int hash, K key, V val, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.val = val;
            this.next = next; }... }Copy the code

ConcurrentHashMap in JDK1.8

Put method

public V put(K key, V value) {
    return putVal(key, value, false);
}

final V putVal(K key, V value, boolean onlyIfAbsent) {
    // Neither key nor value can be null
    if (key == null || value == null) throw new NullPointerException();
    // Computes the hash value
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        if (tab == null || (n = tab.length) == 0)
            // If TAB is uninitialized or the number is 0, the node array is initialized
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            if (casTabAt(tab, i, null.new Node<K,V>(hash, key, value, null)))
                // If you insert an element using CAS and it already exists, enter the next loop and start again
                // If the CAS element is inserted successfully, break breaks out of the loop and the process ends
                break;                   // no lock when adding to empty bin
        }
        else if ((fh = f.hash) == MOVED)
            // If the hash of the first element of the TAB to be inserted is MOVED, the current thread will help migrate the elements together
            tab = helpTransfer(tab, f);
        else {
            // If the TAB is not empty and does not migrate elements, lock the TAB (segment lock)
            // And find if the element to be inserted is in the TAB
            // onlyIfAbsent=false
            // If not, insert to end of list or insert tree
            V oldVal = null;
            synchronized (f) {
                // Check again if the first element has changed. If so, enter the next loop and start from the beginning
                if (tabAt(tab, i) == f) {
                    // If the hash value of the first element is greater than or equal to 0 (no migration, no tree)
                    // The elements in a TAB are stored in a linked list
                    if (fh >= 0) {
                        // The number of elements in TAB is set to 1
                        binCount = 1;
                        // Iterate over the TAB, incrementing binCount by 1 at the end of each TAB
                        for (Node<K,V> e = f;; ++binCount) {
                            K ek;
                            if(e.hash == hash && ((ek = e.key) == key || (ek ! =null && key.equals(ek)))) {
                                // If the element is found, a new value is assigned (onlyIfAbsent=false)
                                // And exit the loop
                                oldVal = e.val;
                                if(! onlyIfAbsent) e.val = value;break;
                            }
                            Node<K,V> pred = e;
                            if ((e = e.next) == null) {
                                // If no element is found at the end of the list
                                // Just insert it to the end of the list and exit the loop
                                pred.next = new Node<K,V>(hash, key,
                                        value, null);
                                break; }}}else if (f instanceof TreeBin) {
                        If the first element is a tree node
                        Node<K,V> p;
                        // Set the number of elements in TAB to 2
                        binCount = 2;
                        // Call the red-black tree insert method to insert the element
                        // Returns null on successful insertion
                        // Otherwise return the node found
                        if((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! =null) {
                            // If the element is found, a new value is assigned (onlyIfAbsent=false)
                            // And exit the loop
                            oldVal = p.val;
                            if(! onlyIfAbsent) p.val = value; }}}}// If binCount is not 0, the element was successfully inserted or found
            if(binCount ! =0) {
                // If the number of linked list elements reaches 8, try tree
                // The value of binCount is only 2 when the element is inserted into the tree, not the number of elements in the whole tree
                // So there is no repetition of tree
                if (binCount >= TREEIFY_THRESHOLD)
                    treeifyBin(tab, i);
                // If the element to be inserted already exists, the old value is returned
                if(oldVal ! =null)
                    return oldVal;
                // Exit the outer loop and the process ends
                break; }}}// Insert elements successfully, the number of elements is increased by 1.
        addCount(1L, binCount);
        // Successfully inserted element returns NULL
        return null;
    }
Copy the code

When performing the PUT operation, the process is as follows:

  • First, the system checks whether the key and value are empty. If they are empty, an exception is thrown.
  • It then determines whether the container array is empty and initializes the array if it is.
  • Further determine the element to be insertedf, whether the index of the current array is inserted for the first time, if so, it is inserted through CAS;
  • And then judgef.hash == -1If yes, the value is currentfisForwardingNodeNode: indicates that other threads are expanding capacity.
  • In other cases, it’s the newNodeNodes are inserted into the appropriate positions in a linked list or red-black tree;
  • After the node is inserted, it then determines whether the length of the linked list has been exceeded8, if more than8, the linked list is transformed into a red-black tree structure;
  • Finally, after the insertion is complete, the expansion judgment is performed.

InitTable initializes an array

private final Node<K,V>[] initTable() {
    Node<K,V>[] tab; int sc;
    while ((tab = table) == null || tab.length == 0) {
        if ((sc = sizeCtl) < 0)
            // If sizeCtl<0, the CPU is being initialized or expanded
            Thread.yield(); // lost initialization race; just spin
        else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
            // If the sizeCtl atom is successfully updated to -1, the current thread enters initialization
            // If the atomic update fails, another program is initialized first, and the next loop is entered
            // If the initialization is not complete by the next loop, sizeCtl<0 goes into the logic above if to let the CPU go
            // If the next loop is complete, table.length! =0, exit the loop
            try {
                // Check whether the table is empty again to prevent ABA problems
                if ((tab = table) == null || tab.length == 0) {
                    // If sc is 0, use the default value 16
                    int n = (sc > 0)? sc : DEFAULT_CAPACITY;// Create an array
                    @SuppressWarnings("unchecked")
                    Node<K,V>[] nt = (Node<K,V>[])newNode<? ,? >[n];// Assign TAB array to table
                    table = tab = nt;
                    // Set sc to 0.75 times the array length
                    // n - (n >>> 2) = n - n/4 = 0.75n
                    // The load factor and the expansion threshold are both written down
                    // This is why there are no threshold and loadFactor attributes
                    sc = n - (n >>> 2); }}finally {
                // assign SC to sizeCtl, which stores the capacity expansion threshold
                sizeCtl = sc;
            }
            break; }}return tab;
}
Copy the code
  • The CAS lock controls only one thread to initialize the TAB array;

  • SizeCtl stores expansion thresholds after initialization;

  • The threshold is 0.75 times the size of the TAB array, which is the size of the map and how many elements it can store.

HelpTransfer assists in capacity expansion

final Node<K,V>[] helpTransfer(Node<K,V>[] tab, Node<K,V> f) {
    Node<K,V>[] nextTab; int sc;
    // If the TAB array is not empty, the first element of the current TAB is of type ForwardingNode, and nextTab is not empty
    // The current TAB has been migrated
    // The first element of the old TAB will be set to a ForwardingNode and its nextTab will point to the new TAB array
    if(tab ! =null && (f instanceofForwardingNode) && (nextTab = ((ForwardingNode<K,V>)f).nextTable) ! =null) {
        int rs = resizeStamp(tab.length);
        // sizeCtl<0, the system is expanding capacity
        while (nextTab == nextTable && table == tab &&
                (sc = sizeCtl) < 0) {
            if((sc >>> RESIZE_STAMP_SHIFT) ! = rs || sc == rs +1 ||
                    sc == rs + MAX_RESIZERS || transferIndex <= 0)
                break;
            // Increase the number of threads by 1
            if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) {
                // The current thread helps migrate elements
                transfer(tab, nextTab);
                break; }}return nextTab;
    }
    return table;
}
Copy the code

The operation steps are as follows:

  • Step 1 verify data of table, Node, and nextTable of node.
  • Step 2, get an identifier based on the length of the array;
  • Step 3 further verify nextTab, TAB, and sizeCtl values if nextTab is not concurrently modified and TAB is not concurrently modifiedsizeCtl < 0, indicating that the capacity is still being expanded.
  • Step 4. Analyze and judge the sizeCtl parameter values. If none of the values are satisfiedsizeCtl + 1, added a thread to help it expand.

AddCount Determines capacity expansion

private final void addCount(long x, int check) {
    CounterCell[] as; long b, s;
    // First try to add the quantity to baseCount. If that fails, add it to the segmented CounterCell
    if((as = counterCells) ! =null| |! U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
        CounterCell a; long v; int m;
        boolean uncontended = true;
        if (as == null || (m = as.length - 1) < 0 ||
                (a = as[ThreadLocalRandom.getProbe() & m]) == null| |! (uncontended = U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) { fullAddCount(x, uncontended);return;
        }
        if (check <= 1)
            return;
        // Count elements
        s = sumCount();
    }
    // Check whether expansion is required. By default, check=1
    if (check >= 0) {
        Node<K,V>[] tab, nt; int n, sc;
        // If the number of elements reaches the threshold, expand the capacity
        // Note that sizeCtl normally stores the capacity expansion threshold, i.e. 0.75 times the capacity
        while (s >= (long)(sc = sizeCtl) && (tab = table) ! =null &&
                (n = tab.length) < MAXIMUM_CAPACITY) {
            // RS is an identifier for capacity expansion
            int rs = resizeStamp(n);
            if (sc < 0) {
                // IF SC <0, capacity expansion is being performed
                if((sc >>> RESIZE_STAMP_SHIFT) ! = rs || sc == rs +1 ||
                        sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                        transferIndex <= 0)
                    // The capacity expansion has been completed, exit the loop
                    break;
                // If the expansion is not complete, the current thread is added to the migration element
                // Add 1 to the thread count
                if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                    transfer(tab, nt);
            }
            else if (U.compareAndSwapInt(this, SIZECTL, sc,
                    (rs << RESIZE_STAMP_SHIFT) + 2))
                // Enter the migration element
                transfer(tab, null);
            // Recalculate the elementss = sumCount(); }}}Copy the code

The operation steps are as follows:

  • Step 1, update the baseCount value with the CAS method
  • Step 2: Check whether expansion is required. By default, check = 1.
  • Step 3: If the capacity expansion conditions are met, determine whether the capacity is being expanded. If yes, expand the capacity together.
  • Step 4, if the size is not expanded, change the sizeCtl value to negative and expand the size.

Above is the whole process of the put method, can discover, inside a lot of use of CAS, the CAS said compared with replacement, there are three parameters, respectively is the target of memory address, the old value, the new value, each time a judgment, old values will be compared with the values in the target memory address, if equal, will update the new values to the memory address, If not, the loop continues until the operation succeeds!

The get method

public V get(Object key) {
    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
    / / calculate the hash
    int h = spread(key.hashCode());
    // Check whether the array is empty.
    if((tab = table) ! =null && (n = tab.length) > 0 &&
            (e = tabAt(tab, (n - 1) & h)) ! =null) {
        // If the first element is the element you are looking for, return it directly
        if ((eh = e.hash) == h) {
            if((ek = e.key) == key || (ek ! =null && key.equals(ek)))
                return e.val;
        }
        else if (eh < 0)
            // If the hash value is less than 0, the hash value is a tree or is being expanded
            // Use find to find elements, which can be implemented differently depending on Node subclasses
            return(p = e.find(h, key)) ! =null ? p.val : null;
        // Walk through the list looking for elements
        while((e = e.next) ! =null) {
            if(e.hash == h && ((ek = e.key) == key || (ek ! =null && key.equals(ek))))
                returne.val; }}return null;
}
Copy the code

The steps are as follows:

  • The first step is to determine whether the array is empty and determine whether the array subscript is empty by key.
  • Step 2: Check whether the first element of node is to be found.
  • Step 3, if it is a red-black tree structure, query from the red-black tree;
  • Step 4, if the list structure, loop through the judgment.

Reomve method

public V remove(Object key) {
    // Call the replace node method
    return replaceNode(key, null.null);
}

final V replaceNode(Object key, V value, Object cv) {
    / / calculate the hash
    int hash = spread(key.hashCode());
    // Loop through the array
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        // Check parameters
        if (tab == null || (n = tab.length) == 0 ||
                (f = tabAt(tab, i = (n - 1) & hash)) == null)
            break;
        else if ((fh = f.hash) == MOVED)
            // If expansion is in progress, assist in expansion
            tab = helpTransfer(tab, f);
        else {
            V oldVal = null;
            // Whether the flag is processed
            boolean validated = false;
            // Use synchronized lock to ensure safe element removal during concurrency
            synchronized (f) {
                // Verify again that the current TAB element has been modified
                if (tabAt(tab, i) == f) {
                    if (fh >= 0) {
                        // fh>=0 indicates a linked list node
                        validated = true;
                        // Walk through the list to find the destination node
                        for (Node<K,V> e = f, pred = null;;) {
                            K ek;
                            if(e.hash == hash && ((ek = e.key) == key || (ek ! =null && key.equals(ek)))) {
                                V ev = e.val;
                                if (cv == null|| cv == ev || (ev ! =null && cv.equals(ev))) {
                                    oldVal = ev;
                                    if(value ! =null)
                                        e.val = value;
                                    else if(pred ! =null)
                                        pred.next = e.next;
                                    else
                                        setTabAt(tab, i, e.next);
                                }
                                break;
                            }
                            pred = e;
                            // If no element is found at the end of the list, exit the loop
                            if ((e = e.next) == null)
                                break; }}else if (f instanceof TreeBin) {
                        // If it is a tree node
                        validated = true;
                        TreeBin<K,V> t = (TreeBin<K,V>)f;
                        TreeNode<K,V> r, p;
                        // Traverses the tree to find the target node
                        if((r = t.root) ! =null &&
                                (p = r.findTreeNode(hash, key, null)) != null) {
                            V pv = p.val;
                            if (cv == null|| cv == pv || (pv ! =null && cv.equals(pv))) {
                                oldVal = pv;
                                if(value ! =null)
                                    p.val = value;
                                else if (t.removeTreeNode(p))
                                    setTabAt(tab, i, untreeify(t.first));
                            }
                        }
                    }
                }
            }
            // If so, returns whether the element was found or not
            if (validated) {
                // If an element is found, its old value is returned
                if(oldVal ! =null) {
                    // If the value to be replaced is empty, the number of elements is reduced by 1
                    if (value == null)
                        addCount(-1L, -1);
                    return oldVal;
                }
                break; }}}// Return null if no element is found
    return null;
}
Copy the code

The steps are as follows:

  • The first step is to iterate over the number group and verify the parameters.
  • Step 2: Determine whether other threads are expanding capacity, if they are expanding capacity together;
  • Step 3: Use synchronized lock to ensure safe element removal during concurrency;
  • Step 4, becausecheck= -1Therefore, the CAS operation is used to change the baseCount value.

Ask questions

What problem does ConcurrentHashMap not solve?

Look at the following example:

private static final Map<Integer, Integer> map = new ConcurrentHashMap<>();

public void unsafeUpdate(Integer key, Integer value) {
    Integer oldValue = map.get(key);
    if (oldValue == null) { map.put(key, value); }}Copy the code

Is ConcurrentHashMap thread-safe if multiple threads call unsafeUpdate() at the same time?

The answer is no. Since other threads may have already put() after get() and before if, putting () overwrites that thread’s put() element.

So how do you fix that?

The answer is also simple: use the putIfAbsent() method, which will ensure that the element is inserted when it does not already exist, as follows:

public void safeUpdate(Integer key, Integer value) {
    map.putIfAbsent(key, value);
}
Copy the code

So what if oldValue is not compared to NULL above, but to a specific value such as 1?

Here it is:

public void unsafeUpdate(Integer key, Integer value) {
    Integer oldValue = map.get(key);
    if (oldValue == 1) { map.put(key, value); }}Copy the code

There is no way to use putIfAbsent().

ConcurrentHashMap also provides another method called replace(K key, V oldValue, V newValue) to solve this problem.

Replace (K key, V oldValue, V newValue); if newValue is null, the element will be deleted.

public void safeUpdate(Integer key, Integer value) {
    map.replace(key, 1, value);
}
Copy the code

So what if instead of a simple put() operation, if is followed by another business operation, followed by put(), such as the following?

public void unsafeUpdate(Integer key, Integer value) {
    Integer oldValue = map.get(key);
    if (oldValue == 1) {
        System.out.println(System.currentTimeMillis());
        /** * Other business operations */System.out.println(System.currentTimeMillis()); map.put(key, value); }}Copy the code

There is no way to use the methods provided by ConcurrentHashMap, so the business itself has to ensure thread-safety, such as the following:

public void safeUpdate(Integer key, Integer value) {
    synchronized (map) {
        Integer oldValue = map.get(key);
        if (oldValue == null) {
            System.out.println(System.currentTimeMillis());
            /** * Other business operations */System.out.println(System.currentTimeMillis()); map.put(key, value); }}}Copy the code

This may not be very friendly, but at least it ensures that the business logic is correct.

Of course, using ConcurrentHashMap doesn’t make much sense here, so you can use a regular HashMap instead.

We can’t just hear that ConcurrentHashMap is thread-safe and assume that it is thread-safe no matter what.

conclusion

While HashMap is not safe to operate in multithreaded environments, under the java.util.concurrent package, Java provides the ConcurrentHashMap class to ensure that HashMap is safe to operate in multithreaded environments!

In JDK1.7, ConcurrentHashMap uses a segmentlock strategy to split a HashMap into an array of segments, where the Segment can be viewed as a HashMap. The difference is that the Segment inherits from ReentrantLock, which assigns an object lock to the Segment during operation to ensure the safety of concurrent operation in multi-threaded environment.

However, in JDK1.7, HashMap is prone to slow query efficiency due to the length of the collision list. Therefore, in JDK1.8, HashMap introduces the red-black tree feature. When the length of the collision list is larger than 8, HashMap will transform the list into a red-black binary tree structure.

In JDK1.8, ConcurrentHashMap uses a storage structure similar to that of HashMap. However, in JDK1.8, ConcurrentHashMap does not use segment locking. However, CAS + synchronized operation is used on the node of the element to ensure the security of concurrency. The implementation of the source code is much more complex than JDK1.7.