Be serious about writing, not clickbait.

The article is available at Github.com/niumoo/Java… And program ape Alang’s blog, point attention, don’t get lost.

ConcurrentHashMap is a thread-safe HashMap that has been used a lot in recent years. So what is its storage structure and implementation principle?

1. ConcurrentHashMap 1.7

1. Storage structure

ConcurrnetHashMap consists of several segments, each of which is a hashMap-like structure. Therefore, each HashMap can be expanded internally. However, the number of segments cannot be changed once initialized. The default number of segments is 16. You can also consider ConcurrentHashMap to support up to 16 concurrent threads by default.

2. The initialization

The initialization process of ConcurrentHashMap is explored through the no-argument construction of ConcurrentHashMap.

    /** * Creates a new, empty map with a default initial capacity (16), * Load factor (0.75) and concurrencyLevel (16). */
    public ConcurrentHashMap(a) {
        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR, DEFAULT_CONCURRENCY_LEVEL);
    }
Copy the code

The parameterless construct calls the parameterless construct, passing in the default values of the three parameters, whose values are.

    /** * Default initialization capacity */
    static final int DEFAULT_INITIAL_CAPACITY = 16;

    /** * Default load factor */
    static final float DEFAULT_LOAD_FACTOR = 0.75 f;

    /** * Default concurrency level */
    static final int DEFAULT_CONCURRENCY_LEVEL = 16;
Copy the code

Then look at the internal implementation logic of the parameterized constructor.

@SuppressWarnings("unchecked")
public ConcurrentHashMap(int initialCapacity,float loadFactor, int concurrencyLevel) {
    // Check parameters
    if(! (loadFactor >0) || initialCapacity < 0 || concurrencyLevel <= 0)
        throw new IllegalArgumentException();
    // Verify the concurrency level, greater than 1<<16, reset to 65536
    if (concurrencyLevel > MAX_SEGMENTS)
        concurrencyLevel = MAX_SEGMENTS;
    // Find power-of-two sizes best matching arguments
    PI to the power of 2
    int sshift = 0;
    int ssize = 1;
    // This loop finds the nearest power of 2 above concurrencyLevel
    while (ssize < concurrencyLevel) {
        ++sshift;
        ssize <<= 1;
    }
    // Record segment offset
    this.segmentShift = 32 - sshift;
    // Record the segment mask
    this.segmentMask = ssize - 1;
    // Set the capacity
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    // c = capacity/ssize, default 16/16 = 1
    int c = initialCapacity / ssize;
    if (c * ssize < initialCapacity)
        ++c;
    int cap = MIN_SEGMENT_TABLE_CAPACITY;
    // The Segment's hashmap-like capacity is at least 2 or a multiple of 2
    while (cap < c)
        cap <<= 1;
    // create segments and segments[0]
    // Create Segment array and set segments[0]
    Segment<K,V> s0 = new Segment<K,V>(loadFactor, (int)(cap * loadFactor),
                         (HashEntry<K,V>[])new HashEntry[cap]);
    Segment<K,V>[] ss = (Segment<K,V>[])new Segment[ssize];
    UNSAFE.putOrderedObject(ss, SBASE, s0); // ordered write of segments[0]
    this.segments = ss;
}
Copy the code

To summarize the initialization logic for ConcurrnetHashMap in Java 7.

  1. Verify necessary parameters.
  2. ConcurrencyLevel specifies the concurrencyLevel. If the value is greater than the maximum value, the value is reset to the maximum value. The default value is 16.
  3. Find the nearest power of 2 above concurrencyLevel as the initial capacity size. The default is 16.
  4. Record the segmentShift offset, which is N in [capacity = 2 ^ N] and will be used later in the Put calculation. The default is 32-sshift = 28.
  5. Record the segmentMask, default is ssize-1 = 16-1 = 15.
  6. Initialize segments[0], default size is 2, load factor is 0.75, expansion threshold is 2*0.75=1.5, expansion is performed only when the second value is inserted.

3. put

Follow the initialization parameters above to see the source of the PUT method.

/**
 * Maps the specified key to the specified value in this table.
 * Neither the key nor the value can be null.
 *
 * <p> The value can be retrieved by calling the <tt>get</tt> method
 * with a key that is equal to the original key.
 *
 * @param key key with which the specified value is to be associated
 * @param value value to be associated with the specified key
 * @return the previous value associated with <tt>key</tt>, or
 *         <tt>null</tt> if there was no mapping for <tt>key</tt>
 * @throws NullPointerException if the specified key or value is null
 */
public V put(K key, V value) {
    Segment<K,V> s;
    if (value == null)
        throw new NullPointerException();
    int hash = hash(key);
    // Move the hash value unsigned 28 bits to the right (obtained at initialization) and then do an and with segmentMask=15
    // Add the upper 4 bits to the segmentMask (1111)
    int j = (hash >>> segmentShift) & segmentMask;
    if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
         (segments, (j << SSHIFT) + SBASE)) == null) // in ensureSegment
        // If the Segment is null, initialize it
        s = ensureSegment(j);
    return s.put(key, hash, value, false);
}

/**
 * Returns the segment for the given index, creating it and
 * recording in segment table (via CAS) if not already present.
 *
 * @param k the index
 * @return the segment
 */
@SuppressWarnings("unchecked")
private Segment<K,V> ensureSegment(int k) {
    final Segment<K,V>[] ss = this.segments;
    long u = (k << SSHIFT) + SBASE; // raw offset
    Segment<K,V> seg;
    // Check whether the Segment at u is null
    if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) {
        Segment<K,V> proto = ss[0]; // use segment 0 as prototype
        // Get the initial length of HashEntry
      
        from segment 0
      ,v>
        int cap = proto.table.length;
        // get the loadFactor from the hash table in segment 0. The loadFactor is the same for all segments
        float lf = proto.loadFactor;
        // Calculate the expansion threshold
        int threshold = (int)(cap * lf);
        // Create a HashEntry array with cap capacity
        HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap];
        if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { // recheck
            // Check if the Segment at u is null, because another thread may be manipulating it
            Segment<K,V> s = new Segment<K,V>(lf, threshold, tab);
            // Check whether the Segment at position u is null
            while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
                   == null) {
                // The CAS assignment will only succeed once
                if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s))
                    break; }}}return seg;
}
Copy the code

The above source code analyzes the process of putting a data in ConcurrentHashMap.

  1. Calculates the location of the key to put, and gets the Segment at the specified location.

  2. If the Segment at the specified location is empty, initialize the Segment.

    Initialize Segment flow:

    1. Check if the Segment of the computed location is null.
    2. To continue initialization for NULL, create a HashEntry array using the capacity and load factor of Segment[0].
    3. Check again if the Segment computed at the specified location is null.
    4. Initialize this Segment with the created HashEntry array.
    5. Spin determines whether the Segment computed at the specified location is null. CAS assigns the Segment to that location.
  3. Segment. Put Insert key,value value.

The operations to get and initialize segments were explored above. The Segment put method in the last line is not checked yet.

final V put(K key, int hash, V value, boolean onlyIfAbsent) {
    // Obtain the ReentrantLock exclusive lock. If the lock cannot be obtained, scanAndLockForPut can be obtained.
    HashEntry<K,V> node = tryLock() ? null : scanAndLockForPut(key, hash, value);
    V oldValue;
    try {
        HashEntry<K,V>[] tab = table;
        // Calculate the data location to put
        int index = (tab.length - 1) & hash;
        // CAS gets the index value
        HashEntry<K,V> first = entryAt(tab, index);
        for (HashEntry<K,V> e = first;;) {
            if(e ! =null) {
                // Check if the key already exists. If so, traverse the list to find the location and replace the value
                K k;
                if ((k = e.key) == key ||
                    (e.hash == hash && key.equals(k))) {
                    oldValue = e.value;
                    if(! onlyIfAbsent) { e.value = value; ++modCount; }break;
                }
                e = e.next;
            }
            else {
                // first has a value but index has a value.
                if(node ! =null)
                    node.setNext(first);
                else
                    node = new HashEntry<K,V>(hash, key, value, first);
                int c = count + 1;
                // If the capacity is greater than the threshold and smaller than the maximum capacity, expand the capacity
                if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                    rehash(node);
                else
                    // The index position assigns node, which may be an element or the head of a linked list
                    setEntryAt(tab, index, node);
                ++modCount;
                count = c;
                oldValue = null;
                break; }}}finally {
        unlock();
    }
    return oldValue;
}
Copy the code

Because a Segment inherits a ReentrantLock, it is very easy to acquire locks from within the Segment. Put processes use this functionality.

  1. TryLock () obtains the lock, if not, proceed with scanAndLockForPut.

  2. Calculate the index position to put the put data into, and then get the HashEntry at that position.

  3. Iterate over put new elements, why iterate over them? Because the HashEntry you get here could be an empty element or a linked list already exists, you treat it differently.

    If the HashEntry at this location does not exist:

    1. If the current capacity is greater than the threshold or smaller than the maximum capacity, expand the capacity.
    2. Direct head insertion.

    If a HashEntry exists at this location:

    1. Check whether the current Key and hash values of the linked list elements match the Key and hash values to be put. If yes, replace the value
    2. If not, obtain the next node in the list until the value is replaced, or the end of the list does not have the same.
      1. If the current capacity is greater than the threshold or smaller than the maximum capacity, expand the capacity.
      2. Direct linked list head insert method.
  4. If the position to be inserted already exists, return the old value after the replacement, otherwise return NULL.

The scanAndLockForPut operation in the first step is not covered here; all this method does is spin tryLock() continuously to get the lock. Block with lock() to acquire the lock when the number of spins is greater than the specified number. Gets the HashEntry of the hash position along the table at spin.

private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
    HashEntry<K,V> first = entryForHash(this, hash);
    HashEntry<K,V> e = first;
    HashEntry<K,V> node = null;
    int retries = -1; // negative while locating node
    // Spin the lock
    while(! tryLock()) { HashEntry<K,V> f;// to recheck first below
        if (retries < 0) {
            if (e == null) {
                if (node == null) // speculatively create node
                    node = new HashEntry<K,V>(hash, key, value, null);
                retries = 0;
            }
            else if (key.equals(e.key))
                retries = 0;
            else
                e = e.next;
        }
        else if (++retries > MAX_SCAN_RETRIES) {
            // After a specified number of spins, block until the lock is acquired
            lock();
            break;
        }
        else if ((retries & 1) = =0 &&
                 (f = entryForHash(this, hash)) ! = first) { e = first = f;// re-traverse if entry changed
            retries = -1; }}return node;
}

Copy the code

4. The expansion rehash

The expansion of ConcurrentHashMap will only be doubled. When the data in the old array is moved to the new array, the position will either remain unchanged or change to index+ oldSize. The node in the parameter will be inserted into the specified position after expansion using the linked list header method.

private void rehash(HashEntry<K,V> node) {
    HashEntry<K,V>[] oldTable = table;
    / / old capacity
    int oldCapacity = oldTable.length;
    // New capacity, twice as large
    int newCapacity = oldCapacity << 1;
    // New expansion threshold
    threshold = (int)(newCapacity * loadFactor);
    // Create new array
    HashEntry<K,V>[] newTable = (HashEntry<K,V>[]) new HashEntry[newCapacity];
    // New mask, default 2 is 4 after expansion, -1 is 3, binary is 11.
    int sizeMask = newCapacity - 1;
    for (int i = 0; i < oldCapacity ; i++) {
        // Iterate over the old array
        HashEntry<K,V> e = oldTable[i];
        if(e ! =null) {
            HashEntry<K,V> next = e.next;
            // Calculate the new location, the new location can only be inconvenience or old location + old capacity.
            int idx = e.hash & sizeMask;
            if (next == null)   // Single node on list
                // If the current position is not yet a list, but an element, assign the value directly
                newTable[idx] = e;
            else { // Reuse consecutive sequence at same slot
                // If it is a linked list
                HashEntry<K,V> lastRun = e;
                int lastIdx = idx;
                // The new location can only be inconvenience or old location + old capacity.
                // The lastRun element positions are the same after the loop
                for(HashEntry<K,V> last = next; last ! =null; last = last.next) {
                    int k = last.hash & sizeMask;
                    if (k != lastIdx) {
                        lastIdx = k;
                        lastRun = last;
                    }
                }
                // the element position after lastRun is the same, directly as the list assigned to the new position.
                newTable[lastIdx] = lastRun;
                // Clone remaining nodes
                for(HashEntry<K,V> p = e; p ! = lastRun; p = p.next) {// Iterate over the rest of the elements, inserting the header to the specified position k.
                    V v = p.value;
                    int h = p.hash;
                    int k = h & sizeMask;
                    HashEntry<K,V> n = newTable[k];
                    newTable[k] = newHashEntry<K,V>(h, p.key, v, n); }}}}// Insert a new node
    int nodeIndex = node.hash & sizeMask; // add the new node
    node.setNext(newTable[nodeIndex]);
    newTable[nodeIndex] = node;
    table = newTable;
}
Copy the code

Some of you might be confused by the last two for loops, where the first for looks for a node where all the new positions of the next nodes are the same. And then assign this as a linked list to the new location. The second for loop inserts the remaining elements into the list at the specified location via header insertion. The reason for this may be based on probability statistics, students who have in-depth research can give their opinions.

5. get

It’s pretty easy from here, the GET method takes only two steps.

  1. Calculate the location of the key.
  2. Traverses the specified location to find the value of the same key.
public V get(Object key) {
    Segment<K,V> s; // manually integrate access methods to reduce overhead
    HashEntry<K,V>[] tab;
    int h = hash(key);
    long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
    // Calculate the location of the key
    if((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) ! =null&& (tab = s.table) ! =null) {
        for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
                 (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE); e ! =null; e = e.next) {
            // If the list is linked, the value of the same key is found.
            K k;
            if ((k = e.key) == key || (e.hash == h && key.equals(k)))
                returne.value; }}return null;
}
Copy the code

2. ConcurrentHashMap 1.8

1. Storage structure

The ConcurrentHashMap of Java8 has changed a lot compared to that of Java7. Instead of the Segment array + HashEntry array + linked list, the ConcurrentHashMap of Java8 is Node array + linked list/red-black tree. When the collision chain expression reaches a certain length, the linked list is transformed into a red-black tree.

2. Initialize initTable

/** * Initializes table, using the size recorded in sizeCtl. */
private final Node<K,V>[] initTable() {
    Node<K,V>[] tab; int sc;
    while ((tab = table) == null || tab.length == 0{/ / if sizeCtl <0", indicating that the other thread successfully executes the CAS and is initializing the CAS.if ((sc = sizeCtl) < 0)
            // Allocate CPU usage
            Thread.yield(); // lost initialization race; just spin
        else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
            try {
                if ((tab = table) == null || tab.length == 0) {
                    int n = (sc > 0)? sc : DEFAULT_CAPACITY;@SuppressWarnings("unchecked")
                    Node<K,V>[] nt = (Node<K,V>[])newNode<? ,? >[n]; table = tab = nt; sc = n - (n >>>2); }}finally {
                sizeCtl = sc;
            }
            break; }}return tab;
}
Copy the code

From the source, you can see that the initialization of ConcurrentHashMap is done through spin and CAS operations. One thing to note inside is the variable sizeCtl, whose value determines the current initialization state.

  1. -1 Indicates that initialization is in progress
  2. -n Indicates that n-1 threads are expanding capacity
  3. Indicates the initialization size of the table if the table is not initialized
  4. Indicates the capacity of the table, if the table has been initialized.

3. put

Go through the put source code directly.

public V put(K key, V value) {
    return putVal(key, value, false);
}

/** Implementation for put and putIfAbsent */
final V putVal(K key, V value, boolean onlyIfAbsent) {
    // Key and value cannot be empty
    if (key == null || value == null) throw new NullPointerException();
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node<K,V>[] tab = table;;) {
        // f = target position element
        Node<K,V> f; int n, i, fh;// fh is used to store the hash value of the element at the target position
        if (tab == null || (n = tab.length) == 0)
            // If the bucket is empty, initialize the bucket (spin +CAS)
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            // If the bucket is empty and the CAS is not locked, then the CAS will break
            if (casTabAt(tab, i, null.new Node<K,V>(hash, key, value, null)))
                break;  // no lock when adding to empty bin
        }
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);
        else {
            V oldVal = null;
            // Use synchronized to join a node
            synchronized (f) {
                if (tabAt(tab, i) == f) {
                    // It is a linked list
                    if (fh >= 0) {
                        binCount = 1;
                        // Loop to add new or overwrite nodes
                        for (Node<K,V> e = f;; ++binCount) {
                            K ek;
                            if(e.hash == hash && ((ek = e.key) == key || (ek ! =null && key.equals(ek)))) {
                                oldVal = e.val;
                                if(! onlyIfAbsent) e.val = value;break;
                            }
                            Node<K,V> pred = e;
                            if ((e = e.next) == null) {
                                pred.next = new Node<K,V>(hash, key,
                                                          value, null);
                                break; }}}else if (f instanceof TreeBin) {
                        / / the red-black tree
                        Node<K,V> p;
                        binCount = 2;
                        if((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! =null) {
                            oldVal = p.val;
                            if(! onlyIfAbsent) p.val = value; }}}}if(binCount ! =0) {
                if (binCount >= TREEIFY_THRESHOLD)
                    treeifyBin(tab, i);
                if(oldVal ! =null)
                    return oldVal;
                break;
            }
        }
    }
    addCount(1L, binCount);
    return null;
}
Copy the code
  1. Calculate the Hashcode based on the key.

  2. Check whether initialization is required.

  3. That is, the Node located for the current key. If it is empty, data can be written to the current position. CAS is used to attempt to write data.

  4. If the current location hashCode == MOVED == -1, you need to expand the capacity.

  5. If none is met, data is written using a synchronized lock.

  6. If the number is greater than TREEIFY_THRESHOLD, it is converted to a red-black tree.

4. get

Get process is relatively simple, directly through the source code.

public V get(Object key) {
    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
    // Hash position of the key
    int h = spread(key.hashCode());
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (e = tabAt(tab, (n - 1) & h)) ! =null) {
        // The hash value of the header is the same if the element at the specified position exists
        if ((eh = e.hash) == h) {
            if((ek = e.key) == key || (ek ! =null && key.equals(ek)))
                // The hash value of key is equal, and the value of key is the same
                return e.val;
        }
        else if (eh < 0)
            // If the hash value of the header node is less than 0, it indicates expansion or red-black tree
            return(p = e.find(h, key)) ! =null ? p.val : null;
        while((e = e.next) ! =null) {
            // it is a linked list
            if(e.hash == h && ((ek = e.key) == key || (ek ! =null && key.equals(ek))))
                returne.val; }}return null;
}
Copy the code

To summarize the GET process:

  1. Calculates the position based on the hash value.
  2. Find the specified location, if the head node is to find, return its value.
  3. If the hash value of the head node is less than 0, it indicates expansion or red-black tree.
  4. If it’s a linked list, walk through it and find it.

Conclusion:

In general, ConcruuentHashMap changes a lot in Java8 compared to Java7,

3. Summary

In Java7, ConcruuentHashMap uses a segmented lock, which means that only one thread can operate on each Segment at a time. Each Segment is a hashmap-like array structure that can be expanded and its conflicts converted to a linked list. But the number of segments cannot be changed once initialized.

Synchronized locking CAS mechanism used by ConcruuentHashMap in Java8. The structure also evolved from the Segment array + HashEntry array + linked list in Java7 to the Node array + linked list/red-black tree. Node is a HashEntry like structure. When its conflicts reach a certain size, it will be transformed into a red-black tree, and when the number of conflicts is less than a certain number, it will return to the linked list.

Some students may have doubts about the performance of Synchronized. In fact, the performance of Synchronized lock is no longer a problem after the introduction of the lock upgrade strategy. Interested students can learn about the upgrade of Synchronized by themselves.

Hello world:) I’m Aaron, a tech tool guy on the front line. The article continues to update, can pay attention to the public number “program ape A Lang” grow together. This article has been received at Github.com/niumoo/Java… And “Unread Code,” lots of knowledge and a series of articles.