In the Map family, ConcurrentHashMap is specifically designed to solve concurrency problems, and its implementation logic is more complex than that of some other Map implementations. It’s hard to understand it just by reading the code. I have encountered this difficulty too, but I have made some surprising discoveries, just one of which I will mention today, and that is the expansion of ConcurrentHashMap, quite solidarity!

Like other members of its family, the ConcurrentHashMap constructor initializes only a few member variables and does not open up the corresponding storage space. The corresponding storage space is checked and initialized when the first element is put in. Let’s look at the logical framework of its PUT method:

public V put(K key, V value) {
    return putVal(key, value, false);
}
final V putVal(K key, V value, boolean onlyIfAbsent) {
    if (key == null || value == null) throw new NullPointerException();
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node<K,V>[] tab = table;;) {
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            // Cas operation puts the head node
        }
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);
        else {
           // Operations related to putting value normally
        }
    }
    addCount(1L, binCount);
    return null;
}
Copy the code

I was kind enough to remove a lot of code, leaving only the branches of logic behind. As we can see, the code is roughly divided into four branches:

  • If the table is empty, call initTable() to initialize it
  • If the table is not empty but the ith element is null, use the CAS operation to place the head node
  • If the head node is not empty and the hash value is MOVED (MOVED is a constant and the value is -1), a method called helpTransfer is called
  • Finally, as usual, lock a new value

What I’m going to talk about in detail is this helpTransfer method. What does this guy mean, help = help? Why help? How can I help? This question, let’s start with the hash value called ‘MOVED’. To control concurrency, ConcurrentHashMap adds multiple tokens of special significance. In the source code, MOVED is commented as follows:

static final int MOVED     = -1; // hash for forwarding nodes
Copy the code

ForwardingNode is a subclass of Node. It is a special subclass that only appears during expansion. If any operation fetched a node of this type, the thread performing the operation immediately knew that the current Map was being expanded. At this point, it stops its operation and calls the helpTransfer method to help expand. ConcurrentHashMap uses this method to unite concurrent threads to perform capacity expansion. When I understand this place, I feel the thick gay love, and in front of my eyes, I see the scene of everyone stopping their work and watching the same screen to find bugs.

But goose, if you want to help, you can help? That’s not the reality. Let’s take a step-by-step look at how ConcurrentHashMap records the number of threads participating in the expansion and the size of the array at the time of the expansion. There is a short but important method called resizeStamp that is key to understanding how to determine whether or not the expansion is possible. Let’s look at the logic of this code:

static final int resizeStamp(int n) {
    return Integer.numberOfLeadingZeros(n) | (1 << (RESIZE_STAMP_BITS - 1));
}
Copy the code

The argument, n, is the length of the current array. This method simply does a few things. Or the first half of the operator, which fetches the number of zeros before the highest 1 in the binary representation of n, which contains only one 1 because n is a power of two. In the latter part, 1 moves to the left resize_stamp_bits-1 bit. RESIZE_STAMP_BITS is 16, which is a constant. Shift the 1 15 bits to the left, and you end up with a binary number starting with a 1 followed by 15 zeros. When it does or with the value of the first half, it actually gets the negative form of the value of the first half. For example, let’s say that n is 16.

16 binary form of (note here for 32-bit storage) 00000000000000000000000000010000 Integer. NumberOfLeadingZeros method to get to 1 in front of the Numbers of 0, count, there are 27. I'm going to write 27 in binary and then I'm going to write it in or. 00000000000000000000000000011011 00000000000000001000000000000000 | -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 000000000000000001000000000011011Copy the code

Keeping this value in mind, let’s look at the source code that appears in addCount:

private final void addCount(long x, int check) {...int rs = resizeStamp(n);
        // Change the sizeCtl value with CAS when the current thread is the first thread to initiate capacity expansion
        else if (U.compareAndSwapInt(this, SIZECTL, sc, (rs << RESIZE_STAMP_SHIFT) + 2))
           transfer(tab, null); . }Copy the code

It’s also a nice omission of a lot of code, leaving only two key sentences. First of all, the resizeStamp method is also called, a value is obtained, and then the value is shifted and +2, the result is as follows:

RESIZE_STAMP_SHIFT = 32-resize_STAMp_bits. It can be simply calculated that in the method of resizeStamp, 1 has moved the RESIZE_STAMP_BITS bit to the left, and then moved the RESIZE_STAMP_SHIFT bit again. The end result is to move the result of the resizeStamp method to the left to the highest digit: After the 000000000000000001000000000011011 left after 100000000001101100000000000000000 plus 2, 100000000001101100000000000000010Copy the code

So, the value you see here can be divided into two parts, with the high 16 bits storing a value that is closely related to the size of the array before expansion. The lower 16 bits store the number of threads currently participating in capacity expansion +1. After this calculation is complete, the CAS operation is used to assign the result to the member variable sizeCtl.

You’ve probably seen it in a lot of places, that the high 16 bits store information about the size of the array, and that information is the number of high zeros in the binary representation of the size of the array. Why do you store that? It’s not hard to see why, because this ensures that 16 bits will always represent the size of the array. We know that the maximum array capacity can be the maximum int type, this value can not be stored in 16 bits, so the smart source code author thought of this way, directly put the high number of 0 in, this value can be one-to-one with the array capacity, and absolutely guarantee that 16 bits can be stored. In fact, at most, there are only 32 zeros. The reason why the entire result starts with 1 is explained in the source code comment, so that the value is negative to indicate that the array is being expanded. With the above matting, we can easily understand the layers of investigation to help expand the capacity before going through. (Note: the following code is from JDK12, the previous version has a bug since 1.8, the reason is me)

private transient volatile int sizeCtl;
final Node<K,V>[] helpTransfer(Node<K,V>[] tab, Node<K,V> f) {
    Node<K,V>[] nextTab; int sc;
    if(tab ! =null && (f instanceofForwardingNode) && (nextTab = ((ForwardingNode<K,V>)f).nextTable) ! =null) {
        int rs = resizeStamp(tab.length) << RESIZE_STAMP_SHIFT;
        while (nextTab == nextTable && table == tab &&
               (sc = sizeCtl) < 0) {
            if (sc == rs + MAX_RESIZERS || sc == rs + 1 ||
                transferIndex <= 0)
                break;
            if (U.compareAndSetInt(this, SIZECTL, sc, sc + 1)) {
                transfer(tab, nextTab);
                break; }}return nextTab;
    }
    return table;
}
Copy the code

Let’s look directly at the three criteria for the first if in the while loop, where sc is sizeCtl:

  • Sc == rs + 1. The low level of sc stores the number of threads participating in capacity expansion +1. When a thread joins, sc+1 is given, and when a thread exits, SC-1 is given. If this formula is true, it indicates that all current expansion threads have completed expansion and exit. There is no need to participate in the expansion.
  • Sc == RS + MAX_RESIZERS. As mentioned above, low values are used to store the number of threads that can participate in capacity expansion. MAX_RESIZERS = (1 << (32-resize_STAMp_bits)) -1, and the maximum value of low values is MAX_RESIZERS. If this formula is true, it indicates that the current capacity expansion thread has reached the maximum, and no more capacity expansion thread is required.
  • TransferIndex < = 0. This judgment is related to the buckets allocated by the expansion thread. This formula indicates that all the buckets in the old array have been allocated by the threads participating in expansion, and there is no need to add new threads for expansion.

These three judgments tell the thread asking for help that your organization is well appreciated, but that you really don’t need any more help now, so go back to your own business. If both threads pass, the current thread will try to add CAS to sc value +1, which means that I will add one more thread to help expand capacity.

At this point, the logic for a thread to join a capacity expansion operation ends, which can be summarized as: Sense capacity expansion -> try to join -> assist capacity expansion.

The capacity expansion operation of ConcurrentHashMap is also complicated, and many logic in this paper are closely related to it. For example, the transferIndex value of the three judgment conditions is maintained in the capacity expansion method. This paper only focuses on the logic of helping expansion. First, it is to understand the transformation and role of the variable sizeCtl in the expansion process. Second, it is to realize the unity of active and lively threads who actively participate in collective activities.

I cut the code, have been deleted, interested readers can find the source, on the basis of this article, read and understand the source code in detail, to harvest more knowledge points.