15 questions, which one can you stick to?

Approaching autumn recruitment, preparing for summer internship, I wish you every day progress hundred million little!Day11
This article summarizes interview questions related to ConcurrentHashMap, which will be updated daily
If you are not familiar with the ConcurrentHashMap source code, please refer to my previous blog: ConcurrentHashMap source code parsing article general directory
Volume? Volume on the right, Java is so volume volume only!

1. Describe the ConcurrentHashMap storage data structure.

The map structure of ConcurrentHashMap is the same as that of HashMap, which consists of array, linked list, and red-black tree.

ConcurrentHashMap stores data in the same unit as a HashMap, i.e., Node structure:

static class Node<K.V> implements Map.Entry<K.V> {
    / / hash value
    final int hash;
    // key
    final K key;
    // value
    volatile V val;
    // Rear-drive node
    volatileNode<K,V> next; . }Copy the code

The difference between ConcurrentHashMap and HashMap is that ConcurrentHashMap supports concurrent expansion and uses internal locking (spin locking + CAS + synchronized + segment locking) to ensure thread safety.

2. Can the ConcurrentHashMap load factor be specified?

The load factor of a normal HashMap can be modified, but ConcurrentHashMap cannot because its load factor is usedfinalKeyword modifier, the value is fixed0.75 ：

Load factor: indicates how full the hash table is ~ In ConcurrentHashMap, this attribute is fixed at 0.75 and cannot be modified ~
private static final float LOAD_FACTOR = 0.75 f;
Copy the code

3. Why is the node. hash field of a Node usually >=0?

Or, how many hash values can a Node have? What are the different scenarios?

ifNode.hash = -1, indicates that the current node is the **FWD(ForWardingNode) ** node (indicates the node that has been migrated).
ifNode.hash = -2, indicates that the current node is treed and isTreeBinObject,TreeBinObject proxies operate on red-black trees.
ifNode.hash > 0Is displayed, indicating that the current node is normalNodeNodes, which may be linked lists, or individualNode.

4. Can you briefly describe the sizeCtl fields in ConcurrentHashMap?

SizeCtr = Size Control, Size Control, sizeCtr = Size Control, Size Control, Size Control

(1) sizeCtl == 0;

izeCtl == 0, is the default value,Indicates that the default capacity is used when the hash list is actually initialized for the first time16Initialize.

SizeCtl == -1?

SizeCtl == -1: The hash list is being initialized. A thread is initializing the current hash table.
The hash table of ConcurrentHashMap is lazily initialized and must be initialized only once in concurrent conditions, so sizeCtl == -1 acts as an “identity lock” that prevents multiple threads from initializing the hash table.

SizeCtl is set to -1 in initTable() via CAS, indicating that a thread is initializing the hash table. Other threads cannot initialize the hash table again.

// SIZECTL: Expected value, initial 0
// This represents the current ConcurrentHashMap object
// SC specifies the sizeCtrl to be modified
// -1 changes sc to -1
U.compareAndSwapInt(this, SIZECTL, sc, -1);
Copy the code

If CAS modifs the thread with the successful sizeCtl = -1 operation, the logic to initialize the hash list can be followed. Threads that fail to modify CAS are constantly checked for spin until the hash list initialization is complete.
In this case, the Thread whose CAS fails will follow the following logic: the spin Thread will pass thread.yield (); Method to temporarily free up CPU resources to be used by hungrier threads. The goal is to reduce the spin drain on CPU performance.

SizeCtl > 0 after initializing hash table

sizeCtl > 0When,Indicates the threshold for triggering capacity expansion after initialization or expansion.
For instance,sizeCtl = 12When the number of data in the hash list> = 12“, the capacity expansion operation is triggered.

④ sizeCtl < 0 && sizeCtl! What happens when lambda is equal to negative 1?

sizeCtl < 0 && sizeCtl != -1When,Indicates that the hash list is being expanded.
-(1 + nThreads)“, indicating that n threads are being expanded together.
At this time,sizeCtlThe high16Who saidExpansion id stampLow,16 位Indicates the number of threads participating in concurrent capacity expansion:1 + nThreadThat is, the number of threads participating in concurrent capacity expansion isnA.

5. Would you please explain the function and calculation method of capacity expansion identification stamp?

According to the length of the old watchtab.lengthTo get the expansion unique identifier stamp.
Assuming that the capacity expansion is 16 -> 32, the function of the capacity expansion identifier is that under the condition of multi-threaded concurrent capacity expansion, all threads with the capacity expansion of 16 -> 32 can participate in the concurrent capacity expansion.

// Fixed value 16, which is related to capacity expansion. A capacity expansion identifier will be generated based on this attribute value during capacity expansion calculation
private static int RESIZE_STAMP_BITS = 16;

/** * when table array is expanded, a capacity expansion identifier is calculated. When simultaneous expansion is required, the current thread must get the capacity expansion identifier to participate in the expansion: */
static final int resizeStamp(int n) {
    RESIZE_STAMP_BITS: a fixed value of 16, which is related to capacity expansion. A capacity expansion identifier will be generated based on this value during capacity expansion calculation
    return Integer.numberOfLeadingZeros(n) | (1 << (RESIZE_STAMP_BITS - 1));
}
Copy the code

sizeCtl < 0 && sizeCtl != -1At this timesizeCtlThe high16Who saidExpansion id stampLow,16 位Indicates the number of threads participating in concurrent capacity expansion:1 + nThreadThat is, the number of threads participating in concurrent capacity expansion isnA.

6. How does ConcurrentHashMap ensure the safety of writing data threads?

This question essentially asks how adding data to ConcurrentHashMap is implemented to ensure thread-safety.

ConcurrentHashMap uses internal locking (spin locking + CAS + synchronized + segmental locking) to ensure thread safety.

The process for adding data is as follows:

① First, check whether the hash list has been initialized. If not, initialize the hash list first and then write the hash list.
(2) Before writing data to the bucket, check whether the bucket is empty. If the bucket is empty, the new data node is directly written to the bucket through the CAS. If CAS fails to write, it indicates that another thread has written data to the current bucket. The current thread fails to compete, returns to spin, and spins to wait.
If the current bucket is not empty, we need to determine the type of the current bucket header:
- ③ If the hash value of the header node in the bucket is- 1, indicates that the header of the current bucket bit isFWDNode, currently the hash list is in the process of expansion. The current thread needs to help with the expansion.
- ④ If conditions (2) and (3) are not met, the bucket may be a linked list or a proxy object in a red-black treeTreeBin. It’s used in this casesynchronizedLocks headers in the bucket to ensure thread-safe writes inside the bucket.

7. Describe the Hash addressing algorithm in ConcurrentHashMap

The addressing algorithm of ConcurrentHashMap is not very different from that of HashMap:

First, the HashCode value is obtained through the Key of the Node Node, and then the HashCode value is detour through the spread(int H) method to obtain the final Hash value.

/** * Compute Node hash algorithm: parameter h is the hash value * eg: * h binary --> 1100 0011 1010 0101 0001 1100 0001 1110 * (h >>> 16) --> 0000 0000 0000 1100 0011 1010 0101 * (h ^ (h) >>> 16)) --> 1100 0011 1010 0101 1101 1111 1011 1011 (h ^ (h >>> 16)) the purpose is to have the top 16 bits of H also participate in the addressing calculation, so that the resulting hash value is more scattered, Reduce the hash conflict ~ * -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- * HASH_BITS - > 0111, 1111 1111 1111 1111 1111 1111 1111 * (h ^ (h >>> 16)) --> 1100 0011 1010 0101 1101 1111 1011 1011 * (h ^ (h >>> 16)) & HASH_BITS --> 0100 0011 1010 0101 1101 1111 1011 1011 * Note: the result of (h ^ (h >>> 16)) is then &hash_bits, so that the hash value is always a positive */
static final int spread(int h) {
    // Move the original hash value xor ^ 16 bits to the right of the original hash value with & on HASH_BITS(0x7fffffff: binary is 31 ones)
    return (h ^ (h >>> 16)) & HASH_BITS;
}
Copy the code

After obtaining the final hash value, use the addressing formula index = (tab.length-1) & hash to obtain the bucket index.

How does ConcurrentHashMap count the amount of data in the current hash table?

ConcurrentHashMap counts the amount of stored data through the addCount(Long x, int Check) method, essentially using the LongAdder atomic class. (LongAdder source code analysis)

ConcurrentHashMap Why not ConcurrentHashMap why not AtomicLong hash table statistics? How about the amount of hash table data?

becauseAtomicLongAtomic class increment operations are implemented based on CAS. Implementation based on CAS leads to the problem that when a large number of threads are performing CAS operations at the same time, only one thread can successfully execute CAS operations, while all the other threads will fail and enter a spin state, which is itself a spin statewhile(true)Is very expensive system resources.

So how does LongAdder maintain high performance with high concurrency?

First take a look at the operation principle diagram of LongAdder :(the picture is from the interviewer asked me LongAdder, I was surprised…)

LongAdder adopts a “segmented” approach to reduce the frequency of CAS failure, typically swapping space for time:

LongAdderWe have a global variablevolatile long base;Value is passed when the concurrency is not highCASTo operate directlybaseValue, ifCASFailure is aimed atLongAdderIn theCell[]In the arrayCellforCASOperation to reduce the probability of failure.

For example, in the current class, base= 10, there are three threads adding 1 atomically to CAS. Thread 1 succeeds in executing base=11, and thread 2 and thread 3 fail in executing base=11, and then add 1 to Cell element in Cell[] array, which is also CAS operation. The value of each Cell in index=1 and index=2 is set to 1.

Sum = 11 + 1 + 1 = 13. The operation of LongAdder is completed. The flow chart is as follows:

(Picture is from the interviewer asked me LongAdder, I was surprised…)

9. What are the preprocessing tasks performed by the thread triggering the capacity expansion condition?

The first thing the thread that triggers the expansion condition needs to do is to change the sizeCtl field value through CAS, so that 16 bits higher is the unique identification stamp for expansion, and 16 bits lower is (number of threads participating in expansion + 1), indicating that there are threads performing expansion logic!
- Pay attention to: there are high16Bit expansion unique identifier is calculated according to the length of the current hash list, its highest bit is1. So what you end up withsizeCtlIt should be a negative number.
The current thread that triggered the capacity expansion condition then creates a new hash list, twice the size of the old one. And assign a reference to the new hash table to the map.nexttable field.
Since ConcurrentHashMap supports concurrent expansion by multiple threads, the expansion thread needs to know the progress of the data transfer from the old hash table to the new hash table. ConcurrentHashMap provides a transferIndex field to record the total migration progress of the old hash table data! The migration progress starts from the high bucket to the bucket level with subscript 0.

10. How to mark the bucket after migration in the old loose linked list?

The bucket that has been migrated from the old hashlist needs to be represented by a ForwardingNode object. This Node is special and is used to indicate that the data in the bucket has been migrated to the bucket in the new hashlist.

What does ForwardingNode do?

ForwardingNodeObject to query the target data into a new hash listfind()Methods.
When a thread happens to be looking for the target element in the old hash list, it happens to encounter the current bucket containing isFWDNode, then you can go throughFWDThe node’sfind()Method redirects to a new hash list to query the target element!

11. What if a hash table is in storage and a new write request comes in?

There are two cases to consider here:

If the bucket accessed by the current thread is not a FWD node according to the addressing algorithm (that is, the data in the current bucket has not been migrated) when writing. Check whether the bucket is empty. If the bucket is empty, data is written in CAS mode. If the bucket is not empty, it could be a linked list or TreeBin, and you need to lock the head of the bucket with the synchronized keyword.
If the current thread is writing, the bucket accessed by the addressing algorithm is the FWD node (that is, the data in the current bucket has been migrated).
- When the FWD node is encountered, it indicates that the hashtable is being expanded, and the current thread needs to join in to help the hashtable expand (helpTransfer(tab, f);Assist with capacity expansion to minimize the time spent on data migration for capacity expansion.
- After the current thread is added to assist expansion, ConcurrentHashMap is based on the globaltransferIndexField to assign migration tasks to the current thread (which needs to be responsible for migrating the bucket range of the old hash table). For example, let the current thread be responsible for migrating data from buckets [0-4] in the old hash table to the new hash table.
- If the current thread does not have the task to be responsible for migration, the thread exits to assist expansion, that is, the expansion ends. At this point, the current thread can continue to execute the logic of writing data!

12. How does the expansion worker maintain the lower 16 bits of sizeCtl during capacity expansion?

Each thread that performs a capacity expansion task (including assisting capacity expansion) is updated before it starts worksizeCtlThe low16Bit, that is, let low16 位 + 1A new thread is added to perform capacity expansion.
Each thread that performs capacity expansion will be assigned a migration task range. If the migration of the task range that the current thread is responsible for is completed and no migration task range is assigned, the current thread can quit to assist capacity expansion and update at this timesizeCtlThe low16Bit, that is, let low16 位 - 1, indicating that a thread exits concurrent capacity expansion.
ifsizeCtl 低 16 位- 1The value of1, it indicates that the current thread is the last thread to exit concurrent capacity expansion.

13. If the bucket list is upgraded to a red-black tree and a reader thread is currently accessing the tree, what should be done if a new writer thread requests?

The writer thread will block because the red black tree is special. New data may trigger the self-balancing of the red black tree, which will cause the structure of the tree to change and affect the reading result of the reader thread!

Data read and written in the red-black tree are mutually exclusive. The detailed principle is as follows:

We know that the red-black tree in ConcurrentHashMap is defined byTreeBinTo the agent,TreeBinIt has an Int insidestateField.
When reading data, the reader thread uses the CAS modestate 值 + 4(indicates that the read lock is added.) After the data is read, use the CAS mode to change the CAS valuestate 值 4 -.
If the writer thread writes data to the red-black tree, it checks firststateIs the value equal to0If it is0, indicating that no reader thread is retrieving data. In this case, data can be written directly, and the writer thread will use CAS to retrieve datastateField value is set to1(indicates a write lock is added).
If the writer thread checksstateValue is not0At this pointpark()Suspends the current thread and makes it wait to be woken up. When a writer thread is suspended, the writer thread first callsstateThe second bit of the value is set to 1 (binary is10), which in decimal form is 2, indicating that there is a writer thread waiting to be woken up.

Conversely, if there is a writer thread in the red-black tree performing a write operation, what happens if a new reader thread requests?

The TreeBin object retains a linked list structure inside and is designed for this situation. The new reader thread is allowed to access the data on the linked list without going through the red-black tree.

14. After suspending a waiting writer thread, when does it wake up to continue writing?

In the previous question, we analyzed that when the reader thread reads data, it will use CAS modestate 值 + 4(indicates that the read lock is added.) After the data is read, use the CAS mode to change the CAS valuestate 值 4 -.
whenstateValue minus4After that, the reader thread will check firststateValue is2If it is2, it indicates that there is a writer thread waiting to be woken up in the pending waitunsafe.unpark()Method to wake up the writer thread.

15. Can you describe the lastRun mechanism in ConcurrentHashMap?

ConcurrentHashMap expansion? What exactly is lastRun?

Summary of the interview question is also quite time-consuming, the article will be updated from time to time, sometimes more than a day to update a few, if you help to review and consolidate the knowledge point, please also support three, the follow-up will be a hundred million points of update!

In order to help more young white from zero to advanced Java engineers, from CSDN official side to get a set of “Java Engineer learning growth knowledge Map”, size870mm x 560mm, can be folded into a book size, interested friends can have a look, of course, anyway, the blogger’s articles are always free ~