Follow me for more interesting interview questions.

Written in the book before

Hello, everyone, it’s nice to meet you again.

Some articles written before, by the female ticket said obscure, smelly and long. A careful thought is really so back to the thing, now use a mobile phone to see technical articles is not a convenient figure, if the source code as long, why not directly to see the source code or books?

Therefore, a sudden whim, can write a code to introduce a knowledge point? Hence the title of this article, Understanding ConcurrentHashMap without code. (A little bit of clickbait here, but try to keep it simple and easy to understand, no complex code)

ConcurrentHashMap contains a lot of content, so this article is divided into the initial part. The first part mainly introduces the underlying storage structure, initialization process, PUT () process, and get() process.

1. Why ConcurrentHashMap

Let’s start with the first question,

1. Why ‘ConcurrentHashMap’?

Know before HashMap classmates know is thread-safe, but the JDk itself also provides Hashtable and Collections. SynchronizedMap (HashMap), why Doug Lea (the great god really cow force, He wrote the whole JUC package.) And the underlying implementation of such a complex ConcurrentHashMap?

In fact, from the perspective of performance, these two schemes basically lock read and write operations. When one thread is reading and writing elements, other threads must wait. For example, when one thread is inserting data, other threads cannot read and write.

More precise: they are using a global lock to synchronize concurrent access between different threads, the same point in time, there can be only one thread holds a lock, that is to say, at the same point in time, there can be only one thread can access to the container, although it ensure the safety of multithreading between concurrent access, but also lead to access to the container into a serialization.

2. How was JDk1.7 implemented before?

Answering this question needs to be understood from two perspectives, structure and thread safety.

1. What is the data structure?

Similar to 1.7 of HashMap, the underlying storage structure is array + linked list.

2. How to ensure thread safety?

The mechanism of segmenting lock is adopted to realize concurrent update operation.

The underlying class mainly borrows two inner classes: Segment and HashEntry.

  1. SegmentinheritanceReentrantLockUsed to act as a lock, eachSegmentObject guards several buckets for each hash mapping table. That is to say byhashTableTo lock an element, to lock a batch of elements, in fact, this is the coarsening of lock optimization.
  2. Each bucket is made up of severalHashEntryA linked list of objects
  3. HashEntryKey/value pairs used to encapsulate the mapping table;

The entire storage structure is as follows: a ConcurrentHashMap instance contains an array of Segment objects.

ConcurrentHashMap1.7 Storage structure

Each time an element is added or acquired, it locks a segment to ensure that only one thread is working on the segment at the same time.

3. JDk1.8 underlying principles?

Again, there are two ways to answer this question.

  1. How the underlying data structure is optimized to make the performance better.
  2. How is thread-safe and do you continue to use segments?

1. Underlying data structure

ConcurrentHashMap is stored in 1.8 as an array + linked list + red-black tree. The storage structure is shown as follows:

ConcurrentHashMap18 Storage structure

2. How to implement thread safety?

The implementation of JDK1.8 has abandoned the Segment Segment locking mechanism, using CAS+Synchronized to ensure the security of concurrent updates, and HashMap similar to the underlying array + linked list + red-black tree storage structure.

Some people may wonder why Synchronized is still used in 1.8. First of all, there are two points:

  1. Synchronizedin1.8It has been greatly optimized and its performance has been greatly improved.
  2. SynchronizedIt’s only used to lock the first node of a linked list or red-black tree, as long as there’s noneHashConflict, no concurrency problems, efficiency is increased by a factor of N.

4. ConcurrentHashMap Initialization process

Unlike other containers, ConcurrentHashMap is not initialized through the constructor. Instead, the constructor calculates how much capacity is required to initialize and assigns it to the global variable sizeCtl.

Why add a global variable sizeCtl with 'volatile'modified global variables, this also needs to start from the meaning of the keyword ** different values represent different meanings **. Negative value: Initializing or expanding. Among them -1: initializing. -n: Initializing or expanding N threads0: indicates that the hash table has not been initialized. The value indicates the size to be initialized or expanded next timeCopy the code

So the first time, you’re just telling the system, if you’re going to put an apple, you’re going to buy a plate with a radius of 10 centimeters.

Although the actual initialization procedure is postpositioned, it does not prevent us from analyzing the initialization process first.

The initialization process is as follows:

PNG indicates the TAB initialization process

SizeCtl defaults to 0 and will be a power of 2 if the instantiation argument is passed. Therefore, the Thread that performs the first PUT changes sizeCtl to -1 through CAS, and ensures that only one Thread can make the sizeCtl change. The other threads use Thread.yield() to give up CPU time slices to wait for table initialization to complete.

5. Add element (put()) ideas

The main steps are as follows:

(1) Compute the hash value corresponding to the key, which is similar to the high-order hash value in a HashMap

(2) Check whether the TAB is empty. If so, initialize the TAB

(3) Obtain the element f corresponding to the index at TAB. The underlying tabAt() method obtains the value through u.getobjectVolatile () and determines whether it is null.

1) If f is null, it indicates the first insert in the table

2) use Unsafe.com pareAndSwapObject method insert Node Node, if set success, whether you need go to expansion.

(4) If the hash value of f is -1, it indicates that f is a ForwardingNode node, indicating that other threads are expanding capacity. Expand capacity together

(5) This step indicates that a hash conflict has occurred, and the new Node is inserted into the appropriate position in a linked list or red-black tree way. This process uses the synchronous built-in lock to achieve concurrency

1) Lock the header

2) First use tabAt(TAB, I) == f to determine whether it is the same element to prevent f from being modified by other threads at the same time

3) if f hash >= 0, it indicates that f is the head node of the linked list structure. If f hash is found, change value; Otherwise add nodes to the end of the list

4) In the previous step, if the hash is the same and the key is the same, the value can be overwritten directly and the loop can be directly broken out

5) If the previous step is not satisfied, the nodes will be traversed in the future

(6) If the node is a tree node, insert the value as a tree

(7) If the number of nodes in the list is binCount >= TREEIFY_THRESHOLD(default is 8), then the list is transformed into a red-black tree structure

(8) Run addCount() to modify the number of elements and determine whether to expand

The overall put() method flow chart is as follows

Ps: You can compare the source code with this figure

Put () process

6. The get () process

The get() method is much simpler than the put() method. The main process is as follows:

ConcurrentHashMap the get () process

conclusion

ConcurrentHashMap is an implementation of a diverging column map table that has the greatest advantage of allowing fully concurrent reads while still performing well. SizeCtl, the global variable, is used to control container state, and CAS and synchronized are used to ensure thread safety.

Next time, I will introduce the calculation method of container size and the expansion mechanism of ConcurrentHashMap.

One day

I am Rhubarb, a programmer who only writes HelloWorld and intends to take apart every interview question in my normal time. Interview three minutes, usually do not relax.

I’ll see you next time.

Reference: https://programmer.ink/think/analysis-of-concurrent-hashmap-principle.html