A few digressions

How to improve system throughput with high concurrency is the goal of all backend developers. Doug Lea, the creator of Java concurrency, provides some reference answers in the design of Java 7 ConcurrentHashMap. ConcurrentHashMap source code in detail summarizes the impact of concurrency performance of ten details, including the use of common spin locks, CAS, but also delayed write memory, volatile semantic degradation and other unusual skills, hope to help you develop design.

Because ConcurrentHashMap contains a lot of content, and Java 7 and Java 8 implementations are quite different, it is inevitable that some details will be missed in the limited space. Therefore, I have decided to use two articles to elaborate on the technical details of ConcurrentHashMap in both versions, but to help readers understand it systematically, the overall structure of the three articles (the one containing the HashMap) will remain the same.

In addition, I also prepared a benefit at the end of the article, please remember to check oh ~

Put the book read thin

Gu Gu, author of The Alibaba Java Development Manual, is a big fan of ConcurrentHashMap’s design, saying: ConcurrentHashMap source code is a very good learning material for learning Java code development specifications. I recommend students to read it from time to time. There will always be new gains. Before you unfold all the secrets hidden in the ConcurrentHashMap, you should first have this picture in your mind:

For Java 7, this diagram fully illustrates the ConcurrentHashMap architecture:

ConcurrentHashMap is a thread-safe Map implementation that reads without locking. By introducing segments, it is possible to lock data with less force when writing
Due to the introduction of Segment, ConcurrentHashMap requires two hashes to read and write, but these two hashes result in finer force granularity locks, which means higher concurrency
The key-value pairs in each bucket array are still stored in buckets as linked lists, which is consistent with a HashMap.

Put the book read thick

The overall architecture of Java 7’s ConcurrentHashMap can be summarized in just a couple of sentences above, and it should quickly become an image that sticks in your mind. Here are a few questions to try to entice you to keep reading and read more:

Which operations of ConcurrentHashMap need to be locked?
How are lockless reads of ConcurrentHashMap implemented?
What are the challenges of calling the size () method to get the size of ConcurrentHashMap in multi-threaded scenarios? How is ConcurrentHashMap resolved?
How should expansion be done if Segment exists?

In the last article we summarized the four most important points in a HashMap: Initialization, data-address-hash, data-storage-PUT, and expansion-resize are still the most important operations for ConcurrentHashMap, but due to the introduction of more complex data structures, So calling size() to see the size of the entire ConcurrentHashMap can be quite challenging, and we’ll focus on Doug Lea’s design in the size() method

Initialize the

public ConcurrentHashMap(int initialCapacity, float loadFactor, int concurrencyLevel) { if (! (loadFactor > 0) || initialCapacity < 0 || concurrencyLevel <= 0) throw new IllegalArgumentException(); if (concurrencyLevel > MAX_SEGMENTS) concurrencyLevel = MAX_SEGMENTS; // Find power-of-two sizes best matching arguments int sshift = 0; int ssize = 1; While (ssize < concurrencyLevel) {++sshift; ssize <<= 1; } // The upper part of the hash is used to determine the segment, and the lower part is used to determine the elements in the bucket array this.segmentShift = 32-sshift; this.segmentMask = ssize - 1; if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; int c = initialCapacity / ssize; if (c * ssize < initialCapacity) ++c; int cap = MIN_SEGMENT_TABLE_CAPACITY; while (cap < c) cap <<= 1; Segment<K,V> s0 = new Segment<K,V>(loadFactor, (int)(cap * loadFactor), (HashEntry<K,V>[])new HashEntry[cap]); Segment<K,V>[] ss = (Segment<K,V>[])new Segment[ssize]; UNSAFE.putOrderedObject(ss, SBASE, s0); // ordered write of segments[0] this.segments = ss; }Copy the code

The initialization method does three important things:

The segments array size (ssize) is determined based on the concurrencyLevel parameter and is raised to the lowest integer power of 2 greater than concurrencyLevel
Determines the hash address offset used when determining the location of elements in the segment array
InitialCapacity/ssize initialCapacity/ssize initialCapacity divided by the size of the segment array Other elements in the segment array are initialized with reference to the first initialized instance for subsequent PUT operations

static final class HashEntry<K,V> { final int hash; final K key; volatile V value; volatile HashEntry<K,V> next; HashEntry(int hash, K key, V value, HashEntry<K,V> next) { this.hash = hash; this.key = key; this.value = value; this.next = next; } final void setNext(HashEntry<K,V> n) { UNSAFE.putOrderedObject(this, nextOffset, n); }}Copy the code

ConcurrentHashMap; ConcurrentHashMap; ConcurrentHashMap; ConcurrentHashMap

Details:

The member variables of HashEntry, value and next, are volatile, meaning that all threads can detect changes made to these variables by other threads in time, and thus read the latest values of these references without locking

Details of the two:

PutOrderedObject is called in the setNext method of HashEntry. The UNSAFE. PutOrderedObject interface is an API in the Sun security library, not part of J2SE, and serves the opposite of volatile. The API is called to delay writing volatile variables to main memory. When is writing to main memory?

There is a clause in the JMM that states:

Before you can unlock a variable, you must first synchronize it to main memory (store and write)

We’ll look at setNext in more detail later when we talk about the put method

The hash

As a result of introducing segment, both get (read) and PUT (write) hashes need to be done twice. Remember that the system did one important thing when we initialized the segment:

Determines the hash address offset used when determining the location of elements in the segment array

That’s it:

this.segmentShift = 32 - sshift;

Copy the code

I’m subtracting it by 32 because int is 32, so with segmentShift, how does ConcurrentHashMap do the first hash?

public V put(K key, V value) { Segment<K,V> s; if (value == null) throw new NullPointerException(); int hash = hash(key); // the variable j represents the JTH item in the segment array int j = (hash >>> segmentShift) & segmentMask; // if segment[j] is null, the following method is responsible for initialings = ensureSegment(j); return s.put(key, hash, value, false); }Copy the code

Let’s take the put method as an example. The variable j represents the JTH item in the segment array. If the segment array size is 2 to the power of n, hash >>> segmentShift takes the high N bits of the key hash, and then corresponds to the segmentMask.

The hash method is similar to the non-thread-safe HashMap, which I won’t go into detail here.

Details of the three:

The authors use CAS to avoid locking when delaying initialization of Segment arrays, and CAS ensures that the final initialization can only be done by one thread. Two more checks were made before the final decision was made to call CAS for initialization. The first check was to avoid repeatedly initializing the TAB array, and the second check was to avoid repeatedly initializing the Segment object. Each line was considered in detail by the author.

private Segment<K,V> ensureSegment(int k) { final Segment<K,V>[] ss = this.segments; long u = (k << SSHIFT) + SBASE; // raw offset Actual byte offset Segment<K,V> seg; if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { Segment<K,V> proto = ss[0]; // use segment 0 as prototype int cap = proto.table.length; float lf = proto.loadFactor; int threshold = (int)(cap * lf); HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap]; if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, // Segment<K,V> s = new Segment<K,V>(lf, threshold, TAB); while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { if (UNSAFE.compareAndSwapObject(ss, u, null, Seg = s)) // Use CAS to ensure that break is initialized only once; } } } return seg; }Copy the code

Put method

final V put(K key, int hash, V value, boolean onlyIfAbsent) { HashEntry<K,V> node = tryLock() ? null : scanAndLockForPut(key, hash, value); V oldValue; try { HashEntry<K,V>[] tab = table; int index = (tab.length - 1) & hash; HashEntry<K,V> first = entryAt(tab, index); for (HashEntry<K,V> e = first;;) { if (e ! = null) { K k; / / if found the key of the same data item, directly replace the if ((k = e.k ey) = = key | | (e.h ash = = hash && key. Equals (k))) {oldValue = e.v alue. if (! onlyIfAbsent) { e.value = value; ++modCount; } break; } e = e.next; } else { if (node ! = null) // Node is not null, indicating that it has been initialized while the spin is waiting. Else // Otherwise, create a HashEntry here. Node = new HashEntry<K,V>(hash, key, value, first); int c = count + 1; If (c > threshold && tab.length < MAXIMUM_CAPACITY) rehash(node); SetEntryAt (TAB, index, node); ++modCount; count = c; oldValue = null; break; } } } finally { unlock(); } return oldValue; }Copy the code

This code stands out in the design of ConcurrentHashMap. In just 40 lines of code, Doug Lea is like a magic magician who has performed several magic tricks in the blink of an eye. It’s amazing how much he understands concurrency. Let’s slowly parse the magic Doug Lea uses in this code

Details of the four:

The scheduling of CPU is fair. If the thread is suspended because the lock cannot be obtained, the efficiency of the thread will undoubtedly be reduced. Moreover, the thread has to be rescheduled after the suspension, and changing the context is a large overhead. If you encounter other threads holding the lock for a short period of time, using spin is a good choice. There is also a tradeoff. If other threads hold the lock for a long time, it is better to suspend and block wait.

private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) { HashEntry<K,V> first = entryForHash(this, hash); HashEntry<K,V> e = first; HashEntry<K,V> node = null; int retries = -1; // negative while locating node while (! TryLock ()) {// spin wait HashEntry<K,V> f; // To recheck first below if (retries < 0) {if (e == null) {// The bucket does not contain a K-V entry if (node == null) // Speculatively create Node creates a new node node = new HashEntry<K,V>(hash, key, value, null); retries = 0; Else if (key.equals(e.keey)) retries = 0; Else // Iterate over the list e = e.next; } else if (++retries > MAX_SCAN_RETRIES) {if (++retries > MAX_SCAN_RETRIES) { break; } else if ((retries & 1) == 0 && (f = entryForHash(this, hash)) ! // If the head node changes, reset the number of times, continue to spin wait e = first = f; retries = -1; } } return node; }Copy the code

The strategy of ConcurrentHashMap is to spin MAX_SCAN_RETRIES times. If the lock has not been obtained, the lock is called to suspend the wait, and the spin wait is reset if another thread uses a header to change the head of the list.

Details of the five:

Remember, if you want to increase the concurrency of your system from a coding perspective, a golden rule is to reduce the size of the concurrency critical section. One small detail in scanAndLockForPut’s design that caught my eye is that it initializes a HashEntry during the spin process. The advantage of this is that the thread does not need to initialize a HashEntry after it has acquired the lock, which reduces the time it takes to hold the lock and improves performance.

Details of the 6:

At the beginning of the PUT method, there is this unremarkable line of code:

HashEntry<K,V>[] tab = table;

Copy the code

This may seem like a simple assignment to a temporary variable, but there’s a lot going on. Let’s look at the table declaration:

transient volatile HashEntry<K,V>[] table;

Copy the code

The table variable is volatile, and the CPU does the following for volatile variables:

sniffer

Each processor by sniffing the spread of the data on the bus to check the value of the cache is expired, when the processor found himself cache line corresponding to the memory address has been changed, and will be set for the current processor cache line in invalid state, when the processor to modify the data operation, will start from system memory read data to the processor cache

Therefore, reading and writing directly to such variables is more costly than writing to ordinary variables, so assigning the table variable to a common local variable at the beginning of the PUT method is intended to eliminate the performance penalty of volatile. Here’s another question: Does this change the semantics of the table so that other threads can’t read the latest values? Don’t worry. Let’s keep reading.

Details seven:

Notice this method in the put method: entryAt():

static final <K,V> HashEntry<K,V> entryAt(HashEntry<K,V>[] tab, int i) {
    return (tab == null) ? null : (HashEntry<K,V>) UNSAFE.getObjectVolatile(tab, ((long)i << TSHIFT) + TBASE);
}

Copy the code

The underlying method calls unsafe. getObjectVolatile. The purpose of this method is to read the latest value of a volatile variable as well as a volatile variable. If you call TAB [I] directly, you may not get the latest first node. The observant reader may be wondering if Doug Lea has lost his mind that the transformation is back to square one, and why he didn’t just use volatile variables in the first place. Let’s move on.

Detail eight:

In the implementation of the PUT method, if there are no items in the list with equal keys, the new items are inserted into the list header and written into the array. The method called is:

static final <K,V> void setEntryAt(HashEntry<K,V>[] tab, int i, HashEntry<K,V> e) {
    UNSAFE.putOrderedObject(tab, ((long)i << TSHIFT) + TBASE, e);
}

Copy the code

PutOrderedObject putOrderedObject data written by this interface is not immediately available to other threads, but is not visible to them until the unclock is called at the end of the put method. See the previous JMM description:

Before you can unlock a variable, you must first synchronize it to main memory (store and write)

This benefit is twofold. The first is performance, because there is no need to synchronize main memory in the critical region where the lock is held, so the lock is held for a shorter time. Second, data consistency is ensured. Data added by a PUT operation is not displayed to other threads until the finally statement of the PUT operation is executed. This is the key reason for the lockless read implementation of ConcurrentHashMap.

Unsafe. getObjectVolatile (); unsafe. getObjectVolatile (); unsafe. getObjectVolatile (); But calling unsafe. putOrderedObject delays writing variables to main memory until the end of the PUT method, both to reduce critical areas for better performance and to ensure that other threads are reading the full data.

Details of the 9:

If put does need to insert data items into the header of the list, it should also be noted that the ConcurrentHashMap statement is:

node.setNext(first);

Copy the code

Let’s take a look at the implementation of setNext:

final void setNext(HashEntry<K,V> n) {
    UNSAFE.putOrderedObject(this, nextOffset, n);
}

Copy the code

PutOrderedObject changes the semantics of volatile because the next variable is volatile. Unsafe. putOrderedObject changes the semantics of volatile. The first consideration is performance. The second concern is semantic consistency. For the PUT method, which calls the unsafe. getObjectVolatile, it is still possible to retrieve the latest data. For the GET method, it is reasonable to not want incomplete data to be read by other threads before the PUT method ends.

The resize capacity

private void rehash(HashEntry<K,V> node) { HashEntry<K,V>[] oldTable = table; int oldCapacity = oldTable.length; int newCapacity = oldCapacity << 1; threshold = (int)(newCapacity * loadFactor); HashEntry<K,V>[] newTable = (HashEntry<K,V>[]) new HashEntry[newCapacity]; int sizeMask = newCapacity - 1; for (int i = 0; i < oldCapacity ; i++) { HashEntry<K,V> e = oldTable[i]; if (e ! = null) { HashEntry<K,V> next = e.next; int idx = e.hash & sizeMask; NewTable [idx] = e; if (next == null) // Single node on list else { HashEntry<K,V> lastRun = e; int lastIdx = idx; // Ensure that newTable[k] is not null for (HashEntry< k,V> last = next; last ! = null; last = last.next) { int k = last.hash & sizeMask; if (k ! = lastIdx) { lastIdx = k; lastRun = last; } } newTable[lastIdx] = lastRun; // Clone Remaining Nodes adds the remaining nodes to the hash bucket of the new array. For (HashEntry<K,V> p = e; p ! = lastRun; p = p.next) { V v = p.value; int h = p.hash; int k = h & sizeMask; HashEntry<K,V> n = newTable[k]; newTable[k] = new HashEntry<K,V>(h, p.key, v, n); } } } } int nodeIndex = node.hash & sizeMask; Node.setnext (newTable[nodeIndex]); newTable[nodeIndex] = node; table = newTable; }Copy the code

If you read our an analysis on the course of HashMap rehash of this code will be more relaxed, we analysis in the last, in the whole bucket array length for positive integer power of 2 cases, before the expansion elements in the same bucket after expansion will only distributed in the two barrels, one of the barrels of subscript remains the same, we call it the old barrels, The first for loop is to find the last item in a list that should be moved to the new bucket. This is done to ensure that the following HashEntry<K,V> n = newTable[K] is called; Will not read null. The second for is simpler, moving all the items into the new bucket array and assigning newTable to table when all the operations are complete.

There is no lock in the rehash method. This does not mean that the method does not need to be locked.

The size method

Before the analysis of the HashMap method when we don’t have to speak the size method, because this method can be used in a single-threaded environment to solve a global variables, the same solution, of course, also can use in a multithreaded scenario, but you have to read the global variables in a multithreaded environment and fall into the endless “lock”, this is what we don’t want to see, How does ConcurrentHashMap solve this problem?

public int size() { final Segment<K,V>[] segments = this.segments; int size; boolean overflow; // true if size overflows 32 bits long sum; // sum of modCounts long last = 0L; // previous sum int retries = -1; // first iteration isn't retry try { for (;;) { if (retries++ == RETRIES_BEFORE_LOCK) { for (int j = 0; j < segments.length; ++j) ensureSegment(j).lock(); // force creation } sum = 0L; size = 0; overflow = false; for (int j = 0; j < segments.length; ++j) { Segment<K,V> seg = segmentAt(segments, j); if (seg ! = null) { sum += seg.modCount; int c = seg.count; if (c < 0 || (size += c) < 0) overflow = true; } } if (sum == last) break; last = sum; } } finally { if (retries > RETRIES_BEFORE_LOCK) { for (int j = 0; j < segments.length; ++j) segmentAt(segments, j).unlock(); } } return overflow ? Integer.MAX_VALUE : size; }Copy the code

When we introduced the put method, we chose to ignore the small modCount member variable, which counts the number of writes in the Segment as writes affect the size of the ConcurrentHashMap.

Because reading the size of ConcurrentHashMap requires the most recent value to be read, the unsafe. getObjectVolatile method is called, which performs worse than ordinary variables, but is much better than global locking.

static final <K,V> Segment<K,V> segmentAt(Segment<K,V>[] ss, int j) { long u = (j << SSHIFT) + SBASE; // Calculate the actual byte offset return ss == null? null : (Segment<K,V>) UNSAFE.getObjectVolatile(ss, u); }Copy the code

Details of ten:

ConcurrentHashMap returns the sum of size() of each segment array if no data is written to the entire ConcurrentHashMap. If write operations are frequent, then locking is mandatory. The lock is equivalent to a global lock because every element in the Segment array is locked. How do you determine if the entire ConcurrentHashMap is written frequently? Consider the number of lockless retries. If the number of lockless retries exceeds the threshold, global locking is processed.

conclusion

After looking at these details in ConcurrentHashMap, let’s try to answer the question posed at the beginning of this article:

Which operations of ConcurrentHashMap need to be locked? A: Only write operations need locking, read operations do not need locking
How are lockless reads of ConcurrentHashMap implemented? A: First, the value and next in HashEntry are volatile. Secondly, the UNSAFE library is used to defer synchronization of main memory during writes, ensuring data consistency
What are the challenges of calling the size () method to get the size of ConcurrentHashMap in multi-threaded scenarios? How is ConcurrentHashMap resolved? A: ConcurrentHashMap determines whether the size() read can be trusted by checking whether there are writes between two lockless reads. If the writes are frequent, Then degenerate to global lock read.
How to expand capacity when Segment exists? A: The size of the segment array is set at the beginning of the initialization process. The size of the segment array is set at the beginning of the initialization process.

The original link: www.toutiao.com/i6955033271…

Recently, I have also compiled a list of 2021 Java Core Knowledge + “Interview”. You can click the link to get the most complete learning notes dACHEon real questions + micro services +MySQL+ distributed +SSM framework +Java+Redis+ data structure and algorithm + network +Linux+Spring family bucket +JVM+ high concurrency + all major learning mind map + interview set

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

ConcurrentHashMap has 10 performance enhancements. Do you know them all?

A few digressions

Put the book read thin

Put the book read thick

Initialize the

The hash

Put method

The resize capacity

The size method

conclusion

ConcurrentHashMap has 10 performance enhancements. Do you know them all?

A few digressions

Put the book read thin

Put the book read thick

Initialize the

The hash

Put method

The resize capacity

The size method

conclusion

Related Posts

How do distributed things ensure sequential interface requests?

Brace expansion of shell

[Data Structure and Algorithm] Queue