I recently read some Java Map source code and was amazed at how well the JDK developers worked. These ingenious design ideas are very valuable for reference and can be described as the best practice. However, most popular articles on the principles of Java Map focus on “dots” rather than “lines” or even “networks.” Therefore, based on my understanding, this paper classifies and summarizes some of the source code I have read, and summarizes several core features of MAP, including: Automated scaling, initialization and lazy loading, hashing, bitwise and concurrency, combined with the source code, are all covered in depth. Hopefully, after reading this article, you will have learned something from it (this article uses HashMap in JDK1.8 by default).

Automatic capacity expansion

Minimum available principle, capacity over a certain threshold will be automatically expanded.

Capacity enlargement is achieved through the resize method. Size enlargement occurs at the end of the putVal method, that is, after the element is written, it will judge whether the size expansion operation is needed. When the size of the increased element is greater than the calculated threshold value, the resize operation will be performed.

The capacity is expanded by bit operation <<1, that is, the capacity is expanded by 1 times, and the new threshold value NEWTHR is expanded by 1 times of the old threshold value.

When enlarging capacity, there are three situations in total:

If there is only one element in a position in the hash bucket array, that is, if there is no hash conflict, the element can be directly copied to the corresponding position in the new hash bucket array.

When a node in a position of the hash bucket array is a tree node, the red-black tree enlargement operation is performed.

When a node in a position of the hash bucket array is an ordinary node, the linked list capacity expansion operation is performed. In JDK1.8, in order to avoid the dead-chain problem caused by concurrent capacity expansion in previous versions, the high-low linked list is introduced to assist the capacity expansion operation.

In the daily development process, there will be some bad cases, such as:

HashMap hashMap = new HashMap(2);

hashMap.put(“1”, 1);

hashMap.put(“2”, 2);

hashMap.put(“3”, 3);

When the HashMap is set to the last element, 3, the current size of the hash bucket array reaches the scale threshold of 2*0.75=1.5, followed by a scale operation. Therefore, this type of code will scale once every time it runs, which is inefficient. During daily development, it is important to fully evaluate the size of the HashMap to ensure that the scale threshold is greater than the number of storage elements and reduce the number of scale times.

Two initialization and lazy loading

Only the default load factor is set during initialization and no other initialization is performed. It is initialized when it is first used.

When a new HashMap is created, the hash array is not initialized immediately. Instead, it is initialized through the resize() method the first time an element is put.

In resize(), the default initialization capacity DEFAULT_INITIAL_CAPACITY is set to 16, and the scaling threshold is 0.75*16 = 12, that is, the scaling operation will be carried out when there are 12 elements in the hash bucket array.

Finally, a Node array of 16 capacity is created and assigned to the member variable hash bucket table, that is, the initialization of HashMap is completed.

Three hashes

The fact that the hash table is named after hashes is a good indication of the importance of hashing calculations in this data structure. However, in the implementation, the JDK does not directly use the HashCode returned by the native method of Object as the final hash value, but carries out secondary processing.

The following are the methods of HashMap and ConcurrentHashMap to calculate the hash value respectively. The core calculation logic is the same, which uses the HashCode corresponding to Key and the result of its HashCode being moved 16 bits to the right to conduct the XOR operation. Here, the XOR operation between the high 16 bits and the low 16 bits is called the perturbation function, with the purpose of integrating the features of the high bits into the low bits and reducing the probability of hash conflicts.

Here’s an example to understand what the perturbation function does:

hashCode(key1) = 0000 0000 0000 1111 0000 0000 0000 0010

hashCode(key2) = 0000 0000 0000 0000 0000 0000 0000 0010

If the HashMap has a capacity of 4, the HashCode of key1 and key2 is bound to conflict without the use of perturbation functions (the last two are the same, both 01).

After the perturbation function is processed, it can be seen that the last two words of the key1 hashCode are different from the key2 hashCode, so the hash conflict above is avoided.

hashCode(key1) ^ (hashCode(key1) >>> 16)

0000 0000 0000 1111 0000 0000 0000 1101

hashCode(key2) ^ (hashCode(key2) >>> 16)

0000 0000 0000 0000 0000 0000 0000 0010

This gain increases as the HashMap capacity decreases. The article “An Introduction to Optimising a Hashing Strategy” randomly selected 352 strings with different hash values. When the HashMap has a capacity of 2^9, the use of perturbation function can reduce collisions by 10%, which shows the necessity of perturbation function.

In addition, after the scrambling function processing in ConcurrentHashMap, it needs to do the AND operation with HASH_BITS, HASH_BITS is 0x7ffffff, that is, only the highest bit is 0, so the result of the operation makes hashCode always positive. In the ConcurrentHashMap, the HashCode for several special nodes, such as Moved, TreeBin, and Reserved, is predefined with a negative value. Therefore, the hashCode of ordinary nodes is restricted to a positive number, in order to prevent conflicts with the hashCode of these special nodes.

1 Hash Conflicts

Through hashing operation, different input values can be mapped to the specified interval range, followed by the problem of hashing conflict. In an extreme case, assuming that all input elements are hashed into the same hash bucket, the query will no longer be O(1), but O(n), equivalent to sequential traversal of a linear table. Therefore, hash conflict is one of the important factors affecting the performance of hash computation. How can hash conflicts be resolved? Mainly from two aspects to consider, on the one hand is to avoid conflict, on the other hand is reasonable in the conflict to solve the conflict, as far as possible to improve the efficiency of query. The former, introduced in the previous section, uses perturbation functions to increase the randomness of hashCode and avoid collisions. For the latter, two solutions are presented in HashMap: a zipper table and a red-black tree.

Zipper table

Before JDK1.8, the method of zipper table was adopted in HashMap to resolve conflicts, that is, when elements already exist in the bucket corresponding to the calculated HashCode, but the two keys are different, a linked list will be pulled out based on the existing elements in the bucket, and the new element will be linked to the front of the existing element. When querying for a hash bucket with a conflict, the elements on the conflicting chain are iterated sequentially. The judgment logic of the same key is shown in the figure below. First, judge whether the hash value is the same, and then compare whether the address or value of the key is the same.

(1) Dead chain

Prior to JDK1.8, there was a bug in HashMap scaling in concurrent scenarios that resulted in a dead chain, which resulted in an endless loop when getting the element at that location, resulting in high CPU utilization. This also shows that HashMap is not suitable for use in high-concurrency scenarios, where ConcurrentHashMap in JUC should take precedence. However, rather than bypass the problem, the JDK developers who are striving for excellence are choosing to face the problem head-on and fix it. In JDK1.8, high – and low-bit linked lists (double-ended linked lists) were introduced.

What is a high and low linked list? During capacity enlargement, the buckets of hash bucket array will be doubled in size. Taking HashMap with capacity of 8 as an example, the original capacity of 8 will be expanded to 16, and [0, 7] will be called the low position and [8, 15] the high position. The low position corresponds to loHead and loTail, and the high position corresponds to hiHead and hiTail.

Scaling iterates over each element in the old buckets array:

If there is no conflict, rehash the module and copy to the corresponding location in the new buckets array.

If there are conflicting elements, the high and low bit linked list is used for processing. Use E.Hash & Oldcap to determine whether the mold falls in the high or low position. For example, if the current element hashCode is 0001 (the high position is ignored), the calculation result is equal to 0, indicating that the result remains the same after expansion. After taking the module, it will still fall at the low position [0, 7], that is, 0001&1000 = 0000, which is still the original position. Then, the low position linked list will be used to link such elements. Assuming that the current element’s hashCode is 1001, its calculation result is not 0, that is, 1001&1000 = 1000. After expansion, it will fall to the high position, and the new position just happens to be the old array index (1) + the old data length (8) = 9. Then, the high linked list will be used to link these elements. Finally, the head nodes of the high-low linked list are placed at the specified position of newTab after capacity expansion, which completes the capacity expansion operation. This implementation reduces the frequency of access to the shared resource newTab, first organize the conflicting nodes, and finally put them into the designated position of newTab. Before JDK1.8, we avoided the dead-chain problem of putting an element into newTab every time we traversal it, which would result in concurrent scaling.

Red and black tree

In JDK1.8, HashMap introduced red-black trees to handle hash conflicts instead of zipper tables. So why introduce red-black trees instead of linked lists? Although the insert performance of the linked list is O(1), the query performance is O(n), which is unacceptable when there are too many hash conflicting elements. Therefore, in JDK1.8, if the number of elements in the collision chain is greater than 8 and the length of the hash bucket array is greater than 64, a red-black tree is used instead of a linked list to resolve the hash conflict. In this case, the Node is encapsulated as a TreeNode instead of a Node (TreeNode actually inherits from Node to take advantage of polymorphism). Enables query performance of O(logn).

Here is a brief review of red-black trees, which are a balanced binary search tree, and similarly, AVL trees. The core difference between the two is that AVL trees are “perfectly balanced”, which can be more expensive than red-black trees when inserting and deleting nodes, but also have better query performance and are suitable for scenarios where there are more reads and less writes. However, with a HashMap, read and write operations are a close match, so choosing a red-black tree is a trade-off in read and write performance.

The four operations

1 Determines the size of the hash bucket array

Find the smallest integer power of 2 greater than or equal to the given value.

The tablesizeFor is used to calculate the size of the final hash bucket array based on the input size cap, finding the smallest integer power of 2 greater than or equal to the given value cap. At first glance, this line by line of bit-computing is confusing, and it would be better to use a similar approach to finding patterns to explore its mysteries.

When cap is 3, the calculation process is as follows:

Cap | = 3, n = 2 n = n > > > 1 010 | 001 = 011 n = 3 n | = n > > > 2 011 | 000 = 011 n = 3 n | = n > > > 4 011 | 000 = 011 n = 3… . n = n + 1 = 4

When cap is 5, the calculation process is as follows:

Cap = 5 n = 4 n | = n > > > 1 0100 | 0010 = 0110 n = 6 n | = n > > > 2 0110 | 0001 = 0111 n = 7… . n = n + 1 = 8

Therefore, the significance of the calculation is to find the smallest integer power of 2 greater than or equal to cap. The whole process is to find the 1 of the highest bit in the binary corresponding to cap, and then copy the highest bit 1 to all the lower bits at a step of 2 times each time (shifting 1, 2, 4, 8, and 16 in turn), set all the bits after the highest bit 1 to 1, and finally carry out +1, that is, carry out.

The change process of similar binary bits is as follows:

0100 1010

0111 1111

1000 0000

Finding the smallest integer power of 2 for the input cap as the final capacity can be understood as the least available principle, taking up as little space as possible, but why have to have an integer power of 2? The answer is that in order to improve the calculation and storage efficiency, the corresponding hash value of each element can accurately fall into the given range of the hash bucket array. The algorithm used to determine array subscript is hash & (n-1), where n is the size of the hash bucket array. Since it is always an integer power of 2, this means that the binary form of n-1 will always be of the form 0000111111, i.e., multiple 1s in succession starting from the lowest place, and the binary will map to any value in the specified interval [0, n-1] by am-and-operation. For example, when n=8, the binary corresponding to n-1 is 0111, and any & operation with 0111 will fall into the range of [0,7], that is, it will fall into the given 8 hash buckets, and the storage space utilization is 100%. To take a counter example, when n=7 and the binary corresponding to n-1 is 0110, any & operation with 0110 will fall into the 0,6, 4 and 2 hash buckets, instead of the interval range of [0,6], reducing the 3 hash buckets of 1, 3 and 5, which leads to less than 60% storage space utilization and increases the probability of hash collision.

2 ASHIFT offset calculation

Gets the most significant digit of a given value (in addition to multiplication and division, a shift can also be used to preserve the high and low order operations, with right shifts preserving the high and left shifts preserving the low).

Abase +ASHIFT in ConcurrentHashMap is used to calculate the initial position of an element in the hash array in real memory, and ASHIFT calculates 31 against the number of scale leading 0, i.e. the actual number of bits in scale -1. Scale is the size of each element in the hash bucket array Node[]. Calculated by ((long) I << ASHIFT) + ABASE, the initial memory address of the ith element in the array can be obtained.

Let’s look at how the numberOfLeadingZeros is calculated, and see if NumberOfLeadingZeros is a static method of Integer, or if it’s a pattern finder.

Let’s say I = 0000, 0000, 0100, 0000, 0000, 0000, 0000, 0000, n = 1

I >>> 16 0000 0000 0000 0000 0000 0000 0000 0000 0100 is not 0

I >>> 24 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 is equal to 0

The shift of 24 bits to the right equals 0, indicating that all bits between 24 and 31 must be 0, that is, n = 1 + 8 = 9. Since all the higher 8 bits are 0 and the information has been recorded in n, the higher 8 bits can be discarded, that is, I <<= 8. At this point,

i = 0000 0100 0000 0000 0000 0000 0000 0000

Similarly, I >, >, > 28 is also equal to 0, which means that all bits from 28 to 31 are 0, n = 9 + 4 = 13, discarishing the high 4 bits. At this point,

i = 0100 0000 0000 0000 0000 0000 0000 0000

So I’m going to keep going,

I >>> 30 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 is not 0 I >>> 31 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 = 0

So we get n is equal to 13, so we have 13 leading zeros. N -= I >>> 31 checks if the highest bit 31 is 1, because n initializes to 1, if the highest bit is 1, then there is no leading 0, that is, n = n-1 = 0.

To sum up, the above operation is actually based on the idea of dichotomy to locate the highest bit of 1 in binary. First, look at the higher 16 bits, if 0, 1 exists in the lower 16 bits; On the other hand, there are high 16 bits. This reduces the search area from 32 (31) bits to 16 bits, then divides it in half, verifies the high 8 bits and the low 8 bits, and so on.

The number of digits checked during the calculation is 16, 8, 4, 2, and 1, which adds up to exactly 31. Why is it 31 instead of 32? Because I can only be 0 if the number of leading zeroes is 32, we already filtered it in the previous if condition. In this way, if the value is not zero, the leading zero can only appear in the high 31 bits, so only the high 31 bits need to be checked. Finally, the binary’s most significant digit is obtained by subtracting the total number of digits from the calculated number of leading zeros. Code to be used in 31 – Integer. NumberOfLeadingZeros (scale), rather than the total number of 32, this is in order to be able to get a hash bucket of the ith a element in an array starting memory address, convenient for CAS, and so on.

Five concurrent

A pessimistic locking

A full table lock

Full table locking is used in Hashtable, which means that all operations are locked and executed serially, as shown in the PUT method in the figure below, and modified with the synchronized keyword. While this is thread-safe, it also has a significant impact on computing performance in the era of multicore processors, which has made Hashtable increasingly out of the developers’ radar.

Segmented lock

In response to the problem of too coarse lock granularity in Hashtable, ConcurrentHashMap introduced the segmentation locking mechanism prior to JDK1.8. The overall storage structure is shown in the figure below. Multiple segments are split based on the original structure, and the original entry (the hash bucket array mentioned in the previous article) is mounted under each segment. Each operation only needs to lock the segment where the element is, instead of the whole table. As a result, the locking scope is smaller and the concurrency is improved.

2 optimistic locking

Synchronized+CAS

Although segmenting locking was introduced to ensure thread safety and solve the problem of poor performance caused by coarse lock granularity, it was not the ceiling for engineers seeking the ultimate performance. Therefore, in JDK1.8, ConcurrentHashMap ditches segmenting locking in favor of an optimistic locking implementation. The reasons for giving up segment-locking are as follows:

After the segment is used, the storage space of the ConcurrentHashMap is increased.

Concurrency performance deteriorates dramatically when a single segment becomes too large.

The implementation of ConcurrentHashMap in JDK1.8 departs from the segment structure and uses a Node array structure similar to that used in HashMap.

Optimistic locking in ConcurrentHashMap is implemented using synchronized+CAS. Here’s a look at the PUT code.

When the element of PUT does not exist in the hash bucket array, the CAS writes directly.

There are two important operations involved, tabAt and casTabAt. As you can see, the Unsafe class is being used here. Unsafe is a rare class in everyday development. Our general perception of the Java language is that the Java language is secure, that all operations are based on the JVM, and that they are done within safe and controlled limits. However, the Unsafe class would break that boundary, giving Java C’s ability to manipulate arbitrary memory addresses, which is a double-edged sword. The ASHIFT mentioned in the previous article is used to calculate the starting memory address of the specified element, and values and CAS are performed via getObjectVolatile and compareAndSwapObject, respectively.

Why not get the elements at the specified position in the bucket array instead of using getObjectVolatile? Because in the JVM’s memory model, each thread has its own working memory, the local variable table in the stack, which is a copy of main memory. Thus, thread 1 updates a shared resource and writes it to main memory, while thread 2’s working memory may still have the old value, resulting in dirty data. Volatile in Java is used to solve this problem and ensure visibility. Updates to a variable modified with the volatile keyword by any thread invalidate copies of that variable in other threads and require the most recent value from main memory. While the Node array in the ConcurrentHashMap is volatile to ensure visibility, the elements in the Node array are not. Therefore, the Unsafe method can be used to retrieve data directly from main memory, making sure the data is up to date.

Moving on to the logic of the PUT method, synchronized locks the first element f (head node 2) in the i-th position of the hash bucket array when an element of PUT is present in the hash bucket array and is not in the expanded state, followed by a double check, similar to the idea of the DCL singleton pattern. After the validation is passed, the elements on the current conflict chain are traversed and the appropriate location is selected to perform the PUT operation. In addition, ConcurrentHashMap also uses the HashMap solution for resolving hash conflicts, linked lists + red-black trees. Synsynchronized locks head nodes only in the event of a hash conflict. In fact, it is a finer-grained locking implementation than segment locking, locking only one of the hash buckets in a specific scenario to reduce the scope of influence of the lock.

The evolution direction of the Java Map solution for concurrency scenarios can be summarized as from pessimistic to optimistic locking, from coarse-grained to fine-grained locking, and can be used as a guideline for our daily concurrent programming.

3 Concurrent summation

Countercell is a great tool for concurrent summing introduced in JDK1.8, whereas prior to that it used the “try lockless summing” + “lock retry in case of conflict” strategy. Take a look at Countercell’s comments, which are adapted from Longadder and Striped64. Let’s take a look at the summation operation, which takes BaseCount as the initial value and traverses through each cell in the CounterCell array, summing up the values of each cell. As an extra note here, the @sun.misc.Contender annotation was introduced in Java8 to address the pseudo-sharing of cached rows. What is pseudo-sharing? In brief, considering the huge difference in speed between CPU and main memory, multiple levels of cache (L1, L2, and L3) are introduced in CPU. The storage unit in the cache is the cache line, which has a size of 2 integer powers of bytes, ranging from 32 to 256 bytes, and the most common is 64 bytes. Therefore, this will result in variables less than 64 bytes sharing the same cache line, and the failure of one variable will affect other variables in the same cache line, resulting in performance degradation, which is called pseudo-sharing problem. Considering the differences in the cache line units between different CPUs, Java8 uses this annotation to mask the differences and fill in the cache line according to the actual size of the cache line, so that the modified variable can monopolize a cache line.

Counting Countercell is implemented by calling addCount whenever the volume in the map changes. The core logic is as follows:

When CounterCells is not empty, or CounterCells is empty and CAS operation on Basecount fails, it will enter the subsequent count processing logic. Otherwise, CAS operation on Basecount succeeds, and it will return directly.

The core count method fullAddCount is called in the subsequent count processing logic, but one of the following four conditions must be satisfied: 1. 2, CounterCells size is 0; 3. The counterCell on the corresponding counterCells position is empty; CAS fails to update counterCell at corresponding counterCells location. The semantics behind these conditions is that, in the current situation, there has been or has been a concurrency conflict in the count, which needs to be resolved by means of CounterCell in preference to CounterCell. If counterCells and the corresponding element have already been initialized (condition 4), then try CAS to update first. If this fails, FullAddCount is called to continue processing. If the counterCells and corresponding elements are not initialized (conditions 1, 2, 3), addCount is also called for subsequent processing.

Adopt ThreadLocalRandom. When here to determine the cell subscript getProbe () as a hash value, this method returns the current Thread threadLocalRandomProbe in the value of the field. And when the hash conflicts, the advanceProbe method can be used to change the hash value. This is different from the hash calculation logic in HashMap, because in HashMap, it is necessary to ensure that the hash value of the same key is the same after multiple hash calculations and can be located to the corresponding value. Even if the hash value of two keys conflicts, the hash value cannot be changed arbitrarily, but the linked list or red-black tree can only be used to deal with the conflict. However, in the counting scene, we do not need to maintain the key-value relationship, but only need to find an appropriate position in the counterCells to put the counting cell. The difference of position has no effect on the final summation result, so when there is a conflict, we can replace a hash value based on a random strategy to avoid the conflict.

Next, we will look at the core calculation logic fulladdCount. There is A lot of code. The core flow is implemented by an infinite loop, which contains three processing branches.

A: The counterCells have been initialized, so you can try to update or create A CounterCell at the corresponding location.

B: If counterCells have not been initialized and there is no conflict (get cellsBusy lock), then add a lock to initialize counterCells with initial capacity of 2.

C: If counterCells are not initialized and there is a conflict, CAS will update the baseCount and the baseCount will also be counted in the final result when summing up. This is also a back-up strategy. Since counterCells are being locked by other threads, Then the current thread doesn’t have to wait any longer, and simply tries to add it using Basecount.

Among them, operations involved in branch A can be divided into the following points:

A1: If a CounterCell has not been created, try to create a CounterCell using lock +Double Check. If this fails, continue to try again. The lock used here is cellsBusy, which ensures that counterCells are created and placed in a sequence to avoid duplicate creation. In fact, it uses the strategy of the DCL singleton pattern. This lock is required for creation and scaling of CounterCells.

A2: The variable wasunContended was passed in addCount by the caller, which means that the previous CAS update cell failed. There is a conflict, and it needs to change the hash value [a7] and continue to try again.

A3: The CounterCell of the corresponding position is not empty, so it is updated directly by CAS.

A4: Conflict detection. If the reference value of counterCells is not equal to the reference value of the current thread, it means that some other thread has changed the reference of counterCells. If a conflict occurs, set the collide to false, and the next iteration can be expanded. Capacity limit: The maximum counterCells capacity is an integer power of 2 greater than or equal to the minimum NCPU (the number of CPU cores in the actual machine). When the capacity limit is reached, subsequent scaling branches will never be executed. The meaning of this restriction is that the true degree of concurrency is determined by the CPU core. When the counterCells capacity is equal to the number of CPU cores, ideally there should be no conflict even if all CPU cores are running different counter threads at the same time, and each thread can select its own cell for processing. If there is a conflict, it must be the hash value, so the action is to recalculate the hash value A7, rather than by expanding the capacity to solve the problem. Time for space, avoid unnecessary storage space waste, very good idea ~

A5: Update the capacity expansion flag, which will be expanded in the next iteration.

A6: Lock and enlarge capacity, and enlarge capacity by 1 times each time.

A7: Change the hash value.

private final void fullAddCount(long x, boolean wasUncontended) {

int h; / / initialize the probe the if ((h = ThreadLocalRandom. GetProbe ()) = = 0) {ThreadLocalRandom. LocalInit (); // force initialization h = ThreadLocalRandom.getProbe(); wasUncontended = true; } Boolean collide = false;} Boolean collide = false; // True if last slot nonempty for (;;) { CounterCell[] as; CounterCell a; int n; long v; If ((as = counterCells)); if (as = counterCells); = null && (n = as.length) > 0) {// Countercell is not created if ((a = as[(n-1) & h]) == null) {// cellsBusy is a lock, If (cellsBusy == 0) {// Try to attach new Cell // create new CounterCell CounterCell r = new CounterCell(x);  // Optimistic create // Double Check, If (cellsBusy == 0 &&u.com PareAndSwapInt (this, cellsBusy, 0, 1)) {Boolean created = false; try { // Recheck under lock CounterCell[] rs; int m, j; // Double Check if ((rs = counterCells) ! = null && (m = rs.length) > 0 &&rs [j = (m-1) &h] == null) {// Add a new CounterCell to counterCells rs[j] = r; created = true; }} finally {// Unlock, why not use CAS here? CellsBusy = 0; cellsBusy = 0; cellsBusy = 0; cellsBusy = 0; } if (created) break; // If creation fails, try continue again; // Slot is now non-empty}} // Slot is now non-empty}} // cellsBusy is not 0; } // [A2] else if (! CAS wasunContended = true) // CAS already known to fail // Continue after rehash // [a3] Countercell is not empty, Else if (U.com PareandSwaplong (a, cellValue, v = a.Value, v + x)) break; // else if (CounterCells! = as | | n > = NCPU) / / that counterCells capacity is greater than the maximum of NCPU (the number of actual machine CPU core) minimum integer power of 2. When counterCells are equal to the number of CPU cores, in theory there should be no conflict, even if all the CPU cores are running different counter threads at the same time. Each thread can select its own cell for processing. If there is a conflict, must be a hash value problem, so the measures is to recalculate the hash value (h = ThreadLocalRandom. AdvanceProbe (h)), // Collide = false if n is greater than nCPU; // At Max size or stale // [a5] else if (! // If the map to the cell location is not null and the attempt to update CAS fails, then there is a race. Set this collide to true and perform the following expansion on the next iteration to reduce the competition. Rehash collide = true if CPU core is larger than CPU core; // else if (cellsBusy == 0 &&u.com PareAndSwapInt (this, cellsBusy, 0, 4)) {if (counterCells == as) {// Expand table unless stale // increase CounterCell = new CounterCell[n] < < 1); for (int i = 0; i < n; ++i) rs[i] = as[i]; counterCells = rs; } } finally { cellsBusy = 0; } collide = false; continue; / / / / Retry with expanded table} [a7]. Change the hash value h = ThreadLocalRandom advanceProbe (h); } // [B] CounterCells have not been initialized and there is no conflict. CounterCells else if (cellsBusy == 0 && counterCells == as && U.com PareAndSwapInt (this, cellsBusy, 0, 1)) { boolean init = false; try { // Initialize table if (counterCells == as) { CounterCell[] rs = new CounterCell[2]; rs[h & 1] = new CounterCell(x); counterCells = rs; init = true; } } finally { cellsBusy = 0; } if (init) break; } // [C] CounterCells is not initialized and there is a conflict. BaseCount else if (u.pareandswaplong (this, baseCount, v = baseCount, v + x)) break; // Fall back on using base }

Countercell is cleverly designed, and behind it is actually LongAdder in JDK1.8. The core idea is to directly use Basecount accumulation in the scenario of low concurrency, otherwise combine with CounterCells and hash different threads to different cells for calculation, so as to ensure the isolation of access resources and reduce conflicts as much as possible. Compared with the brainless CAS in AtomicLong, Longadder can reduce the number of retries of CAS in high concurrency scenarios and improve computational efficiency.

Six conclusion

This may be just the tip of the iceberg in the Java Map source code, but it covers most of the core features and covers most of the scenarios we use in our daily development. Reading the source code is the same as reading a book, as if across the long history and the author of a close dialogue, figure out his mind, learn his ideas and to be passed on. The process of processing information into knowledge and applying it is a pain!