【MAP】 HashMap ConcurrentHashMap all-in-one

0 Article Structure

1.7 vs. 1.8 HashMap
ConcurrentHashMap 1.7 vs. 1.8

1 HashMap 1.7 1.8 (Array + linked list OR red-black tree)

A HashMap stores data based on the hashCode value of a key, and in most cases its value can be located directly, thus providing fast access, but the traversal order is uncertain. A HashMap allows at most one record to have a null key and multiple entries to have a NULL key. A HashMap is not thread-safe, meaning that more than one thread can write a HashMap at any one time, which can lead to data inconsistencies. If thread-safe is desired, the Collections synchronizedMap method can be used to make HashMap thread-safe, or ConcurrentHashMap can be used.

In general, the inside of a HashMap is an array, and each element in the array is a one-way list. In the figure above, each green entity is an instance of the nested class Entry, which contains four attributes: key, Value, Hash value, and Next for a one-way linked list.

Capacity: Indicates the current array capacity. The capacity is always 2^n and can be expanded. After the array capacity is expanded, the array size is twice that of the current one.
LoadFactor: loadFactor. The default value is 0.75.
Threshold: capacity expansion threshold, equal to capacity x loadFactor

Java8 makes some changes to HashMap, but the big difference is that it leverages red-black trees, so it consists of arrays, linked lists, and red-black trees.

Java7 HashMap can quickly locate the index of the array based on the hash value, but then we need to go down the list to find the index. The time complexity depends on the length of the list, O(n). To reduce this overhead, in Java8, when the list has more than eight elements, the list is converted to a red-black tree, reducing the time complexity of lookups at these locations to O(logN).

2 ConcurrentHashMap 1.7 1.8

Paragraph 1.7 Segment

ConcurrentHashMap has the same idea as HashMap, but is more complex because it supports concurrent operations. The entire ConcurrentHashMap consists of segments that stand for “part” or “Segment”, and are often described as segmented locks.

1.7 Thread Safety (Segment Inheritance ReentrantLock)

ConcurrentHashMap is an array of segments. A Segment is locked by inheritingReentrantLock. As long as each Segment is thread-safe, global thread-safety is achieved.

ConcurrencyLevel: parallel level, number of concurrent requests, and number of segments. The default value is 16, which means that ConcurrentHashMap has 16 Segments, so it is theoretically possible to support up to 16 concurrent writes at this point, as long as their operations are distributed over different Segments.
This value can be set to another value during initialization, but once initialized, it cannot be expanded.
Inside each Segment, each Segment is like a HashMap, but it is thread-safe, so it is a bit more difficult to handle.

Java1.8 has made major changes to ConcurrentHashMap, and Java8 has introduced red-black trees.

1. ConcurrentHashMap

ConcurrentHashMap: Retrieval operations (including GET) do not normally block and may therefore overlap with update operations (including PUT and remove). Similar to but unlike a HashMap, ConcurrentHashMap cannot hold null values, and neither key nor value can be considered null.
ConcurrentHashMap was introduced in JDK1.5 with java.util.concurrent. Prior to JDK8, ConcurrentHashMap was implemented based on Segment Segment locking. Synchronized and CAS instead. The Segment class still exists in ConcurrentHashMap in JDK1.8, and this class declaration exists for compatibility with previous versions of serialization.
ConcurrentHashMap in JDK1.8 no longer uses Segment Segment locking, but uses the header of the table array as a synchronized lock. Similar to HashMap in JDK1.8, for the same hashCode, when the number of nodes is less than 8, the Node storage structure is in the form of linked list, time complexity is O(N), when the number of nodes is more than 8, the red-black tree will be converted. The time complexity of access is O(long(N)).

4 ConcurrentHashMap 1.8 Interview details 2 (How is visibility implemented?)

Use volatile to ensure that changes in Node values are visible to other threads
Val and next in Node are both volatile. Changes to val or next are visible to other threads because volatile inserts read barriers before instructions are read, invalidating data in the cache and reloading data from main memory.
ConcurrentHashMap provides tabat-like reading of Table elements. In this case, Table elements are read volatile. The Unsafe class is used to ensure that other threads change the values of the array. It is available when the current thread gets.
static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) { return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
The equivalent is setTabAt, which writes elements to an array volatile so that the changes are visible to other threads.
static final <K,V> void setTabAt(Node<K,V>[] tab, int i, Node<K,V> v) {U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v); }

ConcurrentHashMap 3 (How is security implemented?)

The head node of the table array is used as the lock of synchronized to ensure the security of write operation
Synchronized is a thread-safe lock that only one thread can acquire. Synchronized is a mutex lock that only one thread can acquire.

When the header is null, the CAS operation is used to ensure that data can be written correctly.

/** Implementation for put and putIfAbsent */ final V putVal(K key, V value, boolean onlyIfAbsent) { if (key == null || value == null) throw new NullPointerException(); int hash = spread(key.hashCode()); int binCount = 0; for (Node<K,V>[] tab = table;;) { Node<K,V> f; int n, i, fh; if (tab == null || (n = tab.length) == 0) tab = initTable(); else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) { if (casTabAt(tab, i, null, // cas insert head new Node<K,V>(hash, key, value, null))) break; // no lock when adding to empty bin } else if ((fh = f.hash) == MOVED) tab = helpTransfer(tab, f); else { V oldVal = null; synchronized (f) { if (tabAt(tab, i) == f) { if (fh >= 0) { binCount = 1; for (Node<K,V> e = f;; ++binCount) { K ek; if (e.hash == hash && ((ek = e.key) == key || (ek ! = null && key.equals(ek)))) { oldVal = e.val; if (! onlyIfAbsent) e.val = value; break; } Node<K,V> pred = e; if ((e = e.next) == null) { pred.next = new Node<K,V>(hash, key, value, null); break; } } } else if (f instanceof TreeBin) { Node<K,V> p; binCount = 2; if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! = null) { oldVal = p.val; if (! onlyIfAbsent) p.val = value; } } } } if (binCount ! = 0) { if (binCount >= TREEIFY_THRESHOLD) treeifyBin(tab, i); if (oldVal ! = null) return oldVal; break; } } } addCount(1L, binCount); return null; } static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i, Node<K,V> c, Node<K,V> v) { return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v); }Copy the code

When the CAS operation is performed, compareAndSwap compares the value of the memory location with the expected value. If it matches, the processor automatically updates the value to the new value. Otherwise, the processor does nothing.
AsTabAt is also implemented by calling the Unsafe class, the Unsafe compareAndSwapObject. If you track the Unsafe line carefully, you will find that the CMPXCHG CPU instruction, which is an atomic CPU instruction, is called. To ensure the consistency of data.

4) Why? Is fast? Where is the fastest way?

The older version of the segment lock protects multiple hash buckets, while the jdK8 version of the lock only protects one hash bucket. Due to the smaller granularity of the lock, concurrency of write operations is greatly improved.
[More expansion threads] Locks are required to protect capacity expansion. Therefore, in the old version, the maximum number of threads that can be expanded simultaneously is the number of segment locks. In JDK8, the maximum number of threads that can be simultaneously expanded is the number of hash buckets (the length of the table array). To prevent excessive expansion threads, ConcurrentHashMap specifies that the expansion thread migrates at least 16 Hash buckets at a time. Therefore, the maximum number of threads that can be expanded at the same time in JDK8 is: number of Hash buckets /16
[High concurrency is guaranteed during capacity expansion] The segment lock scope of the old version is too large, resulting in a serious decrease in write concurrency during capacity expansion. The new version uses fine-grained hash bucket locks to ensure concurrent write operations during capacity expansion.
ConcurrentHashMap, like HashMap, maintains an array of tables. The elements of the array are Node linked lists or red-black trees.
There are three important methods for table arrays.

Node<K,V> tabAt(Node<K,V>[] TAB, int I) u.getobjectVolatile (TAB, ((long) I << ASHIFT) + ABASE)
Written as volatile, SetTabAt (Node<K,V>[] TAB, int I, Node<K,V> V) u.putobjectVolatile (TAB, ((long) I << ASHIFT) + ABASE, V);
In the form of CAS, CasTabAt (Node<K,V>[] TAB, int I,Node<K,V> c, Node<K,V> V) U.compareAndSwapObject(TAB, ((long) I << ASHIFT) + ABASE, c, v);

7. ConcurrentHashMap

Let’s look at how the ConcurrentHashMap put method ensures thread-safety through CAS: CasTabAt (TAB, I,null,node1) is executed by thread 1. If TAB [I] is null,node1 is inserted. Thread 2 then executes casTabAt(TBA, I,null,node2), TAB [I] does not equal the expected value of null, insert failure. Thread two then goes back to the start of the for loop and retrieves TAB [I] as expected, repeating the logic above.
The get method also uses the volatile feature to achieve lockless reads.

The procedure for finding a value is as follows:

1. Locate the Hash bucket based on the key and obtain the header of the hash bucket using volatile reads of tabAt.

2. Iterate through the Node list using the volatile next property of the header Node

3. After finding the destination node, read the volatile val of the node

The above three operations are volatile reads, so the memory visibility of the value can be guaranteed without locking
Because the lock granularity is hash buckets, multiple PUT threads will block only if they request the same hash bucket. Put threads that request different hash buckets can be executed concurrently. Put thread. If the hash bucket is empty, the lock is inserted in the for loop +CAS mode.
Remove method, as shown: Next of the removed node still points to the next element. If a traversal thread is traversing the deleted node, the traversal thread can still access the next element through the next attribute. From the thread’s point of view, he does not perceive that the node has been deleted, indicating that ConcurrentHashMap provides a weakly consistent iterator.

7 Map 1.7 1.8 Summary

Comparison of thread safety and non-safety
Security: Volatile cas synchronize replaces the segment lock scheme with the default concurrency of 16 expanded to the number of array nodes.
Speed: Red-black tree O(logN)