1. Concurrent containers

2. Synchronize containers

Why do we need a concurrent container?

The implementation of Hashtable is basically to add put, GET, size and other methods to “synchronized”, which results in all concurrent operations competing for the same lock. When one thread is performing synchronization operations, other threads can only wait, greatly reducing the efficiency of concurrent operations.

From Java Concurrent Programming In Action:

Synchronous containers (e.g. HashTable, Vector) serialize access to all container objects to make them thread-safe. The cost of this approach is severely reduced concurrency, with throughput severely reduced when multiple threads compete for the lock on the container.

The synchronized wrapper simply constructs another synchronized version of the Map input, internally using synchronized blocks for all operations, using [this] as mutually exclusive mutex, no real improvement!

public static <K,V> Map<K,V> synchronizedMap(Map<K,V> m) { return new SynchronizedMap<>(m); } private static class SynchronizedMap<K,V> implements Map<K,V> { private final Map<K,V> m; // Object on which to synchronize final Object mutex; SynchronizedMap(Map<K,V> m) { this.m = Objects.requireNonNull(m); mutex = this; } public int size() { synchronized (mutex) {return m.size(); } } public boolean isEmpty() { synchronized (mutex) {return m.isEmpty(); } } public boolean containsKey(Object key) { synchronized (mutex) {return m.containsKey(key); } } public boolean containsValue(Object value) { synchronized (mutex) {return m.containsValue(value); } } public V get(Object key) { synchronized (mutex) {return m.get(key); } } public V put(K key, V value) { synchronized (mutex) {return m.put(key, value); } } public V remove(Object key) { synchronized (mutex) {return m.remove(key); }}}Copy the code

3, ConccurentHashMap

3-1. Implementation of ConcurrentHashMap prior to JDK1.7

Implementation method:

  • [Segment lock], that is, the internal [Segment], inside the HashEntry array, similar to HashMap, hash the same items are stored in a linked list.
  • Using the volatile value field internally to ensure visibility, HashEntry also takes advantage of the immutable object mechanism to improve upon the underlying ability to perform operations directly, such as volatile access, to maximize performance. After all, many of the operations in Unsafe were optimized by the JVM Intrinsics.
  • At construction time, the number of segments is determined by what is called concurrentcyLevel, which defaults to 16 and can be specified directly in the corresponding constructor. Note that Java requires it to be a power of 2, and if the input is a non-power like 15, it is automatically adjusted to a power of 2 like 16.
  • CocurrentHashMap also hashes;
  • The iterator ConcurrentModificationException; Iterators returned by ConccurentHashMap have weak consistency and can tolerate concurrent modifications;

About lock segmentation:

In some cases, the lock decomposition technique can be further extended to decompose locks on a group of independent objects, which is called lock fragmentation.

The ConcurrentHashMap implementation uses an array of 16 locks, each protecting 1/16 of all hash buckets, where the NTH hash bucket is protected by the (N mod 16) lock, allowing the ConcurrentHashMap to support up to 16 concurrent writers. The number of segment locks can be increased.

The disadvantage of lock fragmentation is that it is more difficult and expensive to acquire multiple locks for exclusive access than a single lock for exclusive access.

Get method source code analysis

public V get(Object key) { // manually integrate access methods to reduce overhead Segment<K,V> s; HashEntry<K,V>[] tab; Hash int h = Hash (key.hashcode ()); // Hash int h = Hash (key.hashcode ()); Long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE; // Volatile access if ((s = (Segment<K,V>) Unsafe. GetObjectVolatile (segments, u))! = null && (tab = s.table) ! = null) {// omit} return null; }Copy the code

Put method source code analysis

To avoid hash collisions, an Unsafe call calls the Segment directly, and then performs the thread-safe PUT:

public V put(K key, V value) { Segment<K,V> s; if (value == null) throw new NullPointerException(); Int hash = hash(key.hashcode ()); int hash = hash(key.hashcode ()); int j = (hash >>> segmentShift) & segmentMask; if ((s = (Segment<K,V>) UNSAFE.getObject(segments, (j << SSHIFT) + SBASE)) == null) s = ensureSegment(j); return s.put(key, hash, value, false); }Copy the code

Core logic:

Final V put(K key, int hash, V value, Boolean onlyIfAbsent) {// scanAndLockForPut will look for nodes with the same key // anyway, Make sure to get the lock HashEntry<K,V> node = tryLock()? null : scanAndLockForPut(key, hash, value); V oldValue; try { HashEntry<K,V>[] tab = table; int index = (tab.length - 1) & hash; HashEntry<K,V> first = entryAt(tab, index); for (HashEntry<K,V> e = first;;) { if (e ! = null) { K k; // Update existing value... } else {// Place HashEntry into a specific position, rehash if the threshold is exceeded //... } } } finally { unlock(); } return oldValue; }Copy the code
  • ConcurrentHashMap acquires reentry locks to ensure data consistency. The Segment itself is an extended implementation based on ReentrantLock, so the Segment is locked during concurrent modification.
  • In the initial phase, you do a repetitive scan to see if the key is already in the array and to decide whether to update or place it, and you can see comments in the code. Repeated scanning and conflict detection are common techniques used in ConcurrentHashMap.

3-2. Implementation of ConcurrentHashMap after JDK1.8

  • Overall, the internal storage becomes very similar to the HashMap structure, which is also a large array of buckets, followed by a so-called linked list structure (BIN), with a finer granularity of synchronization.
  • It still has Segment definitions inside, but only to ensure compatibility in serialization, no longer having any structural use.
  • Because segments are no longer used, initialization is simplified to lazy-load, which effectively avoids initial overhead and addresses a common complaint with older versions.
  • Data stores use volatile to ensure visibility.
  • Use operations such as CAS to perform lockless concurrent operations in specific scenarios.
  • Optimizing for extreme situations using low-level tools like Unsafe and LongAdder.

1. Internal data structures

The Key is final because it is impossible for the Key of an entry to change during the life cycle; Val, meanwhile, is declared volatile to ensure visibility.

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;
    final K key;
    volatile V val;
    volatile Node<K,V> next;
}
Copy the code

2, put method source analysis

Point of optimization:

  • Use CAS to get objects;

  • Use lazy-load to initialize the array.

    / * *

    • Table initialization and resizing control. When negative, the
    • table is being initialized or resized: -1 for initialization,
    • else -(1 + the number of active resizing threads). Otherwise,
    • when table is null, holds the initial table size to use upon
    • creation, or 0 for default. After initialization, holds the
    • next element count value upon which to resize the table.

    */ private transient volatile int sizeCtl;

    / * *

    • The default initial table capacity. Must be a power of 2
    • (i.e., at least 1) and at most MAXIMUM_CAPACITY.

    */ private static final int DEFAULT_CAPACITY = 16;

    / * *

    • The bin count threshold for using a tree rather than list for a
    • bin. Bins are converted to trees when adding an element to a
    • bin with at least this many nodes. The value must be greater
    • than 2, and should be at least 8 to mesh with assumptions in
    • tree removal about conversion back to plain bins upon
    • shrinkage.

    */ static final int TREEIFY_THRESHOLD = 8;

    static final int HASH_BITS = 0x7fffffff;

    public V put(K key, V value) { return putVal(key, value, false); }

    Static final int spread(int h) {return (h ^ (h >>> 16)) &hash_bits; }

    Static final <K,V> Boolean casTabAt(Node<K,V>[] TAB, int I, Node<K,V> c, Node<K,V> V) { Return U.compareAndSwapObject(TAB, (long) I << ASHIFT) + ABASE, c, v); }

    // Optimized for Unsafe, getObjectVolatile is used directly to avoid overhead of indirect calls. static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) { return (Node<K,V>) U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE); }

    / * *

    • Use lazy-load to initialize the array

    */ private final Node<K,V>[] initTable() { Node<K,V>[] tab; int sc; While (= (TAB table) = = null | | TAB. The length = = 0) {/ / sizeCtl < 0, explain in the initialization or adjust the size, if it is found that conflict, to spin wait / / sizeCtl use volatile, If ((sc = sizeCtl) < 0) thread.yield (); Else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) { Real initialization logic to try {if ((TAB = table) = = null | | TAB. The length = = 0) {int n = > 0 (sc)? Sc: DEFAULT_CAPACITY; @SuppressWarnings(“unchecked”) Node<K,V>[] nt = (Node<K,V>[])new Node[n]; table = tab = nt; sc = n – (n >>> 2); } } finally { sizeCtl = sc; } break; } } return tab; }

    Key final V putVal (K, V value, Boolean onlyIfAbsent) {if (key = = null | | value = = null) / / key or value is null thrown NPE; throw new NullPointerException(); int hash = spread(key.hashCode()); int binCount = 0; for (Node

    [] tab = table;;) { Node

    f; int n, i, fh; if (tab == null || (n = tab.length) == 0) tab = initTable(); Else if ((f = tabAt(TAB, I = (n-1) & hash)) == null) {if (casTabAt(TAB, I, null) new Node

    (hash, key, value, null))) break; } else if ((fh = f.hash) == MOVED) // helpTransfer(TAB, f); else { V oldVal = null; // synchronized (f) {if (tabAt(TAB, I) == f) {// synchronized (f) { }} if (binCount! = 0) { if (binCount >= TREEIFY_THRESHOLD) treeifyBin(tab, i); if (oldVal ! = null) return oldVal; break; } } } addCount(1L, binCount); return null; }
    ,v>
    ,v>
    ,v>

    / * *

    • Helps transfer if a resize is in progress.

    */ final Node<K,V>[] helpTransfer(Node<K,V>[] tab, Node<K,V> f) { Node<K,V>[] nextTab; int sc; if (tab ! = null && (f instanceof ForwardingNode) && (nextTab = ((ForwardingNode<K,V>)f).nextTable) ! = null) { int rs = resizeStamp(tab.length); while (nextTab == nextTable && table == tab && (sc = sizeCtl) < 0) { if ((sc >>> RESIZE_STAMP_SHIFT) ! = rs || sc == rs + 1 || sc == rs + MAX_RESIZERS || transferIndex <= 0) break; if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) { transfer(tab, nextTab); break; } } return nextTab; } return

3-3. Why cannot the key and value of ConcurrentHashMap and HashTable be null

Because HashMap is not thread-safe and is used in single-threaded environments by default, if get(key) is null, you can use containsKey(key) to determine whether the value of the key is null or whether it doesn’t exist.

ConcurrentHashMap and HashTable are thread-safe, because the get(key) and containsKey(key) operations together are not atomic operations, other threads may modify the data during the execution. So there is no way to tell whether a value of value is null or there is no key. ContainsValue does the same thing.

4, CopyOnWriteArrayList

4-1. What is copy-on-write?

  • The thread-safety of a copy-on-write container is that as long as a de facto immutable object is properly published, no further synchronization is required to access that object.
  • With each change, a new container copy is created and republished to achieve variability;
  • The iterator of the container keeps a reference to the underlying array, which is currently at the start of the iterator. Since it cannot be modified, synchronization requires only that the contents of the array be visible (volatile).
  • Thus, multiple threads can iterate over the container at the same time without interfering with each other or with the threads modifying the container;
  • Container, the written copy 】 【 returned by the iterator, ConcurrentModificationException, return the element and the element of the iterator created completely consistent, after do not have to consider the effects of modification operations;
  • The underlying array is copied every time it is modified, which incurs some overhead, so the copy-on-write container should be used only when iterating far more than modifying, i.e. when reading more than writing.

CopyOnWriteArrayList is a thread-safe ArrayList that uses ReentrantLock as a lock to keep the thread safe;

When CopyOnWriteArrayList adds an element, it makes a copy of the array, adds the element to the newly copied array, and then points the array to the new array.

GetArray () assigns elements, appends to the copied array, and then reassigns;

CopyOnWriteArrayList works only in scenarios where there are very few writes, and it tolerates transient inconsistencies between reads and writes;

When writing, the array is copied, and then the array is assigned. Volatile is used to modify the array to ensure visibility in multithreaded situations.