The underlying data structure of HashMap

Hash Underlying data structure

JDK7: array + linked list;

JDK8: array + linked list + red-black tree

Why use a red-black tree: When the number of nodes is less than a threshold, the list as a whole is faster than a red-black tree, and when the number of elements is greater than this threshold, the threshold is set to 8 in the HashMap.

However, it is not only the list length greater than 8 that determines whether to convert a red-black tree. There is another restriction that determines whether the array has more than 64 elements. The list can only be converted to a red-black tree if the number of elements in the list is greater than 8 and the array length is greater than or equal to 64.

How to implement the put method of HashMap in JDK8?

  1. Generate hashCode based on key
  2. Determines whether the array in the current HashMap object is empty and initializes the array if it is
  3. According to the logic and operation, calculate hashCode based on the array subscript I corresponding to the current array
  4. Determines whether the element (TAB [I]) at position I is empty

A. If null, encapsulate key and value as Node objects and assign values to TAB [I]

B. If not empty:

  • If the key passed in by the put method is equal to TAB [I].key, then the same key exists

  • If TAB [I]. Key = TAB [I]. Key

  1. If TAB [I] is TreeNode, then the ith position of the array is a red-black tree. Insert the key and value into the red-black tree, and check whether the same key exists in the red-black tree before inserting
  2. If the type of TAB [I] is not TreeNode, it means that the ith position of the array is a linked list. Then the linked list is traversed to find whether there is the same key, and the nodes in the linked list are counted during the traversal. When the traversal reaches the last Node, the key and value are encapsulated as nodes and inserted into the end of the linked list. At the same time, determine whether the number of linked list nodes before inserting new nodes is greater than or equal to 8. If so, change the linked list to a red-black tree.
  • If the same key is found in the previous steps, the onlyIfAbsent flag is used to determine whether the value needs to be updated, and oldValue is returned
  1. modCount++
  2. The number of elements in the HashMap is size plus 1
  3. If the size is larger than the capacity expansion threshold, expand the capacity

JDK8 HashMap get method implementation process

  1. Generate hashCode based on key

  2. If the array is empty, it returns empty

  3. If the array is not empty, use hashcode and the array length to calculate the array subscript I for the key

  4. If there is no element in the i-th position of the array, it returns null

  5. If the key of the element in the first bit of the array is equal to the key passed in by the get method, the element is returned and its value is obtained

  6. If not, the element is determined to have a next element, if not, return null

  7. If so, determine whether the element is a linked list node or a red-black tree node

    A. If it is a linked list, traverse the linked list

    B. If it is a red-black tree, traverse the red-black tree

  8. If the element is found, it returns the element; if it is not found, it returns nothing

Differences between HashMap in JDK7 and JDK8

  1. Red-black trees are used in JDK8
  2. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * In JDK8, we need to count the number of nodes in the current list, so we need to use the tail interpolation method.
  3. The Hash algorithm in JDK7 is more complex than that in JDK8. The more complex the Hash algorithm is, the hashcode generated will be more hashed, so the elements in the HashMap will be more hashed. The more hashed the hashmap query performance will be better. The addition of red-black trees in JDK8 guarantees query performance, so you can simplify the Hash algorithm, because the more complex a Hash algorithm is, the more CPU it uses
  4. In the process of expansion, it is possible to rehash keys in JDK7 (rehash is related to Hash seeds), which is not available in JDK8
  5. In JDK8, the capacity expansion condition is different from that in JDK7. In JDK7, the capacity expansion condition is different from that in JDK8. In JDK7, the capacity expansion condition is different from that in JDK7
  6. There is also an API in JDK8: putIfAbsent(key,value)
  7. In JDK7, the element is moved one at a time. In JDK8, the element is moved one at a time. In JDK7, the element is moved one at a time

Second, the ConcurrentHashMap

How does ConcurrentHashMap ensure concurrency safety in JDK7?

The Unsafe operation +ReentrantLock+ segmentation idea is utilized. The broadening operation primarily uses:

  1. CompareAndSwapObject: Modifies the properties of an object in CAS mode
  2. PutOrderedObject: Safely assigns a value to a location in the array concurrently
  3. GetObjectVolatile: Concurrently and safely retrieves elements at a location in the array

The fragmentation idea is to increase the concurrency of ConcurrentHashMap. The higher the number of segments, the higher the maximum concurrency supported. The programmer can specify the concurrency through the concurrencyLevel parameter. The ConcurrentHashMap inner class Segment is used to represent a Segment.

Each Segment is a small HashMap. When a ConcurrentHashMap put method is called, the Segment put method is called. The Segment class inherited a ReentrantLock. When the Segment put method is called, the reentrant lock is first used to lock the Segment. After the lock is successfully added, the key and value to be inserted are inserted into the small HashMap and unlocked after the insert is complete.

The underlying principles of ConcurrentHashMap in JDK7

The ConcurrentHashMap layer is implemented with two nested arrays:

  1. The ConcurrentHashMap object has an attribute segments of type Segment[];
  2. The Segment object has an attribute table of type HashEntry[];

Segment[] = Segment[]; Segment[] = Segment[]; Segment[] = Segment[] Generate a Segment object at position j using a spin lock.

Then call the Segment object’s PUT method. The Segment put method is used to lock the Segment object, and then it calculates the array subscript I of the corresponding HashEntry[] based on the key. Then it encapsulates the key and value as HashEntry objects and places them in that location. This process is the same as the JDK7 HashMap put method, and then unlocks the Segment object. In the process of locking logic is more complex, first through the spin lock, if more than a certain number of times will be directly blocked and so on lock.

How does ConcurrentHashMap ensure concurrency safety in JDK8?

The Unsafe operation +synchronized keyword is used.

The Unsafe operation is still used in the same way as in JDK7, where the Unsafe operation is responsible for concurrently and safely modifying properties of objects or values in an array location.

Synchronized is mainly responsible for locking a certain location (the location is not empty) when it needs to be operated, such as inserting nodes into a linked list at a certain location and into a red-black tree at a certain location.

In JDK8 there is still the idea of segmenting the lock, except that the middle number of JDK7 can be controlled, whereas in JDK8 there is a lock for each position in the array.

  1. The corresponding array subscript I is computed according to the key. If there is no element at that location, the value is assigned to that location by means of spin.

  2. Synchronized locks if there are elements in that location

  3. After the lock is successful, determine the type of the element

    A. If it is a linked list node, add the node to the linked list

    B. If the node is a red-black tree, add the node to the red-black tree

  4. After the tree is added, determine whether to tree it

  5. AddCount means increment the number of elements in ConcurrentHashMap by 1, but this operation also needs to be concurrency safe, and once the increment is successful, it will continue to determine whether to expand or not, and if necessary, it will expand, so this method is important.

  6. Meanwhile, a thread that finds ConcurrentHashMap expanding during put will help expand it.

Differences between THE ConcurrentHashMap in JDK7 and JDK8

There are so many differences between the two… Includes both differences in the HashMap and other differences, such as:

  1. Section locking is no longer available in JDK8 and synchronized is used instead
  2. In JDK8, the capacity expansion performance is higher, support multi-threading at the same time, in fact, JDK7 also support multi-threading capacity expansion, because JDK7 is for each Segment, so it is also possible to multi-threading capacity expansion, but the performance is not as high as JDK8, because JDK8 for any thread can help capacity expansion
  3. In JDK8, we add CounterCell to count elements. In JDK8, we add CounterCell to count elements. In JDK7, we add CounterCell to count elements

Original address: juejin.cn/post/684490…

ConcurrentHashMap learning articles: juejin.cn/post/684490…