Implementation principles of HashMap

This article has participated in the third “topic writing” track at the Diggings Creators Camp. For more details, check out diggings program | Creators Camp.

preface

As a classic data structure in JDK, HashMap contains arrays, linked lists, red-black trees, sets and other knowledge points, which is why it is frequently asked in interviews. For Java beginners, the principle of HashMap is still very necessary to understand, here introduce the implementation of HashMap principle.

Analytical thinking

Let me explain how I’m reading.

Copy the usual methods separately
Take the different branches of logic and separate them out. Take the extreme method, take special data, Debug the statement in it
Comment the source code.

parsing

Before, I introduced a little bit of HashMap features, HashMap first exploration. I’m going to go deeper here.

The get method

Let’s start with the simplest one…

public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
}
Copy the code

I’m going to go from the array index to the Node
If the first Node in Node hits, it returns
If there is a conflict, the corresponding entry is searched through key.equals(k)

If it is a tree, then search through key.equals(k) in the tree, O(logn);
If it is a linked list, then search through key.equals(k) in the list, O(n).

// The hash value is hash(key),key final Node<K,V> getNode(int hash, Object key) {Node<K,V>[] TAB; Node<K,V> first, e; int n; K k; // Table is not empty and TAB [(n-1) & hash]! = null. if ((tab = table) ! = null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! = null) {// Check whether the hash values of nodes are equal. If the key values are the same, then return. // Think about the situation where the if statement is not valid. if (first.hash == hash && // always check first node ((k = first.key) == key || (key ! = null && key.equals(k)))) return first; // The hash value of the first Node is not the same as the hash value of the key. if ((e = first.next) ! = null) {if (first instanceof TreeNode) // Select node from tree. return ((TreeNode<K,V>)first).getTreeNode(hash, key); Do {// Determine whether the hash value and the key value are equal until they are equal or at the end of the node. if (e.hash == hash && ((k = e.key) == key || (key ! = null && key.equals(k)))) return e; } while ((e = e.next) ! = null); } } return null; }Copy the code

Put method

This involves a little bit more logic in the middle, and the method needs to be looked at in different steps. Ideas:

Hash the key’s hashCode() and calculate the index;
If it doesn’t collide it goes straight into the bucket;
If they do, they will exist behind buckets in the form of a linked list.
- Replace old value if the node already exists
- If the collision causes the list to be too long (greater than or equal to TREEIFY_THRESHOLD), the list is converted to a red-black tree;
If the Node capacity exceeds the load factor*current capacity, resize the Node.

public V put(K key, V value) { return putVal(hash(key), key, value, false, true); } final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; / / if the table is empty, recreate the table if ((TAB = table) = = null | | (n = TAB. Length) = = 0) n = (TAB = the resize ()). The length; // If TAB [(n-1) &hash] is empty, store the node at TAB [(n-1) &hash]. // newNode = new Node<>(hash, key, value, next); if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); Else {// When the TAB [(n-1)&hash] location already has a Node. Node<K,V> e; K k; / / if the existing Node as the key value you want to deposit with the / / e as the Node of the existence of the if (p.h ash = = hash && ((k = p.k ey) = = key | | (key! = null && key.equals(k)))) e = p; else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); Else {// Insert the Node backward all the time. For (int binCount = 0; // If treeify_threshold-1 = 1; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } // Break out of the loop if the key values are the same. E = p.n ext, / / at this time already record e Node values if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) break; p = e; } // If (e! = null) { // existing mapping for key V oldValue = e.value; if (! onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; }} // Internal HashMap modification of the initial ++modCount; If (++size > threshold) resize(); // If (++size > threshold) resize(); // Insert methods, how to do afterNodeInsertion(evict); return null; }Copy the code

Generally, when there is no collision, it is relatively simple and the amount of data is small.

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; // newNode = newNode <>(hash, key, value, next); // New Node<>(hash, key, value, next); if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; }Copy the code

After the collision, there’s a treatment for the red-black tree, because the red-black tree has more information, so I’ll explain that separately next time. Here you can refer to the following, from the JDK source code to study the red-black tree. Let me explain the cycle of the impact.

Check whether the same key exists. If the same key exists, break out of the loop and overwrite the value of the key
If the same key does not exist, insert a new Node at the end of the list
- If the list node is too long, convert to a tree.

for (int binCount = 0; ; ++binCount) {// if the list is null, create a new node. If the list is too long, convert it to a tree. if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } // If the list exists in the same key, store the key value, i.e., e,(e = p.ext). if (e.hash == hash && ((k = e.key) == key || (key ! = null && key.equals(k)))) break; p = e; }Copy the code

The part of the red-black tree, we’ll parse it separately next time

The resize method

There are a lot of lines to go through here. So the first thing we’ll do is we’ll resize(). They are all executed when put is called.

Table == NULL

 if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
Copy the code

When the number of key-value mappings is greater than the critical value.

if (++size > threshold)
            resize();
Copy the code

Resize specific method

final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; // The threshold may be null or 2 to the n. Int oldCap = (oldTab == null)? 0 : oldTab.length; int oldThr = threshold; Int newCap, newThr = 0; int newCap, newThr = 0; // Resize the second time. if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold} // The first time resize() is initialized. else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } // the first time resize() will enter if (newThr == 0) {// loadFactor * initial capacity float ft = (float)newCap * loadFactor; // Ensure that the critical value does not exceed the maximum value. newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } // The actual initialization operation, new array newCap, threshold initialization. threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; // if (oldTab! = null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) ! = null) {// Re-calculate the hash oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); // If e.ext! Else {// Preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; If ((e.ash & oldCap) == 0) {if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else {if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) ! = null); if (loTail ! = null) { loTail.next = null; newTab[j] = loHead; } if (hiTail ! = null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }Copy the code

If it’s resize for the first time, we extract the statement that will be executed.

Initializing capacity
The initialization threshold determines when the number of table key pairs will resize() again

Final Node<K,V>[] resize() {final Node<K,V>[] oldTab = table; // oldCap = 0 and threshod = null int oldCap = (oldTab == null)? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; If (oldCap > 0) {.... Else {// Zero initial threshold so using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; return newTab; }Copy the code

The second and subsequent resize executes the process

final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; // oldCap = (oldTab == null)? // oldCap = (oldTab == null)? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; If (oldCap > 0) {// This if statement guarantees that the capacity of the HashMap does not exceed the upper limit. if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } // Double the size of the table if the capacity limit is not exceeded after expansion. Else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold } threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; // How to handle the last table during the second expansion. if (oldTab ! For (int j = 0; int j = 0; j < oldCap; ++j) { Node<K,V> e; If ((e = oldTab[j])! = null) {oldTab[j] = null; If (e.ext == null) newTab[e.ash & (newcap-1)] = e; Else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); Else {// Preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; //oldCap = 16 = 10000; //oldCap = multiple of 16; If ((e.hash & oldCap) == 0) {// first time if (loTail == null) loHead = e; // Calculate new next else lotail. next = e; loTail = e; } else {if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) ! = null); if (loTail ! = null) { loTail.next = null; newTab[j] = loHead; } // Add the original offset if (hiTail! = null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }Copy the code

Resize is an interesting place to write operations on linked lists with collisions, so let’s recap. When the index is reassigned, there is an operation to reconstruct the list.

Take a more dramatic example, and the reader will understand.

E.ash < 2, then e.ash &oldCap is equal to 0 and the index is less than the previous hash table size. That’s the same index.
If e.ash > 2, e.ash &old is not equal to 0, then its index is the index of the current table plus the newly expanded size.

This diagram shows what the linked list, which was originally mounted at position 1, looks like when the size of the hashMap table is expanded from 2 to 4.

Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) ! = null); if (loTail ! = null) { loTail.next = null; newTab[j] = loHead; } if (hiTail ! = null) { hiTail.next = null; newTab[j + oldCap] = hiHead; }Copy the code

The last

Space is limited, I only introduce the get method, put method, resize method of the specific principle, code paste some more.. .

In fact, for the realization of something to study the principle, the best way or personally see his source code, other people’s comments and principles can only be used as a reference. Be sure to do it by hand.

reference

A HashMap que
The Java 8 series reintroduces HashMap
Java HashMap working principle and implementation

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Implementation principles of HashMap

directory

preface

Analytical thinking

parsing

The get method

Put method

The resize method

Resize specific method

The last

reference

Implementation principles of HashMap

directory

preface

Analytical thinking

parsing

The get method

Put method

The resize method

Resize specific method

The last

reference

Related Posts

Portfolio model

The volatile keyword is volatile

A reminder pops up at the bottom of the Snackbar