Tip: there is the source code address below, please take it yourself


preface

Friends are meeting up again. Are you still being asked “dumb HashMap” in an interview? Can’t implement a simple HashMap by hand? After reading this article you will not count me lost!


Tip: The following is the main content of this article, the case is for reference only

Introduction to HashMap

1. What is HashMap?

Implementation of Map interface based on hash table. This implementation provides all optional mapping operations and allows null values and NULL keys. (The HashMap class is roughly the same as Hashtable, except that it is asynchronous and null is allowed.) This class does not guarantee the order of the mapping, and in particular it does not guarantee that the order is constant. This implementation assumes that hash functions distribute elements appropriately between buckets, providing stable performance for basic operations (GET and PUT). The time required to iterate through the Collection view is proportional to the “capacity” (number of buckets) of the HashMap instance and its size (number of key-value mappings). So, if iteration performance is important, don’t set the initial capacity too high (or the load factor too low). — from Baidu Baike

2. Basic principles of HashMap

In JDK1.8, the implementation of HashMap has been changed to array + list + red-black tree. I will explain these data structures in detail in the following sections

An array of 1.

Features:

  1. An array is a collection of elements of the same data type.
  2. The elements of an array are stored sequentially. They are stored consecutively in memory in that order.
  3. An array element is represented by the name of the entire array and its own sequential position in the array. For example, a[0] represents the first element of an array named A, a[1] represents the second element of array A, and so on.
  4. Subscripts can be constants, variables, or expressions, but their values must be integers (rounded to integers if decimals).
  5. The subscript must be a contiguous set of integers whose minimum becomes the lower bound and whose maximum becomes the upper bound. The default lower bound value is 1 without instructions.

2. One-way linked lists

A one-way linked list is a type of linked list that consists of nodes, each containing a pointer to the next node. The characteristics of

  1. The speed of adding and deleting nodes is convenient and fast, without the need to move the remaining data like linear structure
  2. Queries are slower than arrays, requiring access to arbitrary data through a loop or recursive method. The average query efficiency is lower than that of linear tables
  3. I can only go through it from beginning to end. Can only find the successor, can not find the precursor, that is, can only advance.

You can add or delete nodes.

3. Bidirectional linked lists

A bidirectional linked list is a type of linked list in which each data node has two Pointers, one pointing to the direct successor and the other to the direct precursor. So, starting from any node in a bidirectional linked list, you can easily access its predecessors and successors. In general, we construct bidirectional circular lists. The characteristics of

  1. There are two Pointers, one to the previous node and one to the next node
  2. Can find the precursor and successor, can advance or retreat
  3. Adding and deleting nodes is complicated, and one more pointer storage space needs to be allocated

4. A red-black tree

The red-black tree is a specific type of binary tree, is also a kind of balanced binary search trees variant, its left and right subtrees elevation difference may be greater than 1, so the red-black tree is not strictly a balanced binary tree, because each a red-black tree is a binary sort tree, therefore, when to look up to the red and black tree, can use used in ordinary binary sort tree search algorithm.

The characteristics of

  1. Each node can only be red or black.
  2. The root node must be black.
  3. The red node, its leaf node can only be black.
  4. All paths from any node to each of its leaves contain the same number of black nodes.

These constraints enforce the key property of red-black trees: the longest possible path from root to leaf is no more than twice as long as the shortest possible path. The result is that the tree is roughly balanced. Because worst-case times for operations such as inserting, deleting, and finding a value are required to be proportional to the height of the tree, this theoretical upper limit on height allows red-black trees to be efficient in worst-case scenarios, unlike ordinary binary search trees

HashMap source code

Let’s look at the underlying source code of HashMap. Let’s look at the storage structure of HashMap

1. PUT operation

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; / / 1. If the current table is empty, the default size of the new table if ((TAB = table) = = null | | (n = TAB. Length) = = 0) n = (TAB = the resize ()). The length; / / 2. Get the current key if the corresponding node ((p = TAB [I = (n - 1) & hash]) = = null) / / 3. If no, create a newNode TAB [I] = newNode(hash, key, value, null). Else {//4. Node<K,V> e; K k; / / 5. The same hash key and key references or same key equals, then cover the if (p.h ash = = hash && ((k = p.k ey) = = key | | (key! = null && key.equals(k)))) e = p; //6. If the current node is a red-black tree, Else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).puttreeval (this, TAB, hash, key, value); Else {for (int binCount = 0; ; If ((e = p.ext) == null) {p.ext = newNode(hash, key, value, null); If (binCount >= treeify_threshold-1) // -1 for 1st treeifyBin(TAB, hash); break; } / / 10. If have the same node in the linked list, then cover the if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) break; p = e; } } if (e ! = null) { // existing mapping for key V oldValue = e.value; // Whether to replace value if (! onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; ++modCount; If (++size > threshold) resize(); afterNodeInsertion(evict); return null; }Copy the code

1. Calculate the hashcode value of the key

2. If the hash table is empty, call resize() to initialize the hash table

3. If there is no collision, add the element directly to the hash table

4. If a collision has occurred (with the same hashCode value), make three judgments

2. If it is a red-black tree structure, the tree insertion method is called. 3. You can also iterate over nodes that have the same hash and content as the inserted element.Copy the code

5. If the capacity of the bucket exceeds the threshold, resize the bucket to expand capacity

2. GET operation

final Node<K,V> getNode(int hash, Object key) { Node<K,V>[] tab; Node<K,V> first, e; int n; K k; if ((tab = table) ! = null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! = null) { if (first.hash == hash && // always check first node ((k = first.key) == key || (key ! = null && key.equals(k)))) return first; if ((e = first.next) ! = null) {if (first instanceof TreeNode) // Return ((TreeNode<K,V>)first).getTreenode (hash, key); Do {/ / or in the list to find the if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) return e; } while ((e = e.next) ! = null); } } return null; } final TreeNode<K,V> getTreeNode(int h, Object k) { return ((parent ! = null) ? root() : this).find(h, k, null); } final TreeNode<K,V> find(int h, Object k, Class<? > kc) { TreeNode<K,V> p = this; do { int ph, dir; K pk; TreeNode<K,V> pl = p.left, pr = p.right, q; If ((ph = p.hash) > h) p = pl; if (ph = p.hash) p = pl; else if (ph < h) p = pr; / / the hash values are the same, compare the key value of else if ((pk = p.k ey) = = k | | (k! = null && k.equals(pk))) return p; else if (pl == null) p = pr; else if (pr == null) p = pl; If k is comparable and k.compareTo(pk) does not return 0 else if else if ((kc! = null || (kc = comparableClassFor(k)) ! = null) && (dir = compareComparables(kc, k, pk)) ! = 0) p = (dir < 0) ? pl : pr; Else if ((q = pr.find(h, k, kc))! If (q = pr.find(h, k, kc))! = null) return q; else p = pl; } while (p ! = null); return null; }Copy the code
  1. Hashing the hashCode of the key
  2. If the bucket can be found at the top of the bucket, return it directly. Otherwise, search the bucket in the tree or through the linked list
  3. If there is a hash conflict, equals is used to traverse the list looking for nodes

3. The RESIZE operation

When the HashMap is expanded, it doubles the length, so that the element is either positioned in the same position or moved by a power of two.

final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; /* 1, resize () is called when size> threshold. OldCap > 0 indicates that the original table is not empty. OldCap is the size of the original table. OldThr (threshold) is oldCap × load_factor */ if (oldCap > 0) {if (oldCap >= MAXIMUM_CAPACITY) {threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold} /* 2, resize () is called when the table is empty. If oldCap is less than or equal to 0 and oldThr is greater than 0, the user has created a HashMap, But the constructors used are HashMap(int initialCapacity, float loadFactor) or HashMap(int initialCapacity) or HashMap(Map<? extends K, ? Extends V> m), resulting in oldTab being null, oldCap being 0, and oldThr being the initial capacity of the user-specified HashMap. */ else if (oldThr > 0) // Initial capacity was placed in threshold newCap = oldThr; /* resize () is called when the table is empty. If oldCap is less than or equal to 0 and oldThr is equal to 0, the user calls the HashMap() constructor to create a HashMap, all values are the default values, oldTab (Table) is empty, oldCap is 0, oldThr is 0, */ else {// Zero initial threshold using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; if (oldTab ! For (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) ! = null) { oldTab[j] = null; If (e.next == null) newTab[e.hash & (newCap-1)] = e; if (e.next == null) newTab[e.hash & (newcap-1)] = e; Else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);  Else {// Preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; Do {next = e.next; If ((e.hash & oldCap) == 0) {if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) ! = null); if (loTail ! = null) { loTail.next = null; newTab[j] = loHead; } if (hiTail ! = null) { hiTail.next = null; // oldCap newTab[j + oldCap] = hiHead; } } } } } return newTab; } // Final void split(HashMap<K,V> map, Node<K,V>[] TAB, int index, int bit) { TreeNode<K,V> b = this; // Relink into lo and hi lists, preserving order TreeNode<K,V> loHead = null, loTail = null; TreeNode<K,V> hiHead = null, hiTail = null; int lc = 0, hc = 0; TreeNode<K,V> e = b, next; TreeNode<K,V> e = b, next; e ! = null; e = next) { next = (TreeNode<K,V>)e.next; e.next = null; if ((e.hash & bit) == 0) { if ((e.prev = loTail) == null) loHead = e; else loTail.next = e; loTail = e; ++lc; } else { if ((e.prev = hiTail) == null) hiHead = e; else hiTail.next = e; hiTail = e; ++hc; }} // Untreeify or treeify if (loHead! = null) { if (lc <= UNTREEIFY_THRESHOLD) tab[index] = loHead.untreeify(map); else { tab[index] = loHead; if (hiHead ! = null) // (else is already treeified) loHead.treeify(tab); } } if (hiHead ! = null) { if (hc <= UNTREEIFY_THRESHOLD) tab[index + bit] = hiHead.untreeify(map); else { tab[index + bit] = hiHead; if (loHead ! = null) hiHead.treeify(tab); } }//end if }//end splitCopy the code

Four, HashMap simple handwriting implementation

HyhMap interface

package com.hyh.core.test.map; ** @author heyuhua * @create 2021/2/9 15:29 */ public interface HyhMap<K, V> {/** * PUT interface ** @param * @param V */ void PUT (k k, V V); /** * GET ** @param * @return */ V GET (k k); ** @return */ int size(); /** * Entry interface ** @param <K> * @param <V> */ interface Entry<K, V> {/** * getKey ** @return */ K getKey(); /** * getValue ** @return */ V getValue(); }}Copy the code

HyhHashMap implementation class

package com.hyh.core.test.map; import java.io.Serializable; @author heyuhua * @create 2021/2/915:31 */ public class implements HyhHashMap<K, V> implements HyhMap<K, V>, Serializable {/** * default capacity */ static final int DEFAULT_CAPACITY = 16; int threshold; /** * key index */ int keyIndex; /** * load factor */ static final float DEFAULT_LOAD_FACTOR = 0.75f; */ Node<K,V> [] table; /** * The size of the current Map */ int size; @Override public void put(K key, V value) { Node<K, V> node; if (table == null) { table = resize(); Node = new node <K, V>(hash(key), key, value, null); table[keyIndex] = node; size++; } else { table = resize(); //table is not null Node<K, V> n; // Hash conflict Boolean hashConflict = false; for (int i = 0; i < table.length; i++) { n = table[i]; if (n ! = null) { if (n.hash == hash(key)) { hashConflict = true; //hash is equal while (n! = null) {if (n.key.equals(key)) {//hash = value and key = value; table[i] = n; size++; } else { node = new Node<K, V>(hash(key), key, value, null); node.next = n; table[i] = node; size++; } n = n.next; } } } } if (! HashConflict) {// Put node = new node <K, V>(hash(key), key, value, null); table[++keyIndex] = node; size++; } } } @Override public V get(K key) { HyhHashMap.Node<K, V> node; return (node = getNode(key)) == null ? null : node.value; } /** * getNode ** @param key * @return */ final hyhhashmap. Node<K, V> getNode(Object key) {if (table! = null) { for (int i = 0; i < table.length; i++) { Node<K, V> node = table[i]; if (node ! = null) {// Hash equals if (node.hash == hash(key)) {while (node! = null) {if (node.key.equals(key)) {return node when hash and key are equal; } node = node.next; } } } } } return null; } / expansion and * * * * * @ return * / final Node < K, V > [] the resize () {Node < K, V > [] newTable; int newCapacity, oldCapacity; if (table == null) { keyIndex = 0; threshold = (int) (DEFAULT_CAPACITY * DEFAULT_LOAD_FACTOR); table = (HyhHashMap.Node<K, V>[]) new HyhHashMap.Node[DEFAULT_CAPACITY]; newTable = table; } else { oldCapacity = table.length; If (table.length > threshold) {newCapacity = threshold *= 2; newTable = (HyhHashMap.Node<K, V>[]) new HyhHashMap.Node[newCapacity]; NewTable int newIndex = 0; for (int i = 0; i < oldCapacity; i++) { Node<K, V> node = table[i]; If (node!) {// If (node!) {// If (node! = null) { if (node.next == null) newTable[newIndex] = node; else { HyhHashMap.Node<K, V> loHead = null, loTail = null, hiHead = null, hiTail = null, next; do { next = node.next; if (node.hash == 0) { if (loTail == null) loHead = node; else loTail.next = node; loTail = node; } else { if (hiTail == null) hiHead = node; else hiTail.next = node; hiTail = node; } } while ((node = next) ! = null); if (loTail ! = null) { loTail.next = null; newTable[newIndex] = loHead; } if (hiTail ! = null) { hiTail.next = null; newTable[newIndex + oldCapacity] = hiHead; } } } newIndex++; } } else { newTable = table; } } return newTable; } /** * compute Hash ** @param key * @return */ ** * compute Hash ** @param key * @return */ static final int Hash (Object key) { int h; return (key == null) ? 0 : Math.abs((h = key.hashCode()) ^ (h >>> 16)); } @Override public int size() { return size; ** @param <K> * @param <V> */ static class Node<K, V> implements HyhMap. V> {//hash value final int hash; // key final K key; // value V value; Hyhhashmap. Node<K, V> next; public Node(int hash, K key, V value, Node<K, V> next) { this.hash = hash; this.key = key; this.value = value; this.next = next; } public final K getKey() { return key; } public final V getValue() { return value; }}}Copy the code

Unit testing

If you want to test a HashMap that took half an hour to write without adding a red-black tree, it will not perform as well as the JDK’s HashMap.

package com.hyh.core.test.map; import org.junit.Test; /** * HyhHasMap Test ** @author heyuhua * @create 2021/2/917:54 */ public class HyhHashMapTest {/** * common Test */ @test public void test() { HyhHashMap<String, String> hyhHashMap = new HyhHashMap<>(); hyhHashMap.put("name", "heyuhua"); hyhHashMap.put("height", "180cm"); hyhHashMap.put("name", "hyh"); hyhHashMap.put("height", "180"); System.out.println("name:" + hyhHashMap.get("name") + ",height:" + hyhHashMap.get("height")); } /** * Hash collision Test */ @test public void testHashConfilct() {HyhHashMap<String, String> HyhHashMap = new HyhHashMap<>(); HyhHashMap. Put (" hu Gong ", "heyuhua1"); HyhHashMap. Put (" zhen 齻", "heyuhua2"); System. The out. Println (" Hu gong: "+ hyhHashMap. Get (" Hu gong") + ", chiu 齻 : "+ hyhHashMap. Get (" chiu 齻")); }}Copy the code

Let’s look at the results

Look at what’s in the Map Of course, this half hour of writing is buggy, but the code is for reference only, to give you a deeper understanding of how HashMap works.

HashMap = HashMap

OK, after a quick look at the underlying source code, we have some insight into HashMap. Here are some common interview questions for HashMap

1. What are the features of HashMap?

  • Realize fast storage key value pair, allow null, the key value can not be repeated, if the key value is repeated, overwrite.
  • Threads are not safe.
  • The underlying Hash table is not guaranteed to have an order

2. The underlying principle of HashMap?

  • Jdk7 uses array + linked list, JDK8 uses array + linked list + red-black tree data structure.

3. How does HashMap Put work?

  • When we pass the key and value to the put() method, we first do a hashCode() calculation of the key to get its position in the bucket array to store the Entry object.

4. How does HashMap Get work?

  • When an object is fetched, the bucket location is retrieved by get, the correct key-value pair is found through the equals() method of the key object, and the value object is returned.

5. HashMap expansion mechanism?

  • Expansion involves reallocating a new array, twice as long as the old one, and then iterating through the old structure, rehash all the elements into the new structure.

6. The default initialization length of a HashMap is 16, and why must it be a power of 2 every time it is automatically expanded or manually initialized?

  • To evenly distribute data, reduce hash collisions. Because array positions are determined by bits, data that is not a power of two increases the number of hash collisions and wastes array space.

  • If the input data is not a power of 2, the HashMap must have a power of 2 and the closest number to that number as a result of a shift and or operation

7. What if the size of the HashMap exceeds the capacity defined by the Load factor?

  • Beyond the threshold, the array will be expanded. In general, the array will be twice the size of the original array, and the original elements will be hashing into the new hash table.

The author remarks

I see a long way to go, I’m willing to work with you. Thank you very much for your likes, favorites and comments. See you next time.

Source code: click here to view the source code.