As a qualified Java development engineer, source code is essential, and hashmap is the most important, compared to many students in the autumn recruitment interview will often be asked hashmap bar! Take a look at the magic of HashMap today!

The data structure

First we need to know the data structure of the HashMap: Before JDK1.7, it was array + linked list. Linked list exists because hashMap uses a hash algorithm for positioning. The hash algorithm will cause the hash value of different data to be the same, so it will be linked, and then iterate through next to find the specified value.

Base1.8: Red-black tree was introduced later, aiming to solve the efficiency problem caused by the excessively long linked list. If the length of the linked list is too long, the traversal time will change from O(1) to O(N), and the red-black tree can solve this problem.

As can be seen from the figure above, the length of the linked list will be treed into a red-black tree after reaching a certain length, which is 8. Why is it 8? This is because research shows that the probability of a list with a hash slot length of 8 is 6 in a million, which means that the probability of 8 hash collisions is low. As a side note, the value of the red-black tree list is 6, and to prevent hash collisions from hovering around 8, the result is always treed/chained.

Let’s look at some of the basic attributes in the HashMap source code

Static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; Static final int MAXIMUM_CAPACITY = 1 << 30; // Load factor, used to calculate the capacity expansion threshold. Default is 0.75 static final float DEFAULT_LOAD_FACTOR = 0.75f; Static final int TREEIFY_THRESHOLD = 8; static final int TREEIFY_THRESHOLD = 8; Static final int UNTREEIFY_THRESHOLD = 6; static final int UNTREEIFY_THRESHOLD = 6; // Minimum tree capacity threshold, when the capacity of the hash table reaches 64. // If the list does not reach the threshold, the list is not allowed to be converted to red black tree. static final int MIN_TREEIFY_CAPACITY = 64; <K,V>[] table; // K-v key pair TRANSIENT Set< map. Entry<K,V>> entrySet; // list transient int size; // Transient int modCount; // Array size, loadFactor* array size int threshold; // loadFactor final float loadFactor;Copy the code

Once you understand the basic properties and explanations, start looking at each method, starting with the constructors

Public HashMap(int initialCapacity, Float loadFactor) {// If (initialCapacity < 0) throw new IllegalArgumentException("Illegal Initial ") capacity: " + initialCapacity); InitialCapacity = MAXIMUM_CAPACITY; if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; / / if the load factor < = 0 or null throws an exception if (loadFactor < = 0 | | Float. The isNaN (loadFactor)) throw new IllegalArgumentException (" Illegal load factor: " + loadFactor); // initialize this.loadFactor = loadFactor; // If the tableSizeFor method is invoked, the capacity expansion threshold becomes a power of 2. } static final int tableSizeFor(int cap) {int n = cap-1; n |= n >>> 1; n |= n >>> 2; n |= n >>> 4; n |= n >>> 8; n |= n >>> 16; return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1; Public HashMap(map <? extends K, ? Extends V> m) {this.loadFactor = DEFAULT_LOAD_FACTOR; putMapEntries(m, false); } public HashMap() {this.loadfactor = DEFAULT_LOAD_FACTOR; } public HashMap(int initialCapacity) {this(initialCapacity, DEFAULT_LOAD_FACTOR);} public HashMap(int initialCapacity, DEFAULT_LOAD_FACTOR); }Copy the code

You can see that whatever capacity we pass in is going to be the nearest power of two greater than that, so why do we do that?

  • In order to replace the efficiency of the modular operation, since the array size is the hash value %, the binary number of the power of 2 is only 1 and the rest of the binary number is 0. After -1, the first digit becomes 0 and the following digit becomes 1. And operations are the most efficient in operating systems.
  • Another case is to reduce hash conflicts. Compare the case when the capacity is 16/10. When the capacity is 16, the binary value of Lenth-1 is 01111,01001/01101/01100/01111 and the result is different after & operation. If the capacity is 10, the binary bit 01001 after -1 is the same as 01001/01101/01111, which increases hash conflict.
  • When it comes to reducing hash collisions, there is also a perturbation algorithm that you must know about, the core of which is xOR between the high 16 bits of the hash value and the low 16 bits. The reason for doing this is because we only evaluate the last 16 bits of the hash when positioning (Length-1) &hash, and further reduce hash collisions by including the top 16 bits.

Let’s take a look at the core methods of hashMap

The first is the way to add data

public V put(K key, V value) { return putVal(hash(key), key, value, false, true); } /** * @param hash key hash value * @param key,K * @param value, V * @param onlyIfAbsent if true, does not change the value of the existing key * @param evict * @return if the node already exists, Return the previous value, otherwise return NULL */ final V putVal(int hash, K key, V value, Boolean onlyIfAbsent, Boolean evict) {// TAB: Reference the current hash hash //p: hash element //n: array length // I: route addressing result Node<K,V>[] TAB; Node<K,V> p; int n, i; // If the array is empty or the array length =0, Super-popular if ((TAB = table) = = null | | (n = TAB. Length) = = 0) / / resize () method of judging method for initialization n = (TAB = the resize ()). The length; // I route addressing algorithm (array length -1) &hash, If ((p = TAB [I = (n-1) & hash]) == null) TAB [I] = newNode(hash, key, value, null); Else {//e is used to temporarily store a node element, k is a temporary key node < k,V> e; K k; / / said that the current bucket a hash value, and insert the key in the hash values are equal and the key values are equal or key is not null and the key, the key is equal to the current replacement happens if (p.h ash = = hash && ((k = p.k ey) = = key | | (key! = null && key.equals(k)))) e = p; Else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).puttreeval (this, TAB, hash, key, value); Else {for (int binCount = 0; ; ++binCount) {if ((e = p.ext) == null) {// Insert the end of the list p.ext = newNode(hash, key, value, null); If (binCount >= treeify_threshthreshold - 1) treeifyBin(TAB, hash); break; } / / if found the same hash traversal process and the key is equal to or of the key is not null and spread into the key is equal to directly jump out of the traversal of the if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) break; p = e; }} // If (e! = null) {// existing mapping for key V oldValue = e.value; // If onlyIfAbsent is false, value is overwritten and the old value if (! onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; }} // Maintain a record of the number of changes ++modCount; If (++size > threshold) resize(); afterNodeInsertion(evict); return null; }Copy the code

Get method, the method to get data

Public V get(Object key) {Node<K,V> e; return (e = getNode(hash(key), key)) == null ? null : e.value; } final Node<K,V> getNode(int hash, Object key) {// TAB: current hash //first: //e: temporary node element //n: array length node <K,V>[] TAB; Node<K,V> first, e; int n; K k; // If the current hash table is not empty, the table length is greater than 0, and the bucket corresponding to the key has a head node, the execution continues, otherwise null if ((TAB = table)! = null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! If (first.hash == hash && // always check first) {// If the hash of the header is equal to the hash of the key and the key of the header is equal to the incoming key, or the key is equal to the header, you can return the header if (first.hash == hash && // always check first node ((k = first.key) == key || (key ! = null && key.equals(k)))) return first; // If ((e = first.next)! Return ((TreeNode<K,V>)first).getTreenode (hash, key); return ((TreeNode<K,V>)first). / / traverse the list do {/ / whether the node is equal to the key if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) // Return the matching node e; } while ((e = e.next) ! = null); }} return null; }Copy the code

The method of inserting and obtaining data of HashMap will be analyzed temporarily, and the expansion method of HashMap will be further analyzed in the next chapter. Please look forward to the differences, please correct!