Why is HashMap often asked by interviewers, but often asked to get down?

 

Huh? I don’t know how it works? Joke!

It’s just a hash and a map. Use the hashCode of the key to rehash the subscript of the current object in the array and store it in the array. Then the interviewer says, “Ok, I get it, go back to the news!”

 

This is not the kind of response interviewers want to hear from programmers who have their own understanding and analysis optimizations!

Guest officer, come, 1.7 source code!

/** * AbstractMap, Public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V> Serializable static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; Static final int MAXIMUM_CAPACITY =1 << 30; static final int MAXIMUM_CAPACITY =1 << 30; 1073741824 static final float DEFAULT_LOAD_FACTOR = 0.75f; // load factor 0.75 TRANSIENT Entry<K,V>[] table; // Entry array entry is a single linked list structure TRANSIENT int size; //hashmap length int threshold; // Capacity * load factor Threshold for example, 16*0.75 = 12 void resize(int newCapacity); Static int indexFor(int h, int length) {return h & (leng-1); static int indexFor(int h, int length) {return h & (leng-1); Void createEntry(int hash, K key, V value, int bucketIndex) {Entry<K,V> e = table[bucketIndex];} void createEntry(int hash, K key, V value, int bucketIndex) {Entry<K,V> e = table[bucketIndex]; table[bucketIndex] = new Entry<>(hash, key, value, e); // hash hash conflicts size++; }Copy the code

The default initialization length of a HashMap is 1<< 4=16, and the maximum size is 1<<30=1073741824. It can be seen that 1.7 adopts the structure of array plus single linked list, and Entry is a single linked list structure inherited from Map.Entry. Size is the length of the hashmap, and threshold is the threshold for inserting and expanding the hashmap. As can be seen from the reSize method, 1.7 is expanded before data is inserted.

Don’t worry, 1.7 is how to expand and insert again!

The initial capacity value is 16. If the number of inserted data exceeds the threshold (Capacity * load factor=12) and hash conflicts occur, the capacity will be expanded. Each expansion is 2 to the NTH power, which is advantageous for bit operation, and the last bit of the binary of Length-1 is 1, which reduces hash collisions. Int I = indexFor(e.hash, newCapacity); Recalculated, namely hash & (length-1), and reset the threshold (newCapacity * loadFactor).

Void resize(int newCapacity) {// Expansion method: resize(2 * table.length). If the capacity is insufficient (the capacity is greater than the threshold), expand the capacity (to double) Entry[] oldTable = table; Int oldCapacity = oldtable.length; If (oldCapacity == MAXIMUM_CAPACITY) // If the capacity is equal to the maximum value, the capacity will not be expanded. Threshold = integer.max_value; return; } Entry[] newTable = new Entry[newCapacity]; boolean oldAltHashing = useAltHashing; useAltHashing |= sun.misc.VM.isBooted() && (newCapacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD); boolean rehash = oldAltHashing ^ useAltHashing; transfer(newTable, rehash); Table = newTable; // Complete capacity expansion threshold = (int) math. min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1); Void transfer(Entry[] newTable, Boolean rehash) {void transfer(Entry[] newTable, Boolean rehash) { Int newCapacity = newtable. length; int newCapacity = newTable. For (Entry<K,V> e: table) {// Iterate over the hash table while(null! = e) { Entry<K,V> next = e.next; if (rehash) { e.hash = null == e.key ? 0 : hash(e.key); } int i = indexFor(e.hash, newCapacity); // recalculate index e.ext = newTable[I]; newTable[i] = e; // insert e = next; }}}Copy the code

1.8? That’s too hard. Hang in there! Hang in there!

 

Guest officer, there is still 1.8

public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; static final int MAXIMUM_CAPACITY = 1 << 30; Static final float DEFAULT_LOAD_FACTOR = 0.75f; //Bins are converted to trees when adding an element to a bin with at least this many nodes. static final int TREEIFY_THRESHOLD = 8; Static final int UNTREEIFY_THRESHOLD = 6; static final int UNTREEIFY_THRESHOLD = 6; Static final int MIN_TREEIFY_CAPACITY = 64; static final int MIN_TREEIFY_CAPACITY = 64; static final int MIN_TREEIFY_CAPACITY = 64; Static class Node<K,V> implements map. Entry<K,V>; Static final class TreeNode<K,V> extends LinkedhashMap. Entry<K,V>; // Transient Node<K,V>[] table; EntrySet transient < map. Entry<K,V>> entrySet; // Another way to store data, which is used for iteration function TRANSIENT int size; // Hashmap element count TRANSIENT int modCount; // Number of changes of the mapCopy the code

From the source can be seen, the initial capacity is still 16, the load factor is still 0.75, but more knowledge about the red black tree. 1.8 adopt the data structure of array plus single linked list plus red-black tree, expand or expand 2 to the n power.

The hash method:

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code

Check whether the key value is null. If the key value is null, the value is 0. If the key value is not null, the xOR operation is performed.

The value store method put

Public V put(K key, V value) {return putVal(hash(key), key, value, false, true); } final V putVal(int hash, K key, V value, boolean onlyIfAbsent,boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; // The table is not loaded at first. And after I began to load the put the if ((TAB = table) = = null | | (n = TAB. Length) = = 0) n = (TAB = the resize ()). The length; If (p = TAB [I = (n-1)&hash]) == null)//(length-1)&hash computs the index value to determine whether the current hash bucket is empty TAB [I] = newNode(hash, key, value, null); Else {// If the bucket is not empty, hash collisions will occur. Node<K,V> e; K k; // In the first case, the current hash value is the same and the key value is the same. E = p indicates the first node. if (p.hash == hash && ((k = p.key) == key || (key ! = null && key.equals(k)))) e = p; // In the second case, if the current bucket is an instance of a tree node and not the first node, that is, a red-black tree node. Else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).puttreeval (this, TAB, hash, key, value); For (int binCount = 0; for (int binCount = 0; ; ++binCount) {if ((e = p.ext) == null) {// If (e = p.ext) = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); // If the number of nodes is greater than or equal to 7, transform the red-black tree structure (from 0 to 7) break; } if (e.hash == hash && ((k = e.key) == key || (key ! = null && key.equals(k)))) break; // Insert node and next node repeat, out of the loop p = e; } } if (e ! V oldValue = e.value; V oldValue = e.value; if (! onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; If (++size > threshold) resize(); AfterNodeInsertion (EVICT); Return null; }Copy the code

If the bucket is empty, it will be assigned a value. If the bucket is not empty and the hash value is the same, a conflict exists. In this case, you need to handle the following three cases: 1 if the bucket is not empty and the key value is the same, the bucket is overwritten directly. (2) If the current bucket is not empty, the bucket node is not the first node and is a red-black tree node, and is added to the red-black tree. (Number of nodes >=TREEIFY_THRESHOLD) ③ If the current bucket is not empty, not the first node, not a red-black tree node, it is a linked list node. (Number of nodes <TREEIFY_THRESHOLD) If the value is greater than or equal to 7 (from 0 to 7), the node needs to be turned into a red-black tree node.

If (++size > threshold); if (++size > threshold); When the list depth is greater than 8, it will automatically expand to a red-black tree structure, and the time complexity changes from O (n) to O (logn).

Final Node<K,V>[] resize() {Node<K,V>[] oldTab = table; Int oldCap = (oldTab == null)? 0 : oldTab.length; int oldThr = threshold; //old table threshold int newCap, newThr = 0; If (oldCap > 0) {if (oldCap >= MAXIMUM_CAPACITY) {// The capacity is greater than the maximum value threshold = integer.max_value; Return oldTab; }// Move the table to the left, and the new hash table will be twice as large and smaller than the maximum size; Else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr < < 1; } else if (oldThr > 0) // Initial capacity was placed in threshold newCap = oldThr; Else {// If the capacity is 0, the threshold is capacity * load factor newCap = DEFAULT_INITIAL_CAPACITY; // Default initial capacity 16 newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); // Threshold 16*0.75=12} if (newThr == 0) {// If the initial capacity is less than 16, there is no threshold. Because the constructor can set its own initial capacity float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; // To the threshold @suppressWarnings ({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; If (oldTab! = null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) ! = null) {oldTab[j] = null; If (e.next == null)// There is no next node newTab[e.hash & (newcap-1)] = e; Elseif (e instanceof TreeNode) elseIf (e instanceof TreeNode) Move the tree to a new cap ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; // Next node if ((e.hash & oldCap) == 0) {// First node of the current bucket node if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) ! = null); if (loTail ! = null) { loTail.next = null; newTab[j] = loHead; } if (hiTail ! = null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; // return a new hash table}Copy the code

In general, the capacity expansion process after data insertion is divided into the following steps:

(1) Check whether the current capacity exceeds the maximum capacity. If the current capacity exceeds the maximum capacity, do not expand the capacity. The threshold is the maximum value of an integer. 2 The current capacity is smaller than the maximum capacity and larger than the initial capacity. Twice as much capacity, twice as much threshold

After capacity expansion, the old table data is transferred to the new table, and the index value is calculated by using hash&Length-1 to determine whether the current bucket node belongs to the linked list node or the red-black tree node, and then the bucket node is transferred in sequence.

public V remove(Object key) { Node<K,V> e; Return (e = removeNode(hash(key), key, null, false, true)) == null? null : e.value; } final Node<K,V> removeNode(int hash, Object key, Object value, boolean matchValue, boolean movable) { Node<K,V>[] tab; Node<K,V> p; int n, index; if ((tab = table) ! = null && (n = tab.length) > 0 && (p = tab[index = (n - 1) & hash]) ! = null) { Node<K,V> node = null, e; K k; V v; if (p.hash == hash && ((k = p.key) == key || (key ! = null && key.equals(k)))) node = p; Else if ((e = p.ext)! = null) {// if (p instanceof TreeNode)// the next node is a TreeNode node = ((TreeNode<K,V>)p). The else {/ / linked list node traversal find node do {if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) { node = e; break; } p = e; } while ((e = e.next) ! = null); }}// if (node! = null && (! matchValue || (v = node.value) == value || (value ! = null && value.equals(v)))) {if (TreeNode<K, v >)node).removeTreenode (this, tab, movable); Else if (node == p)// TAB [index] = node.next; Else // List node, change next value p.ext = node.next; ++modCount; // Change the number of times + 1 --size; // Visting deremoval (node); return node; } } return null; // The node is not found.Copy the code

Difference and summary

Hash&length-1 In JDK1.7, the data structure is array and single linked list, and the data structure is array and single linked list and red and black tree. In JDK1.7, the data structure is array and single linked list and red and black tree, and the data structure is array and single linked list and red and black tree. 1.8 is the addition of red-black tree, tail insertion method, can avoid the problem of reverse order infinite loop.

What are the shortcomings of HashMap and how to solve and optimize it

Hashmap has key-value nullability, thread insecurity, unordered, storage location changes over time, i.e

 

Why wrapper classes like String and Integer in HashMap are suitable for key keys

 

If the key in a HashMap is of Object type, what methods need to be implemented?

 

Part draw lessons from the website: www.jianshu.com/p/8324a3457… Really well written, admire!

This is the first time FOR me to write an article seriously. Just be careful because I have written thousands of words like this. In the future, I will write an article on technology stack every few days and sync it to Github! If you have any bad places and deficiencies, please point them out!

 

What the next door writes is really too good, cause I a little overpowered, so be it!