Why is HashMap often asked by interviewers, but often asked to get down?

Huh? I don’t know how it works? Joke!

It’s just a hash and a map. Use the hashCode of the key to rehash the subscript of the current object in the array and store it in the array. Then the interviewer says, “Ok, I get it, go back to the news!”

This is not the kind of response interviewers want to hear from programmers who have their own understanding and analysis optimizations!

Guest officer, come, 1.7 source code!

/** * inherit AbstractMap and rewrite the Map interface **/
public class HashMap<K.V> extends AbstractMap<K.V>
    implements Map<K.V>, Cloneable.Serializable
  
 static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // Move 1 4 bits to the left =16
 static final int MAXIMUM_CAPACITY = 1 << 30;// Move 30 bits to the left =1073741824
 static final float DEFAULT_LOAD_FACTOR = 0.75 f;// Load factor 0.75
 transient Entry<K,V>[] table;// Entry array Entry is a single list structure
 transient int size;/ / a hashmap length
 int threshold;// Capacity * load factor Threshold For example, 16*0.75 = 12
 void resize(int newCapacity);// Capacity expansion method
 static int indexFor(int h, int length) {
    return h & (length-1);// Compute the index method
}

void createEntry(int hash, K key, V value, int bucketIndex) {
    Entry<K,V> e = table[bucketIndex];
    table[bucketIndex] = new Entry<>(hash, key, value, e);// Head insertion to resolve hash conflicts
    size++;
}
Copy the code

The default initialization length of a HashMap is 1<< 4=16, and the maximum size is 1<<30=1073741824. It can be seen that 1.7 adopts the structure of array plus single linked list, and Entry is a single linked list structure inherited from Map.Entry. Size is the length of the hashmap, and threshold is the threshold for inserting and expanding the hashmap. As can be seen from the reSize method, 1.7 is expanded before data is inserted.

Don’t worry, 1.7 is how to expand and insert again!

The initial capacity value is 16. If the number of inserted data exceeds the threshold (Capacity * load factor=12) and hash conflicts occur, the capacity will be expanded. Each expansion is 2 to the NTH power, which is advantageous for bit operation, and the last bit of the binary of Length-1 is 1, which reduces hash collisions. Int I = indexFor(e.hash, newCapacity); Recalculated, namely hash & (length-1), and reset the threshold (newCapacity * loadFactor).

void resize(int newCapacity) {Resize (2 * table.length). If the capacity is insufficient (the capacity is greater than the threshold), expand the capacity (double the capacity).
    Entry[] oldTable = table;// Copy the old hash table
    int oldCapacity = oldTable.length;// Old capacity
    if (oldCapacity == MAXIMUM_CAPACITY) // If the capacity is equal to the maximum value, the capacity will not be expanded
        threshold = Integer.MAX_VALUE;
        return;
    }

    Entry[] newTable = new Entry[newCapacity];
    boolean oldAltHashing = useAltHashing;
    useAltHashing |= sun.misc.VM.isBooted() &&
            (newCapacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
    boolean rehash = oldAltHashing ^ useAltHashing;
    transfer(newTable, rehash);// Move the data to the new hash table
    table = newTable;// Capacity expansion is complete
    threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);// Reset the threshold
}

void transfer(Entry[] newTable, boolean rehash) {// Move the old hash table to the new hash table
// Process: traverses the list in the same order as the old list and inserts the new list in the same order at the head
    int newCapacity = newTable.length;
    for (Entry<K,V> e : table) {// iterate over the hash table
        while(null! = e) { Entry<K,V> next = e.next;if (rehash) {
                e.hash = null == e.key ? 0 : hash(e.key);
            }
            int i = indexFor(e.hash, newCapacity);// recalculate index
            e.next = newTable[i];
            newTable[i] = e;// Insert the headere = next; }}}Copy the code

1.8? That’s too hard. Hang in there! Hang in there!

Guest officer, there is still 1.8

public class HashMap<K.V> extends AbstractMap<K.V>
    implements Map<K.V>, Cloneable.Serializable

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
static final int MAXIMUM_CAPACITY = 1 << 30;
static final float DEFAULT_LOAD_FACTOR = 0.75 f;
//Bins are converted to trees when adding an element to a bin with at least this many nodes.
static final int TREEIFY_THRESHOLD = 8;// When the number of buckets exceeds the threshold, the bucket is converted to a red-black tree
static final int UNTREEIFY_THRESHOLD = 6;// When the number of buckets is smaller than this threshold, the bucket is converted to a linked list, provided that the bucket is a red-black tree
static final int MIN_TREEIFY_CAPACITY = 64;// If the number of nodes in a bucket is larger than this capacity, it will also be converted to a red-black tree
static class Node<K.V> implements Map.Entry<K.V> ;// Change the node
static final class TreeNode<K.V> extends LinkedHashMap.Entry<K.V>;/ / tree node
transient Node<K,V>[] table;// Array plus single linked list, later will be red black tree
transient Set<Map.Entry<K,V>> entrySet;// Another way to store data, mainly for iteration function
transient int size;// Number of hashMap elements
transient int modCount;// Number of changes of the map
Copy the code

From the source can be seen, the initial capacity is still 16, the load factor is still 0.75, but more knowledge about the red black tree. 1.8 adopt the data structure of array plus single linked list plus red-black tree, expand or expand 2 to the n power.

The hash method:

static final int hash(Object key) {
    int h;
    return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code

Check whether the key value is null. If the key value is null, the value is 0. If the key value is not null, the xOR operation is performed.

The value store method put

public V put(K key, V value) {/ / value
    return putVal(hash(key), key, value, false.true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    // The table is not loaded at the beginning, and then the table is loaded after put
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)//(length-1)& Hash the index value to determine whether the current hash bucket is empty
        tab[i] = newNode(hash, key, value, null);// assign a value to the current bucket
    else {// If the bucket is not empty, a hash conflict will occur
        Node<K,V> e; K k;
        // In the first case, the current hash value is the same and the key value is the same. E = p indicates the first node.
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            e = p;
            // In the second case, if the current bucket is an instance of a tree node and not the first node, that is, a red-black tree node. If it is a tree node, add it to the red-black tree
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {// The third type is not the first node, is not a red-black tree node, is the linked list node
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {// If the next node of the bucket is empty, insert the bucket directly
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);// If the number of nodes is greater than or equal to 7, convert the tree structure (from 0 to 7).
                    break;
                }
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    break;// The insert node repeats with the next node to break out of the loopp = e; }}if(e ! =null) { // Duplicate values exist in the node, overwriting the old value directly
            V oldValue = e.value;
            if(! onlyIfAbsent || oldValue ==null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;// The number of changes is increased by one
    if (++size > threshold)
        resize();// If the capacity is increased by one, determine whether the capacity needs to be expanded
    afterNodeInsertion(evict);// Added successfully
    return null;
}
Copy the code

If the bucket is empty, it will be assigned a value. If the bucket is not empty and the hash value is the same, a conflict exists. In this case, you need to handle the following three cases: 1 if the bucket is not empty and the key value is the same, the bucket is overwritten directly. (2) If the current bucket is not empty, the bucket node is not the first node and is a red-black tree node, and is added to the red-black tree. (Number of nodes >=TREEIFY_THRESHOLD) ③ If the current bucket is not empty, not the first node, not a red-black tree node, it is a linked list node. (Number of nodes <TREEIFY_THRESHOLD) If the value is greater than or equal to 7 (from 0 to 7), the node needs to be turned into a red-black tree node.

If (++size > threshold); if (++size > threshold); When the list depth is greater than 8, it will automatically expand to a red-black tree structure, and the time complexity changes from O (n) to O (logn).

final Node<K,V>[] resize() {// Capacity expansion method
    Node<K,V>[] oldTab = table;// The old hash table
    int oldCap = (oldTab == null)?0 : oldTab.length;
    int oldThr = threshold;// Old table threshold
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {// The capacity is greater than the maximum
            threshold = Integer.MAX_VALUE;// The threshold is the maximum integer
            return oldTab;// The maximum capacity cannot be expanded
        }// Move the table one bit to the left to double the size of the new hash table and smaller than the maximum size; The old capacity must be larger than the default initial capacity
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // Double threshold Also needs to be doubled
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;// Initialization threshold
    else {               // If capacity is 0, the threshold is capacity x load factor
        newCap = DEFAULT_INITIAL_CAPACITY;// Default initial capacity is 16
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);/ / threshold for 16 * 0.75 = 12
    }
    if (newThr == 0) {// If the initial capacity is less than 16, there is no threshold. Because the constructor can set its own initial capacity
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;// Assign a value to the threshold
    @SuppressWarnings({"rawtypes"."unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;// Assign the old hash table to the new hash table after capacity expansion
    if(oldTab ! =null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if((e = oldTab[j]) ! =null) {// The current hash bucket node is E
                oldTab[j] = null;
                if (e.next == null)// There is no next node
                    newTab[e.hash & (newCap - 1)] = e;// recalculate index into a new hash table
                else if (e instanceof TreeNode)// There is a next node, which is a tree node. Move the tree to the new cap
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // Preserve order to insert the list into a new hash table node
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;// Next node
                        if ((e.hash & oldCap) == 0) {// The current bucket node is the first node
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
                    if(loTail ! =null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if(hiTail ! =null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;// Return a new hash table
}

Copy the code

In general, the capacity expansion process after data insertion is divided into the following steps:

(1) Check whether the current capacity exceeds the maximum capacity. If the current capacity exceeds the maximum capacity, do not expand the capacity. The threshold is the maximum value of an integer. 2 The current capacity is smaller than the maximum capacity and larger than the initial capacity. Twice as much capacity, twice as much threshold

After capacity expansion, the old table data is transferred to the new table, and the index value is calculated by using hash&Length-1 to determine whether the current bucket node belongs to the linked list node or the red-black tree node, and then the bucket node is transferred in sequence.

public V remove(Object key) {
    Node<K,V> e;// Check whether the deleted node is empty. If not, return the value of the node
    return (e = removeNode(hash(key), key, null.false.true)) = =null ?
        null : e.value;
}

final Node<K,V> removeNode(int hash, Object key, Object value,
                           boolean matchValue, boolean movable) {
    Node<K,V>[] tab; Node<K,V> p; int n, index;
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (p = tab[index = (n - 1) & hash]) ! =null) {
        Node<K,V> node = null, e; K k; V v;
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            node = p;// Get the current node
        else if((e = p.next) ! =null) {// The node to be deleted
            if (p instanceof TreeNode)// The next node is the tree node
                node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
            else {// List nodes, traverse to find nodes
                do {
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
                        node = e;
                        break;
                    }
                    p = e;
                } while((e = e.next) ! =null); }}// Traverse to find the node
        if(node ! =null&& (! matchValue || (v = node.value) == value || (value ! =null && value.equals(v)))) {
            if (node instanceof TreeNode)// Delete a red-black tree node
                ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
            else if (node == p)// Common node
                tab[index] = node.next;
            else// List nodes, modify the next value
                p.next = node.next;
            ++modCount;// The number of changes is increased by one
            --size;// Element minus one
            afterNodeRemoval(node);
            returnnode; }}return null;// Indicates that the node is not found
}
Copy the code

Difference and summary

Hash&length-1 In JDK1.7, the data structure is array and single linked list, and the data structure is array and single linked list and red and black tree. In JDK1.7, the data structure is array and single linked list and red and black tree, and the data structure is array and single linked list and red and black tree. 1.8 is the addition of red-black tree, tail insertion method, can avoid the problem of reverse order infinite loop.

What are the shortcomings of HashMap and how to solve and optimize it

Hashmap has key-value nullability, thread insecurity, unordered, storage location changes over time, i.e

Why wrapper classes like String and Integer in HashMap are suitable for key keys

If the key in a HashMap is of Object type, what methods need to be implemented?

Part of the content borrowed from the website
www.jianshu.com/p/8324a3457…

This is the first time FOR me to write an article seriously. Just be careful because I have written thousands of words like this. In the future, I will write an article on technology stack every few days and sync it to Github! If you have any bad places and deficiencies, please point them out!