1. Introduction

HashMap: As the main implementation class of Map, thread is not safe, so it may be problematic in multi-threaded environment, high efficiency, key\value can be null, the underlying implementation based on hash algorithm

This article is based on JDK1.8 to analyze the source code of HashMap, we know that JDK1.7’s HashMap is the underlying use of array + linked list, compared to JDK1.7, JDK1.8 introduced a red-black tree to optimize the linked list to solve the problem of long linked list efficiency, and rewrite the resize method, Alternative hashing related methods are removed to avoid recalculating key hashing, etc.

Principle 2.

The bottom layer of HashMap is realized based on the pull-in hash algorithm, which is composed of array + linked list. After 1.8, red-black tree is added. When HashMap is added and deleted, it first obtains the subscript of the bucket through the hash value %HashMap length of the element, and then locate the element from the linked list.

When the number of linked list data at an index position in the HashMap array is greater than 8 and the current array length is greater than 64, all data at that index position will be stored in a red-black tree

After new HashMap() is instantiated, a transient Node

[] table array is created. In JDk1.8, the length of HashMap is not initialized, but the length of the array is created when the put method is first called.
,v>

3. Source code analysis

Let’s start with some common variables in HashMap source code

  • DEFAULT_INITIAL_CAPACITY: the default capacity of the HashMap, 16
  • MAXIMUM_CAPCITY: Maximum supported capacity of HashMap, 2^30
  • TREEIFY_THRESHOLD: indicates that if the length of the BUcket list is greater than the default value, the BUcket becomes a red-black tree
  • UNTREEIFY — THRESHOLD: The number of nodes stored in a red-black tree in a BUCKET is smaller than this default value, which is converted to a linked list
  • MIN_TREEIFY_CAPACITY: specifies the minimum hash table capacity for nodes in the bucket to be drawn
  • Table: An array of elements, always 2 to the NTH power
  • EntrySet: A set that stores specific elements
  • Size: Number of key-value pairs stored in the hashMap
  • ModCount: Number of HashMap expansions and structural changes.
  • Threshold: Threshold for capacity expansion, = Capacity x padding factor If the size of a HashMap exceeds this value, capacity expansion will be performed
  • LoadFactor: loading factor

3.1 Construction method

Let’s first look at the constructor of the HashMap

/** Set the fill factor variable to the default value of 0.75, the more common **/
public HashMap(a) {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

/** will call the construct 3 **/
public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}


/** copies a Map from another Map into its own storage structure **/
public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    this.threshold = tableSizeFor(initialCapacity);
}


public HashMap(Map<? extends K, ? extends V> m) {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    putMapEntries(m, false);
}

Copy the code

3.2 INSERT method of PUT

First let’s take the source code and have a look

public V put(K key, V value) {
    return putVal(hash(key), key, value, false.true);
}


final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    // Initialize the bucket array table to determine whether it is the first PUT
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
     // If the bucket does not contain a key/value node reference, the new key/value node reference can be stored in the bucket, (n-1) &hash is equivalent to the length mod, locate the key in the bucket array.
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        // If the value of key and the hash value of the node are equal to the first node in the list, e points to the key-value pair
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            e = p;
            // If the reference type in the bucket is tree, call the red-black tree put method directly to insert
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            // Count the length of the linked list
            for (int binCount = 0; ; ++binCount) {
                // If the list does not contain a key-value to insert, the Node is added to the end of the list
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    // If the length of the list is greater than or equal to the tree threshold, the list is converted to a red-black tree
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                // Check whether the list contains the key/value to be inserted
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    break; p = e; }}// Determine whether the key-value pair to be inserted exists in the HashMap
        if(e ! =null) { // existing mapping for key
            V oldValue = e.value;
            if(! onlyIfAbsent || oldValue ==null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    // If the number of key-value pairs stored in the hashMap exceeds the threshold for capacity expansion, capacity expansion is performed
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}
Copy the code

When the table of the bucket array is empty, the resize() method is called to initialize it. If the node referenced by the key pair in the bucket is empty, the key pair is directly saved to the bucket.

If there is data on the node, the hash value of the node is the same as the hash value of the added key, and the first key of the node is the same as the added key.

If the bucket reference type is Tree, the bucket will directly call the put method of the red-black Tree to insert the bucket. If the bucket reference type is Tree, the bucket will directly call the PUT method of the red-black Tree to insert the bucket. If the bucket reference type is Tree, the bucket will directly insert the key-value to the end of the list. If the length of the inserted list is greater than TREEIFY_THRESHOLD, the list will be converted to a red-black tree. Finally, if the size of the updated hashMap is greater than the threshold, the capacity expansion will be performed

3.3 GET Search method

Source:


public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}


final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    
    if((tab = table) ! =null && (n = tab.length) > 0 &&
    // Locate the bucket where the key-value pair resides, and this position is not empty
        (first = tab[(n - 1) & hash]) ! =null) {
        // If the hash value of the node position is the same as the hash value of the key and the key value is the same, the first key pair of the node is directly returned
        if (first.hash == hash && // always check first node((k = first.key) == key || (key ! =null && key.equals(k))))
            return first;
        if((e = first.next) ! =null) {
        // If the node is a TreeNode, the red-black tree lookup method is called
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            // walk through the list to find
            do {
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    return e;
            } while((e = e.next) ! =null); }}return null;
}

Copy the code

Check whether the hash value and contents of the first value of the linked list are equal to the hash value and equals contents of the searched key. If the hash value is equal, the result is returned directly. If the node is TreeNode, the red-black tree lookup method is called. Otherwise, the linked list lookup of the node is traversed.

3.4 remove the remove ()

Source:

public V remove(Object key) {
    Node<K,V> e;
    return (e = removeNode(hash(key), key, null.false.true)) = =null ?
        null : e.value;
}

final Node<K,V> removeNode(int hash, Object key, Object value,
                           boolean matchValue, boolean movable) {
    Node<K,V>[] tab; Node<K,V> p; int n, index;
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (p = tab[index = (n - 1) & hash]) ! =null) {
        Node<K,V> node = null, e; K k; V v;
        if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
            node = p;
        else if((e = p.next) ! =null) {
            if (p instanceof TreeNode)
                node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
            else {
                do {
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
                        node = e;
                        break;
                    }
                    p = e;
                } while((e = e.next) ! =null); }}// Delete nodes and fix linked lists and red-black trees
        if(node ! =null&& (! matchValue || (v = node.value) == value || (value ! =null && value.equals(v)))) {
            if (node instanceof TreeNode)
                ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                // List elements are redirected
            else if (node == p)
                tab[index] = node.next;
            else
                p.next = node.next;
            ++modCount;
            --size;
            afterNodeRemoval(node);
            returnnode; }}return null;
}

Copy the code

The method of deletion is roughly the same as the method of search, with one more step to delete nodes and repair operations, which will not be described here

3.5 Capacity Expansion mechanism using HashMap

The length of HashMap is dynamic because of the expansion mechanism of HashMap. In HashMap, the length of HashMap bucket array is 2 to the NTH power, and the threshold size is HashMap bucket array length * load factor. When the length of HashMap exceeds the threshold, the capacity will be expanded, and the capacity will be doubled according to the current bucket array length. At the same time, the threshold is doubled, and after the expansion, the key pairs are recalculated and moved to their proper positions

Note: Why is the load factor 0.75 by default?

When the load factor is 1, there will be a lot of hash collisions and the underlying red-black tree will become complicated. The query efficiency is reduced. Time is sacrificed to ensure space utilization. When the load factor is 0.5, although hash conflicts are reduced, a large amount of space is wasted, so the compromise is 0.75. When the load factor is 0.75, the space utilization is relatively high, and a considerable number of hash conflicts are avoided, so that the height of the underlying linked list or red-black tree is relatively low, and the space efficiency is improved.

Source:


final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null)?0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if(oldTab ! =null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if((e = oldTab[j]) ! =null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
                    if(loTail ! =null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if(hiTail ! =null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}
Copy the code

3.5 Tree linked list

Red-black tree is a self-balanced binary search tree

Tree source code:

static final int TREEIFY_THRESHOLD = 8;

/** * When the capacity of the bucket array is smaller than the value, the bucket array is expanded preferentially instead of being tree */
static final int MIN_TREEIFY_CAPACITY = 64;

static final class TreeNode<K.V> extends LinkedHashMap.Entry<K.V> {
    TreeNode<K,V> parent;  // red-black tree links
    TreeNode<K,V> left;
    TreeNode<K,V> right;
    TreeNode<K,V> prev;    // needed to unlink next upon deletion
    boolean red;
    TreeNode(int hash, K key, V val, Node<K,V> next) {
        super(hash, key, val, next); }}/** * Convert an ordinary node list to a tree node list */
final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    // The capacity of the bucket array is smaller than MIN_TREEIFY_CAPACITY
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        resize();
    else if ((e = tab[index = (n - 1) & hash]) ! =null) {
        // hd for head, tl for tail
        TreeNode<K,V> hd = null, tl = null;
        do {
            // Replace an ordinary node with a tree node
            TreeNode<K,V> p = replacementTreeNode(e, null);
            if (tl == null)
                hd = p;
            else {
                p.prev = tl;
                tl.next = p;
            }
            tl = p;
        } while((e = e.next) ! =null);  // Convert an ordinary list to a tree list
        if((tab[index] = hd) ! =null)
            // Convert the tree list to a red-black treehd.treeify(tab); }}TreeNode<K,V> replacementTreeNode(Node<K,V> p, Node<K,V> next) {
    return new TreeNode<>(p.hash, p.key, p.value, next);
}
Copy the code

When the length of the list is greater than or equal to 8 and the capacity of the bucket array is greater than or equal to 64, the list will be treed. When the capacity of the bucket array is small, the probability of hash collisions will be increased and the length of the list will be too long. Therefore, when the capacity is small, the bucket array should be expanded first to avoid unnecessary treing.

In addition, if the bucket capacity is small, capacity expansion is frequent. During capacity expansion, the black tree needs to be removed and remapped. So turning a long list into a red-black tree is a thankless task when bucket capacity is small.

See blog: HashMap source code in detail (JDK1.8) – SegmentFault think no