Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”

This article also participated in the “Digitalstar Project” to win a creative gift package and creative incentive money

Why do interviews always ask about HashMaps

Their roots

Before, my upperclassman told me that you will benefit a lot from having a good look at HashMap, to understand it and to understand its implementation. As a young man, I am not sensible, so I just need to use it. Until the interview gave me a vivid lesson, if I could do it again, I would choose Han Xin and hit red buff first…As a fresh graduates, we know that our company project experience is not enough, so the company on the basis of our knowledge, and the underlying HsahMap both array and list, red and black tree such as the basis of a lot of knowledge, so the HashMap became the worst-hit areas, not only to look at the basic knowledge of master degree, also to look at the depth of the mastery of the knowledge base.

Scene reappearance

Interviewer: Would you like to introduce yourself first? Me: I’m from King Valley College, good at Java… A HashMap is an array, a linked list, and a red-black tree. A HashMap stores data in key-value pairs. Good, understood, you have what to ask me of, I know you have no, that arrive here, return to wait for notice ME: I this is steady ah! Three minutes later, thank you for your delivery, you have entered our talent pool, fate goodbye (see you again). Me: Beautiful.

The underlying implementation of HashMap

Directly aboveIs not what also did not understand, then listen to me to tell you in detail.

Defined constants and their use

Upper layer code

public class ConcurrentHashMap<K.V> extends AbstractMap<K.V>
    implements ConcurrentMap<K.V>, Serializable {

 private static final long serialVersionUID = 362498820763181265L;
 static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // Default initial size
  static final int MAXIMUM_CAPACITY = 1 << 30;// Maximum capacity
  static final float DEFAULT_LOAD_FACTOR = 0.75 f;// Load factor
  static final int TREEIFY_THRESHOLD = 8;// The maximum length of the list
  static final int UNTREEIFY_THRESHOLD = 6;// Minimum length of red-black tree
  static final int MIN_TREEIFY_CAPACITY = 64;// Minimum number of key-value pairs
Copy the code
public abstract class AbstractMap<K.V> implements Map<K.V> {
Copy the code
AbstractMap (HashMap) AbstractMap (HashMap) AbstractMap (HashMap) AbstractMap (HashMap) (I'm not lazy, I'm not)Copy the code

Then there are these constants (smelly and long), which you can see in English

16 MAXIMUM_CAPACITY is the maximum capacity. 1073741824 DEFAULT_LOAD_FACTOR Is the default loading factor UNTREEIFY_THRESHOLD is the minimum length of a red-black tree. When the length of a red-black tree is less than 6, the treeify_capacity is the minimum number of key pairs This is to avoid unnecessary conversions when multiple key-value pairs happen to be in the same linked list at the beginning of the hash table creation.Copy the code

I had no idea how big it was, so I asked.

What’s the use of these constants? Let’s move on

 public HashMap(a) {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }
Copy the code

When we use HashMap, new Hash() will not be passed, and when no arguments are passed, the reprint factor will be initialized to 0.75.

 public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }
Copy the code

We can also pass parameters, we can pass capacity and load factors, but if we pass too much capacity, it will be initialized to the default maximum capacity.

 static class Node<K.V> implements Map.Entry<K.V> {
        final int hash;// The hash value of the object
        final K key;/ / key
        V value;/ / value
        Node<K,V> next;// points to the next element
Copy the code

The bottom layer defines the nodes that implement the Map.Entry interface, which is a linked list composed of each Node.

Explain the put () method

 public V put(K key, V value) {
        return putVal(hash(key), key, value, false.true);
    }

 final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        break; p = e; }}if(e ! =null) { // existing mapping for key
                V oldValue = e.value;
                if(! onlyIfAbsent || oldValue ==null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }
Copy the code

You can see that the put () method calls putValue, and then the hash () method is passed

 static final int hash(Object key) {
        int h;
        return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
    }
Copy the code

Some people ask me, why don’t I just call the hSAH value of the key itself, and write another method, which is actually a very literal hash function

The hash function, or the hash value optimization function, or the perturbing function, takes the original hash value, perturbs it, generates a new hash value, reduces the probability of hash conflictsCopy the code

If the array is empty, it initializes the array. Then it checks whether the current Node is empty. If it is empty, it creates a new Node and stores the value in it. Return the old value, and then there’s a judgment, which is whether or not there’s a tree, and if there is a tree, it follows the structure of the tree, and if it’s not, it puts the tail after the node.

Afraid the text is too long not willing to see, see the picture!!

Expansion mechanism

 ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
Copy the code

If the capacity reaches the threshold, the capacity will be expanded. The threshold is the capacity (existing capacity, not the maximum capacity) multiplied by the transfer factor. That is to say, if the loading factor is smaller, the capacity expansion will be triggered more frequently, and the capacity after expansion will be twice the original. The original node will hash again to produce a new hash, sorting again. Px: The source code here is too long, you can look at the source code for yourself (I bet no more than five will)

Small experience sharing

Don’t memorize, understand the principle, or you will still be unable to answer some questions. For example, the interviewer: Why did you double the capacity? Why can’t I triple the capacity?In fact, if carefully read the source code will understand

 final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)// Here it is
            tab[i] = newNode(hash, key, value, null);
Copy the code

In the put line, I = (n-1) & is used to hash the position with capacity -1 and hash. The capacity is doubled, so its capacity is always raised to the power of 2. In this way, after -1, the binary will be all 1.

Apologize to the bottom developers, the wisdom of the bosses is beyond your mortal comprehension

If the length of a list is larger than 8, it will be converted to a red-black tree. If the size of a red-black tree is less than 6, it will be converted to a red-black tree. Why can’t I set it to 7? Me: 66 dashun, 88 dafa, for good luckCause: The average search length of red-black tree is log(n) with length of 8, the search length is log(8)=3, the average search length of linked list is N /2, when the length is 8, the average search length is 8/2=4, which is necessary to convert into a tree; If the list length is less than or equal to 6, 6/2=3, it is also very fast, but the time to convert to tree structure and spanning tree is not too short.

The reason for not using 7 is that there is a difference of 7 in the middle to prevent frequent conversions between lists and trees. For example, if the number of linked lists is more than 8, the linked list will be converted into a tree structure; if the number of linked lists is less than 8, the tree structure will be converted into a linked list. If a HashMap keeps inserting and deleting elements, and the number of linked lists is around 8, it will frequently convert tree to linked list and linked list to tree, and the efficiency will be very low.

Finally, I’d like to recommend a book to you

Here are a lot of my own understanding, if there is a mistake please hit me in the faceClass is over!