Today, I’m going to talk about one of the most cliched and often talked about issues. Search HashMap on the Internet and find various articles. Today, I’ll take you through HashMap line by line from the source code. But the focus is on the principle of itself, as for multithreading issues, JDK 1.7, 1.8 differences, will be a new article to discuss.

New HashMap<>() <>() <>() <>() <>() <>();

The working principle of

Structural components

put

  • The first is or the hash of the key of the inserted object which is perturbed to get a new hash value, simply for better hashing
  • According to the hash value, find the subscript of the value to be inserted, and then perform the corresponding operation on the linked list, insert at the end, hash conflict, and transform into a red-black tree
  • Expand capacity if necessary

get

When retrieving the value, it is easy to find the index of the bucket based on the hash of the key, and then traverse the linked list to find the value.

Note: The key object itself has a hash value, but when putting and getting, the hash algorithm perturbs it to generate a new hash value, but the hash value of the object itself remains the same.

Member variables

Must be the focus of the first said, the other have the basis can go to the source code to find learning, so more sense of achievement.

Inner classes – Node

HashMap consists of an array and a linked list, which is essentially an array of nodes

.
,v>

It doesn’t seem like there’s anything to talk about, but there’s a class, and there’s a class that has member variables in it.

Member variables

CapCity capacity is not a member variable. Note that since JDK 1.8, buckets are initialized not in the constructor, but in the first PUT, so it’s perfectly fine not to have capcity as a variable.

The constructor

There are two types of constructors, either to assign a value to a member variable or to copy a new Map

There are two functions: tableSizeFor() and putMapEntries(). If you are interested in this function, please look at the source code.

Put

hash()

Let’s look at the PUT source code

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

The first thing to run is a hash() function

Static final int hash(Object key) {//hash algorithm int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }

Int h = key.hashCode() h is a 4-byte 32-bit number, then exclusive-or h >>> 16 and finally we get the new hash value.

H is a 32-bit number, so when it moves 16 bits to the right, the top bit (the first 16 bits) is 0, and x to the 0 is equal to x. So the whole point of the above sentence is to preserve the high, the low and the high exclusivity or, and what this does is it disrupts the low, and the low has some of the properties of the high.

Next we focus on putVal(), which is the first put and the other puts.

For the first time the put

PutVal () on the first put

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; / /... }

When the first put the table must be empty, namely the if ((TAB = table) = = null | | (n = TAB. Length) = = 0) return to true, walk to resize () process.

How does resize work when you put it for the first time

As you can see from the source code, the actual table array is initialized when resize() in putVal() in the first put.

Normal put

First there is a hash that calculates which bucket the key should be placed under. N is the length of the bucket array and is always a power of two. This is reflected in binary where only one digit is 1 and everything else is 0. In this case, n minus 1 is all 1’s in the tail and all 0’s in the rest. (n-1) &hash preserves the lower part of the hash value, and sets the higher part to zero.

If n = 16, then n-1 = 15 = 00001111, which is equivalent to preserving only the last four bits of the hash.

capacity

You can see that there is also a resize() at the end of the above source code. It can be seen that resize() is not only useful for initialization, but also for scaling.

When the number of K-V > expansion threshold = Table array capacity * load factor (default is 0.75), the expansion starts, the table array becomes 2 times the original, and the old K-V is hashed in the new array.

I took the back apart, because one is because it’s clever, and two is because it’s too long and looks tired

In advance, the idea behind the JDK is to split a linked list into two parts for hashing. If you want to break it into two parts you have to do some sort of hash operation to get two results, and the computer naturally comes up with 0,1. One number and the other number get 0 and 1, so you can think of ampersand, and only one of the numbers has a 1, and everything else has a 0.

So the algorithm is the hash (notice, the hash before the perturbation) & newCap and then you only get zeros and ones.

Get

Get () is relatively simple compared to the previous insert and expansion functions

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

You can see that you finally get to the getNode() function