Java base collection HashMap

1. Introduction to HashMap

HashMap is a commonly used data structure in development, which stores data in key-value pairs. Next, let’s take a look at the hashMapde data structure and the underlying implementation principle. It mainly introduces the following aspects: underlying data structure, expansion mechanism, addressing algorithm and hash algorithm, and deadlock problem in 1.7

2. Underlying data structure

In 1.7, hash is composed of arrays and linked lists. Hash is probabilistic and inevitably has conflicts. Linked lists are actually the product of hash conflicts. In extreme cases, hash conflicts are serious and the linked list becomes too long, which reduces the search efficiency. So in jdk1.8 we introduced red-black trees, which are converted to red-black trees if the list becomes too long.

3. Capacity expansion mechanism, addressing algorithm and Hash algorithm

Address algorithm and Hash algorithm Public V put(K key, V value) {return putVal(hash(key), key, value, false, true); }

Compute the hash value using the hash(hey) static final int Hash (Object key) {int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }

Hash computes the Hash value and moves it 16 bits to the right. Hash collisions can be reduced by doing or with the original value, 16 bits higher and 16 bits lower than the hash value.Copy the code

Why not just use hashCode() instead of xOR with its high 16 bits to compute a new hash value? The int type is 32 bits and can represent 2^32 types. -2^31 to 2^31-1), the initial length of a HashMap is 16 (DEFAULT_INITIAL_CAPACITY). So only the low four digits are effective, and the other high ones have no effect. So if hashcodes are 2^10, 2^20, 2^30, then the index will be the same and the hash table will not be evenly distributed. In order to reduce this conflict, the HashMap also addresses (perturbs) the high part of the hashCode, by xor the high 16 bits of the hashCode and hash it out, and then addresses it according to the hash. Expansion mechanism: The main underlying structure of HashMap is array, and array must open up continuous space; When a certain length is reached, the capacity expansion mechanism, also known as resize, is triggered. There are two main factors ● Capacity: the current length of the HashMap. ● LoadFactor: LoadFactor. The default value is 0.75f. If you’re storing the 13th element with a current size of 16, you need to resize.

Expansion? How does it expand?

  • Capacity expansion: Creates a new empty array of entries twice the length of the original array.
  • ReHash: Iterates through the original Entry array to Hash all the entries into the new array.

We need to hash again because the array length is different and we need to compute the position.

Why is the load factor 0.75f by default?

If it is too large, the likelihood of hash conflicts increases; Being too small is a waste of space, and it’s about getting a balance in experience.

Red and black tree

When the linked list length exceeds the threshold (8), the linked list is transformed into a red-black tree, which greatly reduces the search time. 8 is adopted through development and practical experience.

What about the tail plug of JDK1.7?

Using header inserts will change the order of the list, but using tail inserts will keep the original order of the list elements during expansion, and the list will not be looped.Copy the code

So it used to be A->B, but it’s still A->B

Java7 may cause an infinite loop when multithreading HashMap operations. The reason is that the order of the linked list is reversed after the expansion and the reference relationship of the nodes in the original linked list is changed during the migration. Java8 does not cause an infinite loop on the same premise, because the order of the linked list remains the same after expansion and migration, and the reference relationship between the previous nodes remains.Copy the code