The following source code is JDK1.7

A HashMap overview

HashMap is an asynchronous implementation of the Map interface based on hash tables. Provides all optional mapping operations and allows null values and NULL keys. This class does not guarantee the order of the mapping.

Note that hashMaps are not synchronous.

Hash table

Hash table definition: a hash table is a data mapping structure that searches for values according to key codes. The structure searches for places to store values by mapping the locations of key codes.

For example, the most typical example is the dictionary. If you want to find the word “press” in the dictionary, you will usually search the pinyin index according to the Pinyin AN (of course, it can also be a partial index), and then find the position of TI in the dictionary, and get the first pinyin word “an “. This process is called keycode mapping, which is to find f(key) by key. Key is the keyword, f() is the hash function, and the result of the hash function is the hash value.

Hash conflict: then the problem is, we are looking for “press “, not” an “, but they are all the same pinyin. By using the keywords an “press” and “Ann” to map to the same dictionary page 4 position, this is a hash collision (also called a hash collision), expressed in the formula is key1! = key2, but f (key1) = f (key2).

Key is the value, and f(key) calculates the storage address in the array, so that two elements have the same address. At this point, the design of the hash function is crucial, a good hash function will try to ensure that the calculation is simple and the hash address distribution is uniform, but the array is a continuous fixed length of memory space, no matter how good the hash function can guarantee that the storage address will never conflict.

There are many solutions to hash conflicts: open addressing (conflict, find the next one), rehash function, chain address.

A HashMap uses an array and a linked list.

Implementation principle of HashMap

There are two basic data structures: arrays and Pointers, and a HashMap is implemented through these two data structures. It is a combination of arrays and linked lists.

 

As can be seen from the figure, the underlying HashMap is an array structure in which each item is a linked list. When a HashMap is created, an array is initialized.

The backbone of a HashMap is an array of entries.

 

Entry is a static inner class that contains key-values.

 

The overall structure of the HashMap store is as follows:

 

To put it simply, a HashMap consists of an array + a linked list. The array is the body of the HashMap. The linked list exists to resolve hash conflicts. If the array to be located contains a linked list, the add operation traverses the list and is then compared one by one through the key’s equals method, overwriting if it exists, adding if it doesn’t, and traversing the list for the find.

Therefore, for performance purposes, the fewer linked lists in a HashMap, the better performance.

HasmMap has several important fields:

 

 

 

 

 

Constructors for HashMap:

 

As you can see from the code above, no memory is allocated for the array TABLE in a regular constructor (except for one that takes map). Instead, the table array is actually built when a PUT operation is performed

 

InflateTable () inflateTable()

 

Heavyweight roles, hash functions enter:

 

IndexFor () is implemented as follows:

 

H &(length-1) Ensure that the obtained index must be within the range of the array. For example, if the capacity is 16, length-1=15, h=18, calculate as follows:

 

It is concluded that the index = 2.

Therefore, the final storage location is indeed the following flow:

 

Finally, take a look at the addEntry implementation:

 

As can be seen from the code of addEntry, when hash conflicts occur and the size is larger than the threshold value, array expansion is required. During expansion, a new array with twice the length of the previous one needs to be created. Finally, all elements in the current Entry array are passed, and the length of the new array after expansion is twice the length of the previous one. So capacity expansion is a relatively resource-intensive operation.

The get method is much simpler:

 

GetEntry ()

 

As can be seen, the implementation of the GET method is quite simple, and the process is as follows: Key (hashcode)–>hash–>indexFor–> final index position (table[I]);

In the getEntry method, e.hash==hash is necessary when iterating through the list after locating the array position. Consider the following scenario: If the key object passed overrides equals but does not override hashCode, and happens to be located at this array location, it may be equal if only using equals, but its hashCode does not match that of the current object. According to the convention of Object hashCode, the current Object cannot be returned, but null should be returned.

Overriding equals means overriding hashCode as well

Why should hashCode also be overridden when overriding equals? Here’s a quick example:

 

Actual output:

Results: the null

Now that we have a sense of how HashMap works, this result is not hard to understand. Key (hashcod1)–>hash–>indexFor–> final index position; When value is removed by key: Key (hashcode2)–>hash–>indexFor–> final index position, return null because hashcode1 and Hashcode2 are not equal. But it also determines whether the hash value of the entry is equal, as mentioned in the get method above.)

So, when overriding equals, care must be taken to override hashCode while ensuring that equals returns the same integer value for two objects that are equal, and equals returns the same hashCode for two objects that are not equal. It’s just that hash collisions can occur and should be avoided as much as possible.

A HashMap traversal

 

conclusion

The underlying HashMap treats key-values as a whole, which is the Entry object. At the bottom of HashMap, an Entry[] array is used to store all key-value pairs. When an Entry object needs to be stored, its position in the array is determined according to the Hash algorithm, and its storage position in the linked list of the position in the array is determined according to the Equals method. When an Entry needs to be retrieved, the hash algorithm is used to find its storage location in the array, and the equals method is used to retrieve the Entry from the linked list at that location.