Preface:

It’s the golden recruitment season. Are you all ready? The basic implementation principle of HashMap should be mastered by every interviewer. Only by truly mastering the internal implementation principle of HashMap can you avoid being in a panic when facing the interrogation of the interviewer. Only by experiencing the bombarding of the interviewer can you refine your knowledge. This article, combined with a rich graphic format, will help you understand the basics of HashMap in JDK7 and how it is implemented.

Introduction to HashMap

A HashMap introduction:

A HashMap is a hash table that stores a key-value mapping. HashMap extends AbstractMap and implements Map, Cloneable, java.io.Serializable interfaces. The implementation of HashMap is not synchronous, which means it is not thread-safe. Both key and value can be null. In addition, the mappings in a HashMap are not ordered. Instances of a HashMap have two parameters that affect its performance: “Initial capacity” and “load factor.” Capacity is the number of buckets in the hash table. The initial capacity is just the capacity of the hash table when it was created. The load factor is a measure of how full a hash table can be before its capacity automatically increases. When the number of entries in a hash table exceeds the product of the load factor and the current capacity, the hash table is rehash (that is, rebuild the internal data structure) so that the hash table has approximately twice the number of buckets. Typically, the default loading factor is 0.75, which is a trade-off between time and space costs. A high load factor reduces the space overhead, but it also increases the query cost (as reflected in most HashMap class operations, including GET and PUT). The number of entries needed in the map and its load factor should be taken into account when setting the initial capacity to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operation will occur.

HashMap inheritance:The relationship between HashMap and Map is shown as follows:

Constructor of the HashMap

HashMap has four constructors, as follows:

// Default constructor.
HashMap()
// Specifies the "capacity size" constructor
HashMap(int capacity)
// Specifies the "capacity size" and "load factor" constructors
HashMap(int capacity, float loadFactor)
// A constructor that contains "submap"
HashMap(Map<? extends K, ? extends V> map)
Copy the code

HashMap in JDK7

The basic storage structure of HashMap in JDK7 or JDK8 is array + linked list. This section mainly studies the underlying implementation of HashMap in JDK7. Its basic structure diagram is as follows:

In the figure above, the orange area on the left is the hash table, and the blue area on the right is the linked list. The element type in the linked list is Entry, which contains four attributes:

  • K key
  • V value
  • int hash
  • Entry next

So why the array + linked list storage structure? Put (“Key”, “Value”) is an array that stores the Key and Value as an Entry into a hash table. So how does it store an Entry object into an array? How do you determine where the current key and value should be stored in the array, in other words, how do you determine the index of the Entry object in the array? Normally, when we determine an array, we store the data one by one until the array is full, and then consider expanding the array. That’s not how a HashMap works. In Java and most object-oriented programming languages, each object has an integer variable called HashCode. The hashcode is an important identifier that identifies different objects. With this Hashcode, it is easy to determine the subscript index of an Entry object. In Java, it can be understood that hashCode is converted to an array subscript by modulo the length of the array. The basic formula is as follows:

int index = HashCode(key) % Array.length
Copy the code

In fact, in the JDK, hash functions do not take modulo directly. Instead, they take advantage of bitwise operations to improve performance, which we understand here as simple modulo operations. We know that by hashing the Key and modulating the length of the array we get the index of the current Entry in the array, so we can keep calling the PUT method of the HashMap and keep storing data into the array. However, there is a phenomenon, which is called “hash conflict”, in which the results of different keys may be exactly the same. Now that a hash conflict has occurred, how should the conflicting data be stored? In fact, hash conflict is an unavoidable fact, since it is unavoidable, then we should try to solve this problem, currently commonly used methods are mainly two, one is open addressing method, the other is linked list method. The principle of open addressing is relatively simple, which is to “look elsewhere” in the array and try to find the next empty position. The linked list method is not to find the next gap, but to continue to store in the current conflict place, and form a linked list with the existing data, stored in the form of a linked list. The storage form of HashMap is array + linked list. Linked list method is used to solve the problem of hash conflict. Read on for more details. In daily development, the constructor, PUT, and GET methods are the most commonly used methods for HashMap. The following is a detailed understanding of the implementation principle of HashMap from these three methods.

3. Flow chart of HashMap PUT and get methods

Here is a flowchart for storing data in the PUT method of a HashMap:

Here is a flow chart for getting data from HashMap:The GET process is drawn slightly more complex than normal, just to make the process clearer.

4. Common iteration methods of HashMap

In the actual development process, the iterative traversal of HashMap is also a common operation. The common ways of iterative traversal of HashMap are as follows:

  • Mode 1: iterator mode
Map<String, String> map = new HashMap<>(16);
Iterator<Map.Entry<String, String>> iterator = map.entrySet().iterator();
while (iterator.hasNext()) {
    Map.Entry<String, String> next = iterator.next();
    System.out.println(next.getKey() + ":" + next.getValue());
}
Copy the code
  • Method 2: Traverse the Set> method
Map<String, String> map = new HashMap<>(16);
for (Map.Entry<String, String> entry : map.entrySet()) {
    System.out.println(entry.getKey() + ":" + entry.getValue());
}
Copy the code
  • Mode 3: forEach mode (JDK8 feature, lambda)
Map<String, String> map = new HashMap<>(16);
map.forEach((key, value) -> System.out.println(key + ":" + value));
Copy the code
  • Mode 4: keySet mode
Map<String, String> map = new HashMap<>(16);
Iterator<String> keyIterator = map.keySet().iterator();
while (keyIterator.hasNext()) {
    String key = keyIterator.next();
    System.out.println(key + ":" + map.get(key));
}
Copy the code

Comparing these four methods, the first three are actually the same, which are all iterator traversal methods. If both key and value are to be used, the first three methods are recommended; if only key is to be used, the fourth method is recommended.

Five, the summary

This article mainly explains the specific implementation principle of HashMap in JDK7, I believe that readers will have a clear understanding of the implementation principle of HashMap in JDK7, whether JDK7 has been used again. But the fundamentals are worth learning! We will continue to explain the implementation of HashMap in JDK8 and compare JDK7 to help readers understand the similarities and differences between the two!

In addition, I have collated and collected more than 20 years of interview knowledge points from more than companies, and all kinds of Java core knowledge points have been collated into documents. There are some screenshots below. If you want information, you can get remarks below: Nuggets.