This article is participating in the “Java Theme Month – Java Brush questions punch card”, see the activity link for details

After knowing HashMap expansion mechanism, I think of before the interview, there is a the interviewer ask me ever understand why load factor is 0.75, forget at that time is how to answer, but must answer is inaccurate, then wrote the record, this article to deepen the impression, also hope to be able to help you ~

Scenario: Based on JDK1.8

1. The role of load factors

Load factor is related to capacity expansion mechanism, meaning that if the capacity of the container at that time reaches the maximum (capacity load factor), the capacity will be expanded. For example, if the current capacity of the container is 16, the maximum storage capacity is **160.75=12**. When the number of elements stored in the container reaches 12, the container will be expanded

2. Reason explanation (key points)

When we think about HashMap, the first thing that comes to mind is that HashMap is just a data structure. Since it is a data structure, the most important thing is to save time and space, and the function of load factor is definitely to save space and time. Why save? Let’s consider two scenarios of extreme cases

The load factor is 1.0

This is the underlying data structure of a HashMap. Our data is initially stored in an array, and when a Hash collision occurs, it generates a linked list on the node of the data, and when the list reaches a certain length, it turns the list into a red-black tree.When the load factor is 1.0, it means that the expansion will not occur until all eight values of the array (represented in this diagram) are filled. This is a big problem because Hash conflicts are inevitable. When the load factor is 1.0, it means a lot of Hash collisions, and the underlying red-black tree becomes extremely complex, which can be very detrimental to query efficiency. In this case, time is sacrificed to ensure space utilization

As a result, the load factor is too large, and although space utilization goes up, time efficiency goes down

2. The load factor is 0.5

At a load factor of 0.5, this means that when the array is half full, since there are fewer elements to fill, Hash collisions are reduced, the length of the underlying list or the height of the red-black tree is reduced, and queries are more efficient. However, in this case, the utilization of space is greatly reduced, which used to store 1 MB of data, now means 2 MB of space

Therefore, the load factor is too small. Although the time efficiency is improved, the space utilization rate is reduced.

3. Load factor 0.75

In extreme cases, it should be easy to see why the default is 0.75, which is a tradeoff between time and space. Of course the load factor of 0.75 is explained in the HashMap source code.Here I am tapping to the side to impress you, the translation is from Google TranslateTypically, the default load factor (.75) is a tradeoff between time and space costs. Higher values reduce space overhead, but increase the cost of lookups (reflected in most cases), and operations of the HashMap class, including GET and PUT. The expected number of entries mapping and its load factor should be considered to set its initial capacity in the following cases to minimize the number of rehashing operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, the no-rehash operation will always occur.

Below I have translated the class annotation of HashMap for those who are interested