(1) HashMap series: load factor 0.75

(2) HashMap series: Tree threshold 8, degradation threshold 6

(3) HashMap series: 2 power expansion

(4) HashMap series: Put elements (you will regret it forever!)

Red black tree series:

First, “Algorithm – Simple” N fork tree introduction

Two, “Algorithm – simple” red black tree rotation

Three, “Algorithm – Simple” red black tree insertion

4. Delete the red black tree of algorithm – Simple and Simple

One, foreword

In this article, we will focus on two numbers: 8 and 6, which represent the conversion of a linked list to a red-black tree, and the degradation of a red-black tree to a linked list.

HashMap vs. JDk1.7&jdk1.8

In JDK1.7, the data structure of HashMap is still continuous array + single linked list; In JDK1.8, this is changed to a sequential array + single-linked list <-> red-black tree structure.

Red black tree principle, algorithm and implementation, I have written before, you can directly go to see.

Here are some minor improvements to HashMap in JDK1.8:

  • The time complexity of query, insert and delete is linear, i.e. O(n).
  • The time complexity of query, insert and delete of red-black tree is O(logN).

Obviously, the performance of a HashMap deteriorates with linked lists when there are too many hash conflicts, but red-black trees balance this out nicely.

Three, eight and six

3.1 converting linked lists to red-black trees

In the source, when the number of elements in the list is equal to 8, the list will be transformed into a red-black tree.

* Because TreeNodes are about twice the size of regular nodes, we * use them only when bins contain enough nodes to warrant use * (see TREEIFY_THRESHOLD). And when they become too small (due to * removal or resizing) they are converted back to plain bins. In * usages with well-distributed user hashCodes, tree bins are * rarely used. Ideally, under random hashCodes, the frequency of * nodes in bins follows a Poisson distribution * (http://en.wikipedia.org/wiki/Poisson_distribution) With a * parameter of about 0.5 on average for the default resizing * threshold of 0.75, although with a large variance because of * resizing granularity. Ignoring variance, The expected * occurrences of list size k are (exp(-0.5) * POw (0.5, k) / * factorial(k)). The first values are: * * 0: 0.60653066 * 1: 0.30326533 * 2: 0.07581633 * 3: 0.01263606 * 4: 0.00157952 * 5: 0.00015795 * 6: 0.00001316 * 7: 0.00000094 * 8: 0.00000006 * More: less than 1 in ten millionCopy the code

There are a few things we need to focus on in the comments:

  • TreeNodes takes up twice as much memory as normal Nodes.
  • Ideally, when the random hash code and the load factor are 0.75, the frequency of the number in bucket follows poisson distribution.

Poisson distribution: P(X=k) = exp(-λ) * POw (λ, k)/k! When λ = 0.5, the probability is calculated as above! There is a 0.00000006 chance that the list length will reach 8, and then one in 10 million!

Therefore, when the number of linked list elements is 8, the list is converted to a red-black tree.

3.2 Red-black Tree degenerates into linked Lists (UnTreeify = 6)

  • If no degradation threshold is set, only 8 is used to tree and degrade:

Then 8 will become a critical value, sometimes tree-like, sometimes degenerate, which will greatly affect performance. Therefore, we need a degradation threshold lower than 8;

  • UNTREEIFY_THRESHOLD = 7

Again, it’s not much better than the above situation, only one element apart, and it’s still going back and forth between the list and the tree;

  • So why 6?

It is also stated in the source code that the degradation threshold is at most 6 considering memory (tree nodes have 2 times more memory than normal nodes, and avoid repeated conversions).

At the same time, we also need to pay attention to:

/**
 * The smallest table capacity for which bins may be treeified.
 * (Otherwise the table is resized if too many nodes in a bin.)
 * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
 * between resizing and treeification thresholds.
 */
static final int MIN_TREEIFY_CAPACITY = 64;
Copy the code

The above comments indicate that you should not consider turning a list into a red-black tree until you have at least 64 HashMap elements on the tree.

The default size of a red-black tree is 16, so until it is expanded to 64, it will be stored as a continuous array + linked list.