public class TreeMap<K,V>       extends AbstractMap<K,V>       implements NavigableMap<K,V>, Cloneable, java.io.Serializable
Copy the code

TreeMap inherits an interface, NavigableMap, that defines the difference between TreeMap and HashMap:

The key of a HashMap is unordered, and the key of a TreeMap is ordered

Interface NavigableMap

First take a look at NavigableMap’s signature

public interface NavigableMap<K,V> extends SortedMap<K,V>
Copy the code

NavigableMap inherits the SortedMap signature

SortedMap

public interface SortedMap<K,V> extends Map<K,V>
Copy the code

SortedMap, as its name suggests, is an ordered Map. This order usually refers to the natural ordering of keys provided by the Comparable interface, or it can be determined by specifying a Comparator when creating the SortedMap instance. The order of keys is shown when we iterate over a SortedMap instance with collection Views (also provided by entrySet, keySet, and values methods, as with HashMap). The distinction between Comparable and Comparator is explained here:

Comparable generally means the natural order of a class, such as defining a Student class whose Student number is sorted by default

A Comparator typically represents a special classification of a class in a specific situation that requires a custom sort. Let’s say we want to sort by the age of the Student class

Classes that insert keys into the SortedMap must all inherit the Comparable class (or specify a comparator) in order to determine how to compare(via k1.compareTo(k2) or comparator.compare(k1, k2)) two keys, otherwise, ClassCastException is raised on insertion. The order of keys in SortedMap should be consistent with equals. If k1.compareTo(k2) or comparator.compare(k1, k2) is true, k1.equals(k2) should also be true. After introducing SortedMap, let’s go back to our NavigableMap. NavigableMap is new in JDK1.6, which adds “navigation methods” to SortedMap to return the elements closest to the search target. For example, the following methods:

LowerEntry, which returns all elements smaller than the given map.entry

FloorEntry, returns all elements that are smaller or equal to the given Map.Entry

CeilingEntry returns all elements larger or equal to a given Map.entry

HigherEntry, which returns all elements larger than a given map.entry

Design Concept

Red — Black tree

TreeMap is based on a red-black tree, which is a binary search tree. Let’s recall some properties of binary search trees

Binary search tree

A binary search tree (BST) looks like a binary search tree.

! [](https://mmbiz.qpic.cn/mmbiz_jpg/FDmPiaXNnWFQfYWTb9lY1rNdibJ13aBNusnBd1vzlCJaYEgCy7rSJJcTl7fFgVvq6ZIS0cCDK7v1icfDDgDHi cAmgA/640?wx_fmt=jpeg&tp=webp&wxfrom=5&wx_lazy=1&wx_co=1)
Binary search tree

I’m sure you’re familiar with this diagram, but the key point is:

The value of the left subtree is less than the root node, and the value of the right subtree is greater than the root node.

The advantage of binary search trees is that each decision reduces the size of the problem by half, so if binary search trees are balanced, the time to find elements is log(n), which is the height of the tree. A serious problem comes to mind here. If a binary search tree reduces the size of the problem by half, wouldn’t a trinomial search tree reduce the size of the problem by two thirds, which is better? And so on, we could have a quadominal search tree, a five-ominal search tree… For the more general case:

N elements, K cross tree what is the most efficient way to search a tree for K? At K = 2?

K fork search tree

If you follow my above analysis, it is likely to fall into a misunderstanding, that is

Two comparisons are required for a binary search tree to reduce the problem size by two thirds (only one comparison is required for a binary search tree to reduce the problem size by half)

We can’t ignore both of these, for the more general case:

N elements. The average number of comparisons required to search a k-tree is K log n over K.

For extreme cases k = n, k-tree is transformed into a linear table, and the complexity is O(n). If the problem is solved mathematically, it is equivalent to:

When n is fixed, what is the minimum value of k log n over k?

K *log(n/k) can be converted to ln(n)*k/ln(k) according to the logarithm operation rules. Ln (n) is a constant, so it is equivalent to taking the minimum value of k/ln(k). This is a very simple question for those of you who have just learned advanced mathematics in your first year, so let’s go straight to the results

When k is equal to e, k/ln of k is the minimum.

The value of the natural number e is about 2.718, and you can see that this is basically the optimal solution of the binary tree. Do the following in the Nodejs REPL

function foo(k) {return k/Math.log(k); }> foo(2)2.8853900817779268> foo(3)2.730717679880512> foo(4)2.8853900817779268> foo(5)3.1066746727980594Copy the code

It looks like k = 3 is smaller than k = 2, which means that a trinomial search tree should be better than a binomial search tree, but why are binomial trees more popular? Here’s what I found on the almighty StackOverflow:

Cpus can now be optimized for binary logic code, which is split into multiple binary logic.

And that’s kind of why binary trees are so popular, because with one comparison, we can at most cut the size of the problem in half. Okay, this is a bit of a digression, so let’s go back to the red and black tree.

Red black tree property

Red-black tree example

One point to note:

Leaf nodes are NIL nodes in the figure above, which is not found in some textbooks in China, and we sometimes omit these NIL nodes when we draw the diagram, but we need to be clear that when we say leaf nodes, we are referring to these NIL nodes.

Red-black trees are guaranteed to be balanced by following five rules:

  1. The nodes of the tree are only red and black

  2. The root node is black

  3. The leaf nodes are black

  4. The byte point of the red node must be black

  5. There are the same number of black nodes in the path from any node to subsequent leaf nodes

If the above five conditions are satisfied, it is guaranteed that the longest path from the root node to the leaf node is no more than twice the shortest path from the root node to the leaf node. In fact, this is easy to understand, mainly using properties 4 and 5, here is a brief description:

Assume that the root node to the leaf nodes of the shortest path, black node number for B, then according to the properties of 5, the root node to leaf node in the longest path, black node number is B, the longest is every two middle of black nodes, there is a red node (that is, in the case of black and red), so the red node for a maximum of 1 B -. So that would prove the above conclusion.

Red-black tree operation

Red-black tree rotation example (NIL nodes not drawn)

About the red and black tree insert, delete, left rotation, right rotation these operations, I think it is best to achieve visualization, text expression is more complicated, I will not present the ugly here, can find more online, like v_July_v “teach you a thorough understanding of the red and black tree”. Here I recommend a SWF teaching video (the video is in English, don’t be afraid, the key is to look at the picture??) , about 7 minutes, you can refer to. There’s also a visualized interactive red-black tree, so you can go up and do it yourself, insert a few nodes, delete a few nodes, and see how it works with left and right.

Source analysis

Because I’m not going to talk about red-black tree operations here, there’s basically no source code here, because the most important algorithms are From CLR, and CLR is Cormen, Leiserson, Rivest, who wrote the introduction to algorithms, In other words, TreeMap algorithms are pseudo-code based on the introduction to algorithms. Because a red-black tree is a balanced binary search tree, the time complexity of put (including update operations), GET, and remove is log(n).

conclusion

So far, TreeMap and HashMap implementations have been introduced. As you can see, the different implementations of TreeMap and HashMap determine the different application scenarios.

TreeMap keys are ordered, and the time complexity of add, delete, change and query operations is O(log(n)). To ensure the balance of red-black trees, rotation is necessary

The keys of a HashMap are unordered, and the time complexity of add, delete, change, and check operations is O(1). In order to achieve dynamic expansion, resize will be performed if necessary.