thinking

There are many kinds of data structures of database index, such as hash index, balanced binary tree index, B tree index, B+ tree index and so on.

The most popular index is the B+ tree index. Have you ever wondered why the B+ tree index is the most popular index and why other indexes are not widely used?

It’s like, why is it that other people can make 20,000 to 30,000 dollars, but I can only make 10,000 dollars? Have you thought about that?

The hash index

Hash is a very familiar technique used in the old HashMap. Hash indexes are highly efficient and can be located at one time.

If Hash indexes are efficient, why use Hash indexes instead of b-tree indexes?

Everything has two sides, and so does Hash index. Although Hash index is efficient, it also brings many limitations and disadvantages due to its particularity, mainly including the following:

Reason 1:

Hash indexes cannot be queried using ranges

Hash indexes can only be used for “=”,”IN”, and “<=>” queries (note that <> and <=> are different operations), not for range queries such as WHERE price > 100.

Since a Hash index compares Hash values after a Hash operation, it can only be used for equivalence filtering, not for range-based filtering.

Reason two:

Hash indexes cannot be queried using partial index keys.

For a compound index, the Hash index calculates the Hash value by combining the combined index keys instead of calculating the Hash value separately.

Therefore, a Hash index cannot be used when a query is performed by the first key or keys of a composite index.

Three reasons:

Hash indexes cannot avoid table scans at any time.

A Hash index is a Hash table that stores the Hash value of the Hash result and the corresponding row pointer after the index key is Hash.

Because different index keys have the same Hash value, the query cannot be performed directly from the Hash index. Instead, the actual data in the table must be accessed for comparison and corresponding results can be obtained.

Hash index out out

Balanced binary tree index


Also known as AVL tree. In addition to the basic features of binary search tree, it also has a very important feature: its left and right subtrees are balanced binary trees.

And the absolute value (equilibrium factor) of the depth difference between the left and right subtrees does not exceed 1. That is, the balance factors for each node of the AVL tree can only be -1, 0, and 1 (left subtree height minus right subtree height).

Reasons for being eliminated

  • The tree height is too high. The higher the height is, the slower the search speed is

  • He supports range lookup, but he needs to do roundabout lookup again

Let’s say I want to find something greater than 5

In the first step, I first locate 5, and then search for other data greater than 5 in the tree according to binary tree rules. 6, 7, 8, 9, 10…

If there’s a lot of data greater than 5, it’s going to be slow.

B-tree indexes


As you can see, the biggest difference between a B tree and a binary tree is that it can store two values per node, which means that if the height of the tree is lower than the height of the binary tree, it can query faster. That’s his strength

So why didn’t he use it in the end, or because he had a roundabout query problem when he was looking in the range. Similarly, ordering by is inefficient, because the data in the tree has to be sorted manually.

Ultimate boss: B+ trees


It is an upgraded version of B number. Compared with B tree, the relationship between leaf nodes and non-leaf nodes is added in B+ tree.

Leaf nodes contain key and value. Key stores numbers ranging from 1 to 10, and value stores data storage addresses. Non-leaf nodes contain only key but not value.

All adjacent leaf nodes include non-leaf nodes, which are combined by linked list and sorted in a certain order, thus the range query efficiency is very high.

Let’s say we want to findMore than 5Data:

  • First of all, let’s go to position 5

  • And I’m just going to take everything after 5, because this is an ordered list, it’s already sorted

That’s why we use indexes when we sort order by.

If you don’t understand, please leave a message to me

Follow wechat public account: IT elder brother

Java actual combat project video tutorial: you can get 200G, 27 sets of actual combat project video tutorial

Reply: Java learning route, you can get the latest and most complete a learning roadmap

Re: Java ebooks, get 13 must-read books for top programmers

Java foundation, Java Web, JavaEE all tutorials, including Spring Boot, etc

Reply: Resume template, you can get 100 beautiful resumes