There is no point B tree in the heart, how can we understand the underlying principle of database index?

Binary Search Trees

A binary tree is a tree structure with at most two subtrees per node. Subtrees are usually called “Left subtrees” and “Right subtrees”. Binary trees are often used to implement binary search trees and binary heaps.

Binary trees have the following features:

Each node contains one element and n subtrees, where 0≤n≤2.
The left and right subtrees are ordered and cannot be reversed arbitrarily. The value of the left subtree is less than the parent, and the value of the right subtree is greater than the parent.

The concept is a bit boring. Suppose we now have a set of numbers [35 27 48 12 29 38 55] that are inserted into the structure of a number in sequence as follows:

Ok, so that’s a binary tree! We can see that, after a series of insertion operations, an unordered set of numbers has become an ordered structure, and the tree satisfies the properties of the two binary trees mentioned above!

But what if we insert the same set of numbers in ascending order, that is, in the order [12, 27, 29, 35, 38, 48, 55]?

Since the insertion is in ascending order, the newly inserted data is always larger than the existing node data, so each time the insertion is to the right of the node, resulting in the tree is severely biased!

In the worst-case scenario, a tree is reduced to a linear linked list, so the search efficiency is naturally low, and the advantages of the tree are not fully exploited.

In order to give greater play to the binary tree search efficiency, so that binary tree is no longer partial, maintain the balance of each branch, so there is a balance binary tree!

Balanced binary Trees (AVL Trees)

Balanced binary tree is a special binary tree, so it also meets the two properties mentioned above, and there is another property: the absolute value of the height difference between the left and right subtrees is not more than 1, and the left and right subtrees are a balanced binary tree.

[35 27 48 12 29 38 55] is already a balanced binary tree.

What if I insert a balanced binary tree in the order [12, 27, 29, 35, 38, 48, 55]?

Let’s look at insertion and balancing:

This tree always satisfies several properties of a balanced binary tree and remains balanced! So our tree doesn’t degenerate into a linear list!

When we need to find a number, we can follow the root of the tree all the way down to find, so the search efficiency and dichotomy search is the same!

** How many nodes can a balanced binary tree contain? ** This depends on the height of the tree. If the height of the tree is H, the maximum number of nodes in each layer is 2^(n-1), and the maximum number of nodes in the whole tree is 2^0+2^1+2^2+… H + 2 ^ (1).

By this calculation, the height of the 100W data tree is about 20, which means that in the worst case, it takes 20 lookups to find a single data in a balanced binary tree with 100W data.

If it is memory operation, efficiency is also very high! But the data in our database is basically stored on disk, so every node in the binary tree is read once on disk I/O, so if we find a piece of data, what if it takes 20 DISK I/O?

Then performance becomes a big problem! Can we compress the tree so that there are more nodes in each layer? Although I am short, I am fat…

B-Tree

B-Tree B-Tree B-Tree B-Tree

What are the features of b-tree? An M-level B-tree has the following characteristics:

Each node has at most m children.
In addition to root and leaf nodes, each node has at least m/2 (rounded up) children.
If the root is not a leaf, then the root has at least two children.
All the leaf nodes are in the same layer.
Each node contains k elements (keywords), where m/2 is less than or equal to k.
The elements (keywords) in each node are arranged from smallest to largest.
The value of the left node of each element (keyword) is less than or equal to that element (keyword). The value of the right node is greater than or equal to the element (keyword).

Is not the feeling like the mother-in-law opening mouth to ask you to betrothal gift, list a bunch of conditions, and each one makes you very meng force!

Insert a [0,1,2,3,4,5,6,7] array into a b-tree of order 3.

So, are you clear on the b-Tree features? In a binary tree, each node has only one element.

In a B-tree, however, each node may contain multiple elements, and non-leaf nodes have Pointers to their children to the left and right of the element.

If you need to find an element, what does the process look like? If we want to find the keyword 24 in the following b-tree, the process is as follows:

From this process we can see that the query efficiency of B-tree does not seem to be higher than that of balanced binary Tree. However, the query passes through a much smaller number of nodes, which means fewer disk I/OS, which is a significant performance improvement.

From the previous b-tree diagram, we can see that the elements are values like 1, 2, and 3.

However, the data in a database are all strips of data. If a database stores data in a B-tree data structure, how is the data stored?

Let’s look at the next picture:

In a normal B-tree, the elements are numbers. But in the figure above, we split the element into key-data, where key is the primary key of the data and data is the specific data.

So when we’re looking for a number, we can just go down the root, because it’s more efficient.

B+Tree

B+Tree is an optimization based on B-Tree, making it more suitable for implementing external storage index structure.

The structure of B+Tree is similar to that of B-tree, but it has several features of its own:

All non-leaf nodes store only keyword information.
All satellite data (concrete data) is stored in leaf nodes.
All leaf nodes contain information about all elements.
There is a chain pointer between all leaf nodes.

If the above b-tree graph becomes B+Tree, it should look like this:

If you look closely at the b-tree diagram, what’s the difference?

Non-leaf nodes already have only Key information, satisfying the first property above!
All leaf nodes have a Data area below them, which satisfies property # 2 above!
Data of non-leaf nodes can be found on leaf nodes, such as elements 4 and 8 of the root node can also be found on the bottom leaf node, which meets the property of point 3 above!
Note the arrows between the leaf nodes in the figure, which meet property # 4 above!

B-tree or B + Tree?

Before we talk about the choice between the two data structures in the database, we need to understand that the operating system reads data from disk to memory in the basic unit of disk blocks, the data in the same disk Block is read at once, rather than what is needed.

Even if only one byte is required, the disk reads a certain length of data backwards from this location into memory.

The theory behind this is the famous principle of locality in computer science: when a piece of data is used, nearby data is usually used immediately.

The prefetch length is an integer multiple of a Page. A page is a logical block of computer-managed memory. Hardware and operating systems often divide the main memory and disk storage areas into contiguous, equal-sized blocks, each of which is called a page (in many operating systems, pages are typically 4K in size).

B+Tree and B+Tree What are the pros and cons?

(1) A b-tree is a non-leaf node that stores specific data, so if you find a certain keyword, you can return it.

B+Tree has all the data in the leaf node, and every time we look up the leaf node, we get the leaf node. Therefore, among B-trees and B+ trees of the same height, B-tree searches for a keyword more efficiently.

② Since all the data of B+Tree are in leaf nodes and there are pointer connections between nodes, B+Tree only needs to find the keyword and traverse along the linked list when looking for data greater than or less than a certain keyword, and B-tree also needs to traverse the root node of the keyword to search.

(3) Each node of a B-tree stores primary key + actual data, while a non-leaf node of a B+Tree only stores keyword information. The size of each page is limited, so the same page can store less data of a B-tree than that of a B+Tree.

In this way, the b-tree depth increases for the same amount of data, which increases the disk I/O count and affects the query efficiency.

In view of the above comparison, so in the common relational database, is the choice of B+Tree data structure to store data!

Here we take MySQL InnoDB storage engine as an example to explain other principles similar to SQL Server, Oracle!

InnoDB engine data store

In the InnoDB storage engine, there is also the concept of pages, the default size of each page is 16K, that is, every time data is read is 4*4K size!

Suppose we now have a user table and we write data into it:

One thing to note here is that when a new row is inserted into a page, it is usually inserted after the current row or in the space left over from deleted rows to reduce data movement, so the data on a page is not completely ordered (more on page structure below).

But for the order of data access, in each record there is a pointer to the next record, so as to form a one-way ordered linked list, but here for the convenience of demonstration I am arranged in order!

Since the data is small enough to fit on one page, there is only one root, and the primary key and data are stored at the root (the number on the left represents the primary key, and the name and gender on the right represent the specific data).

Let’s say after we write 10 pieces of data, Page1 is full, what happens when we write new data?

Let’s move on to the picture below:

There is a friend called “Qin Shousheng” came, but Page1 has not put the data, at this time you need to split the Page, produce a new Page.

What is the flow like in InnoDB?

Create a new Page2, and then copy the contents of Page1 to Page2.
Generate a new Page3, “Qin Shousheng” data into Page3.
The original Page1 is still the root, but has become a page that does not store data, only stores the index, and has two children Page2, Page3.

There are two things to note here:

Why copy Page1 to Page2 instead of creating a new page as the root?

If you are recreating the root, the physical address stored in that root may change frequently, making it difficult to find.

And in InnoDB, the root node is preread into memory, so the physical address of the node is fixed better!

② There are 10 data in Page1, and then there are 11 data in Page1, and then there are 10 data in Page1, and then there are 11 data in Page1, and then there are 11 data in Page1.

Data on primary keys 1-5 will remain on the original page, data on primary keys 6-11 will be placed on the new page, and primary key 6 will be stored at the root.

If so, the new page space utilization is only 50% and results in more frequent page splitting.

So InnoDB optimizes this by putting new data into a newly created page without moving any records from the original page.

As data continues to be written, the tree grows and grows, as shown below:

Each time you add data, you fill up a page, and then create a new page to continue writing, there is actually an implicit condition, that is, the primary key increment!

Primary key autoincrement write when the newly inserted data will not affect the original page, insert high efficiency! And page utilization is high!

But if the primary key is unordered or random, then each insert may cause the original page to be frequently split, affecting the efficiency of the insert! Reduce page utilization! This is why primary key increment is recommended in InnoDB!

All the non-leaf nodes in this tree are primary keys, so what happens if a table has no primary keys? In InnoDB, if a table does not have a primary key, the default is to look for columns with a unique index. If there is no primary key, an invisible field is generated as the primary key!

If the user table is frequently inserted and deleted, it will lead to fragmentation of the data page, the space utilization of the page is low, but also lead to the tree change of “virtual high”, reduce the efficiency of the query! This can improve query efficiency by eliminating fragmentation through index reconstruction!

InnoDB engine data lookup

How do I find data when it’s inserted?

** Find the page where the data is located. The search process is the same as the search process for B+Tree. The search process is the same as the search process for B+Tree.
** Find specific data in the page. ** Read the leaf node data found in step 1 into memory, and then find the specific data through the method of block search.

This is the same as finding a Chinese character in the Xinhua dictionary. First, locate the page where the Pinyin of the Chinese character is located through the index of the dictionary, and then go to the specified page to find the specific Chinese character.

What strategy does InnoDB use to quickly find a primary key after locating a page? We need to start with the page structure.

The blue area on the left is called Page Directory. This area consists of multiple slots. It is a sparse index structure.

The data in the slot is orderly stored, so when we are looking for a piece of data, we can first find a general position in the slot through dichotomy.

The area on the right is the data area, and each data page contains multiple rows of data. Notice the two special row records at the top and bottom of the figure, Infimum and Supremum, which are two virtual row records.

The next Infimum record pointer points to Supremum when there is no other user data.

When user data is present, the next record of Infimum points to the smallest user record on the current page, and the next record of the largest user record on the current page points to Supremum, so that all row records on the entire page form a one-way linked list.

Row records are logically divided into blocks by Page Directory. The blocks are ordered, which means that the primary key of the largest row in the block pointed to by slot 4 is smaller than the primary key of the smallest row in the block pointed to by slot 8. But the row records inside the block are not necessarily ordered.

Each row has an n_owned field (shown in pink), which identifies how many pieces of data are in the block.

The n_owned value of the pseudo record Infimum is always 1, the n_owned value of the Supremum record is 1,8, and the n_owned value of other users is 4,8.

And only the largest record in each block has a value of N_owned, while all other user records have a value of n_owned.

So when we want to find the record with primary key 6, we first use dichotomy to find the corresponding slot in the sparse index, that is, the slot “8” in the Page Directory.

The slot “8” points to the largest record in the data block, and the data is a one-way linked list structure, so it cannot be searched backwards.

Therefore, it is necessary to find the previous slot, namely “4”, and then find the target record along the linked list through the pointer of the largest user record in “4”.

Clustered index & non-clustered index

If the user table above needs to create a non-clustered index with “user name”, how does this implement?

Take a look at the picture below:

The storage structure of the non-clustered index is the same as before, but the difference is that the data part of the leaf node is no longer the specific data, but the Key of the data clustered index.

Therefore, the process of searching through non-clustered index is to find the clustered index Key corresponding to the index Key, and then take the clustered index Key to the primary Key index tree to find the corresponding data, this process is called back table!

PS: these names are from the Internet, I hope you are not accidentally reading this article ~^_^

InnoDB vs. MyISAM engine

InnoDB engine is used as an example for storage and search. What is the difference between MyISAM and InnoDB in storage? Hold back, look at the picture:

The above image shows the storage structure of MyISAM primary key index. The differences we can see are:

The data area of the leaf node of the primary key index tree does not store the actual data, but the address of the data record.
Data is stored not in primary key order, but in write order.

That is, InnoDB data is physically stored in primary key order, while MyISAM data is physically stored in insert order.

MyISAM leaves do not store data, so the storage structure of the non-clustered index is similar to that of the clustered index. When using the non-clustered index to search for data, the non-clustered index tree can directly find the address of the data, without the need to return to the table, which is more efficient than InnoDB search!

Index optimization recommendations

You will often see many articles or books on the use of some index recommendations, such as:

A fuzzy query starting with % for like will invalidate the index.
The maximum number of indexes for a table is 5.
Use overwrite indexes whenever possible.
Try not to index columns with a lot of duplicate data.
.

Many here is not a list! After reading this article, can we analyze why these suggestions are made with doubt?

Why does a fuzzy query starting with % like invalidate the index? Why should there be no more than 5 indexes in a table?

Why is that? Why? Why?? Believe that you see here plus some of their own thinking should have the answer, right?