You probably know that MySQL does CRUD in memory, i.e. in a Buffer Pool. Then you also know that when there is no data in memory that MySQL needs, MySQL will read data into memory from Disk via IO operations. The units of reading are: data pages

The typical data page length looks like this

That’s right, real data is stored in the data pages, and the data pages are organized in memory in a two-way table! The following figure

In B+Tree, it is required that the primary key index is incremented, that is, if the primary key index is incremented, then all data in the right data page must be larger than the data in the left data page. But obviously the image above does not fit, so you need to split the page to adjust it to look like this.

[查 看 全 文] xmind: xmind docs.qq.com/doc/DVkNMeV

Now, if you recall, you must have heard before: MySQL’s B+Tree clustered index, only leaf nodes store real data, not non-leaf nodes store index data, and leaf nodes are connected by a two-way linked list

Yes, all the leaf nodes of B+Tree are the data pages in the figure above, and they are indeed linked by a two-way list!

Moving on, what happens if we retrieve the row with ID =7 if we just look at the bidirectional list linked by the data pages above?

Obviously we have to scan from scratch!

B+Tree requires primary keys to be incremented. And there is a page splitting mechanism to ensure that all data in the data page on the right is larger than the index value of the data page on the left. Why not do binary search?

A: Yes, it is possible to do binary lookup within a single data page, but the organization of the data pages is a linked list, so you can’t avoid traversing from scratch.

What about MySQL?

MySQL abstracts an index directory from a number of data pages

That makes it much easier to search through pages of data with this index! Directly have the ability of binary search!

And the table of contents actually exists in the data page, unlike the leaf node, which stores only the index information, whereas the leaf node stores the real data, right?

The birth of the index page means that the prototype of B+Tree has been born!

As the user continues to select, the number of pages in the buffer pool increases and the number of pages in the index increases. When the existing index size exceeds 16KB (the size of a data page), a new index page has to be created to store the new index information. Then the B+Tree slowly gets fatter and fatter.

You know, B+Tree is a variation of B trees, and B trees can be 2-3 trees, 2-3-4 numbers…. General term for m-order tree, when each node can store the maximum number of elements, the tree will grow tall.

It looks like this: