Q: How many levels does an InnoDB B+ tree index have in a real production environment? How many rows can I store?

About this question seems to have been seen frequently on the cow guest recently, the feeling has no meaning, probably mainly inspects is the understanding to the B+ index. Answer first:

A: It is usually two or three layers and can store about 20 million lines of data.

As mentioned earlier, pages are the smallest unit of InnoDB disk management. In the InnoDB storage engine, the default size of each page is 16KB. The pages are rows of records.

Assuming a row of data is 1K in size, a page can hold 16 rows of such data.

As we all know, the leaf node of B+ tree stores the real record, while the non-leaf node exists to find the leaf node where the corresponding record resides more quickly. Therefore, it can be simply understood as the non-leaf node stores the key value + pointer. It’s not exactly the offset of the page, but the pointer is better understood

By organizing tables by index, rows of data are stored in separate pages. As shown below:

Let’s say we want to find a row with a primary key of 20 from this B+ treeselect * from table where id = 20;

First, find the root node of the B+ tree, i.e. the page page_number = 10 of the stored non-leaf node. On this page, use binary search method and pointer to locate the row where id = 20 exists on the page page_number = 12. Then, again, binary search on this page can quickly locate the row with id = 20.

The page is the smallest unit of InnoDB disk management and can be used to store not only specific rows of data, but also keys and Pointers.

Back to the topic, let’s start with a simple one. Suppose that B+ tree has only two layers, namely a root node and several leaf nodes, as shown in the figure below:

How many rows of data can be stored in a B+ tree? How many rows of data can be stored in a non-leaf node of a B+ tree?

  • Number of Pointers to the root node * Number of rows per leaf node

Each leaf node to store the number of rows is deposited on the number of records per page, because the number of fields in the data table is different, here we don’t dig into leaf nodes of the storage structure, simple in accordance with the size of one row of data for 1 k to calculate (in fact, now many Internet business data record size is usually around 1 k), A page or leaf node can hold 16 rows of such data.

So how much data can the root (non-leaf) of a B+ number store?

We assume that the primary key is BigInt and the length is 8 bytes, and the pointer size is set to 6 bytes in InnoDB, making a total of 14 bytes.

For the sake of writing, we call a primary key + a pointer a cell, so that a page or a non-leaf node can hold 16384/14 =1170 such cells.

In other words, 1170 Pointers can be stored in a non-leaf node, which corresponds to 1170 leaf nodes. Therefore, for a B+ tree of height 2, 1170 (Pointers in a non-leaf node) * 16 (rows in a leaf node) = 18720 rows of data.

Of course, this analysis is not very precise, according to the definition of “MySQL Technology Insider: InnoDB Storage Engine”, InnoDB data page structure contains the following parts:

If you want to dig deeper, you can go to chapter 4.4 of the book, which I won’t analyze here.

OK, after analyzing B+ trees of height 2, let’s do the same for trees of height 3:

The root page (page10) can hold 1170 Pointers, and the second level of each page (page:11,12,13) can hold 1170 Pointers. Thus, a total of 1170 * 1170 Pointers can be stored, that is, there are 1170 * 1170 non-leaf nodes, so a total of 1170 * 1170 * 16 = 21902400 lines of records can be stored.

| flying veal 🎉 pay close attention to the public, get updates immediately

  • I am a postgraduate student in Southeast University and a summer intern in Java background development of Ctrip. I run a public account “Flying Veal” in my spare time, which was opened on 2020/12/29. Focus on sharing computer fundamentals (data structure + algorithm + computer network + database + operating system + Linux), Java technology stack and other related original technology good articles. The purpose of this public account is to let you can quickly grasp the key knowledge, targeted. Pay attention to the public number for the first time to get the article update, we progress together on the way to growth

  • And recommend personal maintenance of open source tutorial project: CS-Wiki (Gitee recommended project, has accumulated 1.8K + STAR), is committed to creating a perfect back-end knowledge system, in the road of technology to avoid detours, welcome friends to come to exchange learning ~ 😊

  • If you don’t have any outstanding projects, you can refer to the Gitee official recommended project of “Open Source Community System Echo” written by me, which has accumulated 1.1K + STAR so far. SpringBoot + MyBatis + Redis + Kafka + Elasticsearch + Spring Security +… And provide detailed development documents and supporting tutorials. Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo Echo