I don’t know since when, there is a MySQL single table data volume more than 20 million sharp decline in performance.

There is a saying circulating in China’s Internet technology circle that the performance of MySQL will be significantly reduced if the amount of data in a single table exceeds 20 million rows. In fact, the rumor is said to have originated at Baidu. When a DBA tested MySQL performance, he found that the performance of SQL operations deteriorates dramatically when the number of rows in a table reaches 20 million. Then it was said that Engineers from Baidu flowed to other companies in the industry, bringing this information with them, so a saying was spread in the industry.

Later, Alibaba’s Java Development Manual suggested that the number of rows in a single table should exceed 5 million or the capacity of a single table should exceed 2GB, and then it was recommended to carry out library sub-table. In this regard, there is ali’s golden iron law support, so, a lot of people design big data storage, will take this as the standard, the table operation.

Is there any theoretical basis to support this statement? First of all, this performance problem can be said to be composed of many factors, such as hardware and MYSQL itself parameter Settings and so on. Today we can talk about this in terms of the index structure of MySQL.

For those of you who have a little MYSQL background, you know that MYSQL uses B+ trees as its index structure.

A question?

How many rows can InnoDB store in a B+ tree? The short answer to that question is about 20 million. Why so many? Since this can be calculated, let’s start with the InnoDB index data structure and how the data is organized.

We all know that computers have a minimum unit of storage when they store data, just as the minimum unit of cash we use today is a dime. In computer, the minimum unit of disk storage data is sector, the size of a sector is 512 bytes, and file system (such as XFS/EXT4) his minimum unit is block, the size of a block is 4K, and for our InnoDB storage engine also has its own minimum storage unit — Page (Page), the size of a Page is 16K.

Here are a few illustrations to help you understand the minimum storage unit:

A file in the file system is only 1 byte in size, but it has to take up 4KB of disk space.

All innoDB data files (ibD files) are always multiples of 16384 (16K).

Disk sectors, file systems and InnoDB storage engines all have their own minimal storage units.

In MySQL our InnoDB page size is 16K by default, which can also be set by parameter

The data in a data table is stored in pages, so how many rows of data can be stored in a page? Assuming the size of a row of data is 1K, a page can hold 16 rows of such data.

If the database was stored only this way, finding the data would be a problem because we wouldn’t know which page the data was in, and it would be too slow to go through all the pages. So they came up with a way to organize the data in a B+ tree. As shown in the figure:

We first will be ordered by the data recorded by the primary key, respectively in different pages, in order to facilitate understanding we have here only 3 records are stored in a page, the actual situation can hold a lot of), in addition to storing data page, and store keys + pointer page, as shown in figure in page number = 3 page, the page to store the key value and a pointer to the data page, Such a page consists of N keys + Pointers. And of course it’s sorted. This form of data organization is called an index organization table. Now, if I want to find a piece of data, how do I look it up?

Select * from user where id=5;

Where ID is the primary key, and we’re going to go through this B+ tree, and we’re going to go to the root page, so how do you know where the root page of the user table is? Select * from table where id=5; select * from table where id=5; select * from table where id=5; select * from table where id=5; Similarly, we can find records with id=5 by binary query method:

Now that we know how InnoDB’s primary key index B+ tree organizes and queries data, let’s summarize:

1, InnoDB storage engine minimum storage unit is page, page can be used to store data can also be used to store key value + pointer, in B+ tree leaf node store data, non-leaf node store key value + pointer.

2. The index organization table determines which page the data is in through the binary search method of non-leaf nodes and Pointers, and then finds the required data in the data page;

So back to the question we started with, how many rows of data can a typical B+ tree hold?

Here, we first assume that the height of B+ tree is 2, that is, there is a root node and several leaf nodes, then the total number of records stored in this B+ tree is: the number of pointer of root node * the number of record rows of a single leaf node.

We have stated above that the number of records in a single leaf node (page) =16K/1K=16. (This assumes that the data size of a row of records is 1K, in fact, many Internet business data record size is usually around 1K).

So now we need to calculate how much the leaf node can store a pointer, in fact it is very good also, we assume that the primary key ID bigint type, length is 8 bytes, and size of the pointer is set to 6 bytes in the InnoDB source, so a total of 14 bytes, we how many such units, a page that can be stored actually means how many Pointers, That is 16384/14 = 1170. So we can figure out that a B+ tree of height 2 can hold 1,170 *16= 18,720 of these records.

Using the same principle we can calculate that a B+ tree of height 3 can hold: 1170117016= 21902,400 such records. Therefore, in InnoDB, the B+ tree height is generally 1-3 layers, which can satisfy tens of millions of data storage. A page lookup represents an IO when searching for data, so it usually takes only 1-3 IO operations to find data through primary key index queries.

How to get the height of InnoDB primary key index B+ tree?

Above we inferred that the height of B+ trees is usually 1-3, and now we can prove this conclusion from another side. InnoDB’s tablespace file uses page number 3 to represent the root page of the primary key index, and the page level of the B+ tree is stored at the root page offset of 64. If page level is 1, tree height is 2, page level is 2, then tree height is 3. Page level = 1; Now we’ll try to find this page level from the real world.

Before you do this, you can check the InnoDB metadata table that the page number on the root page of the primary key index is 3. You can also check this from the InnoDB Storage Engine book.

Execution Result:

It can be seen that the page number of the root page of the primary key index of customer table and LineItem table under database DBT3 is 3, while the page number of other secondary indexes is 4. The differences between secondary indexes and primary indexes are described in the MySQL related books.

SQL > create tablespace tablespace file;

Since the root page of the primary key index B+ tree starts at page 3 in the entire tablespace file, you can calculate its offset in the file: 16384*3=49152 (16384 is the page size).

In addition, according to InnoDB Storage Engine, the page level value is saved in the first 2 bytes of the root page offset 64 position, so we want the page level value in the entire file offset: 16384*3+64=49152+64=49216, in the first 2 bytes.

Next we use the hexdump tool to look at the data at the specified offset of the tablespace file:

Linetem table page level = 2, B+ tree height = page level+1=3;

Page level of region table is 0, B+ tree height is page level+1=1;

Customer table page level = 2, B+ tree height = page level =3;

The data volume of these three tables is as follows:

Conclusion:

The lineItem table has more than 6 million rows and a B+ tree height of 3, while the Customer table has only 150,000 rows and a B+ tree height of 3. It can be seen that although the data amount is quite different, the height of the two table trees is 3. In other words, the query efficiency of the two tables by index is not much different, because both only need to do three IO times. So if a table has 10 million rows, its B+ tree height will still be 3, and the query efficiency will still be the same.

The region table has only 5 rows of data, and of course its B+ tree height is 1.

Finally, review one interview question

Why do you use B+ trees for MySQL indexes instead of other trees? B tree?

For a more complex version of this problem, see this article;

His simple answer:

As a result, the number of Pointers that can be stored on the non-leaf nodes decreases (some data is also called fan out). If the number of Pointers is small, the height of the tree can only be increased to save a large amount of data, leading to more IO operations and low query performance.

summary

Starting from a problem, this paper introduces the principle and query mode of InnoDB index organization table step by step, and combined with the existing knowledge, answer this question, combined with practice to prove. Of course, in order to make the statement simple and easy to understand, some details are ignored in this paper. For example, it is impossible to store all the space in a page for data, and it also stores some small number of other fields, such as page level, index number, and so on. In addition, the fill factor of the page also causes that a page cannot be used to store all the data. For secondary index data access, you can refer to the MySQL related books. The main point is to combine primary key indexes for back table queries.

More exciting content, please pay attention to my public number “programmer Cow”, my personal blog site: www.kuya123.com

Annotation: this article content comes from the Internet, if there is infringement, please contact webmaster to delete