This is probably the most basic part of the MySQL principle, what data we have in MySQL looks like on disk. Isn’t all the data stored in clustered indexes, you might say? Unfortunately, you didn’t answer my question. I’ll ask you again, what does the cluster index look like on disk?

Just like Redis’ RDB file, which ends up on disk as a real dump. RDB file, MySQL is a bit of a puzzle, we only know how to get data through SQL, we don’t know how the data is stored in the end. Of course, understanding the underlying storage logic is not just about satisfying curiosity.

Its underlying storage mode will affect the storage of Data in the clustered index, and thus affect the performance of MySQL’s Data Manipulation Language (DML). Therefore, a clear understanding of the underlying storage logic can help users in some scenarios with extreme pursuit of performance. Help us optimize MySQL.

What the table looks like on disk

First of all, let’s not talk about technical terms like table space. Let’s build a table and look at the structure of the disk. First you need to find the MySQL data directory. If you are starting with Docker, this directory will look like this:

/data00/docker/volumes/ef876f70d5f5c95325c2a79689db79cc4d1cecb7d96e98901256bd49ca359287/_data

Create a new DB named test, and then add a new directory named test under _data. Create a new student table in the test directory and create two files in the test directory: student.frm and student.ibd.

As you can see, the macro representation of the final data on disk is actually very simple, just a few files, what index ah, page ah are ignored first.

What is in a.frm file? It contains metadata for each table, including the structure definition of the table. The.ibd file holds the table’s data and index.

I saw someone write.ibd as.idb… Although DB looks better, unfortunately it is not correct. If you remember the full name of IBD innoDB Data, you will not get the abbreviation wrong.

The ibD file named after the table has a technical term called a table space.

As the name implies can be understood as my table exclusive space

Understanding table space

If I had told you right up front that InnoDB has a concept called a table space, you probably wouldn’t understand it.

A table with its own data store file is called an exclusive tablespace. InnoDB also has its own system table space, in which all table data is stored in the same file.

When is data stored in the system tablespace and when is data stored in the private tablespace?

This can be determined by the MySQL configuration item innodb_file_per_table. When this configuration item is enabled, each table has its own tablespace. Instead, when this configuration item is turned off, the table data is stored in the system table space.

This configuration item is enabled by default. You can check the status of this variable in MySQL by running the SHOW VARIABLES LIKE ‘innodb_file_per_table’ command

Since MySQL uses exclusive tablespaces as the default setting, you should know that exclusive tablespaces perform better than system tablespaces.

For a system tablespace, there is usually only one file and all table data is stored in this file. Therefore, if the TRUNCATE operation is performed on a table, data scattered in various places in the file must be deleted. The TRUNCATE operation creates a lot of free fragmentation space in the file and does not reduce the size of the shared tablespace file IBDATA1.

If you don’t understand, you can imagine the mark-clean garbage collection algorithm in Java, which can cause a large amount of memory fragmentation during cleanup, which is not conducive to improving memory utilization later on

In the case of an exclusive table space, where the entire table’s data is stored in a single file from start to finish, it is clear who is easier to clean up and free up disk space than a shared table space. Therefore, TRUNCATE performance is better for exclusive table Spaces.

In addition, an exclusive table space can increase the maximum size limit of a single table. This may not be easy to understand. Why does an exclusive table space have this effect? You just need to remember this conclusion here, and we’ll prove it later when we get to pages.

Now that we understand the concept of a table space, we can move on to understanding how data is actually stored in a table space.

Go deep inside a tablespace file

In fact, I mentioned a long time ago when I talked about InnoDB’s memory architecture, that in InnoDB, the page is the smallest unit of data management. So we should start with the smallest part, but we’ve already written an article about pages, so WE won’t go into it here.

The tablespace consists of a stack of Pages **, each of which is of the same size. The default page size is 16K, although this size can be adjusted.

The page size can be adjusted by innodb_page_size based on actual service conditions. The page size can be 4K, 8K, 16K, 32K, or 64K

A bunch of pages put together are called Extents.

The size of each zone is fixed. When innodb_page_size is set to different Extents, the number of pages contained in each Extents and the size of the corresponding Extents varies, as shown in the figure below.

When innodb_page_size is 4K, 8K or 16K, its corresponding Extents size is 1M; When the page size is 32K, the area size is 2M. When the page size is 64K, the extent size is 4M.

MySQL 5.6 only supports 4K, 8K, and 16K. 32K and 64K were added after MySQL 5.7.6.

As the size of pages and extents changes, so does the number of pages that can be held in each extents. For example, when innodb_page_size is 16K, each section contains 64 pages; When it is 8K, each area contains 128 pages; When it is 4K, each zone contains 256 pages.

Pages of data comprise Segments, and Segments comprise Segments.

inlogicInnoDB’s table Spaces are made up of such segments. As the volume of data continues to grow and new space needs to be requested, InnoDB will first request 32 pages and then allocate an entire Extents. Even within a large Segment, four extents are allocated at once.

By default InnoDB allocates two segments to each index. One is used to store non-leaf nodes in the index, and the other is used to store leaf nodes.

Table space classification

There are two types of tablespaces: system tablespaces and private tablespaces. The next step is to look at the details of each tablespace classification.

System table space

When innodb_file_per_table is enabled (it is enabled by default), the system tablespace is only used to store Change Buffer data. When we turn it off, the system table space stores the data related to the table and index. Of course, prior to MySQL 8.0, the exclusive table space also contained Double Write buffers, but after MySQL 8.0.20 it was removed and stored in a separate file.

By default, there is only one data file called IBDatA1 in the system tablespace, although multiple files are allowed. All of these properties including file name and file size are specified by the innodb_datA_file_path configuration project. For example:

innodb_data_file_path=ibdata1:10M:autoextend
Copy the code

The file name of the system tablespace is ibDatA1 and the initial size is 10M. What the hell do you know about autoextend?

As mentioned earlier, the initial size is 10M, so as MySQL runs, the amount of data will slowly grow, and the data files must apply for more space to store data. If autoextend InnoDB is defined, it will automatically expand the data files and apply for 8M space each time. Of course, this 8M can also be configured through innodb_autoextend_INCREMENT.

Exclusive table space

In fact, the above has been introduced in the introduction of almost, here is a simple summary of good. When innodb_file_per_TABLE is enabled (it is now enabled by default), the data for each table is stored in its own separate data file.

Regular table space

A regular table space is a shared storage space, just like a system table space.

The Undo tablespace

The main store is Undo Logs, with which we can quickly roll back changes if a transaction fails. InnoDB creates two datafiles for the Undo tablespace by default, undo_001 and undo_002 by default if not specified.

Innodb_undo_directory can be used to specify the specific storage path for the two data files. Of course, if not specified, the Undo tablespace data files will be placed in InnoDB’s default data directory, usually /usr/local/mysql.

The initial size of the two Undo tablespace data files is determined by the page size of InnoDB before MySQL 8.0.23.

After MySQL 8.0.23, the initial size of Undo tablespace is 16M. As for the expansion of Undo tablespace, different versions have different processing methods.

Prior to MySQL 8.0.23, four Extends Extends was applied per page. According to the previous discussion, if the page size is 16 K, then the corresponding extent is 1M. In other words, 4 m is applied per page size. This has been mentioned above and will not be repeated here.

After MySQL 8.0.23, at least 16 M of space has to be expanded each time. Moreover, to prevent explosive growth of data volume, InnoDB makes a dynamic adjustment to the capacity expansion.

If the time difference between the current capacity expansion and the last capacity expansion is less than 0.1 second, the capacity expansion will double to 32 M. If the time difference for multiple capacity expansion is less than 0.1 second, the double operation accumulates until it reaches the upper limit of 256 MB.

So you might say, well, if there is a period of time when there is a large number of requests and the capacity is expanded to the maximum 256 MB, what about the subsequent requests? Do you still apply for 256 meters? That doesn’t make sense. So InnoDB decides that if the interval between capacity expansion is greater than 0.1 second, it will halve the capacity expansion until it reaches the minimum limit of 16 meters.

Temporary table space

Data in a temporary tablespace is, as the name implies, temporary.

You’re talking shit…

It is divided into two parts, which are:

  • Session Temporary tablespace
  • Global temporary table Spaces

The Session temporary tablespace stores temporary tables created by the user or optimizer. For each Session, InnoDB allocates up to two data files (table Spaces), one for temporary tables created by users and one for internal temporary tables created by the optimizer. When a Session expires, the allocated data files are truncated and placed into a data file pool.

In fact, this operation is no different from other pooling techniques. It is worth noting that the size of these files will not change after Truncate. This data file pool is created when the MySQL server starts up, and 10 files are thrown into it by default. The default size of each file is 5 pages.

The global temporary table space contains the Rollback Segment that has been changed to the temporary table. The initial size of the Rollback Segment is about 12 M, which is also created when the MySQL server starts.

This is the end of this blog, welcome to wechat search to follow [SH full stack notes], reply [queue] to get MQ learning materials, including basic concept analysis and RocketMQ detailed source code analysis, continue to update.

If you find this article helpful, please give it a thumbs up, a comment, a share and a comment.