Any technology has its underlying key basic technology, these key technology is very likely to be the key technology of other technologies, learning these underlying technology, you can learn everything, let you quickly master other technologies. How to store data on disk, how to use log files to ensure that data is not lost, and how to drop disk are key technologies not only for databases such as MySQL, but also for MQ message queues and other middleware.

InnoDB’s main disk files are divided into three large blocks: system table space, user table space, and redo log files and archive files. Files such as binlogs are maintained by MySQL Server layer, so they are not included in InnoDB disk files.

System and user tablespaces

The InnoDB system tablespace contains InnoDB data dictionaries (metadata and related objects) and storage areas for doublewrite Buffer,change buffer, and undo logs. The system table space also contains, by default, any table data and index data created by the user in the system table space. A system table space is a shared table space because it is shared by multiple tables

System table Spaces are composed of one or more data files. By default, a system data file named ibdata1 with an initial size of 10MB is created in the MySQL data directory. You can use innodb_datA_file_path to configure the size and number of data files.

The innodb_datA_file_path format is as follows:

innodb_data_file_path=datafile1[,datafile2]...
Copy the code

You can create a tablespace from multiple files and specify the attributes of the files:

innodb_data_file_path = /db/ibdata1:1000M; /dr2/db/ibdata2:1000M:autoextendCopy the code

The /db/ibdata1 and /dr2/db/ibdata2 files form the system tablespace. If the two files are on different disks, the load on the disks may be averaged, thus improving the overall performance of the database. The names of both files are followed by properties indicating that the ibDatA1 and ibdata2 files are 1000MB in size and autoextend when they run out of space.

When innodb_datA_FILe_path is set, data from tables based on the InnoDB storage engine will be recorded in the system tablespace. If innodb_file_per_table is set, The user can create a separate user table space for each table based on InnoDB storage engine. The user tablespace is named as follows: table name.ibd. In this way, the user does not have to store all the data in the default system table space, but the user table space stores only the table’s data, indexes, and insert buffer BITMAP information. The rest of the information is stored in the default table space.

The figure above shows how InnoDB storage engine stores files. FRM files are table structure definition files, which record the table structure definition of each table.

Redo log files and archive files

By default, there are two files named IB_logfile0 and ib_logFILe1 in the data directory of InnoDB storage engine. This is called InnoDB redo log fiEL. It records transaction logs for InnoDB storage engine. Redo log files come in handy when there is an error in InnoDB’s data store files. InnoDB storage engine can use redo log files to restore data to the correct state to ensure data correctness and integrity.

Each InnoDB storage engine has at least one redo logfile group (group), and each file group has at least two redo log files, such as the default IB_logFILE0 and ib_logFILe1. For higher reliability, you can set up multiple mirrored log groups and put different file groups on different disks to improve the high availability of redo logs.

Each redo log file is the same size in the log group and runs in a circular write mode. InnoDB storage engine writes to redo log file 1 first, switches to redo log file 2 when the file is full, and then switches to redo log file 1 when redo log file 2 is also full.

You can use innodb_log_file_size to set the size of redo log files, which can have a significant impact on InnoDB storage engine performance.

If the redo log file is set too large, data loss may take a long time to recover; On the other hand, if the setting is too small, the redo log file is too small, and dirty pages need to be flushed frequently to disk according to checkpoint check, resulting in performance jitter. See the section on redo logs and Checkpoint mechanisms in my previous article. InnoDB memory structure and features

Fall disk mechanism for redo logs

InnoDB follows Write Ahead redo log (WAL) and force-log-at-commit rules for flushing data files and log files, which ensure transaction persistence. WAL requires that logs in the memory be written to disks before data changes are written to disks. Force-log-at-commit Requires that when a transaction is committed, all logs generated must be flushed to the disk. If the database breaks down after the log is flushed successfully and the data in the buffer pool is flushed to the disk, the database can recover data from the logs after the restart.

As shown in the figure above, InnoDB writes changes in the buffer pool to the redo log buffer first and then to disk at the same time or when a transaction commits. This follows the force-log-at-commit principle. After redo logs are written to disks, change data in the buffer pool is written to disks based on the checkpoint mechanism, which complies with the WAL principle. In the checkpoint timing mechanism, it is determined that the redo log file is full. As mentioned earlier, if the redo log file is too small and is often full, the checkpoint frequently writes the changed data to disks, resulting in performance jitter.

When InnoDB writes data to disk, it may just be written to the file system cache without any real “bag-safe”. InnoDB’s Innodb_flush_LOG_at_trx_COMMIT property controls InnoDB’s behavior on each transaction commit. When the property value is 0, the redo log will not be written when the transaction commits, but wait for the main thread to write on time. When the value is 1, the redo log will be written to the file system cache and fsync of the file system will be called to write the data in the file system buffer to disk storage to ensure that data will not be lost. When the value is 2, the log file is also written to the file system cache when the transaction commits, but fsync is not called and the file system is left to decide when to write the cache to disk. The log flushing mechanism is shown in the following figure.

Innodb_flush_log_at_commit is a basic parameter for InnoDB performance tuning, which involves InnoDB write efficiency and data security. When the parameter value is 0, the write efficiency is the highest, but the data security is the lowest. If the parameter value is 1, the write efficiency is the lowest, but the data security is the highest. When the parameter value is 2, both levels are medium. It is generally recommended to set this property value to 1 for high data security and only to ensure transaction persistence.

Afterword.

We will also learn about binlog files and data file drop mechanism, as well as other knowledge about InnoDB transactions, please keep an eye on it.