Binlog, redo log, and undo log

Logs are an important part of the mysql database. They record various status information during the running of the database. Mysql logs mainly include error logs, query logs, slow query logs, transaction logs, and binary logs. For development purposes, we focus on binary logs and transaction logs (including the Redo log and undo log), which are covered in detail in the remainder of this article.

binlog

The binlog is used to record information about write operations (excluding queries) performed by the database and is stored in binary format on disk. Binlog is the logical log of mysql and is logged by the Server layer. Mysql databases using any storage engine will log binlog.

Logical logging: It can be simply understood that it records SQL statements.

Physical log: Because mysql data is ultimately stored in the data page, the physical log records the data page changes.

Binlogs are written by append. You can set the size of each binlog file with the max_binlog_size parameter. When the file size reaches the specified value, a new file is generated to save the log.

Binlog usage scenarios

In practice, binlog can be used in two main scenarios: master-slave replication and data recovery.

A master-slave replicationIn:MasterThe openbinlogAnd thenbinlogTo eachSlaveEnd,SlaveAlso be putbinlogSo as to achieve the consistency of master and slave data.
Data recovery: By usemysqlbinlogTool to restore data.

Binlog Specifies the disk flushing time

For the InnoDB storage engine, the biglog is only recorded when the transaction is committed, while the record is still in memory, so when is the biglog applied to disk? Mysql uses the sync_binlog parameter to control the disk flushing time of Biglog. The value ranges from 0 to N:

0: The system determines when to write data to the disk.
1: every timecommitWhen thebinlogWrite to disk;
N: The value is changed only for each N transactionsbinlogWrite to disk.

As you can see, the safest setting for sync_binlog is 1, which is the default value for MySQL versions after 5.7.7. However, setting a larger value can improve database performance, so it is possible to increase the value and sacrifice consistency to achieve better performance.

Binlog Log format

There are three formats of binlog: STATMENT, ROW, and MIXED.

Before MySQL 5.7.7, the default format is STATEMENT. After MySQL 5.7.7, the default is ROW. The log format is specified by binlog-format.

STATMENT Based on theSQLStatement replication (statement-based replication, SBR), each SQL statement that modifies the data is recordedbinlogIn the. Advantages:No need to record changes in each row, reducedbinlogLog quantity, savingIOTo improve performance; Disadvantages:This can lead to inconsistency between master and slave data in some cases, such as executionsysdate(),slepp()Etc..
ROW Row-based replication (row-based replication, RBR), instead of recording the context of each SQL statement, just record which data was modified. Advantages:There is no problem with the calls and triggers of stored procedures, or functions, or triggers not being copied correctly under certain circumstances; Disadvantages:It generates a lot of logs, especiallyalter tableWill cause the log to explode
MIXED Based on theSTATMENTandROWHybrid replication of two modes (mixed-based replication, MBR), generally used for replicationSTATEMENTMode to savebinlogforSTATEMENTOperations for which the schema cannot be replicated are usedROWMode to savebinlog

redo log

Why do YOU need the Redo log

As we all know, one of the four characteristics of transactions is persistence. Specifically, once the transaction commits successfully, the changes made to the database are permanently preserved and cannot be returned to the original state for any reason. So how does mysql ensure persistence? The simplest way to do this is to flush all the data pages that the transaction involves changes to disk each time a transaction commits. But doing so can have serious performance problems in two ways:

becauseInnodbBased onpageWhen a transaction changes only a few bytes of a data page, it is a waste of resources to flush the entire data page to disk.
A transaction may involve modifying multiple data pages that are not physically contiguous, which makes writing performance poor with random IO!

So mysql designed the Redo log, specifically to record only what changes were made to the data page by the transaction, which perfectly solves the performance problem (relatively small files and sequential IO).

Basic concepts of the Redo log

The redo log consists of two parts: a redo log buffer in memory and a redo log file on disk. Each DML statement is written to the redo log buffer, and multiple operations are written to the redo log file at a later point in time. This technique is often referred to as WRITE-ahead Logging (WAL) in MySQL.

In a computer operating system, user space (user space) can not be written directly to disk, but must pass through the operating system kernel space (kernel space) buffer (OS Buffer). As a result,redo log bufferwriteredo log fileIt’s actually written firstOS BufferAnd then through the system callfsync()To brush to itredo log file, the process is as follows:

Mysql supports three times to write the redo log buffer to the redo log file. The innodb_flush_log_at_trx_commit parameter is configured.

The parameter value	meaning
0 (write delay)	The transaction will not be committed`redo log buffer`The log is written to`os buffer`, but writes per second`os buffer`And call the`fsync()`Written to the`redo log file`In the. This means that when set to 0, data is written to disk every second (approximately). When the system crashes, data is lost for 1 second.
1 (Real-time write, real-time brush)	The transaction will be committed every time`redo log buffer`Log write in`os buffer`And call the`fsync()`Brush to`redo log file`In the. This way you don’t lose any data even if the system crashes, but because every commit is written to disk, the I/O performance is poor.
2 (Real-time write, delay brush)	Each commit only writes to`os buffer`, and then calls per second`fsync()`will`os buffer`Is written to`redo log file`.

Redo log format

As I said before,redo logActually records changes to the data page, and this change record is not necessary to save all, soredo logThe implementation uses a fixed size, loop write mode, when the end of writing, will return to the beginning of the loop write log. The diagram below:

At the same time, it is easy to know that innoDB has both redo logs and data pages that need to be flushed. The purpose of the redo log is to reduce the need for data pages to be flushed. In the preceding figure, write Pos indicates the LSN of the redo log, and check point indicates the LSN of the redo log after the data page changes are flushed to disk. The space between write Pos and check point is the empty part of the redo log. Between check point and write pos are the change records of data pages to be deleted from the redo log. When Write POS catches up with the Check point, the write POS pushes the check point forward and then records a new log.

When innoDB is started, it will always be restored regardless of whether innoDB was closed normally or abnormally last time. Because the redo log records physical changes to data pages, recovery is much faster than in logical logs such as binlogs. When innoDB is restarted, the system first checks the LSN of the data page on the disk. If the LSN of the data page is smaller than the LSN in the log, the system starts from checkpoint. Still have a kind of situation, is in the midst of the checkpoint before the downtime to brush plate process, and data page brush plate over the logs page brush set schedules, at this time will be recorded in the data page LSN is greater than the log LSN, then beyond will not redo log the progress of the part, because it itself has been done, no need to redo.

The redo log is different from the binlog

	redo log	binlog
The file size	`redo log`The size of theta is fixed.	`binlog`You can set parameters`max_binlog_size`Set each`binlog`Size of the file.
implementation	`redo log`is`InnoDB`Engine layer implementation, not all engines have.	`binlog`is`Server`Layer implementation, all engines can use`binlog`The log
Record the way	At the end of a redo log, the redo log returns to the beginning of a redo log.	Binlog logs are appended. If the size of a file is larger than the specified value, subsequent logs are added to the new file
Applicable scenario	`redo log`Applies to crash recovery (Crash-safe)	`binlog`Applicable to master/slave replication and data recovery

The difference between binlog and redo log indicates that the binlog is only used for archiving. Relying on binlog alone does not allow crash-safe capabilities. Only the Redo log does not work, because the Redo log is unique to InnoDB, and records in the log are overwritten when they are dumped. Therefore, both the binlog and the redo log are required to ensure that data is not lost when the database is restarted.

undo log

One of the four characteristics of database transactions is atomicity. Specifically, atomicity refers to a series of operations on the database that either all succeed or all fail, with no possibility of partial success. In fact, the bottom of atomicity is the undo log. An undo log records logical changes in data. For example, an INSERT statement corresponds to a DELETE undo log, and for each UPDATE statement, there is a reverse UPDATE undo log, so that in the event of an error, the data can be rolled back to the state before the transaction. Undo log is also the key to the implementation of MVCC(multi-version concurrency control). In the introduction, no further details.

reference

Juejin. Im/post / 684490…
www.cnblogs.com/f-ck-need-u…
www.cnblogs.com/ivy-zheng/p…
Yq.aliyun.com/articles/59…
www.jianshu.com/p/5af73b203…
www.jianshu.com/p/20e10ed72…

It is not easy to be original. If you think you have written a good article, click 👍 to encourage you

Welcome to my open source project: a lightweight HTTP invocation framework for SpringBoot