MySQL crash Recovery Process based on Redo Log and Undo Log

I talked about MySQL’s Buffer Pool in the previous article “InnoDB basics”. As a quick note, the Buffer Pool is a very core component of MySQL’s memory structure. You can think of it as a black box.

Update data flow under the black box

When querying data, the Buffer Pool will be queried first. If the Buffer Pool does not exist, the storage engine loads the data from the disk into the Buffer Pool and returns the data to the client. Similarly, when we update some data, if the data does not exist in the Buffer Pool, the data will be loaded first and then the data in the memory will be modified. The modified data is then flushed to disk.

This process seems to have no problem, but in fact, it does not speak martial virtue. What if we modify the Buffer Pool data successfully, but MySQL hangs before we can flush the data to disk? According to the logic in the figure above, the updated data only exists in the Buffer Pool. If MySQL crashes at this time, this part of data will be permanently lost.

Also, I get an error halfway through the update and want to roll back to the previous version. What should I do? Even data persistence guarantee, transaction rollback can not talk about crash recovery?

Redo Log & Undo Log

And by the fact that MySQL is capable of crash recovery, MySQL must be doing something nasty. That’s right, those are the two other key features we’ll cover next: Redo Log and Undo Log.

These two types of logs belong to InnoDB storage engine logs, and MySQL Server Binlog is not the same dimension of logs.

The Redo Log records the state of the transaction ‘after’. The Redo Log records the updated ‘after’ values
The Undo Log records the data status of the transaction “before”. It records the value of the update “before”

So there’s a clear difference between the two types of logs, and I’ll use a more mundane example to explain them.

Redo logs are like when you type a long command on the command line, hit Enter to execute, and an error is reported. All we need to do is press ↑ again to get the previous command and run it again.

Git reset –hard $lastCommitId — Git reset –hard $lastCommitId — Git reset –hard $lastCommitId

Implement the update process after logging

With the Redo Log and Undo Log, let’s refine the above diagram.

First, update data will still determine whether the data exists in the Buffer Pool, if not, load. We mentioned the rollback problem above. Before updating the Buffer Pool, we need to write the state of the data before the transaction started to the Undo Log. Assuming an error occurs halfway through the update, we can use Undo Log to roll back and forth until the transaction begins.

The executor then updates the Buffer Pool and writes the latest data status to the Redo Log Buffer. Because a transaction may involve multiple reads and writes, writing to a Buffer in groups is much more efficient than writing to a disk file in batches.

Why not create an Undo Log Buffer for Undo Log? So let’s assume that there is a Buffer in InnoDB that writes the state of the data before the transaction starts into the Undo Log Buffer and then starts updating the data.

If some of the updated data has been flushed back to disk, but the transaction is not successful and you need to roll back, you find that the Undo Log has disappeared with the process exit, and there is no way to use Undo Log to rollback.

What if MySQL hangs just after updating memory? The Redo Log Buffer may not even be written, and if it is, it is not flushed to disk, and the Redo Log is lost.

It doesn’t matter, because the transaction didn’t succeed due to an unexpected downtime, and since the transaction didn’t succeed, it needs to be rolled back. After MySQL restarts, it reads the Redo Log file on disk and loads its state into the Buffer Pool. The state recovered from the disk Redo Log file is the same as the state before the transaction started, so there is no impact. The Redo Log and Binlog are flushed to disk after the transaction is committed.

There are still problems in the process

You might think it’s perfect at this point, but it’s not. Let’s say MySQL suddenly goes down after we flush the Redo Log to disk, and the binlog has not been written. At this point, the Redo Log status is inconsistent with the Binlog status. The Redo Log returned to A row in the Buffer Pool with field A of 3, but any database listening to its Binlog read data with field A of 2.

Even if Redo logs and binlogs are written to files, the physical machine or VM on which MySQL is running is down, the logs are still lost. The OS now writes changes to the OS Cache before you write to a file to improve efficiency. The OS Cache is then flushed to disk according to the policy (which is influenced by the parameters you configured).

2PC-based consistency assurance

One of the key issues you can see from this is that you must ensure that the Redo Log and the Binlog are either present or not. MySQL is implemented using **2PC (Two-Phase Commit Protocol) **.

2PC is a two-phase Commit protocol that ensures the consistency of distributed transaction data. It splits the Commit of distributed transactions into two phases, Prepare and Commit/Rollback.

Just before starting a fight between the two boxers, the referee will check the status of the two fighters in the middle, similar to asking are you ready? After confirmation, the referee will say Fight.

The referee asks the contestant about his condition. After receiving a “yes” answer, the referee announces the start of the match, which corresponds to the second stage Commit. However, if one of the players is not ready, the referee will announce the suspension of the match, which corresponds to the failure of the first stage and the status of the second stage will change to Rollback. The referee corresponds to the Coordinator in 2PC, and the player corresponds to the Participant.

Let’s take a look at the process using a diagram.

During the Prepare phase, the Redo Log was written to a file, flushed to disk, recorded the ID of the internal XA transaction, and set the Redo Log status to Prepare. After the Redo Log is successfully written, the Binlog is flushed to disk and the XA transaction ID is recorded.

During the Commit phase, a Commit flag is written to the Redo Log on disk to indicate a transaction Commit. The executor then calls the storage engine’s interface to commit the transaction. That’s the whole process.

Verify availability of the 2PC mechanism

This is how 2PC submits Redo logs and binlogs. If an exception occurs during this time, does 2PC guarantee data consistency?

Suppose the Redo Log is successfully brushed, but MySQL hangs before the Binlog is brushed. After the Redo Log is restarted, the Redo Log does not have a Commit flag. The Redo Log is found based on the recorded XA transaction and rolled back.

If the Redo Log is successfully flushed, and the Binlog is successfully flushed, but the MySQL database hangs before the Redo Log is changed from Prepare to Commit. However, the Binlog queried by XID has been successfully flushed to disk.

MySQL commits the Redo Log transaction even though the Redo Log does not have a Commit flag. Because the Binlog, once written, can be consumed from the library or by any consumer that consumes the Binlog. If MySQL does not commit a transaction at this time, data inconsistencies may occur. And the Redo Log and the Binlog are actually Ready at the data level, except for the flag bit.

That’s all for this blog post. If you found it helpful, please give it a thumbs up, a comment, a share and a comment.

Welcome to wechat search to follow [SH full stack Notes] and check out more related articles

MySQL crash Recovery Process based on Redo Log and Undo Log

Update data flow under the black box

Redo Log & Undo Log

Implement the update process after logging

There are still problems in the process

2PC-based consistency assurance

Verify availability of the 2PC mechanism

Related Posts

Spring Cloud Gateway FilteringWebHandler for request processing

The Linux IO model is not that difficult

End the difficulty of installing ElasticSearch, Kibana, Logstash for beginners