preface

In this section, we introduce redolog logs, redo logs, random and sequential redo logs, and some of the most common problems with mysql transactions. We also introduce mysql’s transaction isolation level and its features.

An overview of the

Redo logs were introduced when we introduced the entire innoDB storage engine structure. In this section, we will continue to look at the internal structure of redo logs, look at the internal storage structure, look at the timing of redo logs to disk, and look at the number of redo logs,

Why redo logs?

The redo log is a key component to ensure that data is restored sequentially rather than randomly in the event of a mysql failure.

Redo log Log process of writing data to disks

The contents of the Redo log file

The redo log stores the following contents: tablespace number + data page number + offset + changed bytes + specific values.

  1. Mlog_1byte: indicates that 1byte is changed
  2. Mlog_2byte: indicates two bytes
  3. And so forth, 4 bytes and 8 bytes
  4. If changed a lot: “mlog_write_string”

So a redo log is in the following format:

Log type (such as MLOG_1BYTE), tablespace ID, data page number, offset in the data page, specific modified data

Here is the corresponding data structure diagram:

Redo log block Stores data pages

Redo logs ** are not written to log files as a single line. Instead, they are managed in blocks, each of which is 512 bytes. It contains a 12-byte header block, a 496-byte body block, and a 4-byte trailer block tail, while the 12-byte data header contains: “4 bytes block number, 2 bytes length of data written, 2 bytes block offset of the first log, 4 bytes check point no” **, finally its data structure is as follows

How are redolog lines stored

Because the redolog is not stored in a single line, it is stored in consecutive 512-byte blocks. This is done in the redolog buffer. When a block is written to the full buffer pool, A background thread writes data to the redo log disk file, which looks something like this:

The redo log line is written to the body block, and the background thread flusher data to disk. When a line is written to a block size, You need to write it to a file on disk, and when you’re done, memory is done, it’s going to write the data to a specific block on disk that’s inside the file on disk.

Redo log buffer Stores redo log blocks

Redo log data is flushed to disk using the same cache as the Redo log buffer. The Redo log buffer allocates a contiguous memory for the Redo log block at mysql startup, in the following format:

We can set the parameters: Innodb_log_buffer_size specifies the size of the redo log buffer. The default value is 16MB. This size is large enough because a block is 512 bytes. Data is stored across multiple Redo log pages.

When are Redo logs written to disk

There are two things we need to know about this:

  1. How many Redo log files are there?
  2. When does the Redo log block flush data to disk?

The timing of the flush to disk

Mysql flusher the redo log buffer to disk when the following conditions are triggered:

  1. More than half the size of the redo log buffer
  2. When a transaction is committed, the Redo log block corresponding to the Redo log is flushed to disk. In addition, the Redo log exists to ensure that the transaction is committed correctly.
  3. Background threads flush data to disk once a second
  4. When mysql is shut down, all redo logs are flushed to disk

One important thing to keep in mind here is that “a transaction commit is not successful until the redo log is flushed to disk, and only then is the transaction correct.”

Show variables like ‘datadir’. Innodb_log_group_home_dir = ‘innodb_log_group_home_dir’; innodb_log_group_home_dir = ‘datadir’; innodb_log_group_home_dir = ‘datadir’;

There are several Redo log files

** innodb_log_file_size specifies the size of each redo log file. The default size is 48MB **. Innodb_log_files_in_group ** You can specify the number of log files. The default is 2, 96MB, which may seem small, but in fact it is not a problem to store millions of files.

Ib_logfile0 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 = ib_logfile1 Just go back to the first one.

If you do not want to use the default size, you can adjust the above two parameters, such as 96MB per redo log file, up to 100 redo log files, etc.

Undo log Performs log rollback together with the redo log

Undo log and redo log perform the following operations:

  • Deleting requires inserting back the data before deleting
  • Updates require the old values to be updated back
  • The new data needs to be deleted after the addition

The final undo log structure is as follows:

Redo log concurrency problems

When multiple transactions are executed concurrently, one row of data in the cache page may be updated at the same time. How to handle this conflict? Do I need to lock it? A transaction may update a row of data, and another transaction may query the row of data.

Dirty write, dirty read, unrepeatable read, phantom read (key)

1. Write dirty

When transaction A updates A row of data that transaction B is about to update, transaction B performs the update after transaction A updates, then A rolls back after the update, and changes the value back after transaction B updates. The result of such an operation is: “the result of B’s operation is NULL”, essentially, transaction B updates the data that was operated by transaction A. As A result, transaction A does not commit, but once the update will affect the operation result of transaction B at any time.

2. Dirty reads

In the simplest terms, a transaction reads an uncommitted value, and when that value fails, there is a business problem.

In short, both dirty writes and dirty reads are caused by a transaction updating or querying data from another transaction that has not yet been committed. Because another transaction has not been committed, it may at any time backtrack and roll back, which will inevitably result in your updated data is lost, or the data you previously queried is lost, which is the dirty write and dirty read two bad scenarios.

3. Unrepeatable reading:

Unrepeatable read: There is no interference during the first query

Transaction B updates + commits the transaction and changes A value while TRANSACTION A reads again

And then C comes in, updates the value, and THEN A reads it again, and it’s different

Unrepeatable reads are not a serious problem, but depend on how the database is designed, because repeated reads of a value that change or do not change are up to the designer of the database.

There are two options:

  • After A reads A transaction, BC cannot make any changes unless he commits the transaction, or A reads once and never reads A second time, or the values read multiple times are the same, also known as unrepeatable reads
  • Repeatable readability is the case in the diagram above.

4. The magic to read

Phantom reading is when you use the same SQL query for a transaction many times, and each query finds some data that has not been seen before. Phantom reading is when you read data that has not been committed by someone else.

“What’s the difference between unrepeatable and illusory?”

(1) Unrepeatable read is read from data that has been changed by other transactions.

Solution: Use A row-level lock that locks the row and releases the lock only after transaction A has read multiple times, allowing other transactions to change the previous data.

(2) Phantom read is the data added by other transactions, “for INSERT and delete operations”

Solution: Use A table-level lock, lock the entire table, transaction A read the total number of times before releasing the lock, then allow other transactions to add data.

It’s much easier to understand the transaction isolation level.

Mysql has four isolation levels:

  • Read uncommitted
  • Read Committed
  • Repeatable read(repeatable read)
  • Serializable

Isolation level features:

  1. Read uncommitted: Dirty writes are not allowed to occur. This prevents two threads from updating the same value at the same time. But nothing else
  2. Read committed: Dirty writes and dirty reads are not allowed. This prevents two threads from updating a value at the same time, and prevents two values from reading the same value at the same time. That is, one value cannot read another uncommitted transaction. However, it is possible to read multiple times that a value is committed to a transaction that keeps changing
  3. Unrepeatable read: The three levels of dirty write protection, dirty read protection, and unrepeatable read protection are reflected in Mysql. The same transaction will not read the committed service no matter how many times it is read. However, data added or deleted by other transactions may be read
  4. Serialization: Fully synchronized, with only one thread allowed to access each operation

conclusion

This section delve into the insides of the redo log, briefly describe how undo log works with redo log rollback operations, and briefly describe the ACID nature of transactions. Finally, we briefly review the four isolation levels of mysql and the problems associated with them.

Write in the last

As you can see, the redo log content structure is a bit complex, but you only need to know the basics. If you need to go further, you need to read the source code and summarize and digest it.