“This is the 21st day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021”

I don’t know if you understand MySQL MVCC mechanism, this is an important one in the underlying principle of MySQL, it can greatly improve the concurrency performance of MySQL database. MVCC is widely used in database technologies such as Oracle and PostgreSQL. In this article we will take you to understand the MySQL MVCC mechanism implementation principle.

What is MVCC?

Multi-version Concurrency Control (MVCC) is a key design for improving the Concurrency performance of MySQL databases.

When I/O requests occur on the same row, locks are used to ensure data consistency. In case of read/write conflicts, the MVCC can let the data read through the snapshot instead of the current read, and the snapshot read does not need to be locked.

In the previous article, we also introduced the locking mechanism in MySQL. If you are not familiar with the previous article, you can browse through the previous article.

The business of the InnoDB

MySQL MVCC is supported in the InnoDB storage engine. The most important and special aspect of InnoDB is transactions, so we need to understand some of the design related to transactions.

  • Row-level locks InnoDB offers row-level locks, row-level locks will undoubtedly make lock granularity more fine, but too much data, the high concurrency scenarios, the same time will produce large amounts of lock, therefore, InnoDB lock for the effective optimization of the space, also makes it in under the condition of high concurrency, also won’t because that too much the same time, due to run out of memory.

    • Exclusive lock
    • The Shared lock.
  • Isolation level

    • READ_UNCOMMITTED: dirty reads
    • READ_COMMITTED: reads commit
    • REPEATABLE_READ: repeat read
    • SERIALIZABLE: serialization
  • redo log

    Redo log saves SQL statements to a specified log file. When Mysql executes recovery, redo log operations are performed. When a client executes an SQL statement, the redo log is first written to the log buffer. When the client executes the COMMIT command, the contents of the log buffer are flushed to disk as required. The redo log exists on disk as a separate file, that is, InnoDB log file.

  • undo log

    In contrast to the redo log, undo logs are used for rollback. The undo log writes the original rows of the row affected by the transaction to the Undo buffer and flusher the contents of the undo buffer to disk at an appropriate time. An undo buffer is a circular buffer like a redo buffer, but when the redo buffer is full, the undo buffer is flushed to disk. Unlike redo logs, there is no single undo log file on disk. All undo logs are stored in the primary IBD data file, even if the client has set one data file per table.

Line update process

InnoDB implements three hidden fields for each row:

  • The ID of the hidden
  • 6-byte transaction ID (DB_TRX_ID)
  • 7-byte rollback pointer (DB_ROLL_PTR)

Line update process

  1. Add a new item of data to the database. This item of data has three hidden fields. Only ID has a value

  2. T1 modifies this data, starts transaction, and records read_view

    • An exclusive lock locks the row of data
    • Record the redo log
    • Write the row to the undo log
    • Write the modified value to the data, fill in the transaction Id, and fill in the rollback pointer according to the undo log record location
  3. T2 modifies this data, starts the transaction, and records read_view

    • An exclusive lock locks the row of data
    • Record the redo log
    • Write the row to the undo log
    • Write the modified value to the data, fill in the transaction Id, and join the two undo log records using the rollback pointer (version chain)
  4. Transaction commit, record read_view

    • Normal to submit
    • If the rollback is triggered, you need to find the undo log corresponding to the rollback pointer to roll back the undo log

Note:

  • InnoDB has the Purge thread, which is responsible for querying and cleaning up invalid undo logs.
  • The above procedure describes the process of an UPDATE transaction. The original data does not exist at the time of the INSERT, so the INSERT can be discarded during the rollback

Basic features of MVCC

  • Each row of data has a version that is updated each time the data is updated.
  • When modifying, copy the current version and modify at will, without interference between each transaction.
  • The version number is compared when saving, and the original record is overwritten if the transaction is successfully committed. If the rollback fails, the copied data is discarded.

How InnoDB implements MVCC?

MVCC is built on top of undo log.

The data recorded in undo log is the multiple versions in MVCC.

Form a version chain by rolling back the pointer.

The transaction ID is used to find the record on the read-view

The read – the view record:

  • M_ids: indicates the list of active transaction ids
  • Min_trx_id: the minimum transaction ID of an active transaction
  • Max_trx_id: indicates the maximum transaction ID that has been created
  • Creator_trx_id: indicates the current transaction ID

Version chain alignment rules:

  1. If trx_id < min_trx_id, this version was generated by a committed transaction, and this data is visible;

  2. If trx_id > max_trx_id, this version is generated by a future transaction and is definitely not visible.

  3. If min_trx_id <= trx_id <= max_trx_id, there are two cases

    • If the trx_ID of a row is in m_IDS array, it indicates that the version of the row was generated by a transaction that has not yet committed and is not visible. The current transaction is visible.

    • If the trx_ID of the row is not in the m_IDS array, it indicates that the version was generated by a committed transaction

MySQL InnoDB implements MVCC, which is the isolation level of read committed and repeatable reads. Based on optimistic locking theory, it compares transaction ID and read-view records to judge whether the analysis data is visible, so that most of its read operations do not need locking, thus improving concurrency performance.

However, InnoDB still needs to add an exclusive lock when writing data.

To sum up, this is to improve concurrency performance by replacing pessimistic locks with optimistic locks. This is CALLED MVCC.