Have you mastered the core knowledge of MySQL database?

Search the public account zhang on wechat, reply to the interview manual, and get the PDF version of this document and more interview materials.

Recommended reading:

  • Understand all basic Java knowledge questions
  • Understand all the Java collection interview questions
  • Understand all the computer network interview questions
  • Database index frequent interview questions

Database transactions

What are database transactions?

A database transaction is a sequence of database operations that access and may operate on various data items. These operations are either all executed or none executed. It is an indivisible unit of work. A transaction consists of all database operations performed between the beginning of a transaction and the end of a transaction.

What are the four characteristics of transactions?

  • Atomicity: Atomicity means that operations containing transactions either all execute successfully or all fail and are rolled back.
  • Consistency: Consistency means that the state of a transaction is consistent before and after execution.
  • Isolation: Changes made by one transaction are not visible to other transactions until the final commit.
  • Persistence: Once data is committed, changes made to it are permanently stored in the database.

Database concurrency consistency issues

When multiple transactions are executed concurrently, the following problems may occur:

  • Dirty read: Transaction A updates data, but has not committed yet. Then transaction B reads the updated data from transaction A, and then transaction A rolls back. The read data from transaction B becomes dirty.
  • Non-repeatable read: When transaction A reads data for multiple times, transaction B performs an update operation and commits the data. As A result, data read by transaction A for multiple times is inconsistent.
  • Phantom read: After transaction A reads data, transaction B inserts several pieces of data into the data read by transaction A. When transaction A reads data again, transaction A finds several more pieces of data, which are inconsistent with the data read before.
  • Lost modification: Transaction A and transaction B both modify the same data. Transaction A modifies first, then transaction B modifies later, and transaction B’s modification overwrites transaction A’s modification.

Inrepeatability and phantom reads look similar, but the main difference is that inconsistencies are found in unrepeatable reads because the data has been updated. In magic reading, data inconsistencies are found mainly because the data has increased or decreased.

What are the isolation levels for the database?

  • Uncommitted read: Changes made to a transaction are visible to other transactions before the transaction is committed.
  • Commit read: After a transaction commits, its changes can be seen by other transactions.
  • Repeatable read: Data read repeatedly in the same transaction is consistent.
  • Serialization: A locking implementation is required to force the transaction to be executed serially.

The isolation level of a database can solve problems such as dirty reads, unrepeatable reads, and phantom reads.

Isolation level Dirty read Unrepeatable read Phantom read
Uncommitted read allow allow allow
Submit to read Don’t allow allow allow
Repeatable read Don’t allow Don’t allow allow
serialization Don’t allow Don’t allow Don’t allow

MySQL’s default isolation level is repeatable reads.

How are isolation levels achieved?

The isolation mechanism of transactions is mainly realized by locking mechanism and MVCC(multi-version concurrency control), commit read and repeatable read can be realized by MVCC, serialization can be realized by locking mechanism.

What is MVCC?

Multiple Version Concurrent Control (MVCC) is a method to control concurrency, which is mainly used to improve the concurrency performance of databases.

You should first understand current and snapshot reads before learning about MVCC.

  • Current read: Reads the latest version of the database and locks the read record to ensure that other transactions do not repair the current record.
  • Snapshot read: Reads without locks are snapshot reads. MVCC is used to read data in snapshots to avoid performance loss caused by locks.

It can be seen that the role of MVCC is to solve the problem of database read and write conflicts without locking, and solve the problems of dirty read, unreal read, and unrepeatable read, but can not solve the problem of lost modification.

The realization principle of MVCC:

  • The version number

    System version number: is an increment ID. The system version number increases with each transaction started.

    Transaction version number: The transaction version number is the system version number at the beginning of the transaction. You can determine the chronological order of the transaction by the transaction version number.

  • Row records hidden columns

    DB_ROW_ID: space required 6 bytes, an implicit increment ID used to create a cluster index. InnoDB uses this hidden ID to create a cluster index if the table does not specify a cluster index.

    DB_TRX_ID: required space 6 bytes, specifies the ID of the most recently modified transaction, which records the ID of the transaction in which the record was created or the record was last modified.

    DB_ROLL_PTR: Requires 7 bytes to roll back a pointer to the previous version of this record.

    They look something like this, omitting the values of specific fields. ,

  • The undo log

    The snapshot used by MVCC is stored in the Undo log, which connects all the snapshots, one row at a time, through the rollback pointer. They look something like this.

So let’s do a simple example, let’s say that the first record looks something like this

Now a transaction comes along and modifies his age field to look like this

Now there’s another transaction 2 that changes its gender, and it looks like this again

From the above analysis, it can be seen that when a transaction changes the same record, each record will be joined into a linear table in the Undo log, and at the head of the table is the latest old record.

InnoDB’s workflow at the isolation level of repeat reads:

  • SELECT

    As a result of the query, two conditions must be met:

    1. The snapshot version of the row to be queried by the current transaction must be smaller than the version number of the current transaction. This ensures that the snapshot of the row read by the current transaction either existed before the current transaction started, or was inserted or modified by the current transaction itself.
    2. The deleted version number of the row snapshot to be read by the current transaction must be greater than the version number of the current transaction. If the deleted version number is less than or equal to that of the current transaction, the row snapshot has been deleted and cannot be read.
  • INSERT

    Use the current system version number as the creation version number of the data row snapshot.

  • DELETE

    Use the current system version number as the deleted version number of the data row snapshot.

  • UPDATE

    Save the current system version as the version of the data row snapshot created before the update, and save the current system version as the deleted version of the data row snapshot after the update. In fact, delete the data row snapshot before the update.

To sum up, MVCC is designed to solve the problem of concurrent read and write conflicts without locking. It can achieve two isolation levels of commit read and repeatability.

Database locks

What are database locks?

When a database has concurrent transactions, the mechanism to ensure the order of data access is called the locking mechanism.

How does the lock and isolation level relate to the database?

Isolation level implementation
Uncommitted read Always read the latest data without locking
Submit to read The shared lock is added when data is read and released after data is read
Repeatable read The shared lock is added when the data is read and released when the transaction ends
serialization Locks the entire range of keys and holds the lock until the end of the transaction

What are the types of database locks?

MySQL locks can be divided into three types according to the lock granularity:

MySQL lock category Resource overhead Locking speed Whether a deadlock occurs The granularity of the lock Concurrent degree
Table level lock small fast Don’t big low
Row-level locks big slow will small high
Lock the page general general Don’t general general

MyISAM uses table locking by default and InnoDB uses row locking by default.

The classification of lock can be divided into shared lock and exclusive lock

  • Shared lock: A shared lock, also called read lock, is abbreviated as S lock. When a transaction adds S lock to a data object, the data object can be read but cannot be updated. In addition, other transactions can only lock the data object S, not X during the lock.
  • Exclusive lock: Also known as write lock, or X lock for short, a transaction locks a data object X, and the data object can be read and updated. During the lock, other transactions cannot lock the data object X or S.

Their compatibility is as follows (not good with Excel, too ugly diagrams) :

MySQL InnoDB engine row lock mode and how to achieve?

Row lock mode

In the presence of row locks and table locks, a transaction to add a table X lock, need to check whether there is other issues on the table added a lock or a row of the table to add the lock, and each row of the table is for testing once it is very inefficient, in order to solve this problem, realize the multi-granularity locking mechanism, InnoDB has two kind of internal use intent locks, Both types of intent locks are table locks.

  • Intentional shared lock: IS lock for short. A transaction must obtain an IS lock on a table before it intends to share a lock on a row.
  • Intentional exclusive lock: referred to as IX lock, a transaction must acquire an IX lock on the table before it intends to lock rows exclusively.

With intent locks, a transaction that wants to lock X on a table simply checks to see if any other transactions have X/IX/S/IS locks on the table.

The compatibility of locks is as follows:

INnoDB locks rows by locking index entries on indexes. If there is no index, INnoDB locks records by hiding clustered indexes.

There are three types of InnoDB row locks:

  • Record Lock: Locks an index entry
  • Grap Lock: Locks “gaps” between indexes, before the first entry, or after the last entry.
  • Next-key lock: combination of the first two to lock the record and the gap in front of it.

InnoDB row locks: If data is not retrieved by index criteria, InnoDB will lock all records in the table, which has the same effect as table locks.

MVCC does not solve the phantom problem; at repeatable read isolation levels, the phantom problem is solved using MVCC+ next-key Locks.

What are optimistic and pessimistic database locks and how to implement them?

Optimistic locking: The system assumes that data updates will not cause conflicts most of the time, so the database detects conflicts only when the update operation is committed. If a conflict exists, the update fails.

Optimistic lock is generally implemented by version number and CAS algorithm.

Pessimistic locking: Shielding all operations that might violate data integrity, assuming concurrency conflicts. Generally speaking, every time you go to retrieve data, you assume that someone else will modify it, so every time you retrieve data, you lock it.

Pessimistic lock is implemented by the locking mechanism of the database, adding for updata to the query statement.

What is a deadlock? How to avoid it?

A deadlock is a blockage caused by two or more processes competing for resources or communicating with each other during execution. In MySQL, MyISAM acquires all the locks it needs at once, either satisfy them all or wait, so no deadlocks occur. In The InnoDB storage engine, except for transactions consisting of a single SQL, locks are acquired progressively, so there are deadlocks.

MySQL > prevent deadlock or lock conflicts

  • If multiple tables are accessed concurrently by different programs, try to access the tables in the same order.

  • When the program processes data in batches, if the data is sorted, try to ensure that each thread processes records in the same order.

  • In a transaction, if you need to update records, you should directly apply for an exclusive lock of sufficient level, rather than apply for a shared lock first. When you apply for an exclusive lock, other transactions may have already obtained the shared lock of the same record, resulting in lock conflicts or deadlocks.

  • Try to use a low isolation level

  • Try to use indexes to access data to make locking more accurate and thus reduce the chance of lock conflicts

  • The probability of lock conflict is lower for small transactions when the transaction size is properly selected

  • Try to access data with equal conditions to avoid the impact of next-key locks on concurrent inserts.

  • Do not apply for more than the actual need of the lock level, try not to display the lock query

  • For certain transactions, table locks can be used to speed up processing or reduce the probability of deadlocks.