Isolation level

The isolation level can be understood in conjunction with the data consistency issues that may arise from the next part of concurrent transactions. I believe it will be more helpful.

(4) isolation levels for transactions

When multiple transactions are executed in parallel, problems such as dirty reads, non-repeatable reads, and phantom reads may occur. These are all data consistency problems caused by simultaneous execution of transactions, which need to be solved for the isolation of transactions.

Different isolation levels correspond to different consistency problems in parallel execution, and also affect concurrency and database performance. The higher the isolation level, the better the isolation, and the more stringent the isolation, the lower the concurrency, the lower the efficiency, and the worse the performance.

Therefore, the actual database use is a combination of all aspects, using the appropriate isolation level.

The SQL standard has four transaction isolation levels:

Read Uncommitted – Read uncommitted

  • Read uncommitted (read uncommitted) : When a transaction is not committed, its changes can be seen by other transactions.

In general, very few databases use this isolation level. Some databases, such as PostgreSQL, do not take effect even if the level is changed. Because it creates dirty reads, non-repeatable reads, phantom reads, lost updates, and all possible concurrency issues.

Read Committed

  • Read the submission (read committed), or read committed: After a transaction commits, its changes are not seen by other transactions.

With the exception of MySQL/MariaDB, the default level for all mainstream data is rC-read commit.

Repeatable read -repeatable read

  • Repeatable read (repeatable read) : The data seen during the execution of a transaction is always the same as the data seen when the transaction is started. At the repeatable read isolation level, uncommitted changes are also invisible to other transactions.

Default isolation level for MySQL/MariaDB InnoDB engine. However, RR can produce phantom reads, so MySQL introduced Gap locking (Gap) to solve this problem. But reduced the concurrency, and prone to death lock.

Serialization – serializable

  • Serialization (serializable), or serializable: A transaction’s manipulation of data is completely invisible to other transactions (that is, completely isolated), equivalent to the effect of the transactions being executed sequentially one after another.

Serializable is the highest isolation level. Serializability is usually based on locks. Write locks are applied to writes, read locks are applied to reads, and range-locks are applied when scopes are used in select. When a lock conflict occurs, subsequent transactions must wait until the lock is released at the end of the previous transaction before continuing.

Brief summary of isolation levels

Isolation levels are usually implemented by locking, which loses some degree of concurrency. High isolation levels increase the locking overhead of the system and the possibility of deadlocks. When using low-level isolation, developers need to be careful to avoid undetectable errors.

A high isolation level contains a low isolation level, thus allowing a transaction of the requested isolation level to run at a high isolation level.

The Oracle database does not support READ UNCOMMITTED and REPEATABLE READ isolation levels.

READ UNCOMMITTED is also called Browse access. READ COMMITTED is called cursor stability. REPEATABLE READ May occur magic reading. SERIALIZABLE is called isolation.

The default transaction isolation level for the SQL and SQL2 standards is SERIALIZABLE.

The isolation level describes the extent to which the data being updated (or read) is visible to other transactions. (The Isolation level describes The degree to which The data being updated is visible to other transactions)

About Snapshot Isolation

Snapshot Snapshot isolation level is an isolation level provided and implemented by the RDBMS itself. Its implementation is to form a snapshot of the current data, based on this snapshot to read, modify operations. In multi-version concurrency control, this scheme is often used to reduce the blocking problem caused by locks.

PostgreSQL implements the RR(repeatable read) isolation level in the form of a data snapshot; If READ_COMMITTED_SNAPSHOT is enabled on SQL Server, snapshots are used at the isolation level of read commit. That is, the row version is used to maintain concurrent access of transactions and reduce blocking. The Snapshot isolation level can be specifically set and used in SQL Server.

In snapshot isolation, the data read by any statement in a transaction will be a transaction-consistent version of the data that existed at the start of the transaction. Data changes made by other transactions after the current transaction started are not visible to statements executed in the current transaction.

The SNAPSHOT transaction does not request a lock when reading data. Reading data from a SNAPSHOT transaction does not prevent other transactions from writing data. Transactions that write data do not prevent SNAPSHOT transactions from reading data. However, writing data in a SNAPSHOT transaction prevents other (SNAPSHOT) transactions from writing the same data. Snapshot isolation is less prone to conflict and provides concurrency for transaction execution.

The snapshot isolation level is still very different from the other isolation levels in the snapshot scenario. Primary snapshot isolation follows the use of the version of the data at the start and end of the transaction to avoid dirty reads, lost changes, non-repeatable reads, and phantom reads. The other isolation levels implemented through the snapshot scheme, however, still have corresponding concurrency issues.

However, regardless of the isolation level, when multiple transaction modifications are written to the same data, conflicts can occur and execution blocks or interrupts can occur.

For example, at the snapshot isolation level (SQL Server), two transactions that modify the same data will report the following error and abort execution.

Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.StudentMarks' directly or indirectly in database 'AdventureWorks2016' to update, delete.or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.
Copy the code

Snapshot isolation levels in SQL Server:

Snapshot isolation levels have been introduced since SQL Server2005.

In snapshot isolation, the updated row version for each transaction is maintained in TempDB. Once the transaction begins, all new rows inserted or updated in the table and all deleted records are ignored.

Transaction implementation mechanism

The implementation of transaction is relatively complex, and it is achieved based on the common guarantee of various mechanisms and schemes.

For example, atomicity of transactions is implemented through the rollback mechanism; With the help of locking mechanism and MVCC(multi-version concurrency control) to ensure transaction isolation and concurrency; Implement transaction persistence based on WAL, Checkpoint, Crash Recovery, redo or rollback. Through the constraint consistency (integrity constraints such as unique index, foreign key, check, NOT NULL, etc.) and data consistency (atomicity, isolation, persistence and other mechanisms to ensure that the consistency of data is NOT lost), the stable state of the database system is achieved.

Rollback mechanism

The rollback mechanism is the ability of the database to recover to the state at the beginning of the transaction in the event of a transaction error. It is the core of atomicity and an important attribute of concurrent transaction scheduling.

Lock Locking

The locking (locking or blocking) mechanism is the most common way to achieve concurrency control and isolation of transactions.

The locking mechanism is that the transaction sends a request to the system before the operation of a data object (such as table, record, etc.) and locks it. After locking, a transaction has some control over the data object. Other transactions cannot update the data object until the lock is released.

What kind of control you have after locking depends on the type of lock. For example, there are Exclusive Locks (X Locks), Share Locks (S Locks), read-intention Locks, freehand Locks, gap Locks, etc. In addition, there are table level Locks and row level Locks based on their scope. They combine to ensure that when multiple transactions operate data concurrently, they are isolated and do not interfere with each other.

A lock makes access to a data record mutually exclusive, meaning that when one transaction accesses a data record, no other transaction can modify it. This is one of the ways to guarantee isolation (mutual exclusion).

Details about locks are described in the following sections

MVCC

Multi-version Concurrency Control (MVCC) is a method of Concurrency Control, which is used to realize concurrent access to databases. Similar goals to the locking mechanism, but with higher concurrency than locking, less transaction blocking, and so on. MVCC does not block read operations in any case. Only when multiple write operations update the same record will there be a blocking transaction conflict.

The MVCC protocol ensures that each transaction sees only the version data that is consistent with the snapshot view at the time the transaction is started; Each transaction sees only one snapshot of the data, which contains only the data that was committed when the transaction was started. This snapshot does not equal the current state of the data.

The key to MVCC is to efficiently maintain a different version of each row, each representing the latest result of all committed data at the start of the transaction.

MVCC also implements transaction isolation and concurrency control. Its purpose is to improve the concurrency of database access by preventing read operations from blocking write operations and write operations from blocking read operations. Data is read from the latest version of the row record. The write operation creates a unique, isolated copy of the row record for update.

The MVCC is described in detail in the following sections

MVCC will only block when writing the same data (or the same row of records) at the same time. With the standard two-phase blocking protocol (lock and release at the beginning and end of the transaction, in two phases), both read and write operations can be blocked because there is only one version of each database object and both read and write operations need to acquire the corresponding lock before accessing any data.

WAL

Write-ahead Logging (WAL) is a standard way to implement transaction Logging. The core mechanism of WAL is to ensure that all changes to the database have been recorded in the transaction log before any changes are made to the data, thus ensuring that transaction updates are not lost. That is, logs are written to disks first, and data can be refreshed to disks later.

The purpose of WAL is to ensure the integrity and correctness of logs, thus completing transaction persistence, supporting database Crash Recovery, preventing data loss, and improving database processing performance by writing logs first instead of randomly writing data to the sequential writing of logs.

WAL logs do not necessarily record transactions, but also record data directly (for direct data recovery, etc.), such as MySQL’s Redolog. This can vary from RDBMS to RDBMS and from version to version.

Checkpoint

Checkpoint is a Checkpoint in the transaction log.

The basic process of Checkpoint is as follows. When a checkpoint is generated, all memory data pages up to that checkpoint are flushed to disk and the checkpoint record is written to the WAL transaction log. That is, the checkpoint mechanism ensures that all data before the checkpoint has been written to the database file.

In the event of a database failure, the recovery process begins at the last checkpoint, and the transaction logs after this checkpoint, known as the redo log, are redone to restore the database to a consistent state.

From an exception recovery perspective, pre-checkpoint WAL logs are not needed so that log files can be recycled; Different databases have different processing mechanisms, for example, MySQL binlog does not loop overwrite, but archive to facilitate database synchronization or replay recovery (of course, binlog does not have the concept of checkpoint, but redolog uses checkpoint).

The Checkpoint mechanism is used to: Ensure database consistency. It writes dirty data in the buffer to hard disks to ensure data consistency between the memory and hard disks. WAL logs are disabled as soon as possible, facilitating reuse and saving disk space. If the database is not restarted by Checkpoint, WAL logs will not fail. During the restart of the database system, WAL transaction logs need to be redone, resulting in a long recovery time.

Checkpoint Indicates the location where logs can be cleared. It indicates the location where data has been written to the database file. It indicates the location where the redo logs start during Crash Recovery.

Crash Recovery

Crash Recovery is also called Crash-Safe. The capability of Crash-Safe ensures that committed records are not lost even if the database restarts unexpectedly.

There are many reasons for an abnormal termination of a database system, usually because the database service process has been forcibly terminated. Possible situations include: OOM Killer termination (Out Of Memory Killer) when Memory is insufficient, OS crash, server shutdown, server restart, server hardware failure, etc.

The Crash Recovery mechanism ensures that the database can be recovered to a consistent state when the database system (service running) is terminated abnormally. Fault recovery/disaster recovery/exception recovery, is to do, as long as the hard disk data and logs are not lost, the database will not lose data, this is the design goal of the database system.

The meanings of no data loss are as follows:

  • The database instance or service can be started again.
  • After the database is restarted, committed transactions still exist.
  • When the database is in a consistent state, there is no data disorder.

Crash Recovery usually requires multiple files and operations to work together. For example, MySQL may use redolog, undolog, binlog, data files, etc. PostgreSQL may use WAL log files, control files (such as Checkpoint information), transaction status files, and data files.

Database integrity

Database integrity constraint is an important way to ensure transaction consistency. The modification and operation of the database can meet certain constraints and ensure the stability of the database state.

More on this in the Introduction to Database integrity section.