1.wiredTiger

Storage engines manage how data is stored on hard disks and memory.

1) mongoDB storage engine check

MongoDB3.2 — select wiredTiger as default storage engine

db.serverStatus()
Copy the code

2) Principle analysis

WiredTiger uses Btree to manage data. The data structure of Btree is shown as follows.

1.MongoDB itself does not support transaction implementation, but wiredTiger storage engine can implement transaction and ACID through transaction snapshot and concurrency control technology.

2.Wiredtiger uses Copy on Write to manage modify operations (insert, update, delete). The modification operations will be cached in the cache first. Each checkpoint produces a new root page. At Checkpoint, WiredTiger stores all modified BTree pages persistently

3. Most databases use disk-level indexes, that is, each Btree node in MongoDB is a page unit. During data searching, data on Btree nodes must be loaded from disks to memory or written to disks in the unit of page. 4. MongoDB is a document-level data store with simple structure and low I/O times

For mysql, the root and branches of the B+ tree do not store data areas

1. The storage structure of the B+ tree is as follows: each node stores a keyword and a pointer to the next node; 2. There are two obvious differences with B tree: the first is that the non-leaf node does not save the data region; the second is that there is a left closed interval.

Mysql relational database – MongoDB document level does not require scope query

2. MongoDB is designed to reduce I/O between disks as much as possible to improve performance; Btree just happens to store data and keywords in the same node, so mongoDB uses Btree

Mysql uses B+ tree while MongoDB uses B tree;

1. Firstly, each node can store more keywords

2. Each node is loaded into the memory. If it is not a leaf node, there is no need to scan the library.

3. The leaf nodes of B+ tree are linked to each point in a linked list. Their storage is in order and range query can be used

4. MongoDB is designed to reduce I/O between disks as much as possible to improve performance; Btree just happens to store data and keywords in the same node, so mongoDB uses Btree


2. WiredTiger transaction implementation

MongoDB can not only persistently store data to hard disk files, but also only store data in memory.

And the difference between the different storage engines is that they use different storage mechanisms, locking techniques, indexing techniques, etc.; Then they perform differently. WT implements transactions using these three key techniques:

Snapshot MVCC Redo log;

Wt_transaction defines a global transaction object, including id MVCC multi-version concurrency control, based on keyValue value of the value of the linked list, mainly stores the transaction ID and the value of the operation under the transaction.

1) the Snapshot

Snapshot_oject in the WT engine consists of a minimum execution transaction SNAP_min, a maximum transaction snap Max, and a sequence of all ongoing write transactions in the [SNAP_min, snap_max] interval. If the above image takes a snapshot of a transaction in the system at time T6, then there are two ranges of transaction modifications that T6 can access: All transactions smaller than T1 are modified [0, T1) and [SNAP_min, snap_max] are already committed for transaction T2. In other words, changes that occur in snap_array or whose transaction ID is greater than SNAP_max are not visible to transaction T6. If T1 commits after snapshot creation, T6 will not be able to access T1’s changes. This is the basic principle behind snapshot isolation.

2) MVCC

What is the concurrency control method?

If someone reads data from the database and someone else writes data, it is possible that the person reading the data will see incomplete data. Different databases have proposed many methods to solve this problem. The method to solve this problem is called concurrency control method.

Solution: Lock the mysql database to solve this problem. By locking the mysql database, make all the read users wait for the write users to finish their work. This efficiency will be poor. The MVCC used in mongoDB uses a different concurrency control approach

After each write operation, add the modified content in the wT_MVCC data format to the red arrow header. Then, each read operation determines whether it can be read from the linked list head according to the corresponding modified transaction ID and the snapshot of this read operation. If it cannot be read, it moves to the end of the MVCC list until it finds the readable data version.

3) the Redo log

Every 60 seconds or when the size of the log file reaches 2GB, a Checkpoint is performed to persist the current data and generate a new snapshot.

4) Transaction implementation

General:

1. Create transaction object; Open the object

If a conflict occurs or the transaction fails to be executed, you need to roll back the transaction

3. If the execution is complete, just submit it.

The Wiredtiger storage engine adds the transaction object to the global transaction manager when creating the transaction object during the transaction start process. When a transaction is executed, if it is a read operation, no operations are performed because read operations do not require rollback or commit. The specific process is shown below

Reference:

Mp.weixin.qq.com/s?__biz=MzA…

www.mongoing.com/archives/25…