“This is the first day of my participation in the first Wenwen Challenge 2022.First challenge in 2022”. Hello everyone, I am Wang Lao shi, Mysql storage engine is often encountered in practical applications and interviews, so how should we choose storage should be invited and what storage engine? Today we will talk about mysql storage engine ~


Storage engines

As you can see from the architecture diagram, one of the most important features that distinguishes MySQL database from other databases is its plug-in table storage engine. The MySQL plug-in storage engine architecture provides a series of standard management and service support. These standards have nothing to do with the storage engine itself and may be necessary for every database system, such as SQL analyzers and optimizers. The storage engine is the implementation of the underlying physical structure and actual file reading and writing. Each storage engine developer can develop as they wish. It is important to note that the storage engine is table based, not database based.

The advantage of plug-in storage engines is that each storage engine has its own characteristics and can create different storage engine tables according to specific applications. Due to the open source nature of MySQL database, users can write their own storage engine according to MySQL’s predefined storage engine interface. If users are not satisfied with the performance or function of a storage engine, they can modify the source code to get the features they want. This is the convenience and power of open source.

Due to the open source feature of MySQL database, storage engine can be divided into MySQL official storage engine and third-party storage engine. Some third-party storage engines are powerful, such as the InnoDB storage engine (originally acquired by Oracle), which is widely used. Even MySQL database OLTP(Online Transaction Processing) application is the most widely used storage engine.

MySQL official engine introduction

InnoDB storage engine

InnoDB is MySQL’s default transactional engine and the most important and widely used storage engine. It is designed to handle a large number of short-lived transactions, which are mostly committed normally and rarely rolled back. InnoDB’s performance and automatic crash recovery features make it popular for non-transactional storage requirements as well. Unless there is a very specific reason to use another storage engine, the InnoDB engine should be preferred. InnoDB is also a very good value if you want to learn about storage engines

The object that spends the most time learning in depth will definitely have a much higher return than the average time spent learning each storage engine. So the InnoDB engine will also be the focus of our study.

MylSAM storage engine

In MySQL 5.1 and earlier, MyISAM is the default storage engine. MyISAM offers a number of features, including full-text indexing, compression, spatial functions (GIS), and so on, but MyISAM does not support transaction and row-level locking, and has an unquestionable defect of not being able to safely recover from crashes. Although the MyISAM engine does not support transactions and does not support secure recovery after crashes, it is by no means without merit. For read-only data, or if the table is small enough to tolerate a repair operation, you can still use MyISAM (but don’t default to MyISAM; instead, use InnoDB by default). But MyISAM locks the entire table, not rows. Shared locks are added to all tables to be read while exclusive locks are added to all tables to be read. MyISAM is prone to typical performance issues due to table locking.

Mrg_MylSAM

Merge storage engine is a combination of MyIsam, that is, it merges multiple MyIsam engine tables, but it does not have internal data, the real data is still in MyIsam engine tables, but can directly query, delete and update operations.

Archive the engine

The Archive storage engine only supports INSERT and SELECT operations, and indexes were not supported prior to MySQL 5.1. The Archive engine caches all writes and uses Zlib to compress inserted rows, so it has less disk I/O than MyISAM tables. But each SELECT query requires a full table scan. Therefore, the Archive table is suitable for log and data collection applications, which often require full table scanning for data analysis. It can also be used in situations where faster INSERT operations are required. The Archive engine is not a transactional engine, but a simple engine optimized for high-speed insertion and compression.

Blackhole engine

The Blackhole engine does not implement any storage mechanism, and it will discard all inserted data without saving anything. But the server logs Blackhole tables, so it can be used to copy data to a standby repository, or simply to log. This special storage engine can be useful for special replication frameworks and log auditing. However, this engine has many problems with the way it is used, so it is not recommended.

CSV engine

The CSV engine can use ordinary CSV files (comma-separated value files) as MySQL tables

But this type of table does not support indexes. The CSV engine can copy files in and out of the database while it is running. Excel data can be stored as CSV files, and then copied to the MySQL data directory, can be opened in MySQL for use. Similarly, if you write data to a CSV engine table, other external programs can immediately read CSV data from the table’s data files. The CSV engine can therefore be very useful as a mechanism for data exchange.

The Federated engine

The Federated engine is a proxy that accesses other MySQL servers. It creates a client connection to the remote MySQL server, transfers queries to the remote server for execution, and extracts or sends the required data. Originally designed to compete with similar features in enterprise databases such as Microsoft SQL Server and Oracle, the storage engine was more of a marketing move. Although the lead

Engines seem to offer a good flexibility across servers, but often cause problems, so they are disabled by default.

The Memory engine

Memory tables (formerly known as HEAP tables) can be useful if you need to access data quickly and if the data can’t be modified or lost after a restart. Memory tables are at least an order of magnitude faster than MyISAM tables because each memory-based storage engine table actually corresponds to a disk file. The file name is the same as the table name, and the file type is FRM. Only the table structure is stored in this file. Its data files are stored in memory, which facilitates fast data processing and improves the efficiency of the entire table without disk I/O. So the structure of the Memory table remains after a restart, but the data is lost.

Memroy tables can work well in many scenarios:

Used to lookup or map tables, such as those that map zip codes to state names. Used to cache the results of periodically aggregated data.

Used to store intermediate data generated during data analysis.

Memory tables support Hash indexes, so lookups are very fast. While Memory tables are very fast, they are no substitute for traditional disk-based tables. Memroy tables are table-level locks, so concurrent write performance is low. It does not support BLOB or TEXT columns, and the length of each row is fixed, so even if a VARCHAR column is specified, the actual storage is converted to CHAR, which can result in a partial memory waste.

NDB cluster engine

A combination of a MySQL server, an NDB clustered storage engine, and a distributed, share-nothing, Dr, high-availability NDB database is called a MySQL Cluster.

Third party engines worth knowing

Percona’s XtraDB storage engine

An improved version of the InnoDB engine, already included in Percona Server and MariaDB, focuses on performance, scalability and operational flexibility. XtraDB can be used as a complete alternative to InnoDB. It can even read and write InnoDB data files and support all InnoDB queries.

TokuDB engine

A new index data structure called Fractal Trees is used. This structure is cache-free, so performance does not degrade even if its size exceeds memory, and there are no memory life cycle and fragmentation issues. TokuDB is a Big Data storage engine because it has a high compression ratio and can create a large number of indexes on a large amount of Data. Now the engine has also been acquired by Percona.

Tips: Fractal tree is a write-optimized disk index data structure. In general, the write performance of fractal tree (Insert/Update/Delete) is good, and it can ensure that the read performance of fractal tree is similar to that of B+ tree. According to the test results, the write performance of TokuDB fractal tree is better than that of InnoDB B+ tree, and the read performance is slightly lower than that of InnoDB B+ tree. The core idea of fractal tree is to make full use of the principle of data locality by using the MessageBuffer buffer update operation of nodes to transform random write into sequential write, which greatly improves the efficiency of random write.

Infobright

MySQL is row-oriented by default, each row of data is stored together, and server queries are processed in behavior units. Column-oriented storage is more efficient when processing large amounts of data. For example, HBASE is column-oriented storage.

Infobright is the best known column-oriented storage engine. The engine works well with very large data volumes (tens of terabytes). Infobright is designed for data analytics and data warehousing applications. The data is highly compacted and sorted by blocks, with a set of metadata for each block. When processing a query, access to metadata can decide to skip the block, or perhaps even only the metadata is needed to satisfy the requirements of the query. The engine does not support indexes, but at such a large data level, even indexes are hardly useful, and the block structure is a quasi-index. Infobright needs to customize the MySQL server because some places need to be modified to accommodate column-oriented storage. If queries cannot be executed in column-oriented mode at the storage tier, they need to be converted to row by row at the server tier, which can be slow. Infobright is available in both community and business editions.

other

For graph operation, full text retrieval, MySQL under the corresponding storage engine, we can consult.

Four,Choose the right engine

With so many storage engines, how do we choose? In most cases InnoDB is the right choice, so InnoDB is the default storage engine in MySQL 5.5. How to choose a storage engine can be summed up in a simple sentence: “Unless you need some features that InnoDB does not have and there is no other way to replace them, you should choose InnoDB engine first.” For example, only MyISAM in MySQL supports geospatial search.

Of course, if you don’t need InnoDB’s features and the features of other engines can better meet your needs, you can also consider other storage engines. For example, if you don’t care about scalability and concurrency, or data loss after a crash, but are sensitive to InnoDB taking up too much space, MyISAM is a good choice.

It is recommended not to mix storage engines unless absolutely necessary, as this can lead to a host of complex problems, as well as potential bugs and boundary issues. The interaction between the storage engine layer and the server layer is already complex, let alone mixing multiple storage engines. At the very least, mixed storage presents some difficulties with consistent backup and server parameter configuration.

Five, table engine conversion

There are several ways to convert the storage engine of a table to another engine. Each method has its advantages and disadvantages. Three methods are commonly used

ALTER TABLE

The easiest way to change a TABLE from one engine to another is to use the ALTER TABLE statement. Change mytable engine to InnoDB:

mysql> ALTER TABLE mytable ENGINE = InnoDB;
Copy the code

The above syntax can be applied to any storage engine. However, it takes a long time to execute. In implementation, MySQL copies data from the original table to a new table in rows, which may consume all the SYSTEM’S I/O capacity during replication, and the original table is locked with a read. So be very careful when doing this on busy tables.

If the storage engine of the table is converted, all features associated with the original engine will be lost.

Export and import

You can also use the mysqldump tool to export the data to a file and then change the storage engine options for the CREATE TABLE statement in the file. Change the TABLE names as well, because the same TABLE name cannot exist in the same database, even if they use different storage engines.

CREATE and SELECT

Create a table for a new storage engine, then use INSERT… SELECT syntax to derive data:

mysql>CREATE TABLE innodb_table LIKE myisam_table;

mysql>ALTER TABLE innodb_table ENGINE=InnoDB;

mysql>INSERT INTO innodb_table SELECT * FROM myisam_table;
Copy the code

If there is a large amount of data, consider batching and performing a transaction commit operation for each piece of data.

Six, check mineMySQLThe engine of

1. Check the storage engines supported by MySQL

MariaDB [(none)]>  show engines;
Copy the code

2. Check the default storage engine of MySQL

MariaDB [(none)]>  show variables like '%storage_engine%';
Copy the code

Seven, My I S A M 和I nn o D BTo compare

\