In my previous article we looked at how to use logical backups of MySQL and did a simple logical backup recovery example. In this article we will look at some physical backups of MySQL.

A, what kind of backup is database physics class backup

Before we look at the physical backup of MySQL, we need to know what is the database physical backup? Since it is a physical backup, it must correspond to the physical object of the database. Just as a logical backup is a backup of a database logical object based on our business logic, a physical backup of a database is a backup of a database physical object.

Physical objects of a database consist of physical data files, log files, and configuration files. In the MySQL database, in addition to some log files and system table data files shared by the MySQL system, each storage engine also has its own different physical objects. In the following sections, we will list the physical objects (physical files) corresponding to several common storage engines. In order to know which files need to be backed up and which files do not need to be backed up when various storage engines do physical backup.

MySQL physical backup file required

①MyISAM storage engine

All data for MyISAM storage engine is stored in the ‘datadir’ directory specified in MySQL configuration. In fact, regardless of whether we are using MyISAM storage engine or any other storage engine, each database will have a folder under the “datadir” directory (the same goes for mysql, the database that contains system information). There are three files for each MyISAM storage engine table in each database: the “.frm “file that records table structure metadata, the”.myd “file that stores table data, and the”.myi “file that stores index data. Since MyISAM is a non-transactional storage engine, it does not have its own log files. So MyISAM storage engine physical backup, in addition to backing up the MySQL system’s common physical files, only the above three files can be backed up.

②Innodb storage engine

Innodb storage engine is a transactional storage engine and may store data in a different location than MyISAM storage engine, depending on how Innodb is configured. Innodb_data_home_dir, innodb_data_file_path, and innodb_log_group_home_dir. And another parameter that determines how Innodb’s table Spaces are stored: “Innodb_file_per_table”. The first three parameters specify where data and log files should be stored, and the last parameter determines whether Innodb stores data in a shared or exclusive tablespace.

If we use the shared table space storage method, Innodb needs to backup all data files set by “innodb_datA_home_dir” and “innodb_data_file_path” parameters. All Innodb storage engine table “. FRM “files under the corresponding database directory in” Datadir “;

If we use the exclusive tablespace, we need to back up all the files in the datadir database directory. Idb files, which store the data of the engine table in the exclusive tablespace. There is a possibility that if we are using an exclusive tablespace, why would we want to back up data files that are “only used” by the shared tablespace? This is a common misconception of many people, who assume that all Innodb information is stored in the “. Ibd “file under the” datadir “database directory. Ibd files only store our table data. Innodb is a transactional storage engine, and it needs undo and redo data, regardless of whether Innodb uses a shared or exclusive tablespace to store data. Undo and other metadata related to transactions are stored in data files specified by “innodb_data_home_dir” and “innodb_data_file_path”. So for Innodb’s physical backup to be valid, the data files set by “innodb_data_home_DIR” and “Innodb_data_file_path” parameters must be backed up in any case.

In addition to the data files mentioned above, Innodb has its own log files that store redo and transaction information in the location specified by the “Innodb_log_group_home_dir” parameter. So for Innodb physical backup to work, we also need to back up all log files in the location set by the “innodb_log_group_home_DIR” parameter.

③NDB Cluster storage engine

NDB Cluster Storage engine (also known as MySQL Cluster) physical backup files need to be backed up in the following three types:

  1. Metadata (Metadata) : contains all database and table definition information;
  2. Table data (Table Records) : a file that holds the actual data;
  3. Transaction log data (Transaction Log) : Maintains transaction consistency and integrity, as well as transaction information needed during recovery.

The above three physical files need to be backed up to form a complete and effective backup set, whether through cold shutdown backup, online backup tools provided by NDB Cluster, or third-party backup software. Of course, related configuration files, especially the configuration information on the management node, also need to be backed up.

Common physical backup methods of storage engines

Different storage engines need to back up different physical objects (files), and each storage engine has different requirements on data file consistency. Therefore, different storage engines use different backup methods for physical backup. Of course, if we want to do a cold backup (a backup after the database has been shut down), all we need to do is simply copy all the data files and log files to the desired location of the backup set, regardless of the storage engine. Since the cold backup method is simple and easy to implement, I will not elaborate here.

In our real application environment, there are very few situations where we can stop for daily backups, and we can only do database backups when the database is in service. This is also known as a thermal physical backup. Below we describe the most common online (hot) physical backup methods for each storage engine.

①MyISAM storage engine

MyISAM does not support transaction redo and undo logging, and does not require data consistency. Therefore, MyISAM does not support physical backup of storage engine files. Just copy MyISAM’s physical file. However, although MyISAM storage engine does not have transaction support and is not as strict as Innodb storage engine on consistency of data files, MyISAM storage engine does have consistency requirements between data files and index files of the same table. When the MyISAM storage engine finds that the data file of a table is inconsistent with the index file, it marks the table as unavailable and asks you to fix it. Of course, it is usually easier to fix. But even if the database storage engine itself is not very strict about data file consistency, does our application allow data inconsistency? I think the answer is definitely no, so we must at least ensure that the data of the database is at a certain point in time when the backup, which requires us to make MyISAM storage engine stop writing operation when the backup of the physical file of MyISAM database, only provide read service. The essence of this is to lock database tables to prevent write operations.

MySQL provides its own utility, mysqlHotCopy, which is specifically used to back up the MyISAM storage engine. However, if you have a non-transactional storage engine other than MyISAM, you can also use mysqlHotCopy to perform backup tasks by setting the appropriate parameters or tweaking the backup script. The basic usage is as follows:

mysqlhotcopy db_name[./table_regex/] [new_db_name | directory]
Copy the code

As you can see from the basic usage above, mysqlHotCopy can back up an entire database, specify a specific table, and use regular expressions to match certain table names for specific tables. The backup result is the corresponding physical files containing all the specified tables in the folder of the specified database.

Mysqlhotcopy is a perl program that locks the table, executes FLUSH TABLES, fsyncs the closed table, and fsyncs the fsync data. Then run the os-level copy (cp) command to copy all physical files of the tables or databases to the specified location of the backup set.

In addition, we can manually lock the database by logging in, and then copy relevant files to perform hot physical backup through the command of the operating system. Before the file copy is completed, we cannot exit the locked session (because exiting will automatically unlock), as follows:

root@localhost : test 08:36:35> FLUSH TABLES WITH READ LOCK;
Query OK, 0 rows affected (0.00 sec)
Copy the code

Do not exit mysql and do the following backup on a new terminal:

mysql@sky:/data/mysql/mydata$ cp -R test /tmp/backup/test
mysql@sky:/data/mysql/mydata$ ls -l /tmp/backup/
total 4
drwxr-xr-x 2 mysql mysql 4096 2008- 10- 19 21:57 test
mysql@sky:/data/mysql/mydata$ ls -l /tmp/backup/test
total 39268
-rw-r----- 1 mysql mysql 8658 2008-10-19 21:57 hotcopy_his.frm
-rw-r----- 1 mysql mysql 36 2008-10-19 21:57 hotcopy_his.MYD
-rw-r----- 1 mysql mysql 1024 2008-10-19 21:57 hotcopy_his.MYI
-rw-r----- 1 mysql mysql 8586 2008-10-19 21:57 memo_test.frm. .-rw-rw---- 1 mysql mysql 8554 2008-10-19 22:01 test_csv.frm
-rw-rw---- 1 mysql mysql 0 2008-10-19 22:01 test_csv.MYD
-rw-rw---- 1 mysql mysql 1024 2008-10-19 22:01 test_csv.MYI
-rw-r----- 1 mysql mysql 8638 2008-10-19 21:57 test_myisam.frm
-rw-r----- 1 mysql mysql 20999600 2008-10-19 21:57 test_myisam.MYD
-rw-r----- 1 mysql mysql 10792960 2008-10-19 21:57 test_myisam.MYI
-rw-r----- 1 mysql mysql 8638 2008-10-19 21:57 test_outfile.frm
-rw-r----- 1 mysql mysql 2400 2008-10-19 21:57 test_outfile.MYD
-rw-r----- 1 mysql mysql 1024 2008-10-19 21:57 test_outfile.MYI. .Copy the code

It is then unlocked in the previous session where the lock command was executed

root@localhost : test 10:00:57> unlock tables;
Query OK, 0 rows affected (0.00 sec)
Copy the code

This completes a physical backup, and as you can see from the file list, there are tables for the CSV storage engine in the backup.

②Innodb storage engine

Innodb’s online (hot) physical backup is much more complex than MyISAM’s because it is a transactional storage engine with redo logs and associated undo information and is much more consistent and complete than MyISAM’s. In general, it is difficult to simply complete through a few manual commands, mostly through special Innodb online physical backup software to complete.

The developer of Innodb storage engine (Innobase Company) developed a commercial backup software named IBBackup, which specially realizes the online physical backup function of Innodb storage engine data. The software can back up tables in the database using Innodb storage engine while MySQL is running online, but only tables using Innodb storage engine.

Since this software is not open source and free, I seldom use it. It is mainly a trial version downloaded for trial use, so I will not introduce it in detail here. Readers can obtain the detailed user manual from the official website of Innobase company for trial use

③NDB Cluster storage engine

The NDB Cluster storage engine is also a transactional storage engine. Like Innodb, it has redo logs. The NDB Cluter storage engine provides its own backup functionality, which can be implemented with related commands. Of course, the shutdown cold standby method is also effective.

The steps for online backup are as follows:

  1. Connect to the management server;
  2. Run the START BACKUP command on the management node.
  3. After a backup command is issued on the management node, the management node notifies all data nodes to start the backup and reports back the notification results.
  4. The management node generates a backup number to uniquely locate the backup set generated by the backup before notifying the backup instruction. When each data node receives the backup instruction, it starts the backup operation.
  5. After all data nodes are backed up, the management node sends a message indicating that the data backup is complete to the client.

Since the backup command of NDB Cluster is initiated from the management node and will not be returned after the backup is completed, there is no way to interrupt the backup process directly by “Ctrl + C” or other means. So the NDB Cluster provides the following commands to interrupt the current backup operation:

  1. Logging In to the Management Node
  2. perform"ABORT BACKUP backup_id"Backup_id in the command is the backup number generated when the backup command is initiated.
  3. The admin node confirms the abandonment request with the message “backup_id of abandonment indication”. Note that no actual response is received from the data node to the request.
  4. Then the management node sends the command to all data nodes to interrupt the backup. After all data nodes interrupt the backup and delete the generated backup files, the management node returns “Backup_id abandoned due to * * *”. At this point, the interrupted backup operation is complete.

After using the backup command of the NDB Cluster storage engine, the three files mentioned above are stored on the nodes that participate in the backup, and they are stored in three different files, similar to the following:

  • Backup-backup_id.node_id. CTL contains the control file of related control information and metadata. Each node stores the same table definition (for all tables in the Cluster) in its own file.

  • Backup-backup_id -n.node_id. Data: The data BACKUP file is divided into different fragments to save. During the BACKUP process, different nodes save the fragments generated by different BACKUP data. The final verification information is also included at the end of the backup fragment file to ensure that the backup can be restored correctly.

  • Backup-backup_id.node_id. log: The transaction log BACKUP file contains only the information about committed transactions and only the transactions on the tables that are saved in the BACKUP. The log information is also different at different stages because the log information is only recorded for the data contained in each node.

In the preceding naming rules for backup files, backup_id refers to the backup number. Different backup sets have different backup numbers. Node_id indicates which data node the backup file belongs to, and n indicates the segment number.

Physical backup and restoration methods commonly used by storage engines

As with the previous logical backups, there is no point in just having a backup, and you need to be able to restore it effectively. The biggest advantage of physical backup over logical backup is that it is faster to copy physical files. You can copy backup files to the location to be restored and then perform simple restoration.

①MyISAM storage engine

MyISAM storage engine is also easy to recover from physical backups due to its features.

If the recovery is done by cold shutdown or by locking the backup set after the write operation in the running state, it is only necessary to overwrite the existing files by copying the corresponding data files directly to the corresponding location through the operating system copy command.

If the online hot backup is performed by using mysqlHotCopy software, and the backup information is recorded in the corresponding tables in the database, the recovery operation may need to be combined with the backup table information.

②Innodb storage engine

For cold backup, Innodb storage engine does the same for recovery as other storage engines, just copy the set files (including data files and log files) to the corresponding directory. However, if you use other backup software, perform backup based on the requirements of the backup software. For example, backup through ibBackup also needs to be restored through ibBackup. For specific restoration methods, refer to the user manual of ibBackup, which is not detailed here.

③NDB Cluster storage engine

For cold backup, the recovery method is the same as that of other storage engines. However, it is important to note that only a few files in the backup set must be restored to the corresponding data nodes during recovery. Otherwise, the recovery process cannot be completed correctly.

To generate a backup set using the backup commands provided by the NDB Cluster, use the dedicated backup and restoration software ndb_restore. The ndb_Restore software will read backup related control information from the backup set, and the ndb_Restore software must do this separately on separate data nodes. So how many times does ndb_restore need to be run? Also, the first time you restore using ndb_Restore, you must restore metadata, that is, all the databases and tables will be rebuilt.