Why does the tablespace change when data is deleted?

More like a paragraph: without a cold biting, how fragrant plum blossom, learning is boring, please adhere to! It takes about 20 minutes to read this article!

Hi, everybody. We’ve seen a little bit about why MySQL sometimes does a little bit of shaking when it’s querying data. And analysis of the dirty page brush strategy and linkage mechanism. Delete from table name, table size is still not smaller!

Table structured storage

Innodb tables contain two main sections. One part is the definition of the table structure and the other part is the data. Before MySQL8.0, table structures were stored in files with the.fm suffix. MySQL 8.0 already allows table structure definitions to be placed in system tables. Because table structure definitions take up very little space, we will focus on table data today. The following address is where the data in the data table is stored.

C: \ Program Files \ MySQL \ MySQL Server 5.7 \ dataCopy the code

innodb_file_per_table

This is a MySQL parameter that controls whether the table data is stored in a shared tablespace or in a separate file.

This parameter is set to OFF to indicate that the table data is placed in the system shared table space, that is, with the data dictionary;
Setting this parameter to ON means that each InnoDB table data is stored in a file with an.ibd suffix.

I suggest keeping it in a file for easy management. And the impact on performance is not significant. As of MySQL 5.6.6, the default value is ON. When you don’t need the table, drop table will drop the file. In a shared tablespace, the space is not reclaimed even if the table is dropped.

And that’s where the rest of our discussion starts. Please make it clear here!

Data Deletion Process

You can use the drop table command to reclaim the tablespace when you drop the entire table. However, the more common deletion scenario we encounter is the deletion of rows, where we encounter the problem at the beginning of our article: the data in the table is deleted, but the table space is not reclaimed.

To get to the bottom of this problem, start with the data deletion process.

Deleting data does not directly delete data on disk I/OS. The data is also deleted by means of deletion tags. You can think of it as breaking references. But breaking a reference here is more than just breaking. After disconnecting, it does not free up space, but reuses the space the next time data is inserted, thus reducing the cost of opening up space.

It says reuse space! There is a rule for reuse space. As shown in the figure below, the space will be demultiplexed only when the deleted data is in the range of 300-700 and the ID of the inserted value is also in the range of 300-700. Otherwise, the space will not be multiplexed.

Here’s how you can reuse any space. Any data inserted will only reuse the current space if the user deletes the entire page of data.

Delete from user where file size =4KCopy the code

To sum up: deleting data simply marks the current location asreusableBut the disk size doesn’t change. That is, a tablespace cannot be reclaimed through the delete command. These can be reused, but the unused space looks like “empty space”.

In fact, it is not only deletion that causes voids, but also insertion of data. This is very bad.

With the help of Mr. Dinky’s drawing. If the data is inserted in index ascending order, the index is compact. However, if the data is inserted randomly, the data pages of the index can be split.

Suppose page A is already full and I want to insert another row of data.

As you can see, when page A is full, insert A new page with ID 550, then you have to request A new page B to store the data. After page splitting is complete, A void is left at the end of Page A (note: in fact, more than one record location may be void).

In addition, updating a value on an index can be interpreted as deleting an old value and inserting a new value. Understandably, this, too, creates a void.

That is to say, after a large number of additions and deletions to the table, is likely to be empty. So, if you can get rid of these holes, you can shrink the table space.

And rebuild the table, you can achieve this purpose.

Reconstruction of table

You can create A new table A and B with the same structure. Insert rows from table A into table B in ascending order of primary key ids.

Table B is A newly created table, so the primary key index of table A will not exist in table B. Obviously, the primary key index of Table B is more compact and the data page utilization is higher. If table B is used as A temporary table, after the operation of importing data from table A to table B is completed, table B replaces A with TABLE A. In effect, it shrinks the space of table A.

Alternatively, the alter table A engine=InnoDB command can be used to rebuild the table. Before MySQL 5.5, this command is similar to what we described above, except that you don’t need to create temporary table B yourself. MySQL automatically saves data, swaps table names, and deletes old tables.

The most time-consuming part of rebuilding A table is not to create A new table, but to transfer the data from table A to table B, there will be A data insert process. This process is very time consuming. Table A cannot be added, modified or deleted during the insert. Otherwise, the data will be lost or scrambled.

So it is safe to say that the rebuilt tables before MySQL5.5 were not online(). What is online?

Online and Inplace

Inplace way

This is the way native MySQL 5.5 and the Innodb_plugin provide it. An Inplace is a table that does not copy a temporary table. Compared with Copy Table mode, this is more efficient. The original table is also readable, but not writable.

The Online mode

This is the approach provided in MySQL 5.6 above, and it is the approach we will highlight today. In either Copy Table or Inplace mode, the original Table can only be read, not written. There are significant limitations to applications, so in the latest version of MySQL, InnoDB supports so-called Online DDL. Compared to the above two methods, DDL can be read as well as written in online mode, which is a great improvement for DBAs.

The core of online is that you can go online through the row log during rebuilding. The diagram below. In figure 3, figure 4

Copy Table mode

This is one of the earliest ways InnoDB supports it. As the name implies, this is implemented by means of a temporary table copy. Create a temporary table with a new structure, copy all data from the original table to the temporary table, and Rename. In this mode, the original table is readable and not writable. But it uses up twice as much storage.

Analysis of the

Let’s first introduce the process of online

Create A temporary file, scan all data pages of table A primary key;
Generate A B+ tree from the records of table A in the data page and store it in A temporary file;
During the generation of temporary files, all operations on A are recorded in A row log file corresponding to the state2 in the figure.
After the temporary file is generated, the operations in the log file are applied to the temporary file to obtain A data file logically identical to table A, corresponding to the state3 in the figure.
Replace table A’s data files with temporary files.

Figure 3. DDL of lock table

Figure 4 Online DDL

As you can see, unlike the procedure in Figure 3, this scheme allows adding, deleting, and modifying table A while rebuilding the table due to the presence of log file recording and replay operations. This is where the Online DDL name comes from.

The MDL write lock is also required during insert R7 in Figure 4. There’s nothing special about this routine operation. The really interesting part is this. The write lock degenerates into a read lock before the data is actually copied. Write locks or degenerate to read locks?

Why does alter degrade online? To get Online, the MDL read lock does not block add, delete, or change operations.

Here I’m sure there will be some questions. Read locks why not just unlock them? To protect yourself, forbid other threads to DDL the table at the same time.

Let’s continue with the process. TMP in Figure 3 is a temporary table that was created automatically by the Server layer. The tmp_file in Figure 4 is created by InnoDB, and the whole process is done in InnoDB. In contrast to the Server in Figure 3, it does not move data to temporary tables and is an “in-place” operation, hence the name “inplace”.

So if you have 1TB of memory, you’re dealing with a 2TB table. Can you use the INplace DDL? The answer is no, because inplace takes up memory, if the dump is larger than its own space. It will fail.

Let’s use SQL to show the difference between Figure 3 and Figure 4

alter table t engine=innodb,ALGORITHM=inplace; Alter table t engine=innodb,ALGORITHM=copy; Figure 3Copy the code

When ALGORITHM=copy is used, it means that the table is forcibly copied, and the corresponding process is the operation process in Figure 3.

The difference between

Inplace blocks add, delete, or change operations and is non-online.
If the DDL process is Online, it must be inplace.
Inplace’s DDL may not be Online. As of MySQL 8.0, this is the case with adding FULLTEXT index and SPATIAL index.
The Online DDL option is considered for use during peak business periods, but MySQL 5.5 and earlier this command blocks DML

The ending summarizes

It Outlines the methods for shrinking table Spaces. Two implementations of rebuilding tables. Alter table: alter table: alter table: alter table: alter table: alter table: alter table: alter table: alter table: alter table: alter table:

Wechat public number [Huan little growth path] have any do not understand the place can discuss learning together