This is the third day of my participation in the August More text Challenge. For details, see:August is more challenging

InnoDB tables

InnoDB table consists of two parts: table structure definition and data structure definition. Table structure definition occupies very little space, mainly discussing table data before mySQL 8.0

Why simply dropping table data does not achieve tablespace reclamation

Innodb_file_per_table is used to store table data

  1. Parameter OFF-table data is in the system shared tablespace – along with the data dictionary
  2. Parameter ON [default after version 5.6.6] – Each InnoDB table data is stored in a file with an.ibd suffix

A form stored as a single file is easier to manage – if the table is not needed, drop the file – Reclaim table space -> Share table space is not reclaimed even if the table has been deleted

Data Deletion Process

InnoDB’s data is stored in pages and the reuse of pages is not the same as the reuse of records. The reuse of records is limited to the data that meets the range conditions. The reuse of data pages can be reused anywhere. The other data page is marked as reusable

A condition that causes a data hole

  1. Delete command – data for entire table is deleted – data pages are marked as reusable – File size on disk unchanged – delete command cannot reclaim table space
  2. Random data insertion – Index of the data page split
  3. Updating a value on an index – equivalent to deleting an old value and inserting a new one

A large number of additions, deletions and changes to the table, there may be holes – remove holes to shrink table space -> rebuild the table

Reconstruction of table

Alter table A engine=InnoDB alter table A engine=InnoDB ALTER table A engine=InnoDB alter table A engine=InnoDB

Online DDL Process of rebuilding tables

Log file recording and replay operations – Table A can be added, deleted, or modified during table RECONSTRUCTION

  1. Create A temporary file – scan all data pages for table A’s primary key;
  2. Records in table A in the data page generate B+ tree – store to temporary file;
  3. During temporary file generation – Record all operations on A in A row log
  4. After temporary file generation – Apply the operations from the log file to the temporary file, resulting in A data file that is the same as table A logically
  5. Temporary files replace data files for table A

Temporary files – Save data page – Log files – Save operations – Apply to temporary files – Replace table A

Table A can be added, deleted, or modified during table RECONSTRUCTION due to log file recording and replay operations

DDL – when the ALTER statement is started – needs to acquire MDL write lock – degenerates to read lock before actual copy – in order to implement Online MDL read lock does not block add, delete or modify operation – do not lock is to protect themselves – prevents other threads from DDL to this table simultaneously

Large tables – most time-consuming process in Online DDL – copying data to temporary tables – adding, deleting and modifying operations are acceptable during execution – very short lock time relative to overall DDL process – for business – can be considered Online

The above rebuild methods will scan the original table data and build temporary files – large tables consume IO and CPU resources – online service – carefully control the operation time is safer – Github open source GH-OST

Online and inplace

Storage location of exported table data – tmp_table – Temporary table – Storage location of reconstructed table created on the Server layer – tmp_file-temporary file – InnoDB created internally – The whole DDL process is completed inside InnoDB – For server layer – not moving data to temporary table – “inplace” operation – inplace

Alter table t engine=InnoDB = ALTER table t engine=InnoDB,ALGORITHM=inplace; If ALGORITHM=copy means forcibly copying the table

The similarities and differences between Inplace and Online

  1. The DDL process is Online -> must be inplace
  2. Inplace DDL -> may not be Online

The difference between optimize table, ANALYZE table, and ALTER table

  1. Analyze table T – Not reconstruct table – re-statistics table index information – no data modification – MDL read lock is added to procedure
  2. The optimize table t is equal to comprehension + analyze