Binary tree indexes can lead to fragmentation, which in turn affects database performance. Fragmented index storage has poor performance or is not ordered on disk. Using binary tree indexes to overfind page nodes inherently requires random disk access, so random access is a feature of binary tree indexes, not an exception. However, query performance is still better if the page nodes are physically ordered and tightly stored. If this is not the case, we call it fragmentation, where the speed of a range query or full table scan is exponentially reduced, especially for overwriting index queries.

The data store of the data table may also be fragmented. However, data storage fragmentation is more complex than index fragmentation, and there are three types of data fragmentation:

  • Row fragmentation: Row fragmentation occurs on different slices where the same row of data resides in different physical storage. Row fragmentation directly reduces the query performance of single row data.
  • Interrow fragmentation: Interrow fragmentation occurs when logically ordered paging or when data rows are not stored in order on disk. This can affect range queries for full table scans or clustered indexes — query performance in such cases often depends on whether the data stored on disk is in order.
  • Empty storage space fragmentation: Empty storage space fragmentation occurs when there is a lot of free space in a data page. This causes the database server to read a lot of discarded data that is not needed.

All three of these things happen to MyISAM tables, but InnoDB does not happen to small rows — the storage engine moves the data around and writes it to a separate data shard.

To address data fragmentation, you can allow the OPTIMIZE TABLE or exported data to be re-imported, which works for most storage engines. For example, the MyISAM engine reconstructs the index through a sorting algorithm to defragment it in order. In the old version of InnoDB, there was no efficient way to defragment indexes, but in the new version InnoDB can delete and rebuild indexes “online” instead of rebuilding the entire table to achieve defragmentation.

For storage engines that don’t support the OPTIMIZE TABLE command, you can rebuild the entire TABLE with an ALTER TABLE command that doesn’t affect the TABLE as follows:

ALTER TABLE <The data table name> ENGINE=<The original engine>;
Copy the code

In Percona Server, if expand_fast_index_creation is enabled, rebuilding tables in this way will defragment InnoDB tables and indexes. In the standard MySQL version, this only rebuilds the table (that is, aggregates index fragments). Data tables can be rebuilt to achieve defragmentation by dropping and rebuilding indexes to mimic the functionality of Percona Server.

Don’t rely on guesswork to determine whether you need to defragment the table, but measure to find the fragmented table. Percona XtraNackup has a — STATS option that makes it run in non-backup mode. In this mode, index and table statistics are printed, including data and the amount of space left to store pages. This is the degree to which data fragmentation can be detected. It’s also important to consider whether the data is already in a good stable state — your defragmentation could disrupt it, causing future updates to trigger paging and regrouping, which can affect performance until it reaches a stable state again.