This is the 19th day of my participation in the Gwen Challenge in November. Check out the details: The last Gwen Challenge in 2021

MySQL database optimization usually has the following four dimensions: hardware, system configuration, database table structure, SQL and index.

  • From the optimization cost: hardware > System configuration > database table structure >SQL and index.
  • From the optimization effect: hardware < system configuration < database table structure

To optimize MySQL, we need to know how the MySQL components work together and how MySQL optimizes and executes queries.

MySQL Logical Architecture

The overall logical architecture of MySQL is divided into three layers. The client layer is the top layer, which is not unique to MySQL. Functions such as connection processing, authentication and security are all handled in this layer.

Most of MySQL’s core services are in the middle layer, including query parsing, analysis, optimization, caching, and built-in functions (such as timing, math, encryption, etc.). All cross-storage engine functionality is also implemented in this layer: stored procedures, triggers, views, and so on.

At the lowest level is the storage engine, which is responsible for data storage and extraction in MySQL. Like file systems under Linux, each storage engine has its advantages and disadvantages. The service layer in the middle communicates with storage engines through apis that mask the differences between storage engines.

MySQL > select * from ‘MySQL’;

  1. The client sends a query to the server.

  2. The server first checks the query cache, and if a hit is made, it immediately returns the result stored in the cache. Otherwise, move on to the next level.

  3. The server performs SQL parsing. Preprocessing, and then the optimizer generates the corresponding execution plan.

  4. MySQL invokes the storage engine API to execute the query based on the execution plan generated by the optimizer.

  5. Returns the result to the client.

Note about query caching:

Before parsing a query, if the cache is turned on, MySQL checks first to see if the query matches the data in the query cache.

If there is no hit, the next stage of processing is entered.

If the query cache is hit, the user’s permissions are checked.

If the permissions are ok,MySQL skips the other stages and returns the data directly to the client.

We will optimize MySQL from hardware, system configuration, table design, SQL statements, and more.

Hardware level optimization

  • CPU: Select a CPU with multiple cores and a high CPU frequency.
  • Memory: Select a larger memory.
  • Disk: Select a higher rotational speed, RAID, or disk array card, or use SSDS if conditions permit.
  • Network environment: It is deployed on a LOCAL area network (LAN) as far as possible. Optical cables and 1000/10 MBIT/s networks are used to provide the network. Dual network cables are used to provide redundancy and multi-port binding monitoring.

System Configuration Optimization

Optimized the operating system configuration

  • Use 64-bit operating systems for better use of large memory.
  • Optimize kernel parameters.
  • Increase the file descriptor limit.
  • File system selection: XFS,JFS,EXT3/EXT4 (the choice of file system is important to ensure data security).

Mysql software Optimization

  • Mysql replication is enabled to implement read/write separation and load balancing. The read load is distributed to multiple secondary servers to improve the processing capability of the servers.
  • Use the recommended GA version (official release version) to improve performance.
  • Use partition new function to split big data.

Mysql configuration optimization

Note: Once global parameters are set, resources are pre-occupied with server startup.

  • Parameter wait_TIME_OUT: Specifies the timeout time for thread connections. Try not to set it to a large value. 10 seconds is recommended.
  • The thread_concurrency parameter: the number of concurrent threads concurrency connections (removed in mysql 5.7.2). In InnoDB, we can set parametersinnodb_thread_concurrencyParameter limits the number of threads.

Innodb_thread_concurrency usage recommendations

There are also some suggestions for using Innodb_thread_concurrency.

If the number of concurrent user threads in a workload is less than 64, it is recommended to set innodb_thread_concurrency=0.

If the workload is always heavy or even occasionally peaks, it is recommended to set innodb_thread_concurrency=128 first and then lower this parameter, 96, 80, 64, etc., until you find the number of threads that provide the best performance. For example, suppose the system typically has 40 to 50 users. But regularly the number increases to 60,70, or even 200. You’ll find that performance is stable at 80 concurrent user Settings, and drops below that. In this case, it is recommended to set the Innodb_thread_concurrency parameter to 80 to avoid impacting performance.

  • Read_buffer_size parameter: buffer size reserved for query during full table scan, determined according to select_scan.
  • Parameter tmp_table_size: The setting of temporary memory tables, which if exceeded will be converted to disk tables, based on parameter (created_tmp_disk_tables).

Mysql table design optimization

Storage engine selection

  • Myisam: suitable for small concurrency, read more write less, and can be very good use of index, and SQL statement is relatively simple applications, such as data warehouse.
  • Innodb: suitable for large concurrent access, write operations, foreign keys, transactions and other requirements of the application, large system memory.

Naming rules

  • Adopt most development language naming conventions, such as MyAdress
  • Most open source naming conventions, such as my_address, are usually underlined.
  • Avoid arbitrary names. It’s better to know them by their names.

Field type selection

Select an appropriate field type based on your requirements and keep the field type as small as possible to meet your requirements. Assign only the minimum number of characters you need, and don’t be too generous.

Reason: Smaller field types and smaller character counts will take up less memory, less disk space, less disk IO, and less bandwidth.

Code selection

  • Single-byte – latin1
  • Multi-byte – UTf8 (3 bytes for Chinese characters and 1 byte for English letters)
  • If it contains Chinese characters, it is best to use utF8 to avoid garbled characters.

The primary key to select

Note: The primary key design here is mainly for the INNODB engine.

  • Can uniquely represent a row.
  • Explicitly defines the primary key of an increment field of numeric type. This field can be used only as the primary key.
  • MySQL primary keys should be single column to improve efficiency of join and filter operations.
  • SMALLINT; BIGINT; SMALLINT; BIGINT;
  • Do not update or modify the primary key field to prevent data storage fragmentation and I/O performance degradation.
  • MySQL primary key should not contain dynamically changing data, such as timestamp, create time column, modify time column, etc.
  • MySQL primary keys should be generated automatically.
  • The primary key fields are placed first in the table.

In general, we recommend using a numeric primary key and using the auto_INCREMENT attribute to make it grow automatically.

Mysql statement level optimization

  • Read statements with poor performance, count rows in InnoDB, suggest another table, use myISam, do statistics regularly. Generally applicable to the case where statistical data is not required to be too accurate.
  • Try not to do calculations in the database.
  • Avoid negative queries and % prefix fuzzy queries.
  • Do not perform operations or use functions on indexed columns.
  • Do not use in production applicationsselect * fromQuery data in the form of. Query only the columns you need to use.
  • During query, use limit to reduce the number of returned rows, reducing data transmission time and bandwidth waste.
  • The WHERE clause avoids using functions on query columns as much as possible, because no index is needed to use functions on query columns.
  • Avoid implicit type conversions, such as’ ‘for characters and’ ‘for numbers.
  • All SQL keywords are capitalized to form a good habit of avoiding the waste of system resources caused by repeated COMPILATION of SQL statements.
  • When querying a join table, remember to put small result sets first and follow the principle that small result sets drive large result sets.
  • Enable slow queries and regularly optimize SQL statements in slow queries with Explain.
  • Break up largedeleteorinsertStatements.

conclusion

From the above, MYSQL is mainly optimized from the following aspects:

  • Table design: Reasonable storage engines, field types, normal vs. inverse normal
  • Features: suitable index, cache, partition sub-table.
  • Architecture: master/slave replication, read/write separation, load balancing.
  • Reasonable SQL: test and compare the query efficiency of different SQL with the same function, and write efficient SQL based on past experience.