preface

As a developer, Yinggao is engaged in a battle of wits with database every day, especially in the Internet industry, the use of MySQL is more. Similarly, because of the importance and popularity of mysql, it must be a key point or frequently asked questions in the interview. To be precise, for practice and the underlying questions will be a standard to distinguish you from your competitors in the interview. So, what do you need to prepare for mysql? Look at the graph below

I believe that after reading this picture, some old iron will continue to diss in the heart, how to play this, so much, joking? It doesn’t matter, just like we used to draw the key points in the final exam, Ali P8 will draw the key points for you (MySQL), still hurry to have a look at the key points

What is SQL?

Structured Query Language (SQL) is a database Query Language.

Purpose: Used to access data, query, update, and manage relational database systems.

What is a MySQL?

MySQL is a relational database management system developed by MySQL AB, a Swedish company and a product of Oracle. MySQL is one of the most popular Relational Database Management systems, and one of the best RDBMS (Relational Database Management System) applications in WEB applications. It is commonly used in Java enterprise development because MySQL is open source, free and easy to expand.

What are the three paradigms of database?

  • First normal form: No column can be split again.

  • Second normal form: On a first normal form basis, non-primary key columns are completely dependent on the primary key, not part of it.

  • Third normal form: On a second normal form basis, non-primary key columns depend only on the primary key and not on other non-primary keys.

  • When designing a database structure, try to follow the three paradigms, and if not, there must be a good reason for it. Like performance. In fact, we often compromise database design for performance.

Mysql > select * from ‘mysql’ where ‘privileges’ are stored.

The MySQL server controls user access to the database through the permission table, which is stored in the MySQL database and initialized by the mysql_install_db script. These permission tables are user, DB, table_priv, columns_priv, and host. The structure and contents of these tables are described as follows: User permission table: Records the information about user accounts that are allowed to connect to the server. The permissions in the table are global. Db rights table: records the operation rights of each account on each database. Table_priv Permission table: records data table-level operation permissions. Columns_priv permission table: records operation permissions at the data column level. Host permission table: Works with db permission table to control database-level operation permissions on a given host. This permission list is not affected by GRANT and REVOKE statements.

How many types of entries are available for MySQL binlog? What’s the difference?

There are three formats, Statement, Row and mixed. In statement mode, each SQL statement that modifies data is recorded in the binlog. You do not need to record the changes of each row, reducing the amount of binlog logs, saving I/O, and improving performance. Because SQL execution is contextual, relevant information needs to be saved at the time of saving, and some statements that use functions and the like cannot be recorded and copied. At the ROW level, information about the CONTEXT of SQL statements is not recorded. Only the modified record is saved. The recording unit is the change of each row. Basically, all the changes can be recorded, but many operations will lead to a large number of changes of rows (such as ALTER table). Therefore, files in this mode save too much information and log too much. Mixed, a compromise, uses statement records for normal operations and row records for situations where statement is not available.

In addition, the row level has been optimized in the new version of MySQL to record statements instead of row by row when table structure changes.

What data types does mysql have?

  • 1. The value can be TINYINT, SMALLINT, MEDIUMINT, INT, and BIGINT, which are 1-byte, 2-byte, 3-byte, 4-byte, and 8-byte integers respectively. Any integer type can have an UNSIGNED attribute to indicate that the data is UNSIGNED, that is, a non-negative integer. Length: The length of an integer can be specified. For example, INT(11) indicates an INT of length 11. Length is meaningless in most scenarios. It does not limit the legal range of values, only affects the number of characters displayed, and needs to be used in conjunction with the UNSIGNED ZEROFILL property to be meaningful. Example: Assuming the type is INT(5) and the property is UNSIGNED ZEROFILL, if the user inserts 12, then the database actually stores 00012.

  • Real number types, including FLOAT, DOUBLE, and DECIMAL. DECIMAL can be used to store integers larger than BIGINT and can store exact decimals. FLOAT and DOUBLE have a range of values and support approximations using standard floating points. FLOAT is much more efficient than a DOUBLE in computing DECIMAL, which you can interpret as a string.

  • 3. String types, including VARCHAR, CHAR, TEXT, and BLOBVARCHAR are used to store variable-length strings, which save more space than fixed-length strings. VARCHAR stores string length with an extra 1 or 2 bytes. If the column length is less than 255 bytes, 1 byte is used; otherwise, 2 bytes is used. If the VARCHAR stores more content than the set length, the content is truncated. CHAR is a fixed length that allocates sufficient space based on the defined length of the string. CHAR is padded with Spaces as needed for comparison purposes. CHAR is good for storing very short strings, or all values close to the same length. CHAR also truncates the content stored beyond the set length.

  • 4. Enumeration type (ENUM), which stores non-repeating data as a predefined set. Sometimes you can use ENUM instead of the common string type. ENUM storage is very compact, condensing list values to one or two bytes. ENUM is stored internally as an integer. Avoid using numbers as constants in ENUM enumerations because they are confusing. Sort by internally stored integers

  • 5, date and time type, try to use timestamp, space efficiency is higher than datetime, with an integer to save the timestamp is usually not convenient to deal with. If you need to store subtlety, you can use BigInt storage. See here, this real question is not easier to answer.

What is the difference between MyISAM index and InnoDB index?

  • InnoDB index is clustered index, MyISAM index is non-clustered index.

  • InnoDB’s primary key index is very efficient because its leaf nodes store rows.

  • The leaf node of the MyISAM index stores the row data address, which needs to be addressed again to get the data back.

  • InnoDB leaf nodes that are not primary key indexes store primary key and other indexed column data, so overwriting indexes can be very efficient when querying.

Four features of the InnoDB engine

  • Insert buffer

  • Double write

  • Adaptive Hash index (AHI)

  • Pre-reading (read ahead)

What is an index?

  • Indexes are special files (indexes on InnoDB tables are part of the table space) that contain Pointers to all the records in the table.

  • An index is a data structure. A database index is a sorted data structure in a database management system to help query and update data in a database table quickly. Indexes are usually implemented using B numbers and their variant B+ numbers.

  • More generally, an index is a table of contents. In order to facilitate the search of the contents of the book, through the content of the index to form a catalog. An index is a file that occupies physical space.

What are the advantages and disadvantages of indexes?

  • The advantages of indexes: They can greatly speed up the retrieval of data, which is the main reason for creating indexes. By using indexes, you can improve the performance of the system by using optimization hiders during the query process.

  • Disadvantages of indexes: Time: It takes time to create and maintain indexes. To be specific, indexes need to be maintained dynamically when adding, deleting and modifying data in the table, which reduces the efficiency of adding, changing, and deleting data. Spatial: Indexes need to occupy physical space.

What are the types of indexes?

  • Primary key index: Data columns cannot duplicate or be NULL, and a table can have only one primary key.

  • Unique index: Data columns are not allowed to duplicate and NULL values are allowed. A table allows multiple columns to create unique indexes. ALTER TABLE table_name ADD UNIQUE (column); ALTER TABLE table_name ADD UNIQUE (column1,column2); Create a unique composite index

  • Plain index: A basic index type that has no restrictions on uniqueness and allows NULL values. ALTER TABLE table_name ADD INDEX index_name (column); ALTER TABLE table_name ADD INDEX index_name(column1, column2, column3); Create composite indexes.

  • Full-text indexing is a key technology used by search engines at present. ALTER TABLE table_name ADD FULLTEXT (column); Creating a full-text index

Index data structure (B-tree, hash)

The data structure of index is related to the implementation of specific storage engine. Indexes used in MySQL include Hash index, B+ tree index, etc. The default index of InnoDB storage engine we often use is B+ tree index. For hash index, the underlying data structure is hash table, so in the vast majority of requirements for a single record query, you can choose hash index, query performance is the fastest; In most scenarios, you are advised to select the BTree index.

  • 1. B tree index

Mysql uses storage engine to fetch data, and almost 90% of people use InnoDB. According to the implementation, InnoDB has only two index types: BTREE index and HASH index. B-tree index is the most frequently used index type in Mysql database. Almost all storage engines support BTree index. Mysql > select * from BTREE; select * from BTREE; select * from BTREE;

  • 2. The B + tree

N subtree nodes contain n keywords and do not store data but indexes of the data.

All leaf nodes contain information of all keywords, that is, Pointers to records containing these keywords, and leaf nodes themselves are linked in large order according to the size of the keywords.

All non-terminal nodes can be considered as index parts, which contain only the maximum (or minimum) keyword in their children.

In a B+ tree, data objects are inserted and deleted only on leaf nodes.

B+ trees have two head Pointers, one for the root node of the tree and one for the leaf node of the minimum key code.

  • Hash index

Briefly said, simple implementation of a HASH table is similar to the data structures (HASH), when we use HASH index in mysql, mainly through the HASH algorithm (common HASH algorithm is directly addressing method, in the square method and folding method, the divisor residual method, random number method), puts the data into a database field long HASH value, The row pointer to this data is stored in the Hash table; If a Hash collision occurs (two different keywords have the same Hash value), they are stored in a linked list under the corresponding Hash key. Of course, this is just a rough simulation.

The fundamentals of indexing

Indexes are used to quickly find records that have specific values. If there is no index, the query will generally traverse the entire table.

The principle of index is very simple, it is to turn the disordered data into an ordered query, sort the contents of the columns that create the index, generate an inversion table, spell the data address chain on the contents of the inversion table, and then take out the data address chain, so as to get the specific data

What are the indexing algorithms?

Index algorithms include BTree algorithm and Hash algorithm

  • 1. The BTree algorithm

BTree is the most common mysql database indexing algorithm and the default mysql algorithm. This is because it can be used not only on the =,>,>=,<,<=, and between comparison operators, but also on the like operator, as long as the query condition is a constant that does not begin with a wildcard.

  • 2. The Hash algorithm

Hash Hash indexes can only be used for peer comparison, such as the =,<=> (equivalent to the =) operator. Because it is a positioning data, unlike the BTree index, which needs to access the page node from the root node to the branch node for many IO visits, the retrieval efficiency is much higher than that of the BTree index.

Principles of index design?

  • The columns that are suitable for indexing are those that appear in the WHERE clause or are specified in the join clause.

  • Classes with a small cardinality are poorly indexed and there is no need to index this column

  • Use short indexes. If you index long string columns, you should specify a prefix length to save a lot of index space

  • Don’t over-index. Indexes require additional disk space and reduce write performance. When table contents are modified, the index is updated or even reconstructed, and the more index columns, the longer this takes. So keep only the indexes you need to help the query.

Principles for index creation

Mysql will keep matching to the right until it hits a range query (>, <, between, like). Mysql will keep matching to the right until it hits a range query (>, <, between, like). A = 1 and b = 2 and c > 3 and d = 4 a = 1 and b = 2 and c > 3 and d = 4 a = 1 and b = 2 and C > 3 and d = 4 Update frequent fields are not suitable for creating an index. If a column cannot distinguish data effectively, it is not suitable for creating an index column (such as gender, male and female unknown, or at most three, the distinction is too low). Expand the index as far as possible, do not create a new index. For example, if you want to add (a,b) to a table that already has an index of A, you only need to modify the original index. Data columns that define foreign keys must be indexed. For columns that are rarely involved in a query, do not index columns with a high number of duplicate values. Do not index columns of data types defined as text, image, and bit.

What should I pay attention to when creating an index?

  • Non-empty fields: Columns should be specified NOT NULL unless you want to store NULL. Columns with null values are difficult to query optimize in mysql because they complicate indexes, index statistics, and comparison operations. You should replace null values with 0, a special value, or an empty string;

  • The columns of fields with large value dispersion (the difference between values of variables) are placed before the joint index. You can view the difference value of the field by using the count() function. The larger the returned value is, the more unique values of the field the higher the dispersion degree of the field is.

  • The smaller the index field, the better: Database data is stored in pages. More data is stored on a page. More data is obtained in one I/O operation, the more efficient it is.

Does using indexed queries necessarily improve query performance?

In general, querying data through an index is faster than a full table scan. But we must also be aware of the costs.

Indexes require space to store and need regular maintenance, and the index itself is modified whenever a record is added or subtracted from the table or an index column is modified. This means that each INSERT, DELETE, and UPDATE record will cost 4 or 5 more disk I/ OS. Because indexes require extra storage and processing, unnecessary indexes can slow query response times. Using INDEX queries may not improve query performance. INDEX RANGE SCAN queries are applicable to two situations:

  • Based on a range retrieval, a typical query returns a result set less than 30% of the number of records in the table

  • Retrieval based on non-unique indexes

How do I delete data at the million level or above?

About index: Because index needs extra maintenance cost, index file is a separate file, so when we add, modify and delete data, there will be extra operations on index file, these operations need to consume extra IO, which will reduce the efficiency of add/change/delete. So, when we delete millions of database data, check the MySQL official manual to see that the speed of deleting data is proportional to the number of indexes created. Therefore, when we want to delete millions of data, we can first delete the index (which takes more than three minutes) and then delete the useless data (this process takes less than two minutes). After the deletion, we can create the index again (at this time, there is less data). The index creation is also very fast, about ten minutes. With the previous direct delete is definitely much faster, not to mention in case of delete interruption, all delete will be rolled back. That’s even worse.

What is the leftmost prefix principle? What is the leftmost matching principle?

As the name implies, left-most first. When creating a multi-column index, the most frequently used column in the WHERE clause is placed on the left-most, depending on business requirements.

A = 1 and b = 2 and C > 3 and d = 4; a = 2 and C > 3 and d = 4; D (a,b,d,c); d (a, B,d,c); d (a, B,d);

= and in can be out of order, such as a = 1 and b = 2 and c = 3. Create (a,b,c) indexes in any order. Mysql’s query optimizer will help you optimize them into a form that can be recognized by the indexes

B tree and B+ tree

In a B-tree, you can store keys and values in internal nodes and leaf nodes; But in a B+ tree, the inner nodes are all keys and have no values, and the leaf nodes hold both keys and values.

The leaves of a B+ tree are connected by a chain, whereas the leaves of a B tree are independent.

Benefits of using B trees

B-trees can store both keys and values in internal nodes, so placing frequently accessed data near the root node greatly improves the efficiency of hot data queries. This feature makes b-trees more efficient in scenarios where a particular data is queried repeatedly.

Benefits of using B+ trees

Since the internal nodes of the B+ tree only store keys, not values, a single read can fetch more keys in the memory page, which helps to narrow down the search more quickly. The leaf nodes of B+ tree are connected by a chain. Therefore, when a full data traversal is needed, B+ tree only needs O(logN) time to find the smallest node, and then O(N) sequential traversal through the chain is enough. B trees, on the other hand, need to traverse each level of the tree, which requires more memory replacement times and therefore more time

What is a cluster index? When to use clustered and non-clustered indexes?

Clustered indexes: Store data together with the index, find the index and find the data

Non-clustered index: Myisam uses key_buffer to cache the index in memory. When it needs to access data (through the index), myISam directly searches the index in memory, and then finds the corresponding data on disk through the index. This is why indexes are slow when they are not hit by the key buffer.

Must a non-clustered index be queried back into the table?

Not necessarily. This involves whether all the fields required by the query match the index. If all the fields match the index, then there is no need to perform the query back to the table.

Select age from employee where age < 20; select age from employee where age < 20; select age from employee where age < 20;

What is a federated index? Why do I care about the order in a federated index?

MySQL can use multiple fields to create an index at the same time, called a federated index. If you want to match an index in a joint index, you need to match the index one by one in the order of the fields when the index is created. Otherwise, the index cannot be matched.

MySQL > create index (name, age, school); MySQL > create index (name, age, school); MySQL > create index (school);

When the query is performed, the indexes are only strictly ordered according to name, so the name field must be used for equivalent query first. Then, the matched columns are strictly ordered according to age field, and the age field can be used for index search, and so on. Therefore, when establishing a joint index, we should pay attention to the order of index columns. In general, the columns with frequent query requirements or high field selectivity should be placed first. Additional adjustments can be made individually, depending on the specific query or table structure.

What are database transactions?

Transaction is an indivisible sequence of database operations and the basic unit of database concurrency control. The result of its execution must make the database change from one consistency state to another. A transaction is a logical set of operations that either all or none of them execute.

What are the four properties of ACID?

  • Atomicity: Transactions are the smallest unit of execution and do not allow splitting. The atomicity of the transaction ensures that the action either completes completely or does not work at all;

  • Consistency: Data is consistent before and after a transaction is executed. Multiple transactions read the same data with the same result.

  • Isolation: when accessing the database concurrently, a user’s transaction is not disturbed by other transactions, and the database is independent between the concurrent transactions.

  • Persistence: After a transaction is committed. Its changes to the data in the database are persistent and should not be affected if the database fails.

What is dirty reading? Phantom read? Unrepeatable?

Drity Read: a transaction has updated a copy of data, and another transaction has Read the same copy of data. For some reason, the first transaction has rolled back, and the data Read by the second transaction is incorrect.

Non-repeatable read: Data inconsistency between two queries of a transaction. This may be because the original data updated by a transaction was inserted between the two queries.

Phantom Read: a transaction where the number of pens is inconsistent between two queries. For example, if one transaction queries for rows and another inserts new columns, the previous transaction will find columns that it did not have before on subsequent queries.

What is the isolation level of a transaction? What is the default isolation level for MySQL?

In order to achieve the four characteristics of transaction, the database defines four different transaction isolation levels, which are Read uncommitted, Read committed, Repeatable Read, Serializable. The four levels solve the problems of dirty reads, unrepeatable reads, and phantom reads one by one.

The SQL standard defines four isolation levels: read-uncommitted: The lowest isolation level that allows UNCOMMITTED data changes to be READ, potentially resulting in dirty, illusable, or unrepeatable reads. Read-committed: Allows concurrent transactions to READ data that has been COMMITTED, preventing dirty reads, but magic or unrepeatable reads can still occur. REPEATABLE-READ: Multiple reads of the same field are consistent, unless the data is modified by the transaction itself. This can prevent dirty reads and unrepeatable reads, but phantom reads are still possible. SERIALIZABLE: Highest isolation level, fully subject to ACID isolation level. All transactions are executed one by one so that interference between transactions is completely impossible. That is, this level prevents dirty reads, unrepeatable reads, and phantom reads.

REPEATABLE_READ Isolation level Used by Mysql Default READ_COMMITTED isolation level used by Oracle default

The relationship between isolation levels and locks

  • At the Read Uncommitted level, shared locks are not required to Read data so that it does not conflict with exclusive locks on modified data

  • At the Read Committed level, shared locks are added to Read operations but released after the statement is finished.

  • In Repeatable Read level, Read operations need to add the shared lock, but the shared lock is not released before the transaction is committed, that is, the shared lock must be released after the transaction is completed.

  • SERIALIZABLE is the most restrictive isolation level because it locks the entire range of keys and holds the lock until the transaction completes.

What are the database locks by lock granularity?

  • Row-level lock: The row-level lock is the most fine-grained lock in Mysql. It only locks the current row. Row-level locking can greatly reduce conflicts in database operations. Its locking particle size is the smallest, but the locking cost is also the largest. Row-level locks are divided into shared locks and exclusive locks. Features: high overhead, slow lock; Deadlocks occur; The lock granularity is the lowest, the probability of lock conflict is the lowest, and the concurrency is the highest.

  • Table-level lock: The table-level lock is the lock with the largest granularity in MySQL. It locks the entire table in the current operation. It is simple to implement, consumes less resources, and is supported by most MySQL engines. The most commonly used MYISAM and INNODB both support table-level locking. Table level locks are classified into shared table read locks (shared locks) and exclusive table write locks (exclusive locks). Features: low overhead, fast lock; No deadlocks occur; The lock granularity is large, and the probability of lock conflict is high and the concurrency is low.

  • Page-level lock: page-level lock is a type of lock in MySQL whose lock granularity is in the middle of row-level and table-level locks. Table level locking is fast but has many conflicts, while row level locking is slow but has few conflicts. So a compromise page level is taken, locking adjacent sets of records at a time.

What locks does MySQL have?

Will tell from the type of lock, have share lock and exclusive lock. Shared lock: also known as read lock. When the user wants to read the data, a shared lock is placed on the data. Multiple shared locks can be added simultaneously. Exclusive lock: also known as write lock. An exclusive lock is placed on the data when the user writes to it. Only one exclusive lock can be added, and other exclusive locks and shared locks are mutually exclusive.

What are the three lock algorithms for InnoDB storage engine?

  • Record Lock: A lock on a single row Record

  • Gap Lock: A Gap lock that locks a range, excluding the record itself

  • Next-key lock: Record +gap locks a range, including the record itself

What is a deadlock? How to solve it?

A deadlock is a vicious cycle in which two or more transactions occupy each other’s resources and request to lock each other’s resources.

Common solutions to deadlocks If different programs concurrently access multiple tables, try to agree to access the tables in the same order to greatly reduce the chance of deadlocks. In the same transaction, all resources needed should be locked as much as possible to reduce the probability of deadlock. For the part of the business that is very prone to deadlock, you can try to use the granularity of upgrade lock to reduce the probability of deadlock through table-level locking.

Distributed transaction locks or optimistic locks can be used if the business is not doing well

What are optimistic and pessimistic locks for databases? How do you do that?

The task of concurrency control in a database management system (DBMS) is to ensure that the isolation and unity of transactions and the unity of the database are not broken when multiple transactions simultaneously access the same data in the database. Optimistic concurrency control (optimistic locking) and pessimistic concurrency control (pessimistic locking) are the main techniques used in concurrency control.

Pessimistic: Assume that concurrency conflicts will occur and block all operations that might violate data integrity. The transaction is locked after the data is queried until the transaction is committed. Implementation: use the locking mechanism in the database

Optimistic locking: Data integrity violations are checked only at commit time, assuming no concurrency conflicts will occur. The transaction is locked while the data is being modified, using version locking. Implementation: Music will generally use the version number mechanism or CAS algorithm implementation.

Large table data query, how to optimize?

  • Optimize shema, SQL statement + index;

  • Second plus cache, memcached, redis;

  • Master/slave replication, read/write separation;

  • Vertical split, a distributed system that splits a large system into smaller systems based on how well your modules are coupled

  • Horizontal segmentation, for tables with large data volume, this step is the most troublesome and can test the technical level. It is necessary to choose a reasonable Sharding key. In order to have good query efficiency, the table structure should also be changed to make certain redundancy, and the application should also be changed. Instead of scanning all tables

How to handle large pages?

Large pagination is usually solved in two directions: at the database level, which is where we focus our attention (although not as much), there is room for optimizations like select * from table where age > 20 limit 100000010. This statement takes load1000000 data and then basically dumps it all. Fetching 10 is slow of course. Select * from table where id in (select id from table where age > 20 limit 1000000,10) This also loads a million data, but it is fast because of index overwriting, so all the fields to be queried are in the index. Select * from table where ID > 1000000 limit 10 select * from table where ID > 1000000 limit 10 The main thing is not to do similar requirements (jump directly to a specific page millions of pages later. Allows only page-by-page viewing or following a given path, which is predictable and cacheable) and prevents ID leaks and continuous malicious attacks

Why try to have a primary key?

Primary keys ensure the uniqueness of data rows in the entire table. You are advised to add a self-growing ID column as the primary key even if the table does not have a primary key. After setting the primary key, it is possible to make subsequent deletions faster and ensure the safety of the operation data range.

Does the primary key use an autoincrement ID or a UUID?

It is recommended to use the autoincrement ID instead of the UUID.

Because in InnoDB storage engines, the primary key index as a clustering index, that is, the primary key index of B + tree leaves node stores the primary key index, and all the data (in order), if the primary key index is the ID, so you just need to constantly backward arrangement, if it is a UUID, due to the size of the ID with the arrival of the original not sure. It causes a lot of data inserts, a lot of data movement, and then a lot of memory fragmentation, which in turn degrades insert performance.

In general, in the case of large data volumes, the performance is better with auto-increment primary keys.

As for the primary key being a clustered index, InnoDB selects a unique key as the clustered index if there is no primary key, and generates an implicit primary key if there is no unique key.

Why is the field required to be not NULL?

Null values take up more bytes and cause a lot of mismatches in your program.

If you want to store user password hashes, what fields should be used for storage?

Fixed length strings such as password hashes, salt, and user id numbers should be stored in char rather than vARCHar to save space and improve retrieval efficiency.

Database structure optimization?

A good database design scheme for the performance of the database often get twice the result with half the effort.

You need to consider data redundancy, speed of query and update, and whether the data type of the field is reasonable.

Split a table with many fields into multiple tables: For a table with many fields, if some fields are used infrequently, you can separate these fields to form a new table. Because when a table has a large amount of data, it is slowed down by the presence of infrequently used fields.

Adding intermediate tables: For tables that need frequent joint queries, you can create intermediate tables to improve query efficiency. By creating an intermediate table, you insert the data that needs to be queried through the federated query into the intermediate table, and then change the original federated query to a query against the intermediate table.

Add redundant fields: the design of the data table should follow the conventions of the paradigm theory as much as possible, and reduce the redundant fields as much as possible to make the database design look delicate and elegant. However, reasonable addition of redundant fields can improve the query speed. The more normalized a table is, the more relationships there are between tables, the more queries need to be joined, and the worse the performance.

MySQL database CPU up to 500%

When the CPU increases to 500%, run the top command of the operating system to check whether mysqld is occupied. If not, find out the processes with high CPU usage and handle the problem.

If mysqld is the cause, show processList to see if there is a session running in it. Find the high SQL consumption to see if the execution plan is accurate, if the index is missing, or if there is simply too much data.

In general, it is important to kill these threads (and see if CPU usage drops), and then re-run the SQL after making appropriate adjustments (such as adding indexes, changing SQL, changing memory parameters).

It is also possible that each SQL server does not consume a lot of resources, but suddenly a large number of sessions are connected, causing the CPU to spike. In this case, you need to work with the application to analyze why the number of connections surged, and then adjust accordingly, such as limiting the number of connections.

What is the role of master-slave replication?

  • If the primary database has a problem, you can switch to the secondary database.

  • Read/write separation can be performed at the database level.

  • Daily backups can be made on a slave database.

MySQL primary/secondary replication

  • Data distribution: Start or stop replication at will and distribute data backups across geographic locations

  • Load balancing: Reduce the stress on a single server

  • High availability and failover: Helps applications avoid single points of failure

  • Upgrade testing: you can use a higher version of MySQL as the slave library

How does MySQL master-slave replication work?

  • Log data higher to binary logs on the main library

  • The slave library copies the master library’s logs to its own relay logs

  • An event that reads the relay log from the library and replaces it into slave library data.