1, index,

When it comes to database optimization, the first thing most interviewees want to talk about is to create an index, but after digging deeper, people often don’t understand the index deeply enough, so let’s talk about database index.

Pros and cons of indexes

Advantages: reduces the amount of data that the server needs to scan and improves the query speed;

Disadvantages:

1. Creating and maintaining indexes takes time and increases as the volume of data increases.

2. Indexes also take up space. We know that the data in the table will also have a maximum limit.

3. When the data in the table is added, deleted, or modified, the index also needs dynamic maintenance, which reduces the data maintenance speed.

** Summary: ** Although indexing can improve database performance due to its advantages and disadvantages, it can be counterproductive if used carelessly, so we need to understand the index in order to use it well.

2. Index — Data structure

Hash table:

1. Hash is suitable for equivalent lookup, not range lookup;

2. Ordered arrays perform very well in equivalence and range queries, but are only suitable for static storage engines.

Binary tree:

1. The tree height is too high. Too many disk accesses are required to obtain data once.

2. The disk page is 4K, so the balanced binary tree cannot make good use of disk features;

B tree:

1, a disk block can hold multiple data, tree height is also reduced, can effectively use the disk features;

2. Data and index are on the same node

B + tree:

1. Compared with B-tree, data is stored on leaf nodes. More data can be stored on a disk block, which effectively reduces tree height and disk access times.

2. The data of B+ tree has leaf nodes, and there are bidirectional Pointers between leaf nodes, which can carry out the range query traversal and reduce the time complexity;

Let’s summarize the characteristics of b+ trees:

1, order, convenient retrieval and range query;

2. Each data node stores data blocks, reducing tree height and thus reducing disk read times (addressing consumption). In addition, the block storage and prefetch features of disks are utilized to improve I/O performance.

3. Bidirectional pointer connection between leaf nodes to improve range performance and range search;

4. Increase the efficiency of data retrieval by utilizing the characteristics of hop tables;

3. Re-understand the index failure scenario based on data structure

Principle:

1. Left-most matching principle;

2. Cost based (IO/CPU/ memory) considerations: If index-based costs are high, then abandon indexes.

For example, if the data queried based on indexes accounts for 80% of the total table data, the full table scan efficiency is higher.

3. The sequential read/write performance of the disk is higher than that of the disk. The disk has the prefetch feature.

4, understand some concepts of database, back table, overwrite index, index push down and other concepts, can help write efficient SQL;

5, as far as possible to access resources is one of the important principles of database design, in the use of database, design table, as far as possible to reduce resource consumption as the goal;

6. Reducing the index size helps store more indexes in memory and reduces disk read and write.

EXPLAIN SELECT * FROM employees WHERE name = 'LiLei'; EXPLAIN SELECT * FROM employees WHERE name = 'LiLei'; EXPLAIN SELECT * FROM employees WHERE left(name,3) = 'LiLei'; The index is not used because the B+ tree stores data and cannot be compared when functions are used. The storage engine cannot use the column to the right of the range condition in the aggregate index (index_name_age_position) EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager'; EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age > 22 AND position ='manager'; Index not used cause: Because the index prefix matches, the index can only match certain contents. The second SQL can only use the name and age fields; Use overridden indexes (queries that access only indexes (index columns contain query columns) whenever possible, EXPLAIN select name,age FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager'; EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager'; Mysql > alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table alter table = or <>) cause: The data volume of inequality query is huge. If the query is based on the index, the addressing frequency is high. If the full table scan is not based on the index range, the sequential read and write efficiency is high. Since I don't place null values on the B+ tree, I use the index like starting with a wildcard ('$ABC... EXPLAIN SELECT * FROM employees WHERE name = '1000'; EXPLAIN SELECT * FROM employees WHERE name = '1000'; EXPLAIN SELECT * FROM employees WHERE name = 1000; If there is no index on either side or only one, the index is not indexed: Being or relationship, so if walked index, index of side not to walk, the price is need a full table scan it again, and based on the query again, then pray and sets, consume more, go directly to a full table scan is better than in operation can avoid the inevitable, if really can't avoid, need careful evaluation in collection of element number behind, Cause: Addressing consumption > full table scan. Order by if you want to sort a table based on multiple values, you must use the same sort method. If you want to sort a table based on multiple values, you must use the same sort method, either ascending or descending at the same time. A joint index is a concatenation of multiple fields. If the sorting is inconsistent, the database code cannot be retrieved: Use UTF8MB4 instead of UTF8 cause: The encoding format of the database is different from that of Java. Utf8mb4 uses 4 bytes and cannot be stored for emoticons that occupy 4 bytes. Therefore, utF8MB4 is recommendedCopy the code