Hello everyone, I am two black, here is a set of software testing related resources:

  • Software testing related tools
  • Software testing practice set
  • Deep automated testing
  • Python Learning Guide
  • Interview questions and resume templates

Pay attention to my public number: [programmer two black] can be free!

Communication group: 642830685

directory

Why learn MySQL?

What is an index?

Second, why have index?

1. Where is mysql data stored?

2. Slow data query, where is the general card?

3. To read data from disk, do you read as much as you need?

4. Where is the index stored?

5. What information does the index need when it is stored? What field values need to be stored?

6. What data structure is used to store data in this format?

7. The mysql index is not stored in the format described above, why?

Mysql index data structure

3.1 Hash Table:

3.2 the tree:

Why use B+ tree?

Create index

5.1 Clustered index and non-clustered Index

Back to the table 5.2

5.3 Overwriting indexes

5.4 Left-most match

5.5 Index push down

Why learn MySQL?

We visit all kinds of websites and apps every day, such as wechat, QQ, Douyin, Toutiao, Tencent News, etc. There is a lot of information on these things. These information need to be stored somewhere, where is it stored? The database.

So if we need to develop a website, APP, database we must master the technology, commonly used databases are mysql, Oracle, SQLServer, DB2 and so on.

Of the databases introduced above, Oracle ranks the first in performance, and the services are quite in place, but the charges are also very high. Financial companies have high requirements for database stability, so they generally choose Oracle.

Mysql is free of charge, and several others are temporarily charged at present. Mysql also ranks the first in the usage rate of Internet companies, and its materials are very perfect and the community is very active, so we mainly study mysql.

Due to space constraints, this article details only MySQL indexes

What is an index?

An index is like the table of contents in a dictionary and we usually go to the table of contents and look up the key words or the letters and then look up it’s a lot faster than going through the dictionary

Second, why have index?

However, when we use mysql database, we also have an index like a dictionary to query, which is definitely much faster

2.1 question:

1. Where is mysql data stored?

disk

2. Slow data query, where is the general card?

IO

3. To read data from disk, do you read as much as you need?

Disk to proofread

Locality principle: Data and programs tend to cluster, and previously accessed data is likely to be queried again, spatial locality, temporal locality

Disk prefetch: Data interaction between memory and disk usually has a minimum logical unit, page. Pages are usually sized by the operating system, 4k or 8K, and we can take multiples of the page to read when we interact with the data.

A 578-page PDF document is available for MySQL learning Notes

The InnoDB storage engine reads 16K of data at a time

4. Where is the index stored?

Disk, when querying data, the index is loaded into memory first

5. What information does the index need when it is stored? What field values need to be stored?

Key: The value stored in the actual data row

Address of the file

Offset: indicates the offset

6. What data structure is used to store data in this format?

key-values

Hash table, tree (binary tree, red-black tree, AVL tree, B tree, B+ tree)

7. The mysql index is not stored in the format described above, why?

OLAP: Online analytical processing —- Analyzes massive historical data to generate decision making policies —- Data warehouse – Hive

OLTP: Online transaction processing —- requires very short time to return corresponding results —- database – relational database (mysql, Oracle)

Mysql index data structure

3.1 Hash Table:

A HashMap array with a linked list is not suitable for indexing.

1. Hash conflict will cause uneven data hashing, resulting in a large number of linear queries and a waste of time

2. Range query is not supported. When you perform range query, you must traverse one by one

3. High requirements on memory space

Advantages: If it is equivalent query, very fast

Is there a hash index in mysql?

1. The memory storage engine uses hash indexes

2. Innodb supports adaptive hash

Create table test(id int primary key,name varchar(30)) engine=' InnoDB /memory/myisam' -- default innoDB after 5.1Copy the code

3.2 the tree:

There are many data structures of tree, the common ones are binary tree, BST, AVL, red-black tree, B tree and B+ tree

① Binary tree: unordered insertion

So that’s the structure of our tree, but the binary tree inserts are unordered, which means that when you need to find something, you still have to walk through it one by one

②BST(binary search tree) : insert data in order, the left subtree must be smaller than the root node, the right subtree must be larger than the root node ——– use binary search to improve efficiency

In this way, if you want to query data, you can use binary lookup to quickly narrow the scope and reduce the time complexity ** But if the insertion order is in ascending or descending order, the tree shape will look like this:

Then the binary search tree will degenerate into a linked list, and the time will become O(n) again.

In order to solve the above problem, the height difference between the shortest subtree and the oldest one cannot be more than 1 by the left rotation or right rotation transfer tree

As we can see from the figure, when sequential inserts are performed, rotation is automatically performed to achieve balance, but the performance of the query is compensated by the loss of insert performance. When we insert a lot of data and query is very small, rotation of insert data also consumes a lot of time

(4) Red and black trees (solve the same number of read and write requests) also balance the tree by turning left and right, but also change color behavior as long as the oldest tree is not more than twice the shortest subtree

The query performance and insert performance are approximately balanced. However, as data is inserted and the discovery tree depth becomes deeper, the tree depth becomes deeper, which means that the more I/O times, the data read efficiency is affected

⑤ B tree In order to solve the problem of too much data insertion and tree depth becoming deeper, we use B tree to change the original ordered binary tree into ordered multi-tree

Select * from table where id=14;

  1. Step 1, load disk 1 into memory, find 14<16, find address disk 2

  2. Step 2, load disk 2 into memory, find 14>11, find address disk 7

  3. Step 3: Load disk 7 into memory, find 14=14, read data, fetch data, end thinking: B tree is perfect? Problem 1: B-tree does not support fast search of range query. If we query the data of a range and find a boundary of the range, we need to go back to the root node to search again. We need to traverse from the root node for many times, even if we find another boundary of the range, the query efficiency will be reduced. Problem 2: If data stores row records, the size of rows will increase as the number of columns increases. In this case, the amount of data that can be stored in a page is reduced, the tree is correspondingly higher, and disk I/OS are increased. Think 2: How many records can a three-tier B-tree store? A: Assuming that a data is 1K, innoDB storage engine reads 16K data at one time, and the three layers are 16

    16

    16 = 4096; But often in development, the data of a table is much larger than 4096, do you want to continue to add layers, which will not increase IO

Why use B+ tree?

When you actually store the table data, how do you store it? Key Complete data row transformation B+ tree

B+ tree improves B tree by placing all data in leaf nodes, which are connected by bidirectional Pointers, and the lowest leaf node forms a bidirectional ordered linked list. For example, select * from table where id between 11 and 35?

  1. Step 1, load disk 1 into memory, find 11<28, find address disk 2
  2. Step 2, load disk 2 into memory, find 10>11>17, find address disk 5
  3. Step 3: Load disk 5 into memory, find 11=11, read data
  4. In the fourth step, continue to search to the right, read disk 5, find 35=35, read data between 11 and 35, end it can be seen that, such range query speed is much higher than B tree

Comparing B trees and B+ trees?

  • Data is stored in the leaf node

  • No data is stored in non-leaf nodes

  • Each node of a B+ tree contains more nodes. The advantage of this is that the height of the tree can be reduced and the data range can be divided into multiple ranges. The more ranges, the faster the query

Question: Create index with int or varchar?

A: It depends, but keep the key as small as possible

Create index

Innodb: InnoDB data and index are stored in a single file. Idb myISam: MyISam index is stored in a single file. In MYI files, data is stored in. In the MYD

5.1 Clustered index and non-clustered Index

Innodb: InnoDB: innoDB: innoDB: innoDB: innoDB:

  1. There can only be one clustered index, but there are many non-clustered indexes
  2. When inserting data into InnoDB, you must include an index key
  3. The key of this index can be a primary key, or if there is no primary key, a unique key, or if there is no unique key, a self-generated 6-byte ROWId

Myisam: non-clustered index

MySQL – Innodb —-B+ Tree index and data stored together, find the index can read the corresponding data

MySQL — myisam—-B+ tree index and the address of the stored data together, find the index to obtain the address value, and then find the corresponding data by the address

Back to the table 5.2

Next, I’m going to create a table of examples to show you

CREATE TABLE user_test(id INT PRIMARY KEY AUTO_INCREMENT,-- ID PRIMARY KEY uname VARCHAR(20), age INT, gender VARCHAR(10), KEY 'idx_uname' (' uname ')ENGINE = INNODB; INSERT INTO user_test VALUES(1,' zhang3 ',18,' male '); INSERT INTO user_test VALUES(NULL,' ma ',19,' ma '); INSERT INTO user_test VALUES(NULL,' user_test ',18,' male '); INSERT INTO user_test VALUES(NULL,'王老 7 ',22,'男'); INSERT INTO user_test VALUES(NULL,' user_test ',16,' female '); INSERT INTO user_test VALUES(NULL,' 10 ',26,' 10 ');Copy the code

Select * from user_test where uname = 'user_test '; SQL > alter table select * from uname; select * from uname; select * from uname;Copy the code

First according to the uname query id, according to the id query to the information Such operations walked two B + tree, is back to the table When according to general index after access to the key value of the cluster index, according to the key value to get the data in the clustering index We can find that this operation is a waste of time, so we daily operation, Minimize the number of times you return to the table

5.3 Overwriting indexes

Select id,uname from table where uname = 'j3 '; Select * from uname; select * from uname; select * from uname; select * from unameCopy the code

5.4 Left-most match

Before said leftmost match, let’s chat a few nouns primary key (usually for one column) — — — — — — — — > joint primary key index (more than one column) — — — — — — — — > joint index (may contain multiple index column)

Select * from table where name =? Select * from table where name =? and age = ? ; Select * from table where name =? ; Select * from table where age =? ; Select * from table where age =? and name = ? ; Mysql has an internal optimizer that adjusts the sequenceCopy the code

5.5 Index push down

An example of a feature supported by default after mysql5.7:

select * from table where name = ? and age = ? ; Mysql > select * from 'mysql'; -- Client :JDBC -- Server :server -- Storage engine: Before the data store does not have an index push down, the data store obtains data from the storage engine according to the name that meets the rules. After the index push down is filtered at the server layer, Obtain data from the storage engine based on the name and age criteriaCopy the code

Analysis: The benefit of index push-down, if we have 50 data, we will get 10 data through filtering, if there is no index push-down, we will get 50 data and then exclude 10 data, and with push-down, we will directly filter 10 data in the storage engine

In order to facilitate your learning test, we specially prepared a 13G super practical dry goods learning resources, involving very comprehensive content.



Include software learning roadmap, video, and more than 50 days of class 16 assault on practical projects, more than 80 software testing with software, 37 test documentation, 70 software test related issues, 40 level testing experience article, thousands of test questions, sharing, and 2021 bible software testing interview, there are all kinds of selected software testing job resume, Hope to help you…

Pay attention to my public number: [programmer two black] can get this information!