Mysql supports only one join algorithm: nested-loop join, but nested-loop join has three variants:

(Note: Please refer to the public id: InsideMySQL)

Principle:

1. Simple Nested – Loop Join:

In the figure below, R is the driver table and S is the matching table. It can be seen that R1, R2 and…… are extracted from R respectively , Rn to match the left and right columns of THE S table, and then merge the data, the S table is accessed rn times, which costs a lot to the database



2.Index nested-loop Join:

This requires an index on the undriven table (matching table S), which can be used to reduce comparisons and speed up queries.

In the query, the driver table (R) will search according to the index of the associated field, find the corresponding value on the index, and then query back to the table, that is, only when the index is matched, the query back to the table will be carried out.

If the associated key of an undriven table (S) is a primary key, the performance is very high. If the associated key of an undriven table (S) is a primary key, the performance is slower than that of an index (s) that is a primary key.



3. Block Nested – Loop Join:

If there is an index, the second method is selected for join, but if the join column does not have an index, Block nested-loop join is used.

We can see that there is a join buffer buffer in the middle, which is to cache all join-related columns of the driver table into the join buffer first, and then perform batch matching with the matching table. The first multiple comparison is merged into one, which reduces the access frequency of the non-driver table (S).

By default, join_BUFFer_size =256K. MySQL caches all required columns into the join buffer during lookup, including select columns, not just associated columns. A SQL with N JOIN associations allocates n-1 JOIN buffers at execution time.




Example:

Let’s say I have two tables A and B

A structure: comments_id bigInt(20) P for_comments_if mediumInt (9) product_id int(11) order_id int(11)Copy the code

Id int(11) p comments_id bigInt(20) product_id int(11)...Copy the code

Where the association for B has comments_id, so there is an index.

1. Join:

SELECT * FROM a gc
JOIN b gcf ON gc.comments_id=gcf.comments_id
WHERE gc.comments_id =2056Copy the code

Index nested-loop Join is used to filter the primary key of the driven table A and get one, then seek the Index of the non-driven table B and expect to get one data.

Indexes are not used in the following cases:

SELECT * FROM a gc
JOIN b gcf ON gc.order_id=gcf.product_idCopy the code

Block nested-loop Join is used. If table B has little data, it acts as the driver table and caches the data required by table B into the Join buffer to scan table A in batches

2. Left to join:

SELECT * FROM a gc
LEFT JOIN b gcf ON gc.comments_id=gcf.comments_idCopy the code

Index nested-loop Join is used, because there is no filtering condition, a table is selected as the driver table to Join, and indexes of non-driven tables are associated.

If you add conditions

SELECT * FROM b gcf
LEFT JOIN a gc ON gc.comments_id=gcf.comments_id
WHERE gcf.comments_id =2056Copy the code

One is filtered from the driver table to match the non-driver table.


Left JOIN: saves the left table data, and displays null if the right table does not have relevant data

Fight Join: the right table is saved, and null is displayed if the left table has no related data

Inner join: partial primary and secondary tables, resulting in the smallest set of two tables matching on conditions