0 index

  • The execution order of the JOIN statement

  • INNER/LEFT/RIGHT/FULL JOIN

  • The difference between ON and WHERE

1 overview

A complete SQL statement is broken into clauses that generate a virtual table (VT), but only the last virtual table is returned as a result. With this in mind, let’s try to understand the execution process of a JOIN query and answer some common questions.

If the execution results of different JOIN before have no idea, can combine this article: https://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins to look down.

2 JOIN execution order

Here is the general structure of a JOIN query:


     
  1. SELECT <row_list>

  2.  FROM <left_table>

  3.    <inner|left|right> JOIN <right_table>

  4.      ON <join condition>

  5.        WHERE <where_condition>

Copy the code

It is executed in the following order (the first FROM clause is always executed in an SQL statement) :

  • FROM: Perform the Cartesian product on the left and right tables to produce the first table VT1. The number of rows is n*m (n is the number of rows in the left table and m is the number of rows in the right table

  • ON: Filters VT1 line by line based ON the conditions of ON and inserts the result into VT2

  • Select * from vt2 where LEFT OUTER JOIN(LEFT OUTER JOIN); select * from vt3 where LEFT OUTER JOIN(LEFT OUTER JOIN); The same is true if the RIGHT JOIN is specified. If an INNER JOIN is specified, external rows are not added, and the above insert process is ignored, vt2=vt3.

  • WHERE: The rows that meet the conditions are output to VT4

  • SELECT: fetch the specified field from VT4 to vt5

Here is an example to illustrate the above process (this is not a good practice, but just to illustrate the join syntax).

3, for example,

Create a user information table:


     
  1. CREATE TABLE `user_info` (

  2.  `userid` int(11) NOT NULL,

  3.  `name` varchar(255) NOT NULL,

  4.  UNIQUE `userid` (`userid`)

  5. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

Copy the code

Create another user balance table:


     
  1. CREATE TABLE `user_account` (

  2.  `userid` int(11) NOT NULL,

  3.  `money` bigint(20) NOT NULL,

  4. UNIQUE `userid` (`userid`)

  5. ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

Copy the code

Import some random data:


     
  1. select * from user_info;

  2. +--------+------+

  3. | userid | name |

  4. +--------+------+

  5. |   1001 | x    |

  6. |   1002 | y    |

  7. |   1003 | z    |

  8. |   1004 | a    |

  9. |   1005 | b    |

  10. |   1006 | c    |

  11. |   1007 | d    |

  12. |   1008 | e    |

  13. +--------+------+

  14. 8 rows in set (0.00 sec)

  15. select * from user_account;

  16. +--------+-------+

  17. | userid | money |

  18. +--------+-------+

  19. | | | 22 1001

  20. | | | 30 1002

  21. | | | 8 1003

  22. | 1009 | |

  23. +--------+-------+

  24. 4 rows in set (0.00 sec)

Copy the code

A total of 8 users have user names and 4 users have balances in their accounts.

Select * from user where username = ‘1003’ and username = ‘1003’


     
  1. SELECT i.name, a.money

  2.  FROM user_info as i

  3.    LEFT JOIN user_account as a

  4.      ON i.userid = a.userid

  5.        WHERE a.userid = 1003;

Copy the code
Step 1: Execute the FROM clause to perform the Cartesian product on the two tables

Cartesian product operation will return after the combination of two pieces of all rows in the table and left the userinfo table for eight, right table useraccount has four lines, the generated virtual table vt1 is 8 * 4 = 32 rows:


     
  1. SELECT * FROM user_info as i LEFT JOIN user_account as a ON 1;

  2. +--------+------+--------+-------+

  3. | userid | name | userid | money |

  4. +--------+------+--------+-------+

  5. |   1001 | x    |   1001 |    22 |

  6. |   1002 | y    |   1001 |    22 |

  7. |   1003 | z    |   1001 |    22 |

  8. |   1004 | a    |   1001 |    22 |

  9. |   1005 | b    |   1001 |    22 |

  10. |   1006 | c    |   1001 |    22 |

  11. |   1007 | d    |   1001 |    22 |

  12. |   1008 | e    |   1001 |    22 |

  13. |   1001 | x    |   1002 |    30 |

  14. |   1002 | y    |   1002 |    30 |

  15. |   1003 | z    |   1002 |    30 |

  16. |   1004 | a    |   1002 |    30 |

  17. |   1005 | b    |   1002 |    30 |

  18. |   1006 | c    |   1002 |    30 |

  19. |   1007 | d    |   1002 |    30 |

  20. |   1008 | e    |   1002 |    30 |

  21. |   1001 | x    |   1003 |     8 |

  22. |   1002 | y    |   1003 |     8 |

  23. |   1003 | z    |   1003 |     8 |

  24. |   1004 | a    |   1003 |     8 |

  25. |   1005 | b    |   1003 |     8 |

  26. |   1006 | c    |   1003 |     8 |

  27. |   1007 | d    |   1003 |     8 |

  28. |   1008 | e    |   1003 |     8 |

  29. |   1001 | x    |   1009 |    11 |

  30. |   1002 | y    |   1009 |    11 |

  31. |   1003 | z    |   1009 |    11 |

  32. |   1004 | a    |   1009 |    11 |

  33. |   1005 | b    |   1009 |    11 |

  34. |   1006 | c    |   1009 |    11 |

  35. |   1007 | d    |   1009 |    11 |

  36. |   1008 | e    |   1009 |    11 |

  37. +--------+------+--------+-------+

  38. 32 rows in set (0.00 sec)

Copy the code
Step 2: Execute the ON clause to filter out the rows that do not meet the criteria

ON i.userid = a.userid Vt2 is filtered as follows:


     
  1. +--------+------+--------+-------+

  2. | userid | name | userid | money |

  3. +--------+------+--------+-------+

  4. |   1001 | x    |   1001 |    22 |

  5. |   1002 | y    |   1002 |    30 |

  6. |   1003 | z    |   1003 |     8 |

  7. +--------+------+--------+-------+

Copy the code
Step 3: JOIN adds external rows

LEFT JOIN inserts the rows of the LEFT table that are not present in VT2 into vt2 and fills the remaining fields of each row with NULL, the same as RIGHT JOIN.

In this example, a LEFT JOIN is used, so generate table vt3 will be added to the remaining rows of the LEFT table user_info:


     
  1. +--------+------+--------+-------+

  2. | userid | name | userid | money |

  3. +--------+------+--------+-------+

  4. |   1001 | x    |   1001 |    22 |

  5. |   1002 | y    |   1002 |    30 |

  6. |   1003 | z    |   1003 |     8 |

  7. |   1004 | a    |   NULL |  NULL |

  8. |   1005 | b    |   NULL |  NULL |

  9. |   1006 | c    |   NULL |  NULL |

  10. |   1007 | d    |   NULL |  NULL |

  11. |   1008 | e    |   NULL |  NULL |

  12. +--------+------+--------+-------+

Copy the code
Step 4: WHERE condition filtering

WHERE a.user ID = 1003 create table vt4:


     
  1. +--------+------+--------+-------+

  2. | userid | name | userid | money |

  3. +--------+------+--------+-------+

  4. |   1003 | z    |   1003 |     8 |

  5. +--------+------+--------+-------+

Copy the code
Step 5: SELECT

SELECT i.name., a.money to generate vt5:


     
  1. +------+-------+

  2. | name | money |

  3. +------+-------+

  4. | z    |     8 |

  5. +------+-------+

Copy the code

The virtual table VT5 is returned to the client as the final result.

Now that we have covered the process of union tables, let’s look at the differences between common joins.

4 INNER/LEFT/RIGHT/FULL JOIN

  • INNER JOIN… ON… : returns all rows that match the left and right tables (because only step 2 ON filtering is performed above, step 3 does not add external rows)

  • LEFT JOIN… ON… : returns all rows in the left table. If there is no matching row in the right table, the right table column is set to NULL in the new table

  • RIGHT JOIN… ON… : returns all rows of the right table. If there is no matching row in the left table, the left table column is set to NULL in the new table

INNER JOIN

Select * from vt3 where LEFT JOIN is replaced by INNER JOIN; select * from vt3 where LEFT JOIN is replaced by INNER JOIN;

                        
     
  1. +--------+------+--------+-------+

  2. | userid | name | userid | money |

  3. +--------+------+--------+-------+

  4. | 1001 | x | 1001 | 22 |

  5. | 1002 | y | 1002 | 30 |

  6. | 1003 | z | 1003 | 8 |

  7. +--------+------+--------+-------+

Copy the code
RIGHT JOIN

If LEFT JOIN is replaced by RIGHT JOIN, the generated table vt3 is as follows:


     
  1. +--------+------+--------+-------+

  2. | userid | name | userid | money |

  3. +--------+------+--------+-------+

  4. |   1001 | x    |   1001 |    22 |

  5. |   1002 | y    |   1002 |    30 |

  6. |   1003 | z    |   1003 |     8 |

  7. |   NULL | NULL |   1009 |    11 |

  8. +--------+------+--------+-------+

Copy the code

Because userId =1009 exists in userAccount and userInfo does not exist in userInfo, insert the following line in step 3:


     
  1. |   NULL | NULL |   1009 |    11 |

Copy the code
FULL JOIN

LEFT JOIN + UNION + RIGHT JOIN; LEFT JOIN + UNION + RIGHT JOIN;


     
  1. SELECT *

  2.  FROM user_info as i

  3.    RIGHT JOIN user_account as a

  4.      ON a.userid=i.userid

  5. union

  6. SELECT *

  7.  FROM user_info as i

  8.    LEFT JOIN user_account as a

  9.      ON a.userid=i.userid;

Copy the code

It returns the following result:


     
  1. +--------+------+--------+-------+

  2. | userid | name | userid | money |

  3. +--------+------+--------+-------+

  4. |   1001 | x    |   1001 |    22 |

  5. |   1002 | y    |   1002 |    30 |

  6. |   1003 | z    |   1003 |     8 |

  7. |   NULL | NULL |   1009 |    11 |

  8. |   1004 | a    |   NULL |  NULL |

  9. |   1005 | b    |   NULL |  NULL |

  10. |   1006 | c    |   NULL |  NULL |

  11. |   1007 | d    |   NULL |  NULL |

  12. |   1008 | e    |   NULL |  NULL |

  13. +--------+------+--------+-------+

Copy the code

Ps: In fact, we can see that there is no difference between LEFT JOIN and RIGHT JOIN semantically. The difference between the two results depends on the order in which the LEFT and RIGHT tables are placed.

RIGHT JOIN works analogously to LEFT JOIN. To keep code portable across databases, it is recommended that you use LEFT JOIN instead of RIGHT JOIN.

So when you’re debating whether to use LEFT JOIN or RIGHT JOIN, use only LEFT JOIN when possible.

5 the difference between ON and WHERE

The difference between ON and WHERE is easy to understand now that the order of JOIN execution is clear.

For example:


     
  1. SELECT *

  2.  FROM user_info as i

  3.    LEFT JOIN user_account as a

  4.      ON i.userid = a.userid and i.userid = 1003;

Copy the code

     
  1. SELECT *

  2.  FROM user_info as i

  3.    LEFT JOIN user_account as a

  4.      ON i.userid = a.userid where i.userid = 1003;

Copy the code

In the first case, after executing the second step ON clause, select the rows that meet i.usserid = A.usserid and I.usserid = 1003 to generate table VT2, and then execute the third step JOIN clause to add the external rows to the virtual table to generate vt3, which is the final result:

                                
     
  1. vt2 :

  2. +--------+------+--------+-------+

  3. | userid | name | userid | money |

  4. +--------+------+--------+-------+

  5. | 1003 | z | 1003 | 8 |

  6. +--------+------+--------+-------+

  7. vt3 :

  8. +--------+------+--------+-------+

  9. | userid | name | userid | money |

  10. +--------+------+--------+-------+

  11. | 1001 | x | NULL | NULL |

  12. | 1002 | y | NULL | NULL |

  13. | 1003 | z | 1003 | 8 |

  14. | 1004 | a | NULL | NULL |

  15. | 1005 | b | NULL | NULL |

  16. | 1006 | c | NULL | NULL |

  17. | 1007 | d | NULL | NULL |

  18. | 1008 | e | NULL | NULL |

  19. +--------+------+--------+-------+

Copy the code

In the second case, after executing the second step ON clause, LEFT JOIN selects the rows that meet i.userid = A. user ID and generates table vt2. Add the external row generation table vt3 by executing the third step JOIN clause. Then execute the fourth step WHERE clause, then filter vt3 table to generate vt4, obtain the final result:

                                    
     
  1. vt2 :

  2. +--------+------+--------+-------+

  3. | userid | name | userid | money |

  4. +--------+------+--------+-------+

  5. | 1001 | x | 1001 | 22 |

  6. | 1002 | y | 1002 | 30 |

  7. | 1003 | z | 1003 | 8 |

  8. +--------+------+--------+-------+

  9. vt3 :

  10. +--------+------+--------+-------+

  11. | userid | name | userid | money |

  12. +--------+------+--------+-------+

  13. | 1001 | x | 1001 | 22 |

  14. | 1002 | y | 1002 | 30 |

  15. | 1003 | z | 1003 | 8 |

  16. | 1004 | a | NULL | NULL |

  17. | 1005 | b | NULL | NULL |

  18. | 1006 | c | NULL | NULL |

  19. | 1007 | d | NULL | NULL |

  20. | 1008 | e | NULL | NULL |

  21. +--------+------+--------+-------+

  22. vt4 :

  23. +--------+------+--------+-------+

  24. | userid | name | userid | money |

  25. +--------+------+--------+-------+

  26. | 1003 | z | 1003 | 8 |

  27. +--------+------+--------+-------+

Copy the code

If you replace the LEFT JOIN in the above example with an INNER JOIN, the result is the same whether you place the conditional filter in ON or WHERE, because the INNER JOIN does not perform the third step to add the external row.


     
  1. SELECT *

  2.  FROM user_info as i

  3.    INNER JOIN user_account as a

  4.      ON i.userid = a.userid and i.userid = 1003;

Copy the code

     
  1. SELECT *

  2.  FROM user_info as i

  3.    INNER JOIN user_account as a

  4.      ON i.userid = a.userid where i.userid = 1003;

Copy the code

The return result is:


     
  1. +--------+------+--------+-------+

  2. | userid | name | userid | money |

  3. +--------+------+--------+-------+

  4. |   1003 | z    |   1003 |     8 |

  5. +--------+------+--------+-------+

Copy the code

The resources

  • MySQL Technology Insider: SQL Programming

  • SQL Joins – W3Schools

  • SQL – What is the difference between “INNER JOIN” and “OUTER JOIN”?

  • MySQL :: MySQL 8.0 Reference Manual :: 13.2.10.2 JOIN Syntax

  • Visual Representation of SQL Joins

  • Join (SQL) – Wikipedia


Welcome to follow SegmentFault wechat official account 🙂