Note: MySQL in this article, unless otherwise noted, refers to the open source community edition of MySQL.

Huawei cloud database before the new version released, will face a series of rigorous testing rules, in addition to the requirements of all the test cases by MySQL, still need to pass by huawei millions of richer, more press close to the user business scenario test cases to construct network test, in order to fully validate the stability of the new version whether meet the user’s classic scene.

It was during this rigorous validation process that we discovered a potential Bug in MySQL.

The Bug description

Test environment: Test MySQL 8.0.22, MySQL 8.0.26, and Huawei Cloud GaussDB(for MySQL) respectively based on the same test case and data set. Test statement:

select  
  subq_0.c2 as c0
from 
  (select  
        ref_6.C_STATE asc0, 
        case whenref_6.C_PHONE is not NULL then ref_5.C_ID else ref_5.C_ID end
           asc1, 
        floor(
          ref_3.c_id)as c2
      from 
       sqltester.t0_hash_partition_p1_view as ref_0
             right join sqltester.t4 as ref_1
             on (EXISTS (
                 select  
                     ref_1.c_middle as c0
                   from 
                     sqltester.t1 as ref_2
                   where ((false) 
                       and ((true) 
                         or (true))) 
                     or (false)
                   ))
            innerjoin sqltester.t0_range_key_subpartition_sub_view as ref_3
            on(EXISTS (
               select  
                   ref_0.c_credit as c0, 
                   ref_1.c_street_1 as c1, 
                   ref_4.c_credit_lim as c2, 
                   ref_3.c_credit as c3
                 from 
                   sqltester.t0_hash_partition_p1 as ref_4
                 where true
                 ))
          left joinsqltester.t10 as ref_5
            innerjoin sqltester.t11 as ref_6
            on(true)
          on (((pi() isnot NULL)) 
             and (false))
      where (((ref_5.C_D_ID isnot NULL) 
            or(ref_3.c_middle is not NULL)) 
      )) as subq_0
where (EXISTS (
          select  
             subq_0.c0 as c0, 
             pi() as c1, 
             ref_11.c_street_1 as c2, 
             ref_11.c_discount as c3, 
             pi() as c4
           from 
             sqltester.t0_partition_sub_view_mixed_001 as ref_11))
group by 1
order by 1;
Copy the code

Result: As shown in the following figure, MySQL 8.0.22, MySQL8.0.26 and Huawei cloud GaussDB(for MySQL) return inconsistent results, indicating a Bug, as shown in red.

Bug analysis

First determine which execution result is correct.The current execution plan for this statement is a Hash Join, which was introduced in MySQL8.0, suggesting that the open source version may have problems. Next, we will verify both the mature version of MySQL and the non-mysql database.Verification process:

  • The result is the same as GaussDB(for MySQL) but different from MySQL 8.0.
  • The PostgreSQL verification result is the same as MySQL 5.6 and GaussDB(for MySQL), but different from MySQL 8.0 or later.

This confirms that there are problems with MySQL 8.0 and later. So what causes this Bug? 1. Simplify the query for later analysis. After repeated verification, the query is simplified as follows:

SELECT count(*) FROM (SELECT 1 FROM sqltester.t4 AS ref_1 INNER JOIN sqltester.t4 AS ref_3 ON (EXISTS (SELECT 1 FROMsqltester.t4 AS ref_4 WHERE TRUE )) LEFT JOIN sqltester.t10 AS ref_5 ON (FALSE) WHERE (((ref_5.C_D_ID IS NOT NULL) OR (ref_3.c_middle IS NOT NULL))))AS subq_0 The execution plan IS AS follows: -> Aggregate: count(0) (cost=2.75 rows=0) -> Filter: ((ref_5.c_d_id is not null) or(ref_3.c_middle is NULL) (cost=2.75 rows=0) -> Inner Hash JOIN (no condition) (cost=2.75 Rows =0) -> Index scan on ref_3 using ndx_C_middle (cost=0.13 rows=50) -> Hash -> Inner Hash join (no condition) (cost=1.50 rows=0) -> Index scan on ref_1 using ndx_c_id (cost=6.25 rows=50) -> Hash -> Left Hash join (no condition) (cost = 0.25 rows = 0) - > Limit: 1 row(s) (cost=312.50 rows=1) ->Index scan on ref_4 using ndx_c_id (cost=312.50 rows=50) -> Hash -> Zero Rows Impossible (filter) (cost = 0.00.. 0.00 rows = 0)Copy the code

As you can see from the execution plan above, ref_5 is optimized by the optimizer to Zero Rows, and ref_5 is the inner table of the Left Hash Join. The inner table, which is the Left Join, needs to generate NULL rows to Join the outer table if the inner table has no record of a match condition (Impossible condition here, which means the Join condition is always False). MySQL 8.0.22 MySQL 8.0.22 MySQL 8.0.22 MySQL 8.0.22 MySQL 8.0.22 MySQL 8.0.22

SELECT count(*) FROM (SELECT 1 FROM sqltester.t4 AS ref_1 INNER JOIN sqltester.t4 AS ref_3 ON (EXISTS (SELECT 1 FROM sqltester.t4 AS ref_4 WHERE TRUE )) LEFT JOIN sqltester.t10 AS ref_5 ON (FALSE) WHERE (((ref_5.C_D_ID IS NOT NULL) or(ref_3.c_middle IS NOT NULL))))AS subq_0; + + + + | count (*) | | 2500 | + + 1 row in the set (0.00 SEC)Copy the code

Ref_3.c_middleis NULL (ref_3.C_MIDDLEis NULL) Now the Where condition contains only one condition (ref_5.c_d_idis NOT NULL), which requires the current query to filter out all join records that ref_5 does NOT match. The SQL statement and execution result are as follows:

SELECT count(*) FROM (SELECT 1 FROM sqltester.t4 AS ref_1 INNER JOIN sqltester.t4 AS ref_3 ON (EXISTS (SELECT 1 FROM sqltester.t4 AS ref_4 WHERE TRUE )) LEFT JOIN sqltester.t10 AS ref_5 ON (FALSE) WHERE (((ref_5.C_D_ID IS NOT NULL))))assubq_0; + + + + | count (*) | | 2500 | + + 1 row in the set (0.01 SEC)Copy the code

By comparing the statements and execution results before and after the modification, you can see that the execution result is NOT related to the condition (ref_3.C_middle is NULL) but only to the condition (ref_5.c_d_id is NOT NULL). Normally for ref_5 tables, ref_5 is optimized to Zero Rows because of the Impossible condition. So if only the condition (ref_5.c_d_id IS NOT NULL) IS left, the normal result would be an empty set (count returns 0). But now the open source version’s result set is not, which again shows that the open source version has a problem. For a Left Join, if Join conditions do not match, the inner table needs to be set to NULL rows to Join the outer surfaces. The execution plan is ZeroRows, which means MySQL 8.0 is using ZeroRowsIterator. Actuator needs to call ZeroRowsIterator: : SetNullRowFlag to set Nullflag. 4. Run GDB to check whether the Settings are correct:

Breakpoint 1, ZeroRowsIterator::SetNullRowFlag(this=0x7f92a413d510, is_null_row=false) at /mywork/mysql-sql/sql/basic_row_iterators.h:398 398 assert(m_child_iterator ! = nullptr); (gdb) n 399 m_child_iterator->SetNullRowFlag(is_null_row); (gdb) s std::unique_ptr<RowIterator,Destroy_only<RowIterator> >::operator-> (this=0x7f92a413d520) The at/opt/Simon/Taurus/mysql - root/SRC/tools/GCC - 9.3.0 / include/c + + / 9.3.0 / bits/unique_ptr. H: 355 355 returnget (); (gdb) fin Run till exit from #0 std::unique_ptr<RowIterator,Destroy_only<RowIterator> >::operator-> ( This = 0 x7f92a413d520) at/opt/Simon/Taurus/mysql - root/SRC/tools/GCC - 9.3.0 / include/c + + / 9.3.0 / bits/unique_ptr. H: 355 ZeroRowsIterator::SetNullRowFlag (this=0x7f92a413d510,is_null_row=false) at/home/simon/mywork/mysql-sql/sql/basic_row_iterators.h:399 399 m_child_iterator->SetNullRowFlag(is_null_row); Value returned is $1 = (RowIterator *) 0x7f92a413d4d0 (gdb) s TableRowIterator::SetNullRowFlag (this=0x7f92a413d4d0,is_null_row=false) at/home/simon/mywork/mysql-sql/sql/records.cc:229 229 if(is_null_row) { (gdb) n 232 m_table->reset_null_row(); (gdb) 234 }Copy the code

From the point of view of the above the GDB, breakpoint using ZeroRowsIterator: : SetNullRowFlag table Nullflag set to False. The following GDB information also proves this. Can be determined, the cause of this Bug is: ZeroRowsIterator: : SetNullRowFlag set to False here is not correct. Because if the ZeroRowsIterator: : SetNullRowFlag set to False, it will lead to the table for ZeroRows Left Join generated within the table not NULL result set.

How to solve

**** Now that the above Bug analysis is very clear, it is relatively easy to fix. Just need to ZeroRowsIterator: : SetNullRowFlag always set to True. Because ZeroRowIterator can only produce two results, either an empty set or a NULL row as an outer joined inner table. MySQL 8.0.26 MySQL 8.0.26 MySQL 8.0.26

You can see from the returned results that the query was correct, which means the problem was fixed. To ensure the reliability of huawei cloud GaussDB, you must pass multiple rounds of rigorous test cases before releasing each product. After discovering problems, the Huawei cloud database team carefully determines and analyzes the problems, fixes bugs, and solves problems in the first place to ensure data security and accuracy of service results. Huawei cloud database team gathers more than 50% database kernel experts in the industry to ensure customer business security in real time with professional technology and help enterprise business security on the cloud!