There are two main sources of good or bad SQL:

  • 1. At the database level: it depends on how the optimizer accesses and processes the data
  • 2, from the business aspect: this SQL is not a good SQL in the business

We take Oracle 11g as an example for analysis.

I. Access mode of data

[No index]

If a table is not indexed, the optimizer will use different data access methods depending on how Oracle accesses the table. Here are two examples:

  • 1. Parallel access
  • 2. Multi-data block access

[Indexed]

In the case of index building, there will be different data access methods, mainly including the following five:

  • 1, Index unique scan
  • Index range scan
  • Index full scan
  • Index fast Full Scan
  • Index skip scan

Second, data processing methods

There are several data access methods listed above. In fact, operations such as sorting ORDER by, grouping group by, and counting count are all used in our daily development. However, in addition to these basic operation methods, we generally also conduct join processing for tables. For connection, there are several cases:

  • 1. Nested loop join

  • 2, Hash join

  • 3, Sort merge join

Next, we use test cases to verify the above three join data processing methods, and test SQL with the following example:

Drop table nestedLoopTest1; drop table nestedLoopTest1; drop table nestedLoopTest2 ; -- Create table nestedLoopTest create table nestedLoopTest1 (id NUMBER(11)); commit; create table nestedLoopTest2 ( id NUMBER(11) ); commit; BEGIN FOR I IN 0.. 100 LOOP INSERT INTO nestedLoopTest1(id) VALUES(i); END LOOP; END; commit; BEGIN FOR i IN 0.. 100 LOOP INSERT INTO nestedLoopTest2(id) VALUES(i); END LOOP; END; commit; - 1.hashSelect * from t2 where id (select * from t2 where id (select * from t2 where ID (select * from T2 where ID)))Copy the code
  • 1.hash join(Hash join)

We continue with the following SQL:

select t1.* from nestedLoopTest1 t1,nestedLoopTest2 t2 where t1.id=t2.id;
Copy the code

Table nestedLoopTest1 and table nestedLoopTest2 are not indexed, so we can see the following execution plan:

The execution steps are shown in the figure above. We can see that both tables are scanned in full and then Hash Join is performed again. The principle of Hash Join will be introduced separately later. Hash Join loads small tables into memory and then associates large tables with small tables

  • 2,nested loop join(Internal nested loop connection)

We continue with the following SQL:

create index nestedLoopTest2index on nestedLoopTest2(id);
select t1.* from nestedLoopTest1 t1,nestedLoopTest2 t2 where t1.id=t2.id;
Copy the code

NestedLoopTest2index (nestedLoopTest2index); nestedLoopTest2index (nestedLoopTest2index); Because one record in table T1 may correspond to multiple records in table T2. TABLE ACCESS BY INDEX ROWID (ROWID, ROWID, ROWID

For loop nested connections, we can imagine two for loops nested.

In addition, when both tables are indexed, we continue to execute the following SQL:

create index nestedLoopTest1index on nestedLoopTest1(id);
select t1.* from nestedLoopTest1 t1,nestedLoopTest2 t2 where t1.id=t2.id;
Copy the code

Table T1 is no longer a full table scan, but a full index scan.

  • 3,sort merge join(Merge sort join)

The basic principle of this link mode is to judge whether the original table is sorted. If not, the associated fields are sorted. Determine whether the associated table is sorted, if not sorted, then sort, and finally merge the two sorted tables.

Oracle Execution plan

To understand how Oracle data is accessed and processed, we need to look at the execution plan, but the execution plan only tells us so much.

We login sqlPlus, take a simple SQL for illustration, a brief description of how we should understand the execution plan:

select * from emp;

Emp table is a built-in oracle employee table. Right-click Explain Plan or press F5 to view the execution Plan, as shown below:

The most important column in the execution plan is the Description column, which lists the execution steps of the SQL. This column has the following viewing rules:

  • 1. Under different levels, the step closer to the right is carried out first;
  • 2. In the case of the same level, the higher the result, the earlier the execution;

SQL > select * from emP where TABLE ACCESS is FULL; It doesn’t help to have an index, because we’re just trying to extract the whole table, and it doesn’t make sense to have an index, which also tells us that tables with indexes are not necessarily efficient, as we’ll talk about later.

Cost is a metric used by the Oracle optimizer to measure how expensive the SQL execution is, such as how many CPU resources it consumes.

Here are some examples of SQL accessed through indexes

Index unique scan

Empno is the primary key of the EMP table, and it is a unique index.

select * from emp where empno=7782

The oracle optimizer will perform a UNIQUE INDEX SCAN. After the INDEX is scanned, the value of the INDEX is 7782. Then the Oracle optimizer must fetch the corresponding block from the data file. TABLE ACCESS BY INDEX ROWID is used to query data in the TABLE ACCESS BY INDEX ROWID.

Actually go to fetch the data access to the data block of this step is sometimes not, that is when you just want to take the number empno rather than *, execution plan won’t go to fetch the data in the data file, step 2 is not had, because oracle directly from the index scan to return index directly after this value to go, don’t need to get the data, We don’t need to check as follows:

select empno from emp where empno=7782

In fact, we usually do not write such SQL, haha ~~~

Index range scan

Suppose we execute the following SQL statement:

select job from emp where empno>7782

Its implementation plan is as follows:

According to the execution plan, the execution mode of this type of SQL statement is index range scan.

Index Full scan

A full index scan can be used to determine the result of a full index scan, as in the following SQL statement:

select count(*) from emp

We counted the number of all data in the whole table and directly read the number of index data blocks. In Step 2, we summed up the demerits in step 1 and returned a total number.

Index Fast Full Scan

Let’s copy and rename a table with the following statement:

create table emp1 as (select * from emp); truncate table emp1; Insert 1,000,000 data BEGIN FOR I IN 0.. 1000000 LOOP INSERT INTO EMP1(EMPNO,ENAME) VALUES( I,CONCAT('TBL',I));
END LOOP;
END;
Copy the code

SQL > select index fast full scan, index fast full scan, index fast full scan, index fast full scan

Index Fast Full Scan differs from Index Full Scan in that the former can read multiple data blocks at a time, similar to parallel, while the latter reads data in serial.

SQL statements that use this type of execution are typically those where the result of the execution can be determined directly by the index. For example, we execute the following SQL:

Index Skip Scan

Index Skip scan is an index scan method provided after Oracle 9I. It is mainly used to solve the disadvantage that ACCESS TABLE FULL scan is adopted by default when non-leading column query is used for WHERE conditions in composite indexes. However, there are some limitations to using this feature, including the following:

  • 1. Less unique values in the leading column of the composite index (with many duplicate values)
  • 2. The database uses CBO optimizer, and the tables and indexes are analyzed
  • 3. There is no combined index leading column in the query condition

Next, we mainly verify two questions:

The corresponding SQL statement for the test is as follows:

Drop table student; commit; Create table Student (stuno NUMBER(11), stuname VARCHAR2(20), schoolno NUMBER(11), age NUMBER(3)); commit; Create index stucombIndex on student(stuname,schoolno); commit; Stuname (stuname) select * from student t (stunno) select * from student t (stunno) select * from student t (stunnowheret.schoolno=100; BEGIN FOR I IN 0.. 1000000 LOOP INSERT INTO student(stuno,stuname,schoolno) VALUES( i,'TBL',i); END LOOP; END; commit; Update student t update student tset t.stuname=concat('s',t.stuno) wheremod(t.stuno,10000)=0; commit; select count(*),count(distinct stuname) from student; Analyze table student compute Statisticsfor table for all columns forall indexes; Index skip scan select * from studentwhere schoolno = 1000;
Copy the code

1. In composite indexes, when non-leading columns are used for queries, the optimizer usesACCESS TABLE FULLA full table scan

SQL > alter table scan (); SQL > alter table scan ();

2. Verify index skip scan

And when do we create composite indexes? Consider the following scenarios:

  • 1. When a single – condition query is performed, more data is returned
  • 2. When the conditions are met, less data is returned

If and only if condition 1 and condition 2 are true, we can create a composite index at this point. For example, in the employee table, age=28 is a very large number of people, and role=Java programmer is a very large number of people. But age=28 and role=Java programmers return very little data!

Where XXX = XXX and xxx1=xxx1

6. Application scenarios

1, Oracle 10 million level large table paging query

Traditional Oracle pagination uses the following structure:

select *
from (
select fundacco,rownum rowno from
tbl_20191231
where rownum <= #{end}) b
where
b.rowno > #{start}
Copy the code

At that time, as start becomes larger and larger, the amount of data traversed by the outer layer sub-query will be more and more, and the actual production verification will be very slow. The amount of data is 500W, and 250 items per page. When START is larger than 200W, the average time is 1-2s.

How do you optimize it? We cannot optimize at the Oracle SQL level, but we can create a unique index by adding a new column, Rownos, with monotonically increasing values. We then run the following SQL query very quickly, averaging around 20ms.

select * from tbl_20191231 t where t.rownos > #{start} and t.rownos <= #{end} 

Copy the code

It’s not hard to see that we’re just using Oracle’s Index range scan feature. The implementation plan is as follows: