By chance, I saw the SQL used by partition by and the index used mentioned by others. Because I did not know enough about this aspect, I decided to spend some time to learn about it, and thus the following came into being.

Setting up environment and preparing sample data

This article uses the Postgre SQL database, version 12.8.

Create table

There are three ways to connect to the database:

Connect to the PostgreSQL database using PSQL. PSQL is an interactive terminal program provided by PostgreSQL. With the PSQL tool, you can perform a number of operations, such as executing SQL statements, managing database objects, and so on. Check the version number (select version();)

2. Use pgAdmin to connect to the PostgreSQL database

The second way to connect to the database is to use the pgAdmin GUI application. Using the pgAdmin GUI application, you can interact with the PostgreSQL database server through an intuitive user interface.

3. Connect to the PostgreSQL database through another management software, such as Navicat or Dbeaver.

After connecting to the default database postgres, create the course table and insert the sample data.

CREATE TABLE public.course (
	id int8 NOT NULL GENERATED BY DEFAULT AS IDENTITY,
	language_id int8 NOT NULL,
	"name" varchar(100) NOT NULL,
	"level" int4 NULL,
	created_date timestamp NULL,
	"version" varchar(50) NULL,
	total_lessons int4 NULL,
	country varchar(100) NOT NULL,
	CONSTRAINT course_pkey PRIMARY KEY (id)
);
Copy the code

Sample data can be created in a CSV file and imported into a table.

Next, we use group by and partition by respectively to group the results by the specified column, and use aggregation functions such as Avg(), Min(), Max() to calculate the required values.

Group by is compared with Partition BY

For current requirements, the following values are found in the course table (no practical significance) :

  • The maximum language ID of a country;
  • The maximum language ID of a country;
  • The average language ID of a country;

Group by group

The group by execution syntax is as follows:

SELECT expression, aggregate function ()
FROM tables
WHERE conditions
GROUP BY expression
Copy the code

Next, write our SQL statement:

select
	country ,
	min(language_id) as minLanguageId,
	avg(language_id) as avgLanguageId,
	max(language_id) as maxLanguageId
from
	course
group by
	country ;
Copy the code

The result is shown in the following figure. After grouping, there are only 10 items of data corresponding to 10 countries. Queries are unordered by default.

If we want to add the languageId column to the query result, look like this:

select
	country ,
	language_id ,
	min(language_id) as minLanguageId,
	avg(language_id) as avgLanguageId,
	max(language_id) as maxLanguageId
from
	course
group by
	country ;
Copy the code

After executing this query, we receive an error message. In the SQL GROUP BY clause, if a column is also used in the GROUP BY clause, we can use that column in the SELECT statement. It does not allow any columns in the SELECT clause that are not part of the GROUP BY clause (leaving aside aggregate functions such as count, sum, min, and so on).

ERROR: column "course.language_id" must appear in the GROUP BY clause or be used in an aggregate function
Copy the code

At this point, we can consider using Partition by to solve this problem.

In addition, when we use order BY, if the column field does not exist in the group BY substatement, the above error will be reported.

select
	country ,
	min(language_id) as minLanguageId,
	avg(language_id) as avgLanguageId,
	max(language_id) as maxLanguageId
from
	course
group by
	country
order by level asc;
Copy the code

The correct formula is as follows, and the execution result contains 41 entries.

select
	country ,
	level,
	min(language_id) as minLanguageId,
	avg(language_id) as avgLanguageId,
	max(language_id) as maxLanguageId
from
	course
group by
	country,level
order by level asc;
Copy the code

Partition by partitioning

The execution statement is:

select
	country ,
	language_id ,
	min(language_id) over(partition by country) as minLanguageId,
	avg(language_id) over(partition by country) as avgLanguageId,
	max(language_id) over(partition by country) as maxLanguageId
from
	course;
Copy the code

The following figure shows the query results. By default, the query results are in ascending order.

Add a column that does not belong to a partition by clause.

Try the order by syntax here, too, but there are two ways to write it, with different meanings.

Writing a:

select
	country ,
	language_id ,
	level ,
	min(language_id) over(partition by country order by level asc) as minLanguageId,
	avg(language_id) over(partition by country order by level asc) as avgLanguageId,
	max(language_id) over(partition by country order by level asc) as maxLanguageId
from
	course;
Copy the code

The execution result is as follows:

OVER(PARTITION BY… ORDER BY…) Is divided by country and level, and then sorted by level.

Method 2:

select
	country ,
	language_id ,
	level ,
	min(language_id) over(partition by country) as minLanguageId,
	avg(language_id) over(partition by country) as avgLanguageId,
	max(language_id) over(partition by country) as maxLanguageId
from
	course
order by country,level asc;
Copy the code

The execution result is as follows:

This method is only for country partition, but finally for country and level sorting, from the result can be seen from the first method.

The difference between

The differences between Partition BY and group BY are as follows:

1, The number of data after the group by, the number of data records returned; Partition by retrieves all records in the table.

Group by returns only one row of records by group; Partition by, on the other hand, provides an aggregate column with the same value for each record in the same partition.

Select * from group BY; select * from group by; select * from group by; Partition by can return any field.

4, group by strictly according to the fields contained in the group by statement; The partition by statement contains both the partition by statement and the order BY statement.

The group by query statement is unordered by default; By default, the partition by query is in ascending order by partition field.

Partition by usage

The format for this function is as follows:

SELECT *, aggregate function over(partition by expression)
FROM tables
Copy the code

Over keywords:

  • The over keyword means treating a function as a windowing function rather than an aggregate function. The SQL standard allows all aggregate functions to be used as windowable functions, using the over keyword to distinguish between the two usages.
  • Options are often added in parentheses after the over keyword to change the scope of the window in which the aggregation is performed.
  • If the option in parentheses after the over keyword is empty, the windowing function aggregates all the rows in the result set.

There are many kinds of aggregate function, and we will demonstrate them one by one.

Unlike the group by clause, the partition by clause creates partitions independent of the result set, only for aggregate computation, and partitions created by different windowing functions do not affect each other.

Multiple windowing functions can be used simultaneously in the same SELECT statement without interfering with each other.

row_number()

select
	country ,
	language_id ,
	row_number() over(partition by country) as rn,
	row_number() over() as rownum
from
	course;
Copy the code

The following information is displayed:

Row_number () numbers the results of each partition, and rownum numbers all records.

rank()

select
	country ,
	language_id ,
	level,
	rank() over(partition by country order by level) as rn,
	row_number() over() as rownum
from
	course;
Copy the code

The execution result is as follows:

In the same country, if the level is the same, there will be juxtaposition, so the values of Rn will jump in order.

dense_rank()

select
	country ,
	language_id ,
	level,
	dense_rank () over(partition by country order by level) as rn,
	row_number() over() as rownum
from
	course;
Copy the code

The result is as follows:

Unlike rank(), dense_rank() is sorted according to records of the same level.

Other functions

Count () to count the specified columns under each partition. Each record in the same partition counts the same value.

Max (), gets the maximum value of the specified column under each partition.

Min () to get the minimum value of the specified column under each partition.

Sum () gets the sum of the specified columns under each partition.

Avg () to get the average value of the specified column under each partition.

First_value () gets the first data in the specified column under each partition.

Last_value (), which retrieves the last value of the specified column under each partition.

Lag (), which is used to partition columns in the partition by statement and obtain the value of a column in the previous row. The first row is null.

select
	country ,
	language_id ,
	level,
	lag (language_id) over(partition by country order by level) as languageIdSum,
	row_number() over() as rownum
from
	course;
Copy the code

The result is as follows:

It should be noted that although the above statement partitions according to country and level, it only considers the country field as the partitioning basis for the LAG function. We can see the result of the following SQL statement.

select
	country ,
	language_id ,
	level,
	lag (language_id) over(partition by country, level) as languageIdSum,
	row_number() over() as rownum
from
	course;
Copy the code

Lag takes three parameters. The first parameter is the column name (mandatory), the second parameter is the offset, which defaults to 1, and the third parameter is the default value when the recording window is out, which defaults to null.

The following is an example:

select
	country ,
	language_id ,
	level,
	lag (language_id,2,'9999') over(partition by country order by level) as languageIdSum,
	row_number() over() as rownum
from
	course;
Copy the code

The execution result is as follows:

Lead (), as opposed to the lag function, takes the value of the next line, which is null.

Mysql > select partition by, partition by, partition by, partition by, partition by, partition by, partition by Although you are familiar with MySQL’s Explain commands, PostgreSQL also executes explain commands to view execution plans, but the results are quite different. So let’s look at the Explain command in PostgreSQL.

Explain command

In MySQL, we use the Explain command to view the execution plan and confirm whether the SQL statement hit the index to solve the problem of slow SQL execution.

PostgreSQL also supports explain commands, which do the same thing, but with different results and simpler execution plan content than MySQL.

A simple example

explain select * from product p; The QUERY PLAN -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Seq Scan on the product p (cost = 0.00.. 629.27 rows= 562 width= 562) (1 row)Copy the code

Because this query has no WHERE clause, it must scan all the rows of the table, so the planner has chosen to use a simple sequential scan plan.

Analysis of output results:

  • Seq Scan on courseDivided into two parts:Seq ScansaidFull table scan (sequential scan), if the amount of data is large, then this type of query is the slowest, then you need to consider optimizing the table structure or optimizing the querySQL, andproductRepresents the table to be queried.
  • Cost = 0.00.. 629.27 rows=5527 width=648, representing the estimated startup cost, estimated total consumption, query rows, and average row width (in bytes) of the planned node, respectively.

    • costBy.. 0.00 and 629.27. The first digit represents the estimated startup cost, the cost required to obtain the first row of data. If there is a WHERE condition, this value may not be 0. The second number represents the total cost of returning all the data. The calculation of this value is traceable, and I’ll talk about that later.
    • rowsRepresents the number of rows returned, as in the examplerows=5527That means 5527 rows will be returned.
    • widthRepresents the average line width, which in this example is 648 bytes.

To calculate the value of 3.51 in cost, the following statements can be executed first:

postgres=# SELECT relpages, reltuples FROM pg_class WHERE relname = 'product'; Relpages | reltuples -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- - 574 | 5527.0 (1 row)Copy the code

According to the results, the Product table has 574 disk pages and 5527 rows, and the estimated cost is read through (disk pages)

Seq_page_cost) + (row scan

Cpu_tuple_cost). By default, seq_PAGe_cost is 1.0 and CPU_tuple_cost is 0.01, so the estimated cost is (574 * 1.0) + (5527 * 0.01) = 629.27.

grammar

ANALYZE

The ANALYZE option looks at the actual execution of the SQL to get the actual execution plan of the SQL command. Because it was actually executed, you can see how long each step of the execution plan took, as well as the number of rows it actually returned.

explain(ANALYZE) select * from product p where p.id =30243; Index Scan using product_pkey on product P (cost=0.28.. Rows =1 width= 462) (actual time= 462.. Loops =1) Index Cond: (id = 30243) Planning Time: 0.060 ms Execution Time: 0.034 msCopy the code

Actual time values are measured in milliseconds of real time, while cost estimates are in arbitrary units; So they are likely to be inconsistent. The actual number of rows in the query is also different from that in cost. Loops indicate that the query was executed only once.

Planning Time Time when query plans are generated.

Execution time represents the actual SQL Execution time, excluding the generation time of the query plan.

BUFFERS

The BUFFERS option is TRUE to show information about the use of the cache; the default is FALSE. This parameter can only be used with the ANALYZE parameter. Buffer information includes the number of hit blocks, updated blocks, and squeezed blocks of shared blocks (regular table or index blocks), local blocks (temporary table or index blocks), and temporary blocks (short-term data blocks related to sorting or hashing).

explain(ANALYZE,BUFFERS) select * from product p where p.id =30243; Index Scan using product_pkey on product P (cost=0.28.. Rows =1 width= 462) (actual time= 462.. 0.031 rows=1 loops=1) Index Cond: (id = 30243) Buffers: shared hit=6 Planning Time: 0.311ms Execution Time: 0.0311msCopy the code

The index

PostgreSQL provides several index types: B-tree, Hash, GiST, SP-gist, and GIN. Each index type is better suited for certain query types because they use different index structures. By default, the CREATE INDEX command creates the B-tree INDEX, which is suitable for most cases.

B-tree is suitable for handling equal-and-range queries over data that can be stored sequentially. In particular, PostgreSQL query planners consider using b-tree indexes when an indexed field involves using one of the <, <=, =, >=, > operators for comparison. Constructs equivalent to these operator combinations, such as BETWEEN and IN, can also be implemented using the search B-tree index. Similarly, IS NULL or IS NOT NULL conditions in index columns can be used with B-tree indexes.

PostgreSQL provides five indexes: unique index, primary key index, multi-attribute index, partial index, and expression index.

  • Unique index: Used to enforce the uniqueness of a column value or a combination of columns. Currently, only B-tree can be declared unique. When an index is declared unique, multiple table rows in an index are not allowed to have the same index value. Null values are treated differently. Note: There is no need to manually create an index on a unique column; doing so would simply duplicate the index that was created automatically.

  • Primary key index: A unique index is automatically created for the primary key column to implement the primary key constraint. A primary key index is a special type of unique index.

  • Multi-attribute index: a federated index, defined on multiple columns. Currently, only b-tree, GiST, GIN, and BRIN index types support multi-column indexes, and up to 32 columns can be specified (this limit can be changed in the source code file pg_config_manual.h, but PostgreSQL needs to be recompiled if it is changed).

  • Partial index: An index built on a subset of a table defined by an expression (an expression is a predicate of a partial index) that contains only those tuples of the table that satisfy this predicate.

    create index stu_name_idx on student(name) where id>1 and id<255;
    Copy the code
  • Expression index: An index column is not necessarily a column of the underlying table, but can also be a function or scalar expression computed from one or more columns of the table. This feature is useful for quickly retrieving the contents of a table based on computed results. Let’s say we index the lowercase result of a field:

    CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
    Copy the code

    Move the expression to the right of = to avoid index misses.

When a table is scanned, the query optimizer usually has no way for the user to decide whether to select a sequential scan or an index scan (some databases offer HINT, but PostgreSQL does not). The path selection is done by the database query optimizer module.

Scanning way

Sequential scan

Sequential scan, also known as full table scan, is the most basic scanning mode. Its algorithm complexity is O(N), where N represents the number of tuples in the entire table, that is, all tuples in a table will be read. If the amount of data is large, this query mode is the slowest, and you need to optimize it by building indexes.

explain(ANALYZE,BUFFERS) select * from product p where p.id =30243; The QUERY PLAN -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Seq Scan on the product p (cost = 0.00.. Rows =1 width=648) (actual time=0.012.. 0.916 rows=1 loops=1) Filter: (id = 30243) Rows Removed by Filter: 5026 Buffers: shared hit=574 Planning Time: Execution Time: 0.937 msCopy the code
  • Filter: (ID = 30243) Indicates the Filter operation on the Seq Scan node. That is, each row is filtered during the full table Scan. The Filter condition is ID = 30243
  • Rows Removed by Filter: 5526 indicates how many Rows were filtered by the filtering operation.
  • Buffers: shared hit=574 Indicates that 574 blocks have been hit from the shared cache, which belongs to the Buffers information of the Seq Scan node. This information is displayed only if the Buffers option in the EXPLAIN command is on.

An index scan

Index Scan

explain(ANALYZE,BUFFERS) select * from product p where p.id =30243; QUERY PLAN -------------------------------------------------------------------------------- Index Scan using Product_pkey on product p (cost=0.28.. Rows =1 width=648) (actual time=0.017.. 0.018 rows=1 loops=1) Index Cond: (id = 30243) Buffers: shared hit=3 Planning Time: 0.066 ms Execution Time: 0.035 msCopy the code

Index Scan indicates an Index Scan. Index Cond: (ID = 30243) Indicates the condition of the Index scan.

As you can see, the same conditions on the same table are scanned faster with the use of indexes. Buffers: shared hit=3 Buffers: shared hit=3 Buffers: shared hit=3 Buffers: shared hit=3

IndexOnly Scan

IndexOnly Scan is an overwrite index Scan. The returned results are overwritten by all indexes scanned without the need for query back to the table.

explain(ANALYZE,BUFFERS) select p.id from product p where p.id =30243; Index Only Scan using product_pkey on product P (cost=0.28.. Rows =1 width=8) (actual time=0.033.. 0.034 rows=1 loops=1) Index Cond: (id = 30243) Heap Fetches: 1 Buffers: shared hit=4 Planning Time: Execution Time: 0.048 msCopy the code

Note: An index scan is not performed if there is an index, if the column in the WHERE condition has an index, but because the planner may feel that a full table scan is less expensive, it is performed instead. As shown in the following example:

explain(ANALYZE,BUFFERS) select * from course c where c.id =6039; Seq Scan on Course C (Cost =0.00.. Rows =1 width= 1) (actual time=0.007.. 0.023 rows=1 loops=1) Filter: (id = 6039) Rows Removed by Filter: 150 Buffers: shared hit=5 Planning Time: 0.130 ms Execution Time: 0.038 msCopy the code

Bitmap scanning

Bitmap scanning is classified into BitmapIndex Scan and BitmapHeap Scan.

BitmapIndex Scan is similar to Index Scan in that it is an index-based Scan, but the BitmapIndex Scan node returns a bitmap instead of a tuple each time it is executed, where each bitmap represents a scanned chunk of data. BitmapHeap Scan generally acts as the parent node of BitmapIndex Scan and converts the bitmap returned by BitmapIndex Scan to the corresponding tuple. The biggest advantage of this method is that random reads of Index Scan are converted to physical reads of data blocks, which greatly improves Scan performance when there is a large amount of data.

BitmapHeap Scan is the parent of BitmapIndex Scan.

The bitmap scanning path is traversed according to the depth-first algorithm, as shown in the following figure. BitmapIndex Scan creates a bitmap in memory for the rows or blocks that meet the conditions. After the index is scanned, the BitmapHeap Scan reads the corresponding data from the bitmap to the data file of the table. If two indexes are used, the bitmap formed by the two indexes can be combined into one by AND OR OR calculation, AND then read the data in the table data file.

The operation of BitmapAnd (two index range search conditions are connected by AND) and BitmapOr (two index range search conditions are connected by OR) can expand the original “flat” and “sub-scan” index scan path into “tree” and “multiple scan” bitmap scan path. The leaf node of this tree must be lndexPath, but the lndexPath is responsible for generating bitmaps instead of heap table scanning, leaving the heap table scanning to BitmapHeapPath.

In PostgreSQL, scheduled execution uses a top-down approach. It first performs the Bitmap Heap Scan node, but the node doesn’t have any data yet, so it asks its children (in this case, Bitmap Index Scan). Once the bitmap index scan returns the bitmap, the bitmap heap scan continues processing.

Let’s start with a simple example:

explain(ANALYZE,BUFFERS) select * from product p where p.id <3243; QUERY PLAN ------------------------------------------------------------------------------ Bitmap Heap Scan on product p (cost = 22.79.. Rows = 100 width= 100) (actual time= 100.. 0.293 rows= 100 loops=1) Recheck Cond: (id < 100) Heap Blocks: exact=99 Buffers: Shared hit=105 -> Bitmap Index Scan on product_pkey (cost=0.00.. Rows = 100 width=0) (actual time=0.023.. Loops =1) Index Cond: (id < 3243) Buffers: shared hit=6 Planning Time: 0.311ms Execution Time: 0.311msCopy the code

As you can see as part of the plan, the first node is a bitmap heap scan and its children are bitmap index scans. Take the following steps to do this:

  1. A bitmap heap scan requests a tuple from a bitmap index scan.
  2. Bitmap index scan scans an index based on a condition (ID < 3243) in almost the same manner as in a normal index scan. But instead of returning tiDs corresponding to heap data (consisting of page numbers and offsets within them), it adds these tiDs to the bitmap. The PostgreSQL data storage structure will be covered in a special article next time.
  3. The Bitmap Heap Scan then reads the Bitmap to get the Heap data corresponding to the page number and offset stored. It then checks visibility, eligibility, and so on, and returns a tuple based on the results of all these checks.

In addition, it can also be seen from the actual time of Bitmap Heap Scan and BitmapIndex Scan that the cost of Bitmap Heap Scan includes Bitmap index Scan.

Let’s look at some interesting cases.

Let’s add another WHERE condition to the above statement.

explain select * from product p where p.id <3243 and "type" ='FREE'; QUERY PLAN ------------------------------------------------------------------------------ Bitmap Heap Scan on product p (cost = 22.71.. 535.67 rows=5 width=648) Recheck Cond: (id < 3243) Filter: ((type)::text = 'FREE'::text) -> Bitmap Index Scan on product_pkey (cost=0.00.. 22.71 rows= 100 width=0) Index Cond: (id < 3243) (5 rows)Copy the code

The new condition “type” =’FREE’ reduces the expected output lines, but not the overhead, because we still need to access the same rows. Note that the type clause cannot be used as an index condition because the index is only built on the ID column. It is used as a filter for rows retrieved from the index. So the overhead actually increased slightly to reflect this extra check.

If we add the limit condition.

explain select * from product p where p.id <3243 limit 100; The QUERY PLAN -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Limit (cost = 0.28.. Rows =100 width=648) -> Index Scan using product_pkey on product P (cost=0.28.. 544.33 rows= 100 width= 100) Index Cond: (id < 100) (3 rows)Copy the code

Although the above statement does an index scan, if you change the limit value to, say, 300, you still do a bitmap scan. The range at which the post-limit value will scan differently needs to be tested and depends on the environment configuration.

filter

Whether a full table scan or an index scan, conditional filtering may be used, that is, adding where conditions.

explain select * from product p where "type" ='FREE'; The QUERY PLAN -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Seq Scan on the product p (cost = 0.00.. 643.09 rows=89 width=648) Filter: ((type)::text = 'FREE'::text) (2 rows)Copy the code

If the column of the WHERE condition has an index, it is possible to perform a full table scan with conditional filtering, a bitmap scan, and an index scan.

explain select * from product p where p.id <32243 ; The QUERY PLAN -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Seq Scan on the product p (cost = 0.00.. 643.09 rows= 462 width= 462) Filter: (id < 362) (2 rows) explain select * from product p where id < 362; QUERY PLAN ------------------------------------------------------------------------------ Bitmap Heap Scan on product p (cost = 22.79.. 534.94 ROWS =324 width=648) Recheck Cond: (ID < 3243) -> Bitmap Index Scan on product_pkey (cost=0.00.. 22.71 rows= 362 width=0) Index Cond: (id < 362) (4 rows) explain select * from product p where p < 362; QUERY PLAN -------------------------------------------------------------------------------- Index Scan using Product_pkey on product p (cost=0.28.. 6.27 rows=1 width=648) Index Cond: (id < 392) (2 rows)Copy the code

conclusion

  • In most cases, an Index Scan is faster than a Seq Scan. However, if the obtained result set accounts for a large proportion of all data, Index Scan is not as fast as direct full table Scan because the Index is scanned first and then the table data is read.
  • If the ratio of the obtained result set is small but the number of tuples is large, Bitmap Index Scan may perform better than Index Scan.
  • If the obtained result set can be overwritten by the Index, Index Only Scan usually provides the best performance because it Only scans the Index without reading the data. However, if the VM file is not generated, performance may be worse than Index Scan.

extension

Why is bitmap scanning faster than index scanning when retrieving a larger percentage of PostgreSQL tables?

A normal index scan retrieves a tuple pointer from the index at a time and accesses that tuple in the table immediately. A bitmap scan retrieves all tuple Pointers from the index at once, sorts them using an in-memory “bitmap” data structure, and then accesses table tuples in physical tuple position order. Bitmap scanning increases the locality of references to tables at the cost of more overhead in managing “bitmap” data structures — and at the cost of no longer retrieving data in indexed order. If you want to sort by the column field corresponding to the index, the planner selects plain index scan.

PostgreSQL data is stored in a page. Each row is called a tuple. How does the index locate row records? Index data contains two parts (key= XXX, TID=(block= XXX,offset= XXX)). Key represents the real data, TID represents the pointer to the data row, block represents the page number,offset represents the offset of the row, Line pointer to the data page that points to the real tuple.

For example, for the first index data I1, the heap data will point to {BLkno-5, offset= 20}, for I2 to {blkno-1, offset= 30}, for I3 to {blkno-8, offset=40}, and so on. As shown in the figure below, if we scan according to a normal index, we know that the index is ordered, but the block no corresponding to each index is not consecutively ordered.

The bitmap obtained by bitmap index scanning contains TID data. Although the bitmap needs extra space to store TID, the stored TID is in order, that is {BLKno-1, offset = 30}, {BLkno-5, offset = 20}, {blkno-8, offset=40}, etc. After the block list is sorted, the sequential scan will use only sequential I/O. And we know that sequential I/O is better than random I/O.

For example, the above bitmap scan execution statement, we slightly modify to see if the bitmap scan.

explain(ANALYZE,BUFFERS) select * from product p where p.id <3243 order by p.id; Index Scan using product_pkey on product P (cost=0.28.. 544 rows= 348 width= 348) (actual time=0.008.. 0.300 rows= 100 loops=1) Index Cond: (id < 3243) Buffers: shared hit= 369 Planning Time: 0.08ms Execution Time: 0.391 msCopy the code

As you can see, if you want to sort by ID, it is better to take advantage of the orderliness of the index, so it is cheaper to use index scanning here.

reference

PostgreSQL12 index

PostgreSQL books link

Bitmap scanning in PostgreSQL

Why is bitmap scanning faster than index scanning when retrieving a larger percentage of PostgreSQL tables?