In a PostgreSql database, tables with btree indexes are added. The data insertion speed is very slow on a basis of 180 million data. The insertion performance deteriorates rapidly as the height of the index tree becomes higher and higher.

This article uses PostgreSql table partitioning to optimize table partitioning.

Compiled by Visual C++ build 1914, 64-bit

Delete primary tables, child tables, and sequences

DROP TABLE IF EXISTS test_table CASCADE ;
DROP SEQUENCE IF EXISTS test_table_seq;
Copy the code

Create a primary table and sequence. There is no need to add indexes to the primary table. The primary table does not store actual data

CREATE SEQUENCE test_table_seq MINVALUE 1 MAXVALUE 999999999999;
CREATE TABLE test_table (
  "id" int8 NOT NULL DEFAULT nextval('test_table_seq'::regclass),
  "data_id" numeric,
  "logtime" numeric,
  "data_value" numeric
);
Copy the code

Logtime is a millisecond timestamp. The table is partitioned according to logTime, converted to month, and divided into different tables by month

Create the primary table trigger function

CREATE OR REPLACE FUNCTION test_table_trigger()
  RETURNS TRIGGER AS $$
Declare
	now_data VARCHAR;
	test_table_name VARCHAR;
	sql_text VARCHAR;
BEGIN
  now_data = substring(cast(to_timestamp(NEW.logtime/1000) as varchar),1.10);
	now_data = REPLACE(now_data, The '-'.'_');
	test_table_name = 'test_table_' || SUBSTRING(now_data,7); 
	sql_text = 'INSERT INTO ' || test_table_name || ' VALUES ( $1.* )';
	EXECUTE sql_text USING NEW;
  RETURN NULL;
END;
$$
LANGUAGE plpgsql;
Copy the code

The primary table sets the trigger

CREATE TRIGGER test_table_trigger
BEFORE INSERT ON test_table
FOR EACH ROW EXECUTE PROCEDURE test_table_trigger();
Copy the code

Create sub-table create function, batch generate all sub-tables by year, add check to the sub-table to check the logTime range and inherit the primary table, then add index to the sub-table

Create or replace function creat_test_table(table_year integer) RETURNS BOOLEAN as
$BODY$
Declare
	test_table_name VARCHAR;
	test_table_index_name VARCHAR;
	start_time_text VARCHAR;
	end_time numeric;
	start_time numeric;
	time1 VARCHAR;
	time2 VARCHAR;
	sql_text VARCHAR;
	row_result RECORD;
BEGIN
	start_time_text = table_year || '- 01-01';
	start_time = EXTRACT(epoch FROM CAST(start_time_text AS TIMESTAMP)) * 1000 - 28800000.;
  time1 = table_year || '- 02-01';
	time2 = (table_year+1) || '- 01-01';
  sql_text = 'select EXTRACT(epoch FROM cast(generate_series as TIMESTAMP))*1000-28800000 as bb from generate_series(to_date('||quote_literal(time1)||', '||quote_literal('yyyy-mm-dd')||'),to_date('||quote_literal(time2)||', '||quote_literal('yyyy-mm-dd')||'), '||quote_literal('1 month')||' ::interval)';
	FOR row_result IN EXECUTE sql_text LOOP
		end_time = row_result.bb;
		test_table_name = 'test_table_'||SUBSTRING(replace(to_char(to_timestamp(start_time/1000), 'yyyy-mm-dd'), The '-'.'_'),0.8);
		raise notice 'creat table %',test_table_name;
		IF (select count(*) from pg_class where relname = test_table_name) ! =0 THEN
			CONTINUE;
		end if;	
		test_table_index_name = test_table_name || '_idx';
		sql_text = 'CREATE Table '||test_table_name||' (CHECK(logtime>= ' || start_time || ' and logtime< ' ||end_time || ')) INHERITS(test_table); ';
		EXECUTE sql_text;
		sql_text = 'CREATE UNIQUE INDEX '||test_table_index_name||' ON ' ||test_table_name||' USING btree ( "data_id" ASC , "logtime" ASC ); ';
		EXECUTE sql_text;
		start_time = end_time;
	END LOOP;
	RETURN TRUE;
END
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
Copy the code

Enable constraint exclusion and maintain partitions

set constraint_exclusion = on;
set enable_partition_pruning = on;
Copy the code

Run the subtable creation function to create all subtables for 2021

SELECT creat_test_table(2021);
Copy the code

Thirteen tables and two functions appear after execution

Execute insert statement to test 1630857600000 as 2021-09-06 00:00:00

INSERT INTO test_table(data_id, logtime, data_value) VALUES (1.1630857600000.1);
Copy the code

This record would normally be inserted into the test_TABLE_2021_09 table

Verify test_table_2021_09 table

SELECT * FROM test_table_2021_09;
Copy the code

The table partition can be inserted properly

Since we set the trigger, the where clause containing the logTime condition should be optimized by the optimizer at query time

Let’s start validating the query statement by writing a query statement that contains the logTime condition in the WHERE clause to see its query plan

EXPLAIN SELECT * FROM test_table where logtime = 1630857600000;
Copy the code

Knowing that the query has been optimized, he matches which subtables to retrieve data from, rather than scanning all the subtables

Compare a query plan that scans all subtables. The query statement does not contain the logTime condition and therefore cannot be optimized by the optimizer

EXPLAIN SELECT * FROM test_table;
Copy the code

A PG database table partition belongs to a physical table and cannot be logically separated. You only need to perform operations on the main table to manage all data

Matters needing attention:

  1. When a conditional statement contains logtime, do not write an expression such as logtime/1000>1630857600000. This may not be optimized by the optimizer, and all sub-tables will be scanned during the query. When the number of tables increases, the query efficiency will be seriously affected.
  2. Do not write complex expressions in check to create child tables, preferably in the simplest format (field name conditional value), which contains calculations, or other expressions that will make optimizer optimization ineffective and thus scan all child tables during query, affecting query efficiency.
  3. There is no need to add indexes to the primary table because there is no data in the primary table.
  4. Checks in the primary table are inherited from the quilt table.
  5. You can create a child table by adding logic to the trigger that checks for the existence of the table name that you want to insert into the child table. Or write stored procedures that generate child tables uniformly at a fixed time.