ClickHouse: MergeTree

“This is the fifth day of my participation in the November Gwen Challenge. See details of the event: The Last Gwen Challenge 2021”.

The most powerful table engine in ClickHouse is the MergeTree (MergeTree) engine and others in the family (*MergeTree), which supports indexing and partitioning and is comparable to innodb to Mysql.

MergeTree

Build table statements

create table order_table
(
    id           UInt32,
    item_id       String,
    total_amount Decimal(16.2),
    create_time  Datetime

) engine = MergeTree
      partition by toYYYYMMDD(create_time) primary key (id)
order by (id,item_id);
Copy the code

Insert data

insert into order_table
values (101.'i_001'.1000.00.'the 2020-06-01 12:00:00'),
       (102.'i_002'.2000.00.'the 2020-06-01 11:00:00'),
       (102.'i_004'.2500.00.'the 2020-06-01 12:00:00'),
       (102.'i_002'.2000.00.'the 2020-06-01 13:00:00'),
       (102.'i_002'.12000.00.'the 2020-06-01 13:00:00'),
       (102.'i_002'.600.00.'the 2020-06-02 12:00:00');
Copy the code

Parameters,

MergeTree has many more parameters (most of them are default values), but three parameters are more important and cover a lot of MergeTree concepts.

Partition by (Optional)

role

The purpose of partitioning is to reduce the scan scope and optimize the query speed. If you do not fill in a partition, only one partition will be used.

Partition directory

MergeTree consists of column files, index files, and table definition files, but if partitioned, these files will be stored in different partitioned directories.

parallel

After partitioning, ClickHouse processes query statistics that involve cross-partitions in parallel on a partition basis.

Data write and partition merge

A write to any batch of data will create a temporary partition that will not be incorporated into any existing partition. At some point (10-15 minutes later) ClickHouse automatically merges the data from the temporary partition into the existing partition (you can’t wait to execute it manually with Optimize).

optimize table xxxx final;
Copy the code

For example,

Insert again:

insert into order_table
values (101.'i_001'.1000.00.'the 2020-06-01 12:00:00'),
       (102.'i_002'.2000.00.'the 2020-06-01 11:00:00'),
       (102.'i_004'.2500.00.'the 2020-06-01 12:00:00'),
       (102.'i_002'.2000.00.'the 2020-06-01 13:00:00'),
       (102.'i_002'.12000.00.'the 2020-06-01 13:00:00'),
       (102.'i_002'.600.00.'the 2020-06-02 12:00:00');
Copy the code

The view data is not incorporated into any partitions

Manual Optimize

optimize table order_table final;
Copy the code

The query again

Primary Key primary key (Optional)

The primary key in ClickHouse, unlike other databases, provides only a primary index of the data, but is not the only constraint. This means that data with the same primary key can exist.
The primary key is set based on where conditions in the query statement.
Based on the conditions, corresponding index Granularity can be located by binary search on primary keys, avoiding full table scan.
Index granularity: Refers to the interval of data corresponding to two adjacent indexes in a sparse index. The default MergeTree in ClickHouse is 8192. It is not recommended to change this value unless the column has a large number of duplicate values, such as one different data in tens of thousands of rows in a partition.

Sparse index:

The advantage of sparse indexes is that you can locate more data with less index data, at the cost of locating only the first row of index granularity, and then scanning bit by bit.

Order by(Required)

Order by sets the order in which fields the data in a partition is stored in order.
Order BY is the only mandatory item in MergeTree, even more important than the primary key, because when the user does not set the primary key, a lot of processing is done by the order BY field (such as de-replay and summary).
Requirements: The primary key must be a prefix field to the ORDER BY field. If the order by field is (ID,item_id) then the primary key must be either ID or (id,item_id)

Hop index (secondary index)

Set allow_experimental_datA_SKIpping_indices =1; set allow_experimental_datA_skipping_indices =1; 2, create test table

create table order_table2
(
    id           UInt32,
    item_id       String,
    total_amount Decimal(16.2),
    create_time  Datetime,
    INDEX        a total_amount TYPE minmax GRANULARITY 5
) engine = MergeTree
      partition by toYYYYMMDD(create_time) primary key (id)
order by (id, item_id);
Copy the code

GRANULARITY N is the GRANULARITY of the second-level index to the first-level index GRANULARITY.

3. Insert data:

insert into order_table2
values (101.'s_001'.1000.00.'the 2020-06-01 12:00:00'),
       (102.'s_002'.2000.00.'the 2020-06-01 11:00:00'),
       (102.'s_004'.2500.00.'the 2020-06-01 12:00:00'),
       (102.'s_002'.2000.00.'the 2020-06-01 13:00:00'),
       (102.'s_002'.12000.00.'the 2020-06-01 13:00:00'),
       (102.'s_002'.600.00.'the 2020-06-02 12:00:00');
Copy the code

Using the following statement, you can see that secondary indexes can work for queries that do not have primary key fields.

clickhouse-client --send_logs_level=trace <<< 'select * from order_table2 where total_amount > toDecimal32(900., 2)';
Copy the code

Result observation:

2021.11.18 14:53:14.362092 [6221] {50C5A092-6853-47c7-bfc0-23ebed4fed39} <Debug> default.order_table2 (SelectExecutor): Index 'a' has dropped 0 granules. 2021.11.18 14:53:14.362223 [6221] {50C5A092-6853-47c7-bfc0-23ebed4fed39} <Debug> default.order_table2 (SelectExecutor): Index `a` has dropped 1 granules.Copy the code

Data TTL

TTL stands for Time To Live, and MergeTree provides the ability To manage the life cycle of data or columns.

Column level TTL

Create a table

create table order_table3
(
    id           UInt32,
    item_id       String,
    total_amount Decimal(16.2) TTL create_time+ interval 10 SECOND,
    create_time  Datetime
) engine = MergeTree
      partition by toYYYYMMDD(create_time) primary key (id)
order by (id, item_id);
Copy the code

Insert data (note: change according to actual time)

insert into order_table3
values (106.'s_001'.1000.00.'the 2020-06-12 22:52:30'),
       (107.'s_002'.2000.00.'the 2020-06-12 22:52:30'),
       (110.'s_003'.600.00.'the 2020-06-13 12:00:00');
Copy the code

Manually merge. After the effect expires, the specified field data returns to 0

Table level TTL

Data will be lost 10 seconds after create_time

alter table t_order_mt3 MODIFY TTL create_time + INTERVAL 10 SECOND;
Copy the code

The judgment field must be of the Date or Datetime type. You are advised to use the Date field of the partition.

Time cycle that can be used:

SECOND
MINUTE
HOUR
DAY
WEEK
MONTH
QUARTER
YEAR