** Summary: ** This article is a translation of Altinity’s series of technical articles on ClickHouse. ClickHouse, an open source analysis engine for Online analytical Processing (OLAP), is widely used by domestic and foreign companies because of its excellent query performance, petabyte data size and simple architecture. This series of technical articles will cover ClickHouse in detail.

preface

This article is a translation of Altinity’s technical series on ClickHouse. ClickHouse, an open source analysis engine for Online analytical Processing (OLAP), is widely used by domestic and foreign companies because of its excellent query performance, petabyte data size and simple architecture.

Ali Cloud EMR-OLap team, based on the open source ClickHouse optimization series, to provide open source OLAP analysis engine ClickHouse on the cloud hosting service. EMR ClickHouse is fully compatible with the open source version of the product and provides cloud product features such as rapid cluster deployment, cluster management, capacity expansion, reduction, and alarm monitoring, while optimizing ClickHouse read and write performance. Improved ClickHouse’s ability to quickly integrate with other EMR components. Visit help.aliyun.com/document\_d… Learn more.

He Yuan (Jing Hang), senior product expert of Ali Cloud Computing Platform Business Division

(Photo by Altinity)

Handle real-time updates in ClickHouse

directory

  • A short history of ClickHouse updates
  • Use cases
  • Implement update
  • conclusion
  • subsequent

In OLAP databases, mutable data is generally undesirable. ClickHouse also does not welcome variable data. Like some other OLAP products, ClickHouse doesn’t even support updates initially. Updates were added later, but like many other features, they were added “the ClickHouse way.”

Even now, ClickHouse updates are asynchronous and therefore difficult to use in interactive applications. Still, in many use cases, users need to make changes to existing data and expect immediate results. Can ClickHouse do that? B: Sure.

A short history of ClickHouse updates

Back in 2016, the ClickHouse team published an article titled “How to Update Data in ClickHouse.” At the time ClickHouse did not support data modification, only special insert structures were used to simulate updates, and data had to be discarded by partition.

To meet the GDPR requirements, the ClickHouse team provided UPDATE and DELETE in 2018. Updates and deletions from ClickHouse continue to be one of the most read articles on the Altinity blog. This asynchronous, non-atomic UPDATE is implemented in the form of an ALTER TABLE UPDATE statement and can disrupt large amounts of data. This is useful for batch operations and infrequent updates, which do not require immediate results. Although “normal” SQL updates appear on the roadmap every year with no problem, they are not implemented in ClickHouse. If we need to update behavior in real time, we must use other methods. Let’s consider a practical use case and compare the different implementations in ClickHouse.

Use cases

Consider a system that generates various alarms. From time to time, the user or machine learning algorithm queries the database to see and confirm new alarms. Confirm that the operation requires modifying the alarm record in the database. Once confirmed, the alarm disappears from the user’s view. This looks like an OLTP operation and is out of place with ClickHouse.

Since we cannot use the update, we have to insert the modified record instead. Once we have two records in the database, we need an efficient way to get the latest records. To do this, we will try three different approaches:

  • ReplacingMergeTree
  • Aggregation function
  • AggregatingMergeTree

ReplacingMergeTree

Let’s start by creating a table to store alarms.

CREATE TABLE alerts(
  tenant_id     UInt32,
  alert_id      String,
  timestamp     DateTime Codec(Delta, LZ4),
  alert_data    String,
  acked         UInt8 DEFAULT 0,
  ack_time      DateTime DEFAULT toDateTime(0),
  ack_user      LowCardinality(String) DEFAULT ''
)
ENGINE = ReplacingMergeTree(ack_time)
PARTITION BY tuple()
ORDER BY (tenant_id, timestamp, alert_id);
Copy the code

For simplicity, all alert-specific columns are packaged into a common “alert_data” column. But you can imagine that an alarm might contain dozens or even hundreds of columns. Also, in our example, “alert_id” is a random string.

Notice the ReplacingMergeTree engine. ReplacingMergeTee is a special table engine that replaces data BY primary key with an ORDER BY statement — a new version of the row with the same key value replaces the old version of the row. In our use case, the “degree of newness of row data” is determined by the “ACK_time” column. Substitution takes place in a background merge operation, and it does not happen immediately, nor is it guaranteed to happen, so consistency of query results is an issue. However, ClickHouse has a special syntax for handling such tables, which we will use in the following query.

Before running the query, let’s populate the table with some data. We generate 10 million alerts for 1000 tenants:

INSERT INTO alerts(tenant_id, alert_id, timestamp, alert_data)
SELECT
  toUInt32(rand(1)%1000+1) AS tenant_id,
  randomPrintableASCII(64) as alert_id,
  toDateTime('2020-01-01 00:00:00') + rand(2)%(3600*24*30) as timestamp,
  randomPrintableASCII(1024) as alert_data
FROM numbers(10000000);
Copy the code

Next, we validate 99% of the alarms, providing new values for the “acked”, “ACK_user” and “ack_time” columns. We’re just inserting a new row, not updating it.

INSERT INTO alerts (tenant_id, alert_id, timestamp, alert_data, acked, ack_user, ack_time) SELECT tenant_id, alert_id, timestamp, alert_data, 1 as acked, concat('user', toString(rand()%1000)) as ack_user, now() as ack_time FROM alerts WHERE cityHash64(alert_id) % 99 ! = 0;Copy the code

If we were to query the table now, we would see the following:

SELECT count () FROM alerts ┌ ─ ─ the count () ─ ┐ │ │ 19898060 └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ 1 rows in the set. The Elapsed: 0.008 SEC.Copy the code

The table clearly has both confirmed and unconfirmed rows. So the substitution hasn’t happened yet. To see the “real” data, we must add the FINAL keyword.

SELECT count () FROM alerts FINAL ┌ ─ ─ the count () ─ ┐ │ │ 10000000 └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ 1 rows in the set. The Elapsed: Processed 19.90 million rows, 1.71 GB (5.39 million rows/s, 463.39 MB/s)Copy the code

Now the count is correct, but look how much the query time has increased! With FINAL, ClickHouse must scan all rows and merge them by pressing the primary key when executing a query. You get the right answer, but it costs a lot of money. Let’s see if it would be better to filter only unconfirmed guilds.

SELECT count () FROM alerts FINAL WHERE NOT acked ┌ ─ the count () ─ ┐ │ │ 101940 └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ 1 rows in the set. The Elapsed: Processed 19.07 million rows, 1.64 GB (5.34 million rows/s, 459.38 MB/s)Copy the code

Although the count was significantly reduced, the query time and amount of data processed remained the same. Filtering does not help speed up queries. As the table gets bigger, the cost can get even bigger. It cannot be extended.

** Note: ** For readability, all queries and query times are displayed as if they were running in clickhouse-client. In fact, we tried several queries to make sure the results were consistent and verified using the “Clickhouse-benchmark” utility.

Well, querying the entire table doesn’t help. Can our use case still use ReplacingMergeTree? Let’s randomly select a tenant_ID, and then select all unconfirmed records — imagine the user is looking at the monitor view. I like Ray Bradbury, so I’ll take 451. Since the value of “alert_data” is only randomly generated, we will compute a checksum to verify that the results of the multiple methods are the same:

SELECT count(), The sum (cityHash64 (*)) AS the data FROM alerts FINAL WHERE (tenant_id = 451) AND (NOT acked) ┌ ─ the count () ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ data ─ ┐ 90 18441617166277032220 │ │ │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ 1 rows in the set. The Elapsed: Processed 106.50 thousand Rows, 119.52 MB (383.45 thousand Rows /s., 430.33 MB/s.)Copy the code

Too fast! It only took 278 milliseconds to query all the unconfirmed data. Why fast this time? The difference lies in the selection criteria. “Tenant_id” is part of a primary key, so ClickHouse can filter data before FINAL. In this case, ReplacingMergeTree becomes efficient.

We also try user filters and query for the number of alarms confirmed by a particular user. The cardinality of the columns is the same — we have 1000 users, so try user451.

SELECT count () FROM alerts FINAL WHERE (ack_user = 'user451') AND acked ┌ ─ the count () ─ ┐ │ │ 9725 └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ 1 rows in the set. Elapsed: Elapsed: Elapsed: Elapsed: Elapsed: Elapsed: ElapsedCopy the code

This is very slow because no indexes are used. ClickHouse scanned all 19.04 million lines. Note that we cannot add “ack_user” to the index because it would break the ReplacingMergeTree semantics. However, we can do a neat trick with PREWHERE:

SELECT count () FROM alerts FINAL PREWHERE (ack_user = 'user451') AND acked ┌ ─ the count () ─ ┐ │ │ 9725 └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ 1 rows in Elapsed: Elapsed: Elapsed: Elapsed: Elapsed: Elapsed: Elapsed: ElapsedCopy the code

PREWHERE is a particularly clever way to allow ClickHouse to apply filters in different ways. ClickHouse is usually smart enough to automatically move conditions to PREWHERE, so users don’t have to care. It didn’t happen this time. Fortunately, we checked.

Aggregation function

ClickHouse is known for supporting a wide variety of aggregation functions, more than 100 in the latest version. Combined with nine aggregation function combinators (see clickhouse. Tech/docs/en/que… Three functions: “argMax”, “Max”, and “any”.

The same query for the 451st tenant can be performed using the “argMax” aggregation function, as follows:

SELECT count(), sum(cityHash64(*)) data FROM ( SELECT tenant_id, alert_id, timestamp, argMax(alert_data, ack_time) alert_data, argMax(acked, ack_time) acked, max(ack_time) ack_time_, argMax(ack_user, ack_time) ack_user FROM alerts GROUP BY tenant_id, alert_id, timestamp ) WHERE tenant_id=451 AND NOT acked; ┌ ─ the count () ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ data ─ ┐ │ │ │ 18441617166277032220 90 └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ 1 rows in the set. Elapsed: Elapsed: Elapsed: Elapsed: Elapsed: Processed 73.73 thousand Rows, 82.74 MB (Elapsed: Elapsed)Copy the code

Same result, same number of rows, but 4 times better performance! This is the efficiency of ClickHouse aggregation. The downside is that the query becomes more complex. But we can make it easier.

Please note that when the alarm is confirmed, we only update the following 3 columns:

  • acked: 0 => 1
  • ack_time: 0 => now()
  • Ack_user: ‘” = > “user1”

In all three cases, the column value increases! Therefore, we can use “Max” instead of the slightly bloated “argMax”. Since we are not changing “alert_data,” we do not need to do any actual aggregation for this column. ClickHouse has a nice “any” aggregation function to do this. It can pick any value with no extra overhead:

SELECT count(), sum(cityHash64(*)) data FROM ( SELECT tenant_id, alert_id, timestamp, any(alert_data) alert_data, max(acked) acked, max(ack_time) ack_time, max(ack_user) ack_user FROM alerts GROUP BY tenant_id, alert_id, timestamp ) WHERE tenant_id=451 AND NOT acked; ┌ ─ the count () ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ data ─ ┐ │ │ │ 18441617166277032220 90 └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ 1 rows in the set. Elapsed: 0.055 SEC. Processed 73.73 thousand rows, 82.74 MB (1.34 m letters rows/s, 1.50 GB/s)Copy the code

Queries are easier and a little faster! The reason for this is that ClickHouse does not need to calculate “Max” on the “alert_data” column by using the “any” function!

AggregatingMergeTree

AggregatingMergeTree is one of the most powerful features of ClickHouse. When combined with materialized views, it enables real-time data aggregation. Since we used the aggregate function in the previous method, can we make it better with AggregatingMergeTree? In practice, this is no improvement.

We only update one row at a time, so a group has only two rows to aggregate. In this case, AggregatingMergeTree is not the best choice. But we have a little trick. We know that the alarm is always inserted in an unacknowledged state and then changed to an acknowledged state. After the user confirms the alarm, only three columns need to be modified. Can we save disk space and improve performance if we don’t duplicate data for other columns?

Let’s create a table that implements aggregation using the “Max” aggregation function. We can also use “any” instead of “Max,” but the column must be nullable — “any” selects a non-null value.

DROP TABLE alerts_amt_max;

CREATE TABLE alerts_amt_max (
  tenant_id     UInt32,
  alert_id      String,
  timestamp     DateTime Codec(Delta, LZ4),
  alert_data    SimpleAggregateFunction(max, String),
  acked         SimpleAggregateFunction(max, UInt8),
  ack_time      SimpleAggregateFunction(max, DateTime),
  ack_user      SimpleAggregateFunction(max, LowCardinality(String))
)
Engine = AggregatingMergeTree()
ORDER BY (tenant_id, timestamp, alert_id);
Copy the code

Since the raw data is random, we will populate the new table with the existing data in alerts. We will insert two times as before, one for unconfirmed alarms and the other for confirmed alarms:

INSERT INTO alerts_amt_max SELECT * FROM alerts WHERE NOT acked;

INSERT INTO alerts_amt_max 
SELECT tenant_id, alert_id, timestamp,
  '' as alert_data, 
  acked, ack_time, ack_user 
FROM alerts WHERE acked;
Copy the code

Note that we insert an empty string instead of “alert_data” for confirmed events. We know the data doesn’t change, we can only store it once! Aggregate functions will fill in the gaps. In practice, we could skip all invariant columns and leave them with default values.

Once we have the data, we first check the size of the data:

SELECT table, sum(rows) AS r, sum(data_compressed_bytes) AS c, sum(data_uncompressed_bytes) AS uc, uc / c AS ratio FROM system.parts WHERE active AND (database = 'last_state') GROUP BY table ┌ ─ table ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ r ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ c ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ uc ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ thewire ─ ┐ │ alerts │ │ │ 20926009562 19039439 21049307710 │ 1.0058921003373666 │ alerts_amt_max │ 19039439 │ 10723636061 │ 10902048178 │ 1.0166372782501314 │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘Copy the code

Well, because of the random strings, we have very little compression. However, since we don’t have to store alerts_data twice, we can reduce the size of the aggregated data by half compared to not aggregating it at all.

Now let’s try the aggregate table query:

SELECT count(), sum(cityHash64(*)) data FROM ( SELECT tenant_id, alert_id, timestamp, max(alert_data) alert_data, max(acked) acked, max(ack_time) ack_time, max(ack_user) ack_user FROM alerts_amt_max GROUP BY tenant_id, alert_id, timestamp ) WHERE tenant_id=451 AND NOT acked; ┌ ─ the count () ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ data ─ ┐ │ │ │ 18441617166277032220 90 └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ 1 rows in the set. Elapsed: Elapsed: 0.036 SEC. Processed 73.73 thousand Rows, 40.75 MB (2.04 million rows/s., 1.13 GB/s.)Copy the code

Thanks to AggregatingMergeTree, we process less data (82MB, now 40MB) and are more efficient.

Implement update

ClickHouse does its best to merge the data behind the scenes to remove duplicate rows and perform aggregation. However, sometimes it makes sense to force a merge, for example to free up disk space. This can be done with the OPTIMIZE FINAL statement. The OPTIMIZE operation is slow and expensive and therefore cannot be executed frequently. Let’s see what effect it has on query performance.

OPTIMIZE TABLE alerts FINAL Ok. 0 rows in set. Elapsed: OPTIMIZE TABLE alerts_amt_max FINAL Ok. 0 rows in set.Elapsed: Elapsed.Copy the code

After OPTIMIZE FINAL, both tables have the same number of rows and the same data.

┌ ─ table ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ r ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ c ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ uc ┬ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ thewire ─ ┐ │ alerts │ │ │ 10616223201 10000000 10859490300 │ 1.02291465565429 │ alerts_amt_max │ 10000000 │ 10616223201 │ 10859490300 │ 1.02291465565429 │ └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┴ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘Copy the code

The performance differences between different approaches become less obvious. The summary is as follows:

conclusion

ClickHouse provides a rich set of tools to handle real-time updates, such as ReplacingMergeTree, CollapsingMergeTree (not mentioned in this article), AggregatingMergeTree, and aggregate functions. All of these methods have three things in common:

  • “Modify” the data by inserting a new version. The insertion speed in ClickHouse is very fast.
  • There are effective ways to simulate update semantics similar to OLTP databases.
  • However, the actual changes will not happen immediately.

The choice of approach depends on the use case of the application. ReplacingMergeTree is straightforward and the most convenient method for users, but only for small and medium-sized tables, or when the data is always queried by the primary key. Using aggregate functions provides greater flexibility and performance, but requires significant query rewrite. Finally, AggregatingMergeTree saves storage by keeping only the modified columns. These are great tools for ClickHouse DB designers and can be applied according to their specific needs.

subsequent

Now that you’ve seen how to handle real-time updates in ClickHouse, this series includes more:

  • Use the new TTL Move to store the data in the appropriate place
  • Use the Join in the ClickHouse materialized view
  • ClickHouse aggregate function and aggregate state
  • Nested data structures in ClickHouse

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.