This is my 88th original article \

I went to ClickHouse’s second online conference yesterday and spent most of the time with my Russian brother speaking English with a Russian accent that was a bit confusing. Originally also want to steal a teacher, the result level is really not good, did not understand.

Fortunately, I have seen some of them before, or I heard them for nothing yesterday.

Rookie of the OLAP class

Data processing is still divided into OLTP and OLAP.

OLTP (online transaction processing) optimization direction is high concurrency, high availability, is accurate, is all kinds of add, delete, change check. Therefore, the problems faced and solved are how to solve the increase, deletion, change and check under high concurrency, how to solve the dirty read, dirty write, ensure data consistency and other problems.

The optimization direction of OLAP is high-speed data processing capability and high-speed reading capability. There are generally two optimization directions. One is to pre-calculate the data of all dimensions and store it in CUBE. The results can be directly queried during analysis. One is to store the structure, optimize it in various ways, and calculate it when you analyze it. This is ROLAP (Relational OLAP), and ClickHouse is the classic example.

How popular is ClickHouse? In the field of big data, this CK is much more attractive than the CK of underwear! Here are two pictures to get a feel for it:

Pay attention to the last ha, experience is unlimited, 20-40K!

The characteristics of ClickHouse

I used to reject ROLAP because it was too slow. ROLAP is now calculated, the previous routine is basically to generate a huge complex SQL thrown into the database run, that can not slow?

But this ClickHouse is different, and its most notable feature is speed! It’s not scientific!

The image above is from Clickhouse- % Share

Although various assessments will choose their own indicators, but this is too wide, right? A colleague at Yandex, the founder of ClickHouse, came out and explained, somewhat to my disappointment, that it was not a great algorithm or solution, but a step-by-step optimization from the hardware up. Are you surprised?

So another feature of ClickHouse is that it is independent and does not require any component dependencies. There seems to be a trend towards this, such as Doris. We know that Kylin is dependent on Hbase. This can cause all sorts of component versioning issues. Just think about it!

When ClickHouse is running, it uses up all of the server’s resources, not just memory. Even if you look up a simple data, you will eat more than 50% of the CPU!!

In addition, CK has the following features:

  • PB level data processing capability
  • Column data store
  • Excellent data compression
  • Multi-core parallel processing
  • Multi-server distributed processing
  • SQL support (some statements are weird)
  • Vectorization engine
  • Support real-time data update
  • High throughput write
  • Approximate calculation
  • Less dependence, very easy to get started

As for the problem of not supporting transactions, not supporting deletion, modification, etc., this is not OLAP’s need, ok? Although the number of warehouse also occasionally have the possibility of changing the data, but to support so good what? Isn’t it?

ClickHouse applications and support

ClickHouse has a Chinese community, click to visit ClickHouse Chinese community. \

In addition to performing the basic ROLAP operations, ClickHouse can combine various techniques to perform various operations.

Suning uses CK combined with Bitmap to make label circles and user portraits:

Tencent uses CK to do real-time, accurate online real-time OLAP analysis of games:

Byte with CK to build the data center.

Are you sure you don’t know something about such a hot topic? All the ClickHouse information is ready for you. Reply “CK” to get all the ClickHouse information

Enjoy better with the following articles

A full set of data packet data package | data warehouse construction

Experience | shell data architecture practice of China engineering development platform

Data package | middle construction plan a full set of data to the data

I need your upvotes. I love you