This is probably the most comprehensive summary of OLAP stacks ever! (Swastika dry goods)

Click on the top of the “QI QI” attention, star mark or top grow together

preface

Hello everyone, I’m Yunqi!

I happened to see this in-depth interpretation of OLAP on Zhihu, which made a detailed analysis from the aspects of technology development, product selection and execution optimization. Please share it with us! \

The full text is 10,000 words, it takes 30 minutes to read!

I think it’s a little long. Why don’t you keep it? **

Article | WenZheng lake

Source | zhihu

What types of OLAP data warehouses are there?

Divided by data volume

There are many ways to classify a thing or a thing based on different perspectives. The same goes for logarithmic warehouse products. For example, we can choose the number of different types based on the amount of data, as shown in the following figure:

The focus of this series of articles is on the multi-billion to multi-billion-dollar, real-time analytics warehouse, such as Cloudera’s Impala, Facebook’s Presto, and Pivotal’s GreenPlum. If the amount of data exceeds 10 billion, offline storehouse is generally selected, such as Using Hive or Spark (SparkSQL3.0 seems to improve performance significantly). For a small amount of data, although it is an analysis application, you can also directly choose a common relational database, such as MySQL.

Divided by modeling type

Next, we will focus on analytical data warehouses with medium data volumes, focusing on OLAP systems. According to The introduction of OLAP in Wikipedia, OLAP can be divided into MOLAP, ROLAP, and HOLAP according to the modeling methods. The following sections introduce the advantages and disadvantages of OLAP respectively.

1, MOLAP

This is probably the most traditional data warehouse. When Edgar F. Codd proposed the concept of OLAP in 1993, it was MOLAP data warehouse. M stands for Multidimensional. Most MOLAP products anticipate all the results a user might need from raw data and store them in an optimized multi-dimensional array storage, which you can think of as the “data cube” mentioned in the previous article.

Because all possible results have been calculated and stored persistently, there is no need to perform complex calculation during query, and efficient index-free data access can be implemented in array form, so all user-initiated queries can respond stably and quickly. These result sets are highly structured and can be compressed/encoded to reduce storage footprint.

But high performance doesn’t come without a cost. First, MOLAP needs to perform predictive calculations, which can take a lot of time. It is obviously inefficient to perform full estimates after each write of incremental data, so it is important to support iterative calculations only on incremental data. Second, if the business needs to change, requiring new query operations outside of the predetermined model, the existing MOLAP instance cannot do anything but re-model and re-estimate.

Therefore, MOLAP is suitable for scenarios with fixed service requirements and large data volumes. In open source software, Kylin, developed by eBay and contributed to the Apache Foundation, is one of these OLAP engines, supporting sub-second queries on ten billion datasets.

Its architecture diagram intuitively reflects the Cube based predictive computation model (BUILD), as shown below:

2, ROLAP

In contrast to MOLAP, ROLAP performs calculations directly on the fact and dimension tables that make up the multidimensional data model without predictive computation. R stands for Relational. Obviously, this method is more scalable than MOLAP. After incremental data import, there is no need to recalculate. When users have new query requirements, they only need to write the correct SQL statement to obtain the required results.

However, the deficiency of ROLAP is also obvious, especially in the scenario of huge data volume, after the user submits SQL, the time required to obtain the query result is not exactly predictable, it may take seconds, or it may take tens of minutes or even hours. In essence, ROLAP distributes the estimated MOLAP time to the user for each query, definitely affecting the user’s query experience.

Of course, the acceptable performance of ROLAP depends on the type of SQL the user queries, the size of the data and the user’s performance expectations. For relatively simple SQL, such as Query in TPCH, the response time is faster. But complex SQL, such as Query for data analysis and mining classes in TPC-DS, can take several minutes.

Compared with MOLAP, ROLAP has a lower threshold to use. After completing the construction of star or Snowflake models, creating fact tables and dimension tables of corresponding schemas and importing data, users only need to be able to write SQL that meets the requirements and get the desired results. This is obviously more convenient than creating “data cubes”.

Some analysis shows that although ROLAP performance such as MOLAP, but due to its flexibility, scalability, ROLAP users are several times more than MOLAP.

3, HOLAP

MOLAP and ROLAP have their own advantages and disadvantages and are mutually exclusive. It would be a better choice if they could complement each other. The appearance of HOLAP is just for this purpose. H stands for Hybrid. This idea is very simple and direct. For those SQL queries that are frequent and stable but time-consuming, speed up through predictive arithmetic; For faster queries, less frequent occurrences, or new query requirements, manipulate the fact and dimension tables directly through SQL like ROLAP.

There seems to be no open source OLAP system of this type, and some big data services or Internet vendors, such as HULU, have similar offerings. It is believed that HOLAP may be further developed and used on a larger scale in the future.

4, HTAP

In another dimension, HTAP is an OLAP-type system, an extension of ROLAP with OLAP capabilities. Recent developments suggest that some cloud vendors have made a compromise on HTAP, weakening T (Transaction) to S (Serving), evolving toward HSAP. As for HTAP/HSAP, this paper does not expand further, and other information can be independently queried.

There are many mainstream OLAP data warehousing systems, including the types described above. Below is Gartner’s data analytics market ranking released in 2019:

It can be found that traditional commercial vendors and closed source cloud service vendors occupy the majority of the market. Most of these systems have been heard of but not studied. As a database/data warehouse developer for an Internet company, the rest of this article focuses on the open-source OLAP system (SQL on Hadoop) based on the Hadoop ecosystem.

What are the common open source ROLAP products?

The open source ROLAP that is widely used in production environment can be divided into two main categories, one is the wide table model, and the other is the multi-table combination model (the aforementioned star or snowflake type).

Wide table model

The wide table model can provide better query performance than the multi-table combination model, but the deficiency is that the SUPPORTED SQL operation types are relatively limited, such as weak or no support for complex operations such as Join.

Druid supports larger data sizes, pre-aggregation capabilities, and better query performance through inverted indexing and bitmap indexing. Druid is widely used in AD analysis scenarios, monitoring and alarm timing applications, and other applications. ClickHouse has a simple deployment architecture that is easy to use, saves detailed data, and delivers robust query performance thanks to its vectorization, pruning and other optimization capabilities. Both of them have high real-time data and are widely used in Internet enterprises.

In addition to Druid and ClickHouse, ElasticSearch and Solar can also be categorized as wide table models. However, their system design architecture is quite different. These two search engines are generally called search engines. They improve query performance by inverting indexes and applying the Scatter-Gather calculation model. The query effect is good for the search type. However, when the data volume is large or the scan aggregation type is used, the query performance will be greatly affected.

Multi-table composite model

Star or snowflake modeling is one of the most common ROLAP systems, including GreenPlum, Presto and Impala, which are all based on MPP architecture. The system using this model and architecture has the advantages of supporting large amount of data, good scalability, flexibility and ease of use, and supporting various SQL types.

Compared with other types of ROLAP and MOLAP, this kind of system has no advantages in performance, and its real-time performance is relatively mediocre. Generic systems tend to be more difficult to implement and optimize than proprietary systems because they have more scenarios to consider and support more types of queries. Dedicated systems, on the other hand, only need to be optimized for a specific scenario they serve, reducing their relative complexity.

For ROLAP systems, especially star or Snowflake systems, it is important to minimize response time, which is the core competitiveness of the system. This content will be introduced in the next section.

What hacks are used to optimize ROLAP system performance?

ROLAP systems currently used in production environments implement most of the performance optimization techniques in this area, including MPP architecture, support for cost-based query optimization (CBO), vector-execution engines, dynamic code generation mechanisms, storage and access efficiency optimization, and other CPU – and memory-related computing layer optimization. The following are introductions one by one.

What is the MPP architecture?

First, let’s talk about the system architecture. This is the first division of OLAP system design. Currently, the architecture used in the production environment includes the system based on the traditional MapReduce architecture plus the SQL layer assembly. Mainstream MPP based systems; Other non-MPP systems, etc.

1. MR architecture and its limitations

In the Hadoop ecosystem, Hive provides SQL query services based on MapReduce framework for the first time.

However, there are obvious limitations based on MR framework, such as:

Each MapReduce operation is independent of each other, and Hadoop does not know what MapReduce is coming next.
The output of each step is persisted to the hard disk or HDFS.

The first problem made it impossible to optimize across MR operations, and the second caused the data interaction between MR to require a lot of IO operations. Both problems have a great impact on execution efficiency and poor performance.

2. Analyze the advantages and disadvantages of MPP

MPP is short for Massively Parallel Processing, which is a massively parallel computing framework. MPP queries are faster than MR architectures, usually returning query results in seconds or even milliseconds, which is why many systems that emphasize low latency, such as OLAP systems, use MPP architectures.

Impala is used as an example to introduce the MPP system architecture.

The Impala architecture diagram above shows the Impala components and the execution flow of a query.

Users can use the Impala client or UI tool such as Impala-shell or Beeline to send query SQL to the Impala node. The Impala node that receives the SQL is a Coordinator node and parses the SQL.
First, the execution plan based on a single node is generated. The execution plan is then distributed, for example by parallelizing joins, aggregations, and so on to the Impala Executor nodes. The execution Plan is divided into multiple Plan fragments (PF), and each PF consists of one to multiple operators.
Then, the optimized PF of the execution plan is delivered to the corresponding Executor node. Multiple execution nodes process the task in parallel, shortening the time required for the entire task.
Scan data on HDFS/Hbase and process data layer by layer, for example, shuffe and Join data across nodes.
A node completes a task and sends the output to a Coordinator node.
A Coordinator node collects data from each executing node, performs final processing, and finally returns the desired result set to users.

3. The performance of MPP architecture is better than MR for the following reasons:

Data interactions between PFS (that is, intermediate processing results) reside in memory Buffer without falling down (assuming sufficient memory);
The process between Operator and PF is pipelined. There is no need to wait for both Operator and PF to complete before proceeding to the next process. Relationships between upstream and downstream and data interaction are explicit in advance.

In this way, CPU resources are fully utilized and I/O resource consumption is reduced. But things often go both ways, and MPP is not perfect. The main problems include:

If the intermediate result does not fall, it is good under normal circumstances, but it is bad under abnormal circumstances. This means that in scenarios such as node downtime, the intermediate result needs to be recalculated, which slows down the task completion time.
The scalability is not as good as MR architecture, or the performance cannot be linearly improved as the number of MPP system nodes increases to a certain scale. One reason is the “barrel effect,” where the performance bottleneck depends on the node with the worst performance. Another reason is that the larger the scale is, the more frequently nodes break down and bad disks occur. As the failure rate increases, the PROBABILITY of SQL retry increases.

Based on the above analysis, MPP is suitable for business scenarios that do not take long to execute, such as a few hours. Because the more time passes, the more likely it is to fail.

4. Other non-MPP architectures

Considering the limitations of MR system, Hive and Spark are optimized in different ways besides MPP architecture, including Hive Tez, SparkSQL based on DAG (Directed Acyclic Graph), etc.

Different architectures have different advantages and disadvantages. It is important to find the scenarios that are suitable for them and optimize them properly to make full use of their advantages.

What is cost-based query optimization?

Having a suitable system architecture does not necessarily bring positive benefits. “The best horse fits the saddle” and the execution plan is also decisive to the performance of the final system. Execution plans and their optimization, as I understand them, come from the field of relational databases. This is another university question, here is a brief introduction.

The distributed architecture enables the execution plan to be optimized in parallel across nodes, and the execution time can be greatly shortened by splitting task granularity and changing serial to parallel. In addition, there are two more important optimization methods, which are traditional rule-based optimization and more advanced cost-based optimization.

Rule-based optimization

In general terms, rule Based Optimization (RBO) refers to the optimization of SQL statements issued by users without additional information, mainly by changing the SQL, such as the order in which SQL clauses are executed. Common optimizations include predicate push-down, field filtering push-down, constant folding, index selection, Join optimization, and so on.

PredicatePushDown, the most common PredicatePushDown condition, etc. Take MySQL as an example, when the MySQL Server layer obtains InnoDB table data, it pushes the WHERE condition to the InnoDB storage engine. InnoDB filters the WHERE condition. Only eligible data is returned. In the case of data partitioning, predicate push-down is more effective.

For example, in the column save mode, only the data of the corresponding column is read. In the row save mode, an index can be selected for index coverage query, which is also a scenario of index selection optimization.

Constant or function folding is also a common optimization method. Some constant calculations in SQL statements (addition, subtraction, multiplication, division, rounding, etc.) are done in the execution plan optimization stage.

There are many methods for Join optimization. Rule-based optimization here mainly refers to the implementation of Join. For example, the most fool-head Join implementation is to read every record of the two tables participating in Join honestly for comparison of Join conditions. The most common optimization method is Hash Join, which is obviously very efficient. Don’t take this for granted, MySQL didn’t have it until version 8.0. In addition, the order and merge of Join can also be judged and selected directly through SQL.

Cost-based optimization

The rules-based optimizer is simple and easy to implement, with a built-in set of rules to determine how to execute a query plan. The opposite is cost based optimization (CBO).

The implementation of CBO relies on detailed and reliable statistics, such as maximum, minimum, average, distinction, number of records, total of columns, table size partitioning information, and metadata such as columns’ histograms.

One of the main uses of CBO is to determine how and in what order a Join should be performed in a Join scenario. When we say Join, we’re talking about Hash joins.

Join execution mode

Hash Join can be classified into broadcast and partition based on the size of Build Table and Probe Table.

Broadcast mode is applicable to Join large tables and small tables. In parallel Join, the small table is broadcast to each execution node where the partition data of the large table resides, and the Join results are returned and summarized.

Partition mode is the most common mode and is suitable for joins between large tables or when the table size is unknown. Partition the two tables separately, and Join each partition separately.

Obviously, the key to determining the size of a table is whether there is some way to get the number of records in the table, and if the storage tier holds the number of records, it can get it directly from the metadata.

If Join two tables are big table, but at least there is a table with the Where filter conditions, so the decision to go before partition way also can further satisfy the condition number of records, at that time, the partition table on the physical storage can play a role, we can see each partition of the maximum and the minimum number and to estimate the total number of records of the filtered. Of course, a more accurate way is the column histogram, which gives you a direct and intuitive view of the total number of records.

If none of the above statistics is available, another way to use the CBO is to dynamically sample the records to determine which Join mode to go.

The Join order

If there are multiple Join operations in the SQL of a query, how the order of the Join is determined has a significant impact on performance. This, too, is a well-researched technique by database gurus.

A good CBO should be able to automatically choose whether to use a left-deep tree (LDT, Left) or a Bushy tree (BYT, right) to perform a join, depending on the characteristics of the SQL statement.

There is no good or bad order between the two kinds of Join. The key is the table data that is joined, that is, the field characteristics of the Join.

For LDT, if each Join can filter out a large amount of data, it is obviously better in terms of resource consumption. For some systems where every column is indexed, LDT is better than BYT.

Generally speaking, BYT is a more efficient mode. The advantages of MPP architecture can be brought into play by changing serial multi-layer Join to parallel less-level Join, and results can be obtained as soon as possible, which is often used in multi-table mode ROLAP scenarios.

Why do we need a vectorization execution engine? How does this relate to dynamic code generation?

The Query Execution Engine is a core component in a database that transforms query plans into physical plans and evaluates them to return results. Query execution engines have a significant impact on system performance. A comparison of Impala and Hive shows that Hive is slower than Impala in some simple queries (TPC -h Query 1). This is mainly because Hive runs in cpu-bound state with only 20% disk I/O. Impala has at least 85% IO.

What could account for such a big difference? First, a brief description of the volcano’s execution engine.

The volcano model and its disadvantages

The earliest query execution engine was the Volcano style execution engine, also known as the iterator model, or one-tuple-at-a-time. In this model, the query plan is a DAG of operators, each of which contains three functions: open, Next, and close. Open allocates resources, such as allocating memory or opening files, close frees resources, and next recursively calls the suboperator’s next method to generate a tuple (the physical representation of the row).

Select sum(C1) from T1 where C2 > 15; select sum(C1) from T1 where C2 > 15; All the way recursively to the leaf node Scan operator, whose next returns a tuple from the file.

Its disadvantages mainly lie in:

Lots of virtual function calls: The volcano model’s next method is usually implemented as a virtual function, and in the compiler, The virtual function call requires lookup of the virtual function table, and the virtual function call is an indirect jump, leading to a brance misprediction of CPU, which requires the overhead of a dozen cycles. The volcano model calls the Next method multiple times to return a tuple, resulting in expensive function call overhead
Type packing: For expressions such as A + 2 * b, we need to interpret variables of different data types. Therefore, in Java, we need to wrap variables of primitive (such as int) as Object. However, we need to call the implementation function of specific type.
CPU Cache utilization is inefficient: The next method returns only one tuple at a time. Tuples are usually stored in rows. If only the first column is accessed and a full row is filled into the CPU Cache each time, a Cache Miss will occur.
Conditional branch prediction failure: the current CPU is parallel pipeline, but if there is a conditional judgment will lead to parallelism. For example, determine the type of data (string or int), or determine whether a column does not need to be read because of the filtering conditions of other fields.
The CPU does not match the I/O performance. One row of data is read from the disk and sent to the CPU for processing after multiple calls. Obviously, most of the time, the CPU waits for data to be ready, causing the CPU to idle.

Through the above description, we can get the basic method to solve the problem. Problems can be divided into two broad categories and solved using the vectorization engine and dynamic code generation techniques described below.

Vectorization execution engine

Vectorization is performed on the premise of column storage, and the main idea is to read a batch of columns from disk at a time, organized as an array. Each time next processes the column array through the for loop. Doing so would significantly reduce the number of calls to Next. The corresponding CPU utilization is improved, and the data is organized together. The features of CPU hardware, such as SIMD, can be further used to load all data into the CACHE of CPU to improve the cache hit ratio and improve efficiency. With the dual optimization of column storage and vectorization execution engines, the speed of query execution will take a huge leap forward.

Dynamic code generation

Vectorization reduces CPU wait time, improves CPU Cache hit ratio, and alleviates virtual function call efficiency by reducing the number of next calls. Dynamic code generation, however, further solves the problem of virtual function calls.

Instead of using interpretative unified code, dynamic code generation directly generates code for the corresponding execution language and uses primitive Type directly. The effect of dynamic code can be to eliminate the type judgments caused by the bifurcations of the data types, using hardware instructions to further improve the efficiency of loop processing.

JVM systems such as Spark SQL, Presto can use reflection, and C++ Impala uses LLVM to generate intermediate code. C++ is relatively more efficient.

Vectorization and dynamic code generation techniques often work together to achieve better results.

What are the storage space and access efficiency optimization methods?

There are many ways to optimize storage and I/O modules, which we consider in the Hadoop ecosystem. Of course, many optimization methods are not specific to Hadoop, but common. In the OLAP scenario, the most basic and effective data storage optimization is column storage. The optimization measures discussed below are based on column storage.

Data compression and encoding

Data compression is a common optimization method in the storage field. It can reduce the storage space of data on disk with controllable CPU cost. On the one hand, it can save cost and on the other hand, it can reduce the overhead of I/O and data transmission across threads and nodes in memory. At present, the main compression algorithms in use include Zlib, SNappy and LZ4. A compression algorithm with a higher compression ratio does not necessarily have a higher compression ratio. The compression and decompression speed of an algorithm with a higher compression ratio is usually slower. Therefore, a trade-off between CPU and I/O must be made based on hardware configuration and application scenarios.

Data coding can be understood as lightweight compression, including RLE and data dictionary coding.

The figure above shows the use of RLE encoding and data dictionary encoding in the Presto paper. RLE is used when all columns are repeating characters, such as page0, where the returnFlag of the 6 rows is “F”. Data dictionaries can be used efficiently on less discriminating columns, such as scenarios where there are only a few strings in the column. Data dictionaries can be used across pages, taking into account the value dependencies of columns in the same table.

Compared with data compression, data encoding does not need to decode data in some aggregation query scenarios and directly returns the desired results. For example, assuming that C1 is a character in table T1, the RLE algorithm encodes the value of 16 C1 columns “AAAAaABBCCCCaAAA” as 6A2B4C4A, where 6A means that there are six consecutive characters a. Select count(*) from T1 where C1= ‘a’ select count(*) from T1 where C1= ‘a’

In column storage mode, the efficiency of data compression and encoding is much higher than that of row storage.

Fine-grained data storage

The so-called data refined storage is to reduce unnecessary data scanning and calculation by providing as much metadata information as possible. Common methods include but are not limited to the following:

Data partitioning: Data partitioning can be used to disperse table data to multiple storage nodes based on hash or range for multi-copy storage. This improves data Dr And migration efficiency. In addition, data partitions that do not meet the requirements of where conditions can be quickly filtered out during query, without reading data column by column for judgment.
Row groups: Similar to data partitioning, Parquet and OrcFile, commonly used in Hadoop, also divide table data into row groups, and the records within each row group are stored in columns. In this way, the column storage can improve the EFFICIENCY of OLAP query, and at the same time, the requirement of multi-row query can be taken into account.
Local indexes: Create indexes on data partitions or row groups to improve query efficiency. As shown in the figure below, OrcFile maintains Index Data in the header of each row group to store metadata such as maximum and minimum values. Based on this information, you can quickly decide whether to scan the row group or not. Some OLAP systems further enrich metadata information, such as an inverted index or B+ tree index of row group records, to further improve scanning and query efficiency.

Rich metadata: In addition to maximum and minimum value information, it also provides metadata information such as average value, distinction, number of records, column sum, table size partition information, and column histogram.

Data localization access

Local data reads and writes are common optimizations, and they are available under Hadoop.

Generally speaking, to read data in HDFS, you need to obtain DataNode information from NameNode first, and then read the required data from DataNode.

For OLAP systems such as Impala, the HDFS local access mode can be optimized to directly read HDFS file data on disks. The HDFS feature is called “Short Circuit Local Reads”, and its related configuration items (in hdFS-site.xml) are as follows:

<property>    <name>dfs.client.read.shortcircuit</name>    <value>true</value>  </property>  <property>    <name>dfs.domain.socket.path</name>    <value>/var/lib/hadoop-hdfs/dn_socket</value>  </property>
Copy the code

Including: DFS client. The read. Shortcircuit switch is open this function, DFS. Domain. Socket. Path is a Datanode and DFSClient communication between the local path of the socket.

Runtime data filtering

This is an advanced feature of a few OLAP systems, such as Impala RunTime Filter (RF) RunTime filtering and SparkSQL 3.0 Dynamic Partition Pruning. The bloomfilter (BF) or filter condition of the driven table can be applied to the data scan phase of the driven table, thus greatly reducing the amount of data to be scanned/returned. Each of these is outlined in a diagram, which will be detailed later in the analysis of specific OLAP systems.

The figure above visually illustrates the implementation of the Impala Runtime Filter. The process is as follows:

Deliver the SCAN operation for two tables simultaneously. The left table is a large table, and the right table is a small table (relatively speaking, it may be of the same level), but the left table waits for a period of time (1s by default), so SCAN for the right table is performed first.
The scanning results of the right table are transmitted to different Join nodes according to the join key hash, and the JOIN node performs the construction of hash table and RF.
After reading all the input from the right table, the Join node completes the RF construction. The Join node sends the RF to the Coordinator node (if it is Broadcast Join, it directly sends the RF to the Scan node in the left table).
Coordinator nodes merge different RFS, that is, merge Bloom filters. The Bloom Filter after the merge is a GLOBAL RF, which is distributed to each Scan table on the left.
The left table will wait a certain amount of time (1s by default) before data scan is enabled. In order to wait as long as possible for RF to arrive, however, RF will be applied after that moment whenever RF arrives.
The left table is also delivered to the Join node in Hash mode after scanning using RF, and the Join node performs the apply operation to complete the whole Join process.

Sparksql Figure 1 (official this figure is wrong, the right should be Scan Date)

Sparksql figure 2

The two figures above are dynamic partitioning clipping diagrams of SparkSQL 3.0. Broadcast the scan result of the right table (hashtable of table Date after filter) to the Join node of the left table. In the scanning of the left table, the hashtable of the right table is used for conditional data filtering.

Are there other ways to optimize?

Another extremely important technology is cluster resource management and scheduling. Hadoop uses YARN for resource scheduling, which brings great traversal, but is not suitable for OLAP systems with high performance requirements.

For example, starting AppMaster and applying for containers can take a lot of time, especially the former, and the supply of containers is intermittent, which greatly affects the timeliness.

The current optimization methods mainly include long-term stagnation after AppMaster startup, container reuse and so on. Having resources in place when they are needed allows queries to start immediately without waiting.

Finally, a summary

The purpose of this article is to explain my understanding of the database and OLAP system. The reason for adopting the Q&A format is that I take these questions to Google or internal sources, or directly ask the leaders in the field.

Due to the limited level, it is inevitable that there are mistakes, we are very welcome to point out after reading.

I am “Qi Yunqi”, a big data development ape who loves technology and can write poems. Welcome your attention!

(2) Apache Druid principle and architecture analysis

(I) First met Apache Kylin

Thoroughly understand HBase Rowkey design and implementation

My 2020 year-end review: Life, sea and sea, braving waves and moving forward

How to use Ali Cloud big data products to build data center?

‍‍‍‍‍‍‍‍‍‍‍

???? Share, like, watch, give a triple punch! ????