Hologres (Chinese name) interactive analysis is ali cloud from the research on the number of one-stop real-time warehouse, the cloud native system combines real-time service and analysis of large data, fully compatible with PostgreSQL deal with large data ecological seamless get through, can use the same set of data architecture also supports real-time written real-time query and real-time offline federal analysis. Its emergence simplifies the architecture of business, provides real-time decision-making ability for business, and enables big data to play a greater role in business value. From the birth of Ali Group to the commercialization on the cloud, with the development of the business and the evolution of technology, Hologres is also continuously optimizing the competitiveness of its core technology. In order to make you better understand Hologres, we plan to continue to launch a series of revealing the underlying technical principles of Hologres, from high-performance storage engine to efficient query engine. High throughput write to high QPS query, all-round interpretation of Hologres, please continue to pay attention!

Highlights of previous seasons:

  • VLDB 2020 paper”Alibaba Hologres: A cloud-Native Service for Hybrid Serving/Analytical Processing
  • Hologres Revealed: First Public! Alibaba cloud native real-time data warehouse core technology revealed
  • Hologres Revealed: The first cloud native Hologres storage engine revealed
  • Hologres Reveals: Hologres Efficient Distributed Query Engine

In this issue, we’ll explain the technical principles of Hologres’ high-performance native accelerated query, MaxCompute.

MaxCompute (formerly known as ODPS) is now dedicated to the storage and calculation of structured data in batches, as the data scale has grown to the level of terabytes, petabytes, and ebs that the traditional software industry cannot support. It is a fast and fully managed EB data warehouse solution that provides mass data warehouse solutions and analysis modeling services.

Hologres seamlessly integrates with MAXCOMPUTE in offline big data scenarios, enabling accelerated query of MAXCOMPUTE without data import and export, fully compatible access to various MAXCOMPUTE file formats, and interactive analysis of PB level of offline data at the millisecond level. SQE (S Query Engine), the implementer behind Hologres, realizes Native access to MaxCompute through SQE, and then combines with the processing of Hologres high-performance distributed execution Engine HQE to achieve the ultimate performance.

The Hologres accelerated query MaxCompute has the following main advantages:

  • High Performance: Direct accelerated querying of MAXCompute data with sub-second response query performance, direct ad-hoc querying in OLAP scenarios for most reporting and other analysis scenarios.
  • Low Cost: Over the years, MaxCompute has been developed to allow users to store large amounts of data on MaxCompute. One store can be accessed directly without redundancy. On the other hand, users can simply migrate data for a subset of high-performance scenarios to SSDs, and data for analysis scenarios such as reports can be stored in MaxCompute to further reduce costs.
  • More efficient: Native access to MAXCompute enables high performance and fully compatible access to a variety of MAXCompute file formats, as well as complex tables such as Hash/Range Clustered Table, without migrating or importing data, reducing user cost.

    Introduction to SQE architecture

The overall architecture of SQE is shown in the figure above, and you can see that the overall architecture is also very simple. MaxCompute data is stored in Pangu. When HologRes executes a Query to speed up the Query for MaxCompute data, on the HologRes side:

  • The Hologres Frontend requests meta-information from the SQE Master via the RPC.
  • Hologres Blackhole makes a request to the SQE Executor via RPC for specific data related information.
  • SQE consists of processes with two roles:

    • SQE Master handles Meta-related requests and is primarily responsible for retrieving tables, partitioning metadata, authentication, and file sharding.
    • SQE Executor, as the core of SQE, is responsible for reading data requests, involving Block Cache, read-ahead fetching, UDF processing, expression push-down processing, index processing, Metric, Meter and other functions.

    MaxCompute Appearance Engine Core Technology Innovations

    The SQE-based architecture enables high performance and accelerated querying of MaxCompute data based on the following technological innovations:

In combination with the distributed nature of MaxCompute, Hologres abstracts a distributed surface to support access to MaxCompute distributed data. It is now possible to access MAXCOMPUTE distributed pangu files across clusters and compute cluster reads as close as MAXCOMPUTE.

SQE and MAXCOMPUTE can achieve metadata and Data real-time access, Import Foreign Schema command is supported, and SQE and MAXCOMPUTE can achieve metadata and Data real-time access. Automatically synchronize MAXCOMPUTE metadata to the HologRes facade, enabling the facade to be automatically created and the structure to be automatically updated.

3) Support UDF/ Expression Push Down SQE supports UDF/ Expression Push Down to realize user-defined UDF calculation; Pushing the expression down reduces the overhead of useless data transfers and further improves performance.

In Hologres V0.10 and above, Hologres has updated the execution engine to use asynchronous readers for more efficient asynchronous reads. Asynchronous PREFETCH is also supported to further reduce read latency; In addition, Hologres supports a series of optimization techniques such as IO merge, LazyRead, Lazy Decoding, etc., to reduce the latency of IO on the whole query to bring the maximum performance.

SQE also uses BlockCache to store frequently used and recently used data in memory, which reduces unnecessary IO and speeds up read performance. Within the same node, the same accessed data is shared in a Block Cache through the consistent Hash implementation. For example, in the SCAN scenario, the performance can be improved by more than 2 times, greatly improving the query performance.

6) The traditional process model and other architectures require the dynamic and real-time creation of processes and other scheduling operations, which brings a large scheduling overhead. SQE uses the resident process mode to avoid unnecessary scheduling overhead, and it can also greatly improve the Block Cache hit ratio and effective utilization. Network Shuffle is required to provide a fast and stable fault-tolerant mechanism. Because of a Network Shuffle, both sender and receiver processes must be alive to complete a data Shuffle. Similarly, if the Retry of Network Shuffle is carried out in the traditional way of falling disk, although the stability can be guaranteed, it may lead to large performance overhead due to disk IO during the Retry process. In order to solve this problem, we optimize the phased scheduling to solve the fast and stable fault tolerance problem.

MaxCompute appearance engine upgrades to HQE

As mentioned above, we used SQE to accelerate the query of MaxCompute appearance. SQE can perform very well, but there is a layer of RPC interaction in the middle when interacting with Hologres and there is a bottleneck in the network when there is a large amount of data.

Therefore, based on the existing capabilities of Hologres, we have optimized the execution engine in Hologres V0.10 and above to support direct reading of the MAXCOMPUTE table by the Hologres HQE query engine, which has further improved performance by more than 30% compared to SQE reading.

This is mainly due to the following aspects:

1) The interaction between the RPC between SQE and Hologres is saved, which is equivalent to saving the serialization and deserialization of data once, and the performance is further improved. 2) The Block Cache of Hologres can be reused, so that the second query does not need to access the storage, so that the data can be accessed directly from the memory to avoid the storage of IO, so as to better accelerate the query. 3) The existing Filter pushdown capability can be reused to reduce the amount of data to be processed. 4) Implement read-ahead and Cache in the underlying IO layer to further speed up SCAN performance.

The following is the performance data of a customer’s actual online business query:

E2E running time SQL Num SQE query performance (average response) HQE query performance (average response) Performance improvement
2-10s 547 4956 ms 2609 ms + 47.34%
10-30s 207 16757 ms 5457 ms + 67.43%
More than 30 s 63 78686 ms 12666 ms + 83.90%
Total 817 13631 ms 4106 ms + 69.87%

Execute 817 SQL with an overall performance improvement of 70%, including more than 80% improvement in long Query.

Note: The optimization has been launched in Hologres V0.10, please see the documentation for use.

MaxCompute speeds up scenario selection

There are two ways to speed up the query of MAXCOMPUTE in Hologres:

1) Creating outer (data is still stored in MAXCOMPUTE) gives a 2-5 times performance improvement over querying in MAXCOMPUTE 2) Importing inner tables gives about a 10-100 times performance improvement over outer

Foreign Data Wrappers, in PostgreSQL, use the external access interface to access Data stored externally. It is recommended that you use a more convenient Import Foreign Schema method to create the facade, which can better simplify the synchronization of metadata without paying attention to field type mapping, etc.

The direct facade union approach actually takes advantage of the query engine’s optimization power to improve efficiency, but does not take advantage of the indexing power of Hologres. Therefore, when the outer surface is transferred to the inner table, the index structure of the inner table can be specified according to the query mode. Through these index capabilities, the query performance can be higher. This is the reason why external import inner table, inner table performance is better, can give full play to the database index optimization ability.

At present, the main comparison between the two methods is as follows:

Scenario/Dimension performance Storage costs The amount of data The index convenience
Within the Hologres table Very good high
(SSD) PB level can be supported Can support bitmap, cluster and other indexes You need to import data
Hologres appearance good low
(HDD) A single query is limited to 200GB Only ODPS indexes are supported There is no need to migrate or import data

As can be seen from the above comparison:

  • If you have a large amount of data and have high requirements for performance (such as within 100ms, etc.), are sensitive to query delay, and have SLA requirements for query, it is suggested that you import the data into the table in Hologres for query access.
  • For temporary exploratory analysis or internal business that is not sensitive to latency, you can use the MaxCompute surface approach to reduce data movement.
  • In addition to the above scenarios, you can choose the appropriate usage scenario according to the specific business situation.

The composite relationship between MaxCompute and Hologres

I’ve described a number of scenarios where the Hologres appearance query engine speeds up querying MaxCompute, but not all types of queries are suitable for execution on the Hologres appearance engine.

Hologres is a synchronous query engine designed for interactive analysis scenarios, targeting big data in and small data out scenarios, typically used in ad-serving and Analytics scenarios. MaxCompute is an asynchronous data processing engine designed for massive data processing scenarios. It is designed for big data in and big data out scenarios, typically used in ETL scenarios. In the ETL scenario, jobs are delivered asynchronously, the IO interface is optimized for SCAN, the computing process requires redundant design of nodes to support high availability, and the need to compute state drop to automatically retry in the event of failure. These are all capabilities that Hologres does not have. Together, MaxCompute+Hologres provides a one-stop data processing + service experience that reduces data isolation and redundancy and provides a solution architecture for big data warehouses that supports real-time, offline, integrated development experiences.

conclusion

Hologres is deeply integrated with MaxCompute through SQE, making full use of the advantages of Hologres and MaxCompute, aiming at the ultimate performance, to directly accelerate the query of MaxCompute data, making it more convenient and efficient for users to conduct interactive analysis, and at the same time reducing the great analysis cost. Realize the integration of offline storehouse service.

About the author: Wang Qi (Hui-Qing), a technical expert at Alibaba, is engaged in the research and development of interactive analysis engine Hologres.

In the future, we will successively launch the series of unveiling the underlying technical principles of Hologres. The specific planning is as follows. Please keep your attention!