Abstract:

preface

Recently, global authoritative IT consulting firm Forrester released “The Forrester WaveTM: CloudData Warehouse Q4 2018” research report, Alibaba analytical DATABASE (AnalyticDB successfully selected! AnalyticDB, as a PB-level real-time cloud data warehouse independently researched and developed by Alibaba, is fully compatible with MySQL protocol and SQL:2003 syntax standard, which can carry out real-time multidimensional analysis perspective and business exploration for trillions of data at the millisecond level. Help customers move the entire data analysis and value from traditional offline analysis to the next generation of online real-time analysis mode. This article will deeply understand the core products behind the successful selection of AnalyticDB and the customer value.

Core competency 1: fast and real-time

AnalyticDB can carry out real-time multidimensional analysis perspective on trillions of level data in an instant, and quickly discover the value of data. The speed of AnalyticDB for complex SQL queries is 10 times faster than that of traditional relational databases. In addition, AnalyticDB can be rapidly expanded to a large scale of thousands of nodes, further improving the query response speed. The three modules combine to build performance advantages far ahead of competitors:

Intelligent SQL Optimizer: For complex SQL queries, AnalyticDB SQL Optimizer implements a variety of query rewriting optimization, selects the best JOIN ORDERING path based on statistics, and supports CTE merging. At the same time, for queries with high concurrency and low latency, intelligent Plan cache is provided to cache similar SQL Pattern plans to avoid repeated optimization costs.

Sei and Computing Engine: AnalyticDB was comprehensively upgraded to a new generation of SEI and distributed computing engine in 2017, which adopts MPP architecture as a whole, supports DAG computing model, and introduces LLVM and other runtime compilation and optimization JIT technology in nodes, which improves performance by more than one time. The data analysis task is broken into small particles in the computing unit of xi and the computing engine. The computation scheduling mechanism of time-sharing polling is built into the engine, which can ensure the stable operation of the task under high concurrency.

Basaltic Storage Engine: AnalyticDB supports column and column mixed storage, while aiming at different data types, when data is written in real time, intelligent construction of a variety of dimension indexes, including B+ index, interval index, inverted index, bitmap index, etc., and the traditional index algorithm innovation, the introduction of dynamic filtering, delay physicalization and other ways, greatly reduce I/O, Achieve high performance point or range retrieval, support billions of records association analysis.

AnalyticDB is a fully distributed structure, which enables the database to support the dynamic linear expansion of ECU nodes to thousands of nodes. Users can dramatically improve query SQL response times and concurrency for SQL processing through horizontal scaling. AnalyticDB is built on The Flying system of Ali Cloud. AnalyticDB adopts hierarchical decoupling architecture, and separates analysis and calculation, data writing and index construction into different nodes. At the same time, various types of nodes adopt multi-active operation mode to achieve high availability, and data is stored on pangu distributed file system. To achieve high reliability and high performance read/write I/O, in the overall architecture to achieve elastic expansion and high availability. Each layer of AnalyticDB architecture takes into full consideration the problem of scale expansion.



**

Core capability three: high concurrency real-time write and update **

Since both front-end access layer and write node support dynamic large-scale expansion, customers can increase the write capacity from the minimum size of 100,000 TPS to 10 million +TPS through horizontal node expansion. After real-time writing, the data level is visible, and the whole data delay from writing to analyzing is controlled in seconds.

A single table supports a maximum of PB level data and ten trillion records. Traditional data warehouses usually Load data offline and do not have real-time high concurrent write capabilities. It is because of the ability to write massive data in real time, AnalyticDB data analysis timeliness is very high, is the next generation of enterprise data offline calculation to real-time core solution.

Core competency four: Flexibility

AnalytiDB is a fully distributed design for front-end access layer, elastic computing layer, and data storage layer, without a single point globally. In addition, the storage and computing separation structure brings the advantage of extreme flexibility. On the cloud, customers can not only flexibly adjust the number of nodes at any time, but also do the dynamic lifting of instance specifications. AnalytiDB supports flexible switching between storage SATA instances and high-performance SSD instances.

For example: from 8 high-performance C4 instances to 12 high-performance C8 instances, or from 12 C8 to 8 C4, or even from 2 high-performance C8 nodes to 4 large storage SATA S2N, enterprises can truly achieve flexible cost control.

Core competency 5: Easy to use

AnalyticDB as a cloud hosted PB-level SQL data warehouse, highly compatible with MySQL protocol and SQL:2003, through standard SQL and common BI tools, as well as ETL tool platform can be easily used. At the same time, alibaba cloud data transmission Service (DTS) + data visualization support (Datav & QuickBI) can be easily dragged to complete the construction of real-time data warehouse of the enterprise. AnalyticDB aims to help enterprises reduce the construction threshold of real-time data operation.

Solve data construction efficiency and performance problems for enterprises

4PX is a leading cross-border e-commerce logistics service provider. After building offline data platform for many years, the information technology team of 4PX needs to build pB-level real-time data platform to support digital operation in a short time. After investigating a series of solutions and considering the cost and construction efficiency, 4PX Information Technology finally chose AnalyticDB to build a real-time data platform. In a very short period of time, DTS+AnalyticDB+DataV/QuickBI suite was used to complete the initial infrastructure of 4PX real-time data warehouse in a simple and quick drag-and-drop configuration.



As a popular smart App for taking photos, wuhe camera has various users and App data that need to be reported for real-time analysis, so as to help operators to analyze the effect of activities and developers to do App analysis, and constantly optimize user experience and App quality. The total amount of data is about 10 billion, which needs to be stored and updated in real time. The customer’s earliest solution is MySQL, and then changed to MongoDB, which solves the problem of real-time writing, but the analysis performance is very slow. After using the storage example of analytical database, the business data is directly written to AnalyticDB, which not only solves the real-time high concurrent write problem, but also reduces the complex analysis performance from 40 minutes + to second level, and the high stage QPS 1800+.

Typical industry customers – they are also using AnalyticDB

Looking to the future: Reserve more innovative forces and build a richer ecology

AnalyticDB, as the next generation pB-level real-time data warehouse of Alibaba, carries the mission of real-time data value analysis of customers within the whole group and on the cloud. The report can be seen that the whole big data enterprise services into the CDW stage, flexible, easy to use, self-service service has become the mainstream trend, AnalyticDB will next in the ease of use, data channel, task management, visualization and other surrounding ecological construction continue to do wide and deep. At the same time, it has also reserved some core forces for the future and made phased progress:

  1. For the first time, AnalyticDB adopted THE GPU acceleration technology in the “Double 11 Global Carnival”. Under the situation of greatly reducing the computing cost, AnalyticDB served global merchants to carry out data analysis from offline to online era, and supported pB-level data from T+1 calculation speed to second level real-time analysis.
  2. Vector analysis has for the first time supported face recognition, algorithm recommendation and real-time fusion analysis of structured data in new retail scenes such as Intime and Hema, opened up online and offline membership systems at the millisecond level, and supported real-time digitalized offline interaction and marketing.

AnalyticDB is born for online data value. As a real-time cloud data warehouse platform, AnalyticDB hopes to provide the most advanced next-generation real-time data warehouse capability to all enterprises, helping enterprises transform and accelerate data value exploration and online. Search slowly, use AnalyticDB!