Abstract: Enterprises choose to split their online business databases, or choose different database types to meet their business needs. Service data is scattered in various places. How to collect and query these data conveniently has become a big problem for users

background

As business complexity increases and data size increases, more and more companies are choosing to split their online business databases vertically or horizontally, or even choose different database types to meet their business needs. SQL queries that could have been implemented in the same database instance now need to be completed across multiple database instances. Service data is scattered in various places. How to collect and query these data conveniently has become a big problem for users.

Traditional solutions to this kind of problem require users to gather data from all instances to the same place in advance, and then perform offline query analysis. To this end, users need to maintain data migration links, purchase machine resources to store the collected data, and pay a lot of resources and operation and maintenance costs. In addition, data migration also means data delay. The newly generated online business data can only be analyzed “for a while” or even “for a day”, which cannot meet the real-time demand.

In order to solve the problem of timely query across database instances, Ali Cloud DMS (data management) launched the cross-instance query service.

What is cross-instance query service

Cross-instance query service provides timely associated query service for online heterogeneous data sources in different environments. Regardless of whether the database is MySQL, SQLServer, PostgreSQL or Redis, and regardless of which ali Cloud region the database instance is deployed in, the associated query between these database instances can be realized only through a SINGLE SQL without data collection.

In addition, database instances can also be deployed in different resource environments. In addition to RDS, we also support self-built databases on ECS, self-built databases with public IP, self-built databases on user’s local IDC, and even databases deployed in other cloud vendors.

features

Timely query online data

At present, most data analysis solutions need to export the data of OLTP database to offline data system for analysis, but this solution is difficult to meet the requirements of real-time, and there are risks of data loss when exporting data to offline system.

DMS cross-instance query service, without user migration task, directly write a SQL, can realize the direct correlation analysis of multiple online databases. Data synchronization is not required, reducing the complexity of the service architecture and saving users’ budgets and o&M costs for holding offline computing resources.

DBLink

For those familiar with Oracle, we can create a DBLink on the currently logged Oracle database to point to another remote Oracle database table. In the cross-instance query service, we redefined the concept of DBLink, which is a virtual connection to any database instance of the user and an alias of the database instance. For example, for MySQL, DBLink corresponds to IP /port one-to-one. With DBLink, SQL access to any data source can be achieved.

Support for multiple relational databases

Currently, it supports MySQL, SQLServer, PostgreSQL and other relational databases.

Supports SQL access to NoSQL

In addition to relational databases, cross-instance queries also support SQL access to NoSQL databases such as Redis. Because SQL syntax is supported, associated queries between RDBMS and NoSQL can also be implemented. Yes, you read that right, a single SQL can implement associated query between MySQL and Redis.

Supports cross-geographic and hybrid cloud queries

When an enterprise reaches a certain stage of development and the number of users and services keeps increasing, the original capacity of a single room cannot meet the requirements of service development. In combination with factors such as DISASTER recovery (Dr) and high availability (HA), the enterprise usually chooses cross-region deployment, also known as unitary deployment. At the same time, many enterprises also need to expand their business overseas and provide better experience for foreign users through local deployment. The problem with this kind of horizontal split is how to make a unified summary and associated query for the global business data.

With the DMS cross-instance query service, no matter which region of Ali Cloud your database instance is deployed in, you can query data of all regions in a unified manner without cross-region data migration.

In addition to Ali Cloud RDS, we also support various databases that users deploy on Ali Cloud ECS. Moreover, if your database is deployed in the local IDC room, or even other cloud vendors, cross-instance query services can be used to achieve cross-instance associated query in these hybrid cloud scenarios.

Cross-instance data import and export

Insert into b select * from a; As you know, this SQL statement can export the data from table A to table B, but if table A and table B are not on the same database instance, this SQL statement does nothing.

The emergence of cross-instance query service breaks the boundary of data import and export between instances. It can export data from a table of one MySQL instance to a table of another MySQL instance. You can also export the results of associated queries from SQLServer tables and PostgreSQL tables to MySQL instance tables.

Compatible with standard SQL

Cross-instance queries can be implemented using standard SQL statements. Cross-instance query services are also highly compatible with MySQL, supporting the MySQL protocol, as well as a variety of common functions and syntax. You can connect to the cross-instance query service through the JDBC/ODBC driver; You can also use various MySQL GUI tools to manage various data sources; Of course, you can also use it directly on the DMS cross-instance query console.

Serverless architecture

Cross-instance query is a serverless online database associated query service. Users do not need to pre-purchase computing resources, no maintenance resources, no operation and upgrade costs, anytime, anywhere.

High performance low latency

The underlying cross-instance query service is based on the powerful MPP computing engine, which continuously optimizes SQL queries, including pushdown, join algorithm, execution plan cache, Meta cache, local scheduling, connection pooling and other technologies. Currently, single-table queries and multi-table associative queries across instances can be performed in milliseconds.

The technical architecture

Users can directly use the MySQL JDBC driver to connect to the cross-instance query service in the application program. Of course, we also provide a console page that can be executed by typing SQL directly.

Application scenarios

Vertically split cross-database queries

Some electricity company will original members, orders, goods and other data are stored in a database instance, but business development rapidly, traffic speed growth, lead to database capacity and performance bottlenecks, so users decide to vertical resolution of architecture, the vertical members, goods, order data split to three database instance. At this time, the business needs to show the sales order quantity of a certain category of goods, and the query originally in the same database should be changed into the query across two database instances. How to carry out associated query in business?

The first approach that comes to mind is to refactor the existing business code, query the data from the two databases separately, and then join the business code. The problem is that if this solution is adopted, there are so many queries in the business that it is extremely difficult to split and operate. There is no very efficient method for cross-library join operation, which requires iterative query from various business libraries, and the query efficiency will also have a certain impact.

What we found was that users encountered a typical cross-instance query problem. At present,Ali Cloud DMS cross-instance query serviceThe ability to query SQL across multiple database instances is already supported, and users can solve the above problems with a single SQL. It can not only meet the core demand of “cross-library Join”, but also greatly simplify users’ technical solutions.

Horizontal split cross-database queries

A hotel has corresponding stores in multiple cities, and its database will be deployed separately in each city. There are demands for global data query in multiple cities in business. Similarly, now more and more Internet industry began to introduce the unitary architecture, in each city will be deployed separately machine room and database, multi-unit data summary query demand is also increasingly strong.

To meet the cross-cell and cross-region database query requirements on the cloud, the cross-instance query service Bridges the barriers between regions. Users can meet these requirements with a single SQL query.

Associated query of heterogeneous database

A company is migrating business data from SQLServer to MySQL for cost and future scalability. During this period, there must be some business subsystems still on SQLServer, and some other business subsystems have all been migrated to MySQL. At this time, the joint query between the two subsystems can be realized by the cross-instance query service of ali cloud. In addition, during migration, you can also verify data consistency between SQLServer and MySQL by querying services across instances.

Associated query in hybrid cloud scenarios

A game company, due to various reasons, at the same time keep ali Cloud, Tencent, UCloud, AWS and other environment of the database instance, at the same time in their own self-built IDC also deployed part of the database. The data of the business is so scattered, it is only to count the number of online users of the current game, and they should go to each environment to query again and do summary again. With the help of Aliyun cross-instance query service, a SINGLE SQL can realize the associated query between cloud vendors and IDC.

summary

Aliyun DMS (Data management) cross-instance query service not only covers the scenario of associated query of heterogeneous data sources, but also solves the problem of cross-region and cross-cloud database associated query. Not only that, we have greatly optimized query performance so that most queries can be completed in milliseconds. Users can implement cross-query across instances through standard SQL without data aggregation.