Ant Financial has reshaped payments and changed lives over the past 15 years, providing services to more than 1.2 billion people around the world, which is supported by technology. At the 2019 Hangzhou Computing Conference, Ant Financial will share its technological precipitation over the past 15 years, as well as its future-oriented financial technology innovation with attendees. We have compiled some of the best speeches and will continue to publish them in the”
Ant Financial Technology“On the public account, this article is one of them.

For decades, the ant gold suit has been solve the problem of using technology to reshape financial services, in the process of the solution to this problem involves two direction of technology, the first is to solve how to move money from one account to another account, the mass, security, usability problems appeared in the process of how to solve, our answer is more disaster, the high availability of distributed architecture; Second, with the advent of the new digital financial era, how to make more and better use of data to drive business development, that is, data intelligence technology. This article will share some of the advances in ant data intelligence and our thoughts.

First of all, let’s take a look at the requirements of financial data intelligence and the differences between it and traditional big data:

Real-time demand is high, real-time data at more than twice the speed of growth, online decision-making is more and more, no longer the data to make decisions offline and then deployed online;
Computing scenarios are complex and diverse. In the past, it might have been a simple aggregation, but gradually evolved to make decisions based on rules, graphs, machine learning and other decisions, and the forms of the whole computing are becoming more and more diverse.
The data link is long, and the efficiency of r&d and debugging is low. When you want to do full-link data research and development, you will go through a dozen systems from beginning to end, which poses a great challenge to the overall data research and development.
High availability of computing and storage, including cross-city disaster recovery, highly reliable computing services;
Data security, regulatory compliance and risk prevention and control require strict data security and privacy protection, especially compliance at the regulatory level.

Over the past decade, computing technologies have evolved from batch computing for large-scale data warehouses, to real-time and streaming computing, to interactive analysis, which can solve some problems on the one hand, and bring new challenges on the other. For example, multiple computing modes bring efficiency problems with multiple r&d, multiple systems bring cost problems with multiple storage requirements, and different disaster recovery and data security requirements bring complexity problems.

To address the problems of computing diversity, we need a more open computing architecture.

Ant Financial open computing architecture

Making a system to solve all problems is a natural idea of technical personnel, but the difficulty is how to define the boundaries of this system. We believe that computing and the business itself are so closely linked that the changing needs of the business will likely require more and more computing patterns to be explored. So our practice is this open computing architecture, which is unified at different levels to accommodate different computing models.

The first is the unified storage layer, which integrates various storage systems to share data. In this way, customized optimization can be made according to computing requirements, and internal data can be automatically recycled.

The second is to unify data security standards, realize unified metadata management and access on unified storage, and data blood communication, unified authentication and data access authority system, unified data security level and privacy protection system.

The third is the unified programming model, which is based on standard SQL and expansion. When doing business research and development, we face the data abstracted from the lower layer. When doing data-oriented programming, we do not need to pay attention to interactive analysis or other computing modes, nor do we need to pay attention to how the data is stored. Doing so can improve efficiency when developing data and writing business logic. We’ve done a lot of research in this area, and the goal is that when you’re doing SQL development you can go down by two orders of magnitude, from tens of thousands of lines of code to hundreds of lines of code.

The result is an architecture that can be extended with new technologies.

AI engine under open computing architecture

AI computing is an important capability in open architecture, and we need to build more flexible and intelligent AI engines.

At present, most companies’ AI systems will follow such a architecture: there is a data warehouse or cluster for data cleaning and preprocessing, and then take out a table and train it on a model platform with data annotations. The trained model is finally deployed to the line for prediction. The whole process has gone through multiple systems, so the data may be stored in multiple copies in fact, and it takes a lot of time to transfer the model, which makes it difficult to achieve real real-time performance. Users often need to develop multiple platforms and components to meet their needs.

Open architecture allows you to plug in an AI engine, and we’ve done some work on both the SQL layer and the deep learning engine. SQLFLow describes the requirements of your application in SQL, and the underlying machine learning tasks are generated directly against THE SQL to train the model.

ElasticDL we just announced open source on September 11th, it is an elastic scheduling AI engine based on TensorFlow. You can still do efficient AI training when you are short of resources or have errors. It also makes AI training much easier by training Keras models directly from the command line. With these tools, we hope to simplify the training and use of AI.

To learn more about SQLFlow and ElasticDL, check out their open source home pages sqlFlow.org and Elasticdl.org

In an open architecture, in fact, there is no need to change the engine. The general pattern is that when a new engine or tool is available, it can be used directly, and after using it, it needs to be optimized.

Finance-level graph computing under open computing architecture

In the financial field, the financial scene relies heavily on graph data, and we need strong graph computing capability. So how can open computing architecture support graph computing?

The figure above shows the development process of ant’s graph computing. Four years ago, we started from making graph data products, to making an iterative computing engine for offline full graphs, then to making an engine for streaming graph fusion, then to making high-speed graph caching, and now to make a one-stop graph platform by aggregating all things related to graphs.

The first one is GeaBase, a financial level distributed graph database. The problem is that when you have a large amount of graph data, the data is related to each other, to provide strong consistent, high capacity storage. The biggest difference between it and some existing graph databases is that many existing graph databases collect all the data to make a calculation, which is the simplest way, but will lead to performance bottleneck. What we do is to deliver the calculation to the worker to achieve distributed high performance. At the same time, GeaBase can choose what consistency is required based on the user’s business needs.

Then the large scale map calculation, using the adaptive threshold partition strategy to reduce resources, because a lot of figure calculation is needed to map loaded into memory, and then iterate, some large figure of this kind of circumstance, the demand for memory is very high, so we made some optimization hopes to reduce the utilization rate of resources. We also have the flexibility to support more graph algorithms, as well as the ability to do very large scale efficient graph relationship mining, which has been implemented in the internal risk control scenario.

Then there is online flow graph fusion, ant developed the industry’s first real-time multi-mode fusion computing framework. The reason is that we found that in business, there are many times when data comes in, and we need to do a lot of graph calculation at the same time, and then output the result after calculation. This is also a relatively cutting-edge exploration topic in the industry, and we can do many layers of calculation on a large number of large graphs at the same time.

Based on the strong demand for graph computation, we have made a high-performance graph cache, in which the key technology is based on non-collision Hash function, and the compression of graph data structure. You can see the effect in the figure below. We can compress up to one-fifth of the original data, and the performance is 2-5 times that of the best comparable products in the industry.

When we had so many systems, the problem we encountered was that we needed to conduct research and development for multiple engines in one scene, so we developed a one-stop platform, AntGraph, to facilitate the whole process from development and debugging to production on-line. We’re putting all of our access under one Graph SQL, and we’re doing a little extra research on that, because it’s debatable whether SQL is the best fit for Graph, but we can use some of the descriptive functionality of SQL plus some extensions to do what we want.

After the previous research and development of graph computing capability, we have multiple graph computing engines, and in order to optimize customer experience, SQL language is used to unify the upper layer. This gives our open computing architecture the power of graph computing.

Converged computing under open computing architecture

After previous research and development, open computing architecture has a large number of computing engines, although the upper level of the unified, but this situation is often not the optimal choice. Is it possible to optimize existing computing models more when we have a better understanding of them? In many cases, what users want is computing that combines multiple models, sometimes it’s streaming plus graphs, sometimes it’s streaming plus machine learning plus something else, and the answer we’ve come up with is a converged computing engine.

At the bottom, fusion computing is based on Ray, a new generation of computing engine promoted by Ant Financial and UC Berkeley University. Fusion computing solves complex scenarios through a set of engines, improves efficiency through dynamic computing and state sharing, and realizes integration of RESEARCH and development, runtime and disaster recovery.

Fusion computing has been implemented in several ant scenarios, including:

Dynamic graph derivation, flow + graph calculation, performance can complete 6 layers of iterative query within 1 second, used for real-time anti-cash, fraud identification;
Financial online decision-making, flow + distributed query + online service, performance data production to distributed query within one second, used for financial network monitoring, institutional channel routing, etc.
Online machine learning, flow + distributed machine learning, performance to achieve second level data samples to model update, used for intelligent marketing, real-time recommendation, flow control, etc.

Converged computing does not replace other engines, but rather complements them for some suitable scenarios. As can be seen from the above sharing, this architecture can accommodate various types and functions of computing engines, which is also the meaning of the word “open”. If there is a new engine in the future, or the business has new requirements for data, it can be directly inserted into its own engine.

Finally, we summarize ant Financial’s overall vision for the future of Data intelligence. We hope that the future storage can be connected, all the engines can be plugged and integrated, and the upper layer hopes to have a standard Data access mode. We call it the Big Data Base when all these sets are combined together. We believe that after more than ten years of development, big data will evolve to the next stage, adding, deleting, modifying and checking data will be as simple as database.

On another level, Big Data Base also means that machine learning, graph computing and all kinds of computing engines in the future can be easily used in one system. Many of the components of this open computing architecture have been opened source. We are still in the process of developing this large system. We will share more details with you in the future, and hope that you can join us to push the field of financial data intelligence to the next stage.

OceanBase topped the TPC-C test list and achieved the breakthrough of zero database in China. Want to know the technical details behind? Welcome to download the e-book “OceanBase TPC-C Test Technology Analysis”. Long press to identify the following QR code, follow the official official account of “Ant Financial Technology”, and reply “TPCC” in the dialog box, you can download it for free.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Open Computing Architecture: How does Ant Financial accommodate all computing in one architecture?

Ant Financial open computing architecture

AI engine under open computing architecture

Finance-level graph computing under open computing architecture

Converged computing under open computing architecture

Open Computing Architecture: How does Ant Financial accommodate all computing in one architecture?

Ant Financial open computing architecture

AI engine under open computing architecture

Finance-level graph computing under open computing architecture

Converged computing under open computing architecture

Related Posts

Flink state consistency and end-to-end ecactly-once guarantee

Iqiyi machine translation technology practice of multi-language lines

Use Spring Boot to implement blog statistics service