On the special session of “Financial Intelligence” of the Computing Conference held in Hangzhou Cloud Town on September 27th, Professor Song Le, an ai guru, shared the development and application of financial characteristic machine learning in Ant Financial. Professor Song Le is a researcher in ant Financial’s ARTIFICIAL Intelligence Department, and a tenured associate professor and deputy director of the Machine Learning Center at Georgia Tech University. He is also a board member of the International Association for Machine Learning and field Chair of several top international conferences.

Machine learning has penetrated into ant Financial’s various scenes like water and oil, driving the development of various businesses. On the scene of the conference, Professor Song Le introduced machine learning with financial characteristics in detail, focusing on three aspects of technology: deep learning system for mass graph data, automatic machine learning system, and multi-intelligence body adversarial reinforcement learning system.

The following is a transcript of the speech:

Deep learning system for massive graph data

What makes the financial scene different from the rest of the Internet is that its data is a vast financial network. In this network money moves between different individuals. Various types of nodes are involved in the process of capital flow, including role nodes such as users, businesses and companies, virtual nodes such as accounts, device nodes such as WIFI facilities and terminals, and physical nodes such as locations. At the same time, the relationship between these nodes and the type of information interaction can also be different, thus forming a huge graph. How to model this graph with machine learning and discover the valid information in it is a very complex process.

Those who know machine learning know that in order to model graph data in machine learning, vector representation of data is required first. For example, models like logistic regression and decision tree need vector representation of data first. However, the input of graph data is not vector representation, it is heterogeneous and irregular, the number of neighbors of each node is different, and the connection relationship is also different, which requires a platform to realize the transformation process of graph data to vector representation, and then realize a variety of machine learning models based on vector representation. The figure below shows a general graph vector representation framework.

After learning the representation of graph data, it can be used for a variety of applications, such as recommendation and decision applications, as well as some generation models. At present, there is a popular deep learning model called convolutional neural network in the academic circle. The learning process of convolutional neural network is to realize the modeling of deep neural network in the way of parametric propagation of neural network for each node and edge.

The graph data of Ant Financial is very complex and huge, involving tens of billions of nodes and hundreds of billions of edges. How to use massive data for machine learning modeling requires a good system architecture and platform construction support. This includes, how to store the graph data, so that it can support fast query, fast inference; And how the data is organized at the logical level, whether it’s a social network, a money transfer network, or a media network. With the logical organization of the data and the storage of the graph data, we also need to do general operators, including sampling of the graph, random walk and message propagation. Then, based on the components of these operators, we can implement a variety of graph deep learning models, including representation learning model based on unsupervised learning, and representation learning model based on supervised learning. After representation learning, machine learning can be used to predict the types of nodes and edges, timing behaviors, and multiple objects. Based on these predictive models, we can support all kinds of upper financial business through offline learning scoring or online learning scoring.

In view of the massive data of Ant business, in addition to the thinking on the platform architecture design mentioned above, there are still some technical difficulties to be solved. We may be faced with tens of billions of nodes and hundreds of billions of edges. We need to consider how to quickly query nodes on the graph and extract the sub-graphs around nodes. Our graph storage system GraphFlat and PHStore will be used here. With this in mind, machine learning algorithms, such as random sampling algorithms, need to be designed to turn graph data into a sparse or dense matrix operation for distributed computation on gpus and cpus.

In addition, in order to support a variety of financial scenarios, it is necessary to support network modeling of different structures. Generally, the networks involved in financial scenarios may be isomorphic networks without attributes. It may also be heterogeneous networks often involved in risk control scenarios, such as the fund transfer relationship between users and merchants; It could also be a network with attributes. We need to use different algorithm models to deal with different types of network relations, and no matter which type of network, we can use a set of platforms for modeling. At present, we have built a library of algorithms for multi-type graph data. Include:

  • XGrep for attribute-free network can train billions of nodes, billions of edges and hundreds of billions of samples, and developed distributed random walk framework and distributed Word2VEc training framework.
  • Attribute network oriented GeniePath, which is adaptive depth/breadth of graph neural network, leading performance in the industry;
  • HeGNN& IGNN for heterogeneous networks has a hierarchical attention mechanism that provides financial interpretability and can automatically learn heterogeneous information-rich semantics.
  • KGNN for knowledge graph can express and learn knowledge graph, including graph neural network + graph model.

In many cases, financial scenes need to be interpretable, so we need to explain the depth model trained and find out which side or node on the network affects the decision of the whole risk control system. To this end, we define a variety of models, including GeniePath can automatically search the neighbor of the node in the deep learning network to see which neighbor has influence on the risk control of the current information node; Or consider the influence of different network levels based on HeGNN and IGNN, even the relatively rough high-level network influence, and the influence of different dimensions of the network.

To sum up, a depth map learning platform with high availability requires logical division of architecture, and each division module in the middle has many technical points, including systems engineering technology, high-performance computing technology, and model algorithm technology.

Above are two concrete landing examples. First, we use figure deep learning in marketing scenario, by users and merchants history of purchase behavior to predict sensitivity of bonus amount, a red envelope to the businessman to realize the personalized pricing, help businesses better distribution of bonus amount, improve marketing fund use efficiency, this method reduced the cost of marketing by 8%. We have also applied deep map learning and knowledge mapping to corporate credit, increasing the credit line by tens of billions. In addition, the depth map learning platform also has various applications in payment, loan, insurance, wealth management and other scenarios.

In Internet finance, graph neural network is a very useful new technology, and it is also one of the technical directions of Ant Financial to develop vigorously.

Automatic machine learning system

In addition, the ant gold internal application scenario, there are all kinds of machine learning algorithm in training every day, tens of thousands of model, but the algorithm of personnel training in model of time and energy is limited, due to the complicated products, algorithm personnel should not only choose deep learning network structure and other super parameter selection algorithm, Business knowledge may be added to the process; With the increase of data volume, the model needs to be adjusted in a relatively short time, and the time requirement for model training is getting higher and higher. However, the traditional manual adjustment is difficult to meet our efficiency requirements for model update and iteration.

To solve this problem, we built the AutoML computing platform, an automated modeling tool, which enables platform and algorithmic personnel to collaborate together to accelerate the modeling and optimization process of machine learning models. To this end, we implemented a number of algorithms on top of the underlying infrastructure to implement feature automation, hyperparameter search, network structure search, and meta-learning to reduce the cost of new model development.

A specific case is introduced, called Autonet, which is a deep neural network algorithm widely used for company recommendation scenarios. The basic idea is that we automatically assemble some previously successful small deep neural network complex modules to form a new network structure and find a more efficient model: On the one hand, the network structure of DNN is automatically constructed, and on the other hand, the final modeling effect is improved. Under the same resources, the output model basically takes the same time as the manual design model, and at the same time, a good effect is achieved when the user pulls a new scene, and the dynamic pin rate increases by 14%.

In addition, AotuML has a variety of landing scenarios, all of which use the automatic machine learning platform’s network structure search, superparameter search, meta-learning, and some end-to-end solution capabilities. Machine learning models of various business scenarios can be optimized through this platform to improve efficiency. Even some business scenarios across BU are based on transfer learning to speed up the machine learning modeling process.

Multi – body antagonistic reinforcement learning system

The above two points illustrate how our horizontal technology supports machine learning models for various scenarios from the ground up. Next, we introduce the application and implementation of multi-agent reinforcement learning system in ants. In the actual financial scene, the nodes involved are not static. For example, the nodes of man and merchant have game, cooperation and confrontation in the interaction process. Therefore, we should combine confrontation learning with reinforcement learning of multiple agents in these scenes. For example, anti-fraud scenarios and financial payment scenarios can be modeled by multi-agent reinforcement learning.

However, many traditional reinforcement learning methods assume a simulator and then interact with it continuously to optimize reinforcement learning strategies, such as playing go. However, in the financial scene, the simulator involved in multi-agent reinforcement learning is not a static one, it may be a person or an institution, so there is no clear and easy simulator, and we do not know what the reward function or loss function of multi-agent is in its behavior. Therefore, it is not feasible to use traditional reinforcement learning to set financial scenes. First, a simulator and its reward function should be learned according to existing financial data or user behavior data, so as to carry out reinforcement learning on this basis.

Therefore, we establish a multi-agent reinforcement learning platform. Imitation Learning is used to learn the user’s behavioral characteristics and his reward function. On this basis, various machine learning is performed.

This is a concrete application of multi-body reinforcement learning in recommendation system. In many cases, when a user logs into a system, the system checks, analyzes and recommends the user for a long time. The good recommendation system modeling is not every time the user to act as an independent prediction problem, but regard it as reinforcement learning problems, in this way, we can according to the user long-term rewards, be fond of optimization, rather than short-term recommendation, so that the user is interested in the content of the recommended for a long time, create value.

This is our paper on ICML(2019). We introduce the theory of reinforcement learning into adversarial learning, and use this method to learn user click behavior and corresponding reward function of click behavior. With user behavior model and reward model, large-scale reinforcement learning can be carried out.

There are various applications of reinforcement learning in Ant Financial, which we are still exploring and exploring. We also welcome more exchanges between academia and the industry, so as to jointly innovate and promote development.