Tencent officially announced Plato, the open source high-performance graph computing framework, on April 14. This is the fifth major open source project in just one week.

Relative to other figure computing framework of the global scope, Plato can meet the large scale figure of a billionaire node computing needs, shorten computing time from days to minutes, comprehensive performance leading ahead of the other mainstream distributed computing framework, and broke the often need hundreds of server resource bottleneck, now, At least ten servers are needed to complete the calculation.

Yu Donghai, head of Tencent Plato team, said, “Plato has supported many core businesses within Tencent, including wechat, especially providing support for all kinds of computing of Tencent’s super-scale social network graph data, which solves the difficulty that other existing computing frameworks cannot complete computing within limited resources and time. Plato has not only created great business value for Tencent, but will continue to drive the synergistic development of graph computing technology and the industry and accelerate innovation once open source.”

In fact, the “graph” of graph computing does not refer to ordinary images and photos, but an abstract data structure used to represent the association between objects. Graph computing is a process of expressing problems and solving problems by taking graph as data model. Graph computing can fuse data from different sources and types into the same graph for analysis, and obtain results that were difficult to be found by independent analysis. Therefore, graph computing has become a crucial data analysis and mining tool in social network, recommendation system, network security, text retrieval, biomedical and other fields.

Plato is a high-performance graph computing framework independently developed by Tencent internal Graph computing TGraph team integrating internal resources. Plato is named to honor the great mathematician Plato. Currently, Tencent Cloud big Data team is encapsulating Plato, which will be open to all developers.

Plato’s computing performance was extremely strong, 1-2 orders of magnitude higher than Spark GraphX, the most leading graph computing framework on the market. It reduced the computing time of algorithms from days to minutes, improving the performance by dozens of times, and also marked that graph computing fully entered the minute-level era. Another huge advantage is that Plato is far less than the mainstream graph computing frameworks in terms of memory consumption, 1-2 orders of magnitude less than Spark GraphX, requiring only a small or medium scale cluster of about 10 servers to complete super-large graph computing, compared to the previous limit of requiring hundreds of servers. Resource pressures and computing costs are greatly reduced.

Plato currently provides two core capabilities: graph dissociation calculation at the Tencent data scale and graph representation learning at the Tencent data scale. At the same time, Plato naturally adapted to Kubernetes, YARN and other resource scheduling platforms, and provided a variety of interfaces supporting mainstream file systems, which could provide a more friendly operating environment for developers.

In terms of architecture design, the core of Plato framework is the adaptive graph computing engine, which can provide a variety of computing modes for developers to flexibly choose according to different types of graph algorithms, including adaptive computing mode, shared memory computing mode and pipeline computing mode. In addition, a good interface is designed to support access to new computing communication modes.



Plato Overall architecture diagram

On top of the computing engine, Plato provides a multi-level interface to algorithm designers or specific businesses: from the underlying API, to a library of graph algorithms, to a “solution” — a graph toolset — tailored to a particular business. Using these application-layer interfaces and tools, Plato could also combine offline results with other machine learning algorithms to support different operations at the top.

It is worth mentioning that many algorithms in Plato’s algorithm library, such as graph feature, node centrality index, connected graph and community recognition, have been open source, and more algorithms will be open source in the future.

Plato’s high-performance, scalable, pluggable features have a promising future in social networking, recommendation systems, biomedical and other fields, For example, ranking the influence of web pages regularly to improve users’ search experience, analyzing the structure of huge social networks to accurately recommend services to users, and understanding the interaction between proteins through subgraph matching to develop more effective clinical medicine, etc.

Since the adjustment of 930 architecture last year, open source collaboration has become one of the important strategies of Tencent’s technology development, and has driven intensive external open source projects. Just last week at Techo developer conference, Tencent officially announced the open source of TubeMQ, Tencent Kona JDK, TBase, TKEStack four key projects. With Plato’s open source, Tencent has made another big move in the open source space. Tencent has hosted 89 open source projects on Github, with more than 1,000 contributors participating in open source contributions and more than 260,000 stars, ranking first on Github’s global company contribution list.

Plato was open source address: https://github.com/tencent/plato