The open source | Service Mesh data plane SOFAMosn deep revelation

Author: Duo Xiaodong, Flower Yi Shan, Ant Financial senior technical expert, focusing on cloud computing technology and products. Apache Kylin founding team core member, Ant Financial Cloud PaaS founding team core member, Antstack network product leader, SOFAMesh founding team core member.

The project here: https://github.com/alipay/sofa-mosn

This article is based on part of the content of Ant Financial Service Mesh Data Plane SOFAMosn Deep Decryption, which was shared by the author in Service Mesh Meetup #2 in Beijing. The complete content can be seen in the live replay at the end of the article

preface

Today’s sharing content is the deep disclosure of Ant Financial Service Mesh Data plane SOFAMosn.

Following teacher Xiao Jian’s sharing of SOFAMosn in the theme of Service Mesh Meetup #1 in Hangzhou, Service Mesh Exploration Under Large-scale Micro-service Architecture, this time focuses on the thinking and exploration of data plane in ant landing.

background

Last time, Teacher Xiao Jian has introduced the selection of SOFAMesh technology, as well as some trade-offs between open source and self-research.

So let’s first look at why do ants choose Service Mesh

The key points are summarized into four aspects:

Ant is fully embracing microservices, cloud native, whether SOFA5, or k8S-compatible container platform Sigma landing, Service Mesh is an important component that cannot be obtained.

Second, the operations of the ant system based on traffic dispatching work in the service level, such as scheduling flow between the LDC architecture in logical zones, such as flexibility, in essence is scheduling flow between heterogeneous computer room, in addition to release as logic zone blue green, disaster room and so on all need more robust in the traffic scheduling ability, more flexible, More scalable.

In addition, due to the financial properties of ants, we have stricter requirements on service authentication, such as the landing of state secrets, certificate management in the encryption card, encryption and decryption, etc., which require not only higher security level, but also the capacity to carry large traffic. At the same time, we have seen the accelerated development of The Zero Trust network architecture, which is in line with our aspirations.

Finally, ant’s internal technology stack is diverse, but the cost of integrating multiple languages remains high.

For example, the common requirements for non-SOFA languages to communicate with SOFA, such as understanding the logic of configuration centers, SOFARPC serialization, and LDC routing rules if deployed in a production environment, can be solved by sinking into the Mesh architecture.

Those of you who know SOFAMesh well should know that ants have chosen to use Golang’s self-developed data plane. In making this decision, we have focused on future technology selection, cross-team r&d efficiency, ant’s existing technology system, operation and maintenance system and other factors. Meanwhile, through investigation and verification, the performance of Golang version is acceptable to us.

Next, I will introduce you to the Mesh data plane developed by Ant and UC, which we call SOFAMosn

The overall structure of SOFAMesh

The diagram you see is based on Istio’s architecture, in which we use SOFAMosn instead of Envoy in the data plane, and add some improvements explored in ant practice.

Version 0.1.0 of SOFAMosn supports xDS V0.4 API core capabilities, with emphasis on SOFARPC protocol, and is used within ant in production environments. It also supports the basic functions of HTTP/1.1 and HTTP/2.0, but is not currently used in the production environment.

Core design ideas of SOFAMosn

Firstly, the data flow processed by SOFAMosn as a proxy is divided into four layers. In the inbound direction, the data goes through network IO layer, binary protocol processing layer, protocol flow processing layer and forwarding route processing layer successively. Outgoing and incoming processes are basically the opposite.

Understand the basic idea of stratification, and specifically introduce the specific functions of each layer:

The I/O layer encapsulates I/O reads and writes and provides an extensible I/O event subscription mechanism
The PROTOCOL layer provides the ability to serialize/deserialize data based on different protocols
The STREAMING layer provides upward protocol consistency, is responsible for STREAM life cycle, manages Client/Server mode request traffic, and provides pooling mechanism for Client STREAM
The Proxy layer provides routing, load balancing, and other capabilities to make the stream flow at the front and back ends. You can clearly see the process of one-way request flow from this diagram.

Now that we know about layered design and forwarding, let’s look at the threading model. From the thread model of version 0.1.0, it can be seen that IO coroutines of each link appear in pairs, read coroutines are responsible for reading, event mechanism and Codec logic, data is promoted to steam layer, and specific stream events are handled by independent resident worker coroutines pool.

In 0.2.0 we will have multi-core optimization, reading coroutines will no longer be responsible for codec logic, forwarding will be done by codec worker pool. From the perspective of development direction, we will draw on SEDA’s ideas to abstract the processing of each stage in the forwarding process into a stage, and process each stage through the mechanism of Task queue, worker coroutine pool and controller. From a technical implementation point of view, Golang also implements the SEDA mechanism with simpler components.

SOFAMosn module division

In addition to the four core modules introduced just now, there are other modules, such as the routing module responsible for the routing address of the request, the back-end management module responsible for the management of the back-end life cycle, health, etc. The blue boxes are the functional modules that SOFAMosn 0.1.0 will involve, and the red dotted boxes are some topics that we plan to implement or experiment. In this regard, we welcome you to join us in building.

In conclusion, modularity and layered decoupling are the original intention of SOFAMosn design. In addition, programmability, event mechanism, scalability and high throughput are all important considerations in the design.

SOFAMosn core capabilities

After introducing some of the architectural design ideas, let’s take a look at the core capabilities of SOFAMosn 0.1.0.

In terms of network core capabilities, we have encapsulated and abstracted IO process-related capabilities into programmable interfaces, which we have optimized for performance and can be used independently. SOFAMosn provides the built-in TCP proxy function and has been optimized for use alone. In addition, SOFAMosn supports TLS link encryption, and currently uses Golang implementation. Golang TLS performance experiment will be introduced in the following chapters. SOFAMosn supports TProxy mode with iptables transparent forwarding. At the same time, MOSN supports smooth reload, smooth upgrade.

In terms of multi-protocol, SOFAMosn 0.1.0 focuses on supporting SOFARPC, which has been used in ant production environment. At the same time, SOFAMosn supports HTTP/1.1, the basic functions of HTTP/2.0, by using the open source HTTP/1.1 implementation of FastHTTP and Golang’s own HTTP2 implementation. As both FastHTTP and HTTP2 have built-in IO and link pooling functions, the support for these two protocols is currently separate from the overall design of SOFAMosn, and performance and other aspects have not been optimized. We will consider incorporating them into the Framework of SOFAMosn in the future version iteration and performance optimization. In addition, we are working on Dubbo, HSF support, which will be available in a later release. Meanwhile, currently supported SOFARPC, HTTP/1.1, HTTP/2.0 all support Mes H TLS link encryption.

Here, in terms of core routing, Version 0.1.0 SOFAMosn supports virtual Host matching, Route match matching, and Subset route matching/load balancing in its core functions.

Back-end management supports basic load balancing algorithms and active health check.

Beyond the core features, SOFAMosn offers some highlights based on our experience with landing.

First, SOFAMosn supports X-Protocol, a more lightweight way to customize the RPC PROTOCOL. For relatively simple scenarios that do not require unpacking, RPC data is forwarded as TCP or HTTP/2.0 payload. All routing load policies that do not need to be unpacked are supported.

At the same time, we plan to add codec extension points in X-Protocol to support scenarios requiring unpacking. In terms of smooth upgrade support, in addition to the classical transfer listener FD + protocol layer wait mode, SOFAMosn supports protocol-independent migration of stock links. In addition, SOFAMosn supports specifying/updating front-end and back-end communication protocols for deployment and upgrade.

In terms of Istio integration solution, SOFAMosn 0.1.0 supports Istio 0.8 Pilot V0.4API full dynamic configuration and xDS on ADS core functions, which will be continuously supplemented in later versions. SOFAMosn also supports the static configuration model.

In addition to capability support, SOFAMosn provides extensible capabilities at the network layer, protocol processing layer, and TCP-based private protocol layer, enabling elegant integration of custom services. In the process of ant landing, our internal SOFAMosn relies on the open source version to realize the internal business of ant in an extensible way, providing feasible solutions in the project landing.

performance

Now that we’ve covered the core features, let’s move on to another issue that’s getting a lot of attention, performance, which is one of the most popular issues right now.

In version 0.1.0 of SOFAMosn, we focused on optimizing the single-core forwarding performance of the protocols based on the overall framework of SOFAMosn in Sidecar mode, that is, the single-core forwarding performance of TCP and SOFARPC.

Let’s start by sharing some of our techniques and experiences for optimizing in single-core scenarios. The way we use is mainly exclusive binding core, memory, IO, scheduling and other aspects of optimization.

First, bind core. When P=1 is specified, exclusive bind core performs better in both system call execution efficiency and cache locality affinity, and the overall throughput increases by about 30%. Secondly, memory optimization. We sampled the recycle mechanism of sslab style to improve reuse and reduce memory copy.

At the same time, the affinity of Golang memory model should be considered in memory allocation to minimize the memory application in arena area. Finally, we all know that Golang’s GC needs to be familiar with and used to, and there are a lot of details to pay attention to to minimize the stress of GC ScanObjects.

In the IO scheme, Golang’s IO model is synchronous, so the read aspect should not only read as much as possible, but also avoid the influence caused by frequent invocation of SetReadDeadline, which will have certain impact on throughput under our pressure test. Appropriate buffer is needed in writing. For example, the frequent writing of system IO by an IO coroutine driven by multiple workers will also result in throughput reduction. Another aspect to be aware of is the need to avoid unbalanced read and write frequencies in a multi-coroutine scenario, which is a potential cause of overall throughput degradation.

In addition, if a large number of reads or writes are triggered, a large number of system calls can be made, which increases the cost of Golang Runtime scheduling. In terms of Golang Runtime scheduling, coroutine scheduling will be triggered first, resulting in time consumption. At the same time, runtime scheduling itself is not as sensitive as OS thread scheduling, and there will be a certain amount of time consumption. At the same time, OS system calls themselves may be time-consuming, resulting in performance degradation.

Here I will share some of the real cases we have encountered in performance optimization. In addition to thinking about IO, I will also focus on scheduling and balancing.

Firstly, we use coroutine pooling to avoid the runtime.morestack problem. Secondly, in the single-core scenario, we need to focus on whether G is in starvation state, resulting in resource waste.

After introducing some of the performance optimization processes, let’s take a look at some of our performance optimization results so far, namely the performance of single-core TCP forwarding, and the performance of single-core SOFARPC forwarding. As you can see, in single-core TCP forwarding scenarios, the forwarding performance difference between SOFAMosn 0.1.0 and Envoy 1.7 is manageable, and we will continue to refine it in future releases.

In addition to the TLS implementation mentioned earlier, let’s look at some of the performance explorations. First, the test scenario is introduced. In this scenario, we find that Golang’s native implementation performance for the ECDHE algorithm is lower than Ningx (using OpenSSL) but higher than Golang with Boring SSL. Through the performance pressure test of specific algorithms and protocols, code research we draw the following conclusions. It can be seen that for ecdHE-P256 encryption suite, Golang native implementation performance is very good, can be trusted to use.

In addition to these optimization points, we will continue to carry out performance optimization, multi-core optimization and memory optimization in subsequent versions, and use user-mode and kernel-mode acceleration technology to improve the forwarding performance of SOFAMosn. In terms of TLS encryption and decryption, we will try Offload acceleration based on local accelerator card and Keyless architecture, which is also some technical means we have implemented in ant network.

RoadMap

Finally, I’d like to introduce SOFAMesh’s RoadMap (see our public account for details) :

In the first week of August we will release SOFAMesh 0.1.0, which will focus on Proxy core capabilities, xDS V0.4 API core features, SOFARPC and other communication protocols.

At the end of August, we will release version 0.2.0. Based on the continuous improvement of core capabilities, we will improve the function and scalability of X-Protocol to support private RPC Protocol extension. At the same time, we will support Dubbo/HSF communication protocol and access to ZK based service registry. At the same time, we will focus on enhancing HTTP/2.0 functionality and performance optimization. We will also support a K8S operator that allows SOFA Mesh to access K8S resources.

In addition to functional enhancements, we will continue to optimize performance, focusing on multi-core performance and overall memory optimization. In addition, we will continue to promote code optimization, improve testing and other basic work.

At the end of September we will release 0.3.0, which focuses on providing Feed integration, Quota, and Report functions. Fusing and current limiting capabilities are also available in September.

SOFAMosn is still a preliminary version, and we will continue to supplement, improve and optimize it. We also welcome friends who are interested in the open source community to join the construction of the Open source Version of SOFAMesh.

supplement

This article is based on the author in the Service Mesh Meetup # 2 share part of the finishing, the share of the PPT and video, can be viewed from the https://www.itdks.com/eventlist/detail/2455

Click follow to get the latest distributed architecture

Welcome to jointly create SOFAStack https://github.com/alipay