Introduction: Aliyun put forward the concept of “Trinity” at the end of 2020, aiming to form a unified technology system of “self-research technology”, “open source project” and “commercial products”, so that the value of technology can be maximized. Dubbo 3.0, the first of the three, has high expectations within the group. It perfectly integrates the characteristics of the internal HSF, and naturally has the core capabilities of high performance and high availability. We expect to use it to solve the internal landing problems and achieve the unification of the technology stack. This article will share the evolution of Dubbo 3.0 and how to help users enjoy the benefits of cloud native technology.

The author | much cloud

The trinity

At the end of 2020, Aliyun put forward the concept of “trinity”, aiming to form a unified technology system of “self-developed technology”, “open source project” and “commercial products”, so that the value of technology can be maximized.

Alibaba Group’s internal HSF framework has developed its core competitiveness of high performance and high availability after experiencing the test of double Eleven flood peak for many years. As for Dubbo, one of the most popular service governance frameworks at home and abroad, its open source affinity need not be said much.

Dubbo 3.0 is expected to be the first of three initiatives within the group. It perfectly integrates the characteristics of the internal HSF, and naturally has the core capabilities of high performance and high availability. We expect to use it to solve the internal landing problems and achieve the unification of the technology stack. At present, it has landed in koala on a large scale, and will land in many core scenes in the future, and carry the complex business scenes of 618 and Double 11.

The benefits of Dubbo 3.0

Before getting into the details of the changes to Dubbo 3.0, let’s talk about two aspects of the benefits of upgrading to Dubbo 3.0.

First of all, Dubbo 3.0 will focus on improving the performance and stability in the practice of large-scale cluster. By optimizing the data storage method, the resource consumption of single machine will be reduced, and based on this, the stability of the cluster will be guaranteed in the case of the horizontal expansion of the super-large-scale cluster. Meanwhile, Dubbo 3.0 puts forward the concept of flexible cluster, which can effectively guarantee and improve the overall reliability and resource utilization of the whole link under the heterogeneous system.

The second point is that Dubbo 3.0 represents a milestone in Dubbo’s full embrace of cloud native. At present, Dubbo has a large base of users at home and abroad, and with the arrival of the cloud native era, these users have more and more strong demand for cloud. Dubbo 3.0 will provide a complete set of solutions, migration paths and best practices to help enterprises make the transition to cloud native and reap the benefits of cloud native.

1. Business income

So from a business application perspective, what are the specific benefits of upgrading to Dubbo 3.0?

First, in terms of performance and resource utilization, Dubbo 3.0 can significantly improve resource utilization by reducing the additional resource consumption caused by the framework.

From a stand-alone perspective, Dubbo 3.0 can save about 50% of the memory footprint; From the perspective of cluster, Dubbo 3.0 can support millions of cluster instances, which lays the foundation for future large-scale business expansion. Dubbo 3.0’s support for the Reactive Stream communication model could significantly increase overall throughput in some business scenarios.

Second, Dubbo 3.0 opens up more possibilities for business architecture upgrades. The most intuitive is the upgrade of communication protocols, which brings more choices to the business architecture.

Dubbo’s original protocol actually bound microservice access to some extent. For example, to access Dubbo’s back-end services, mobile terminal and front-end services need to go through protocol transformation at the gateway layer. For another example, Dubbo only supports request-response communication, which makes some scenarios that require streaming or reverse communication not well supported.

Finally, Dubbo 3.0 brings a holistic solution to cloud native upgrades on the business side. Whether it is the passive change caused by the upgrade of the underlying infrastructure or the active upgrade of the service to solve the pain points, when the service is upgraded to the cloud native, Dubbo 3.0 can help the service products quickly access the cloud native by providing the cloud native solution.

Dubbo 3.0 overview

Now that we know what the benefits of upgrading to Dubbo 3.0 are, let’s take a look at some of the specific changes to Dubbo 3.0.

Support for a new service discovery model. Dubbo 3.0 tries to start with the application model and optimize its storage structure for its mainstream cloud native design model to avoid interoperability problems on the model. The new model is highly compressed on the data organization, which can effectively improve performance and scalability of the cluster.

2. Proposed the next generation RPC protocol — Triple. This is an open new protocol based on HTTP/2 design, which is fully compatible with gRPC protocol, because it is based on HTTP/2 design, with extremely high gateway friendly and penetration, full compatibility with gRPC protocol makes it a natural advantage in multi-language interoperability.

3, put forward the unified governance rules. These rules govern native cloud traffic and cover a series of scenarios, such as traditional SDK deployment, Service Mesh deployment, VM deployment, and Container deployment. One rule governance scenario can greatly reduce the cost of traffic governance and unify the global traffic governance in heterogeneous systems.

4. Provide solutions for accessing the Service Mesh. For Mesh scenarios, Dubbo 3.0 proposes two access modes: The first is the Thin SDK mode. The deployment model is the same as the mainstream deployment scenario of the current Service Mesh. The Dubbo mode will be slimed, shielding the same governance functions as Mesh and retaining only the core RPC capability. The second is the Proxyless mode, where Dubbo will take over Sidecar’s responsibilities, actively communicate with the control plane, and apply cloud native traffic governance functions based on Dubbo 3.0’s unified governance rules.

Application-level service registration discovery

Application level service discovery model

In fact, the prototype of the application-level service discovery model was first proposed in Dubbo 2.7.6. After a period of iterations, a relatively stable model was finally formed in Dubbo 3.0.

In 2.7 and previous versions of Dubbo, application service registration and discovery, only at the granularity of interfaces, each interface corresponding in the registry data that are a different machine will belong to the current machine metadata information on registration or interface level configuration information, such as serialization, computer room, unit, the timeout configuration, etc.

All servers that provide this service change independently in interface granularity when they restart or publish. For example, a gateway application depends on 30 interfaces of an upstream application. When the upstream application is published, there are 30 address lists that are used to bring the machine online and offline.

The way interfaces are registered as first citizens of discovery is one of the earliest SOA or microservice fragmentation approaches, providing independent and dynamic change capabilities for a single service, single node. With the development of services, the number of services that a single application depends on is increasing, and the number of machines of each service provider is also increasing due to business or capacity reasons. As a whole, the total number of service addresses that clients depend on is increasing rapidly. In this case, consider optimizing the process design of registration discovery.

Note two features here:

  • With the basic completion of single application splitting into multiple microservice applications, large-scale service splitting and recombination is no longer a pain point. Most interfaces are provided by only one application or fixed several applications.
  • A large number of urls used to flag address information are highly redundant, such as timeout, serialization, and so on. These configuration changes are extremely infrequent, yet they occur in every URL.

Combined with the above characteristics, the final application-level registration discovery is proposed, that is, application is taken as the basic dimension of registration discovery. The main difference from the interface level is that an application that provides 100 interfaces needs to register 100 nodes in the registry. If the application has 100 machines, each release is a change of 10,000 virtual nodes to its clients. Application-level registration discovery requires only one node, and only 100 virtual nodes change per release. For applications that rely on a large number of services and many machines, the size reduction is tens to one hundredth of an order of magnitude, and the memory footprint will be reduced by at least half.

However, technical solutions need to be designed not only for functional correctness, but also for upgrading existing services. Therefore, the foundation of upgrading to application-level registration discovery is the ability to align interface-level registration discovery in its functionality. Regardless of whether the client is upgraded or whether application-level registration discovery is enabled, the premise is that the correct business invocation is not affected.

To provide this assurance, we designed a new component, the Metadata Center, to manage two pieces of data:

  • Interface application mapping: The mapping between the interface and the application is reported and queried to determine whether the application level is enabled on the client and avoid service code changes.
  • Application-level metadata snapshot: when an application using the configuration of the different between different interface data will appear differentiation, therefore in application level scheme, the author puts forward the concept of metadata snapshot, each application will be generated when each release a metadata snapshot, this snapshot contains the current version of metadata and the current application provides all the configuration of the interface. This snapshot ID is stored in the URL, which both provides dynamic change capability and reduces memory pressure on the data store on an order of magnitude.

Finally, because this new Service discovery model is highly similar to the Service discovery model in Spring Cloud, Service Mesh and other architectures, Dubbo can realize mutual discovery with nodes in other architectures from the registry level.

Dubbo 3.0- Cloud native & Ali endorsed, easy to use

Dubbo 3.0 is positioned to be the best microservices framework for the cloud-native era. At present, we can see several trends, such as K8s becoming the de facto standard of resource scheduling, Mesh becoming the mainstream, and rapid growth in scale. The existence of these trends put forward higher requirements for Dubbo.

1. How to deploy and call Dubbo service more conveniently on K8s is a problem that must be solved. To solve this problem, unified protocol and data exchange format is a prerequisite. 2. The popularity of meshing brings with it multiple issues, namely how native Dubbo and Mesh Dubbo can coexist, and how multi-language scenarios can be supported. 3. The increase in scale will bring more challenges to the entire Dubbo architecture, both for components such as the registry and for clients, with more data and calls.

How to provide more efficient services while maintaining stability is the most important part of Dubbo’s evolution.

These challenges in the cloud-native era have driven Dubbo into the next generation: new protocols, K8s infrastructure support, multilanguage support, and scaling.

1. Next generation RPC protocol

The most basic capability of RPC framework is to complete the invocation of services across business processes and compose services into chains and networks, among which the most core carrier is RPC protocol.

At the same time, due to the tight coupling with business data, the design and implementation of RPC protocol also directly determines the business architecture in some aspects, such as the interaction from the terminal device to the back-end, the adoption of multiple languages in the microservice architecture, and the data transmission model between services.

Dubbo 2 provides the core semantics of RPC, including protocol headers, flag bits, request IDS, and request/response data. However, in the era of cloud native, Dubbo 2 protocol mainly faces two challenges: one is ecological interconnectivity, users are difficult to directly understand binary protocol; Second, it is not friendly to gateway components such as Mesh and requires a complete parsing protocol to obtain the required invocation metadata, such as some RPC contexts, which can be challenging in terms of performance and ease of use.

Dubbo as a service framework, its most important is to provide telecommunication capability. The design and implementation of the old Dubbo 2 RPC protocol has been proved in practice to limit the development of business architecture in some aspects, such as the interaction between terminal devices and back-end services, the adoption of multi-language in microservice architecture, and the data transmission model between services.

To support existing functions and solve existing problems, the next-generation protocol must have the following features:

  • Protocols need to address interoperability across languages. Both the traditional multi-language and multi-SDK model and the cross-language Mesh model need a more general and extensible data transmission format.
  • In addition to the Request/Response model, the protocol should support Streaming and Bidirectional.
  • The request Id mechanism is retained in performance to avoid the performance cost caused by queue head blocking.
  • Easy to extend, including but not limited to support such as Tracing/ Monitoring, and should be recognized by devices at all levels, making it easier for users to understand.

Based on these requirements, the HTTP2/ Protobuf combination is the best fit. When you think of a combination of the two, it’s probably easy to think of the gRPC protocol. The relationship between the new generation protocol and gRPC is as follows:

1. The new Dubbo protocol is based on the GRPC extension protocol, which also ensures that the new protocol and GRPC can be interworked and shared on the ecosystem.

2. Based on the first point, the new Dubbo protocol will support Dubbo service governance more natively, providing greater flexibility.

3. In terms of serialization, since most applications are not using Protobuf yet, the new protocol will provide sufficient support for serialization, smooth adaptation of existing serialization and easy migration to Protobuf.

4. In the request model, the new protocol will support Reactive, which is not available in gRPC protocol.

2, the Service Mesh

In order to make Dubbo fall in the Service Mesh system, after referring to a number of schemes, the two types of Mesh schemes that are most suitable for Dubbo 3.0 were finally determined. One is the classic Sidecar-based Service Mesh and the other is the sidecarless Proxyless Mesh.

The Sidecar Mesh deployment is the same as the mainstream Service Mesh deployment. The focus of Dubbo 3.0 is to provide a completely transparent upgrade experience for business applications, not only from a programming perspective, but also through Dubbo 3.0 lightweight, Triple protocol, and so on, to minimize the loss and maintenance costs of the entire call link. This solution is also known as the Thin SDK solution, and the thing about Thin is that it removes all the unwanted components.

Proxyless Mesh deployment solution is another Mesh form planned by Dubbo 3.0. The goal is to directly interact with the control plane by the traditional SDK without starting Sidecar.

Consider the following scenarios in which Proxyless Mesh can be deployed:

  • The business side expects to upgrade the Mesh solution, but cannot accept the performance cost caused by Sidecar traffic hijacking, which is common in core business scenarios
  • It is expected to reduce Sidecar o&M costs and system complexity
  • Legacy systems upgrade slowly, migration takes a long time, and multiple deployment architectures coexist for a long time
  • Multiple deployment environments include multiple deployment modes, such as VM deployment and Container deployment, and mixed deployment of multiple types of applications, such as Thin SDK and Proxyless deployment, and Proxyless deployment mode for performance-sensitive applications. Peripheral applications are deployed in the Thin SDK solution. Multiple data planes are scheduled by the unified control plane.

Combining these two forms, Dubbo has Mesh solutions to choose from in different business scenarios, different migration stages, and different infrastructure support situations, which can be further governed by a unified control surface.

Future deployment

1. Deploy on K8s

Above is the expected future deployment of Dubbo 3.0 on Kubernetes. Dubbo 3.0 will enable native services of its Kubernetes to be called to each other without the need to deploy separate registries on a service discovery model.

2. Deploy on Istio

Above is the expected future deployment of Dubbo 3.0 on Istio. Here, a Thin SDK and Proxyless hybrid deployment mode is adopted. Pod 1 and Pod 3 are shown in the figure. Data traffic is directly sent by Dubbo Service, while Pod 2 is deployed in Thin SDK mode. Traffic is intercepted by Sidecar and outgoing.

Flexible reinforcement planning

Cloud native has brought about a major change in technology standardization. Making applications easier to create and run on the cloud, with the ability to scale flexibly, is the core goal of all cloud native infrastructure components. With the flexibility of cloud-native technology, applications can scale a large number of machines in a very short time to support business needs.

Such as in response to the zero seconds kill scene or incident, the application itself, often require thousands or even tens of thousands of machines for several to improve performance to meet the needs of users, but at the same time in the expansion of also brought much caused by such as cluster node frequent abnormal, service capacity is influenced by many objective factors lead to node service ability not equal a series of problems, These are the problems encountered in large-scale cluster deployment in cloud-native scenarios.

Dubbo expects to solve these problems based on a flexible cluster scheduling mechanism. This mechanism mainly solves two aspects of the problem, the first is in the case of node exception, distributed service can remain stable, no avalanche and other problems; Second, for large-scale applications, it can run optimistically and provide high throughput and performance.

  • From a single-service perspective, Dubbo’s goal is to provide an unbeatable service, which is to ensure the overall correctness and timeliness of the business by selectively rejecting some requests when the number of requests is particularly high.
  • From the distributed perspective, the overall performance degradation caused by complex topology and different node performance should be reduced as much as possible. Flexible scheduling mechanism can dynamically allocate traffic in an optimal way, so that heterogeneous systems can reasonably allocate requests according to the accurate service capacity at runtime, so as to achieve optimal performance.

The Dubbo 3.0 roadmap

Apache Dubbo 3.0.0 was officially released in June as a milestone release donated to Apache, which represents a full embrace of cloud native nodes.

When we release Apache Dubbo 3.1 in November 2021, we will bring the implementation and practice of Deploying Apache Dubbo in Mesh scenarios.

In March 2022, we will release Apache Dubbo 3.2, which will introduce a new intelligent traffic scheduling mechanism for large-scale application deployments to improve system stability and resource utilization.

Finally, Apache Dubbo 3.0 has been integrated with the INTERNAL RPC framework of Alibaba Group, hoping to use it to solve the internal landing problem and achieve the unification of technology stack. In the future, Apache Dubbo 3.0 will be launched in Alibaba Group on a large scale, carrying complex business scenarios on 618 and Double 11.

The original link

This article is ali Cloud original content, shall not be reproduced without permission.