How to design a Dubbo RPC framework from 0 to 1

  • Your understanding of the underlying principles of the RPC framework.
  • And test your overall RPC framework system design ability.

RPC and the RPC framework

1.RPC(Remote Procedure Call)

That is, remote procedure call, mainly solves the problem of remote communication, without understanding the communication mechanism of the underlying network.

2. The RPC framework

The RPC framework is responsible for masking the underlying transport mode (TCP or UDP), serialization mode, and communication details.

In practice, the underlying communication details and invocation process need not be concerned, leaving the business side to focus on the implementation of the business code.

Well-known PRC framework, Ali HSF and Dubbo(open source)

How Dubbo developed

1. Small business scale

For example, in an early application Java War package, all functions were packaged and deployed on a single server. It was convenient to call the interface and did not involve any distributed scenarios.

2. Bigger business

With the rapid development of services, there are more and more services and more and more subsystems. For example: Taobao’s trading system, commodity system, user system, evaluation system… Hundreds of systems appear.

As systems become more complex, business code remains coupled. For example, the earliest Denali project of Taobao, which contains the code of all business systems, takes a long time to package and deploy.

In addition, with the rapid development of each business line, business codes are coupled together. When problems occur online, it is urgent to roll back the code, pull branches, and merge a large number of codes. This process is extremely painful.

At this point, you will find that technology has become the bottleneck of the business, and the business needs to be separated and deployed separately.

3. Emergence of Dubbo and HSF

The problem arises when applications are deployed in a split manner, and an efficient means of communication between applications is needed to fulfill this requirement, which involves distributed remote calls.

As a result, Taobao split Denali into such systems as UM(UserManger) and SM(ShopManager). And so on dozens of engineering codes.

Then, the interfaces related to all calls are split based on service units: UIC(user center service), SIC(store center service)… The cluster deployment is based on service units and services are provided based on services.

So, here comes the RPC framework, alibaba uses HSF internally, as well as the open source RPC framework: Dubbo.

The core design of RPC framework

Mikechen mentioned the core goal of RPC earlier: it is primarily to solve the problem of invoking services in distributed systems.

In fact, the knowledge system involved in this step is very much: it requires a deep understanding and mastery of communication, remote call, message mechanism and so on. It requires a clear understanding of the implementation of the theory, hardware level, operating system level and the language adopted.

1. Three core roles of RPC framework

1) Service Provider (Server)

Provide external background services, register their own service information to the registry

2) Registry

Used for server registration of remote services and client discovery services.

Currently the main registry can be implemented by zooKeeper, Eureka, Consul, ETCD and other open source frameworks.

For example, Alibaba’s Dubbo uses ZooKeeper to implement its registry.

3) Service consumers (Clients)

Get the registration information for the remote service from the registry and then make the remote procedure call.

2.RPC remote call procedure

1) The service caller (client) calls the service in a local invocation manner;

2) After receiving the call, the Client stub is responsible for assembling methods and parameters into a message body capable of network transmission; In Java, this is serialization

3) The Client stub finds the service address and sends the message to the server over the network.

4) The Server stub decodes the message after receiving it, which is a deserialization process in Java.

5) The Server Stub invokes the local service according to the decoding result;

6) Local service execution processing logic;

7) The local service returns the result to the server stub.

8) The Server Stub packages the returned result into a message, which is serialized in Java;

9) The Server Stub sends the packaged message over the network to the consumer

10) The Client stub receives the message, decodes it, and deserializes it in Java;

11) The service caller (client) gets the final result.

The goal of the RPC framework is to encapsulate steps 2 through 10.

The RPC framework involves technology

1. Establish communication

First, you solve the communication problem, mainly by establishing a TCP connection between the client and the server, where all the data exchanged by the remote procedure call is transmitted.

2. Service addressing

1) Service registration

The service needs to be registered with the service center first. A registry stores the SERVICE’s IP, port, invocation method (protocol, serialization method) and so on. In ZooKeeper, service registration actually creates a ZNode node in ZooKeeper, which stores the service information mentioned above.

2) Service discovery

When a service consumer invokes a service for the first time, it will find the IP address list of the corresponding service through the registry and cache it locally for subsequent use. When a consumer invokes a service, it no longer invokes the registry, but invokes the service directly from the server of one of the service providers in the IP list through a load-balancing algorithm.

3) Registration service

Reliable addressing (primarily for service discovery) is a cornerstone of RPC implementation, such as zooKeeper for registration services.

  • After startup, the service provider registers the machine IP, port, and service list with the service registry center.
  • The service consumer obtains the address list of the service provider from the service (registration) center when starting the service to implement soft load balancing and Failover.
  • The provider needs to send heartbeat to the registry periodically. After not receiving heartbeat from the provider for a period of time, the provider is considered to have stopped the service and the corresponding service is removed from the registry, etc.

3. Network transmission

What protocols are used for data transmission, and how should data be serialized and deserialized

4. NIO communication

At present, many RPC frameworks are directly based on NetTY, such as ALIBABA’s HSF, Dubbo and Hadoop Avro. It is recommended to use NetTY as the underlying communication framework.

5. Service invocation

Such as: machine for local calls B (through A Proxy agent) after get the return value, at this time also need to be sent back to the return value is A machine, also need the serialized operation, and then through the network to send binary data back to A machine, and when A machine to receive these return values, deserialize operation again

In short, to implement an RPC is not difficult, difficult is to implement a high performance and reliable RPC framework, if you want to further understand the Dubbo source code analysis, see how Dubbo is to solve these problems.

About the author: mikechen, more than ten years of BAT architecture experience, senior technical expert, used to work for ali, taobao, baidu.