Dubbo itself is not complicated, and the official documents are very clear and detailed. There are generally not many questions about Dubbo in the interview, from layering to working principle, load balancing strategy, fault tolerance mechanism and SPI mechanism. The biggest question is generally how to design an RPC framework. But if you understand the hierarchy of how it works, that’s pretty much the answer to the question. \

What about the layering of Dubbo?

To a large extent, Dubbo is divided into three layers. The business business logic layer provides interfaces and implementation as well as some configuration information by ourselves. The RPC layer is the real core layer of RPC call, which encapsulates the whole RPC call process, load balancing, cluster fault tolerance, proxy, Remoting is the encapsulation of network transport protocols and data transformations.

On a more detailed level, it is the 10-layer pattern in the figure. The whole layer depends from top to bottom. Except for business business logic, the other layers are SPI mechanism.

Can you explain how Dubbo works?

  1. When the service starts, the provider and consumer connect to the registry based on the configuration information to register and subscribe to the service, respectively
  2. Register returns the provider information to the consumer based on the service subscription, and the consumer caches the provider information locally. If the information changes, the consumer receives a push from the Register
  3. Consumer generates proxy objects, selects a provider based on the load balancing policy, and records the number and time of interface invocation to Monitor periodically
  4. After receiving the proxy object, the consumer makes an interface call through the proxy object
  5. The provider deserializes the data after receiving the request, and then invokes the concrete interface implementation through the proxy

Why communicate through proxy objects?

The main purpose is to realize the transparent proxy of the interface, encapsulate the call details, so that users can call remote methods just like calling local methods. At the same time, some other strategies can be implemented through the proxy, such as:

1. Load balancing policy called

Call failure, timeout, demotion, and fault tolerance

3. Do some filtering, such as adding caches and mock data

4. Interface call data statistics

What is the process of service exposure?

  1. At container startup, labels are parsed through ServiceConfig, dubbo tag parsers are created to parse labels for Dubbo, and once the container is created, the ContextRefreshEvent event callback is triggered to start exposing the service
  2. Get the Invoker from ProxyFactory, which contains the object information and the specific URL of the method to be executed
  3. Through the DubboProtocol implementation, the wrapped Invoker is converted to my exporter, and the server is started to listen to the port
  4. Finally, RegistryProtocol saves the mapping between URL and Invoker and registers with the service center

What is the flow of service references?

After the service is exposed, the client refers to the service and then invokes it.

  1. First the client subscribes to the service from the registry based on the configuration file information

  2. Then DubboProtocol connects to the server based on the provider address and interface information of the subscription, starts the client, and creates the Invoker

  3. Once invoker is created, the service is referenced by generating a proxy object for the service interface that is used to remotely invoke the provider

What are the load balancing policies?

  1. Weighted random: Suppose we have A set of servers = [A, B, C] whose corresponding weights are weights = [5, 3, 2] and the total weight is 10. Now, these weights are tiled on the one-dimensional coordinate values, [0, 5) interval belongs to server A, [5, 8) interval belongs to server B, and [8, 10) interval belongs to server C. Next, A random number between [0, 10) is generated through the random number generator. And then you calculate what interval that random number will fall into.
  2. Minimum active number: Each service provider has one active number. Initially, all service providers have 0 active number. For each request received, the active count is increased by one, and when the request is completed, the active count is decreased by one. After the service has been running for a while, the better performing service providers process the requests more quickly, so the active count drops more quickly, and they get priority for new service requests.
  3. Consistent hash: The provider’s invoke and random nodes generate a hash through the hash algorithm, and project the hash onto the ring of [0, 2^ 32-1]. Md5 is performed according to the key and then hash is performed during query. Get the invoker of the first node whose value is greater than or equal to the current hash.

Image from Dubbo official

  1. Weighted polling: For example, if the weight ratio of servers A, B, and C is 5:2:1, server A receives five requests, server B receives two, and server C receives one of the eight requests.

What are the cluster fault tolerant methods?

  1. Failover Cluster: The default fault-tolerant solution of Dubbo. When the call fails, the call is automatically switched to another available node. The specific number of retries and interval can be set when the call is referenced.
  2. Failback Cluster automatic recovery: After an invocation fails, logs and call information are logged and empty results are returned to the consumer. The failed invocation is retried every 5 seconds through a scheduled task
  3. Failfast Cluster Fast failure: The Failfast Cluster is invoked only once and an exception is thrown immediately after the failure
  4. Failsafe Cluster failure security: An exception occurs during invocation, but no log is generated and no result is returned
  5. The Forking Cluster invokes multiple service providers in parallel: multiple threads are created from a thread pool, multiple providers are invoked concurrently, the results are stored in a blocking queue, and the results are returned as soon as one provider successfully returns a result
  6. Broadcast Cluster: Invokes each provider one by one. If an error occurs on one provider, an exception is thrown after the invocation.

Do you understand the Dubbo SPI mechanism?

SPI, also known as Service Provider Interface, is a Service discovery mechanism. Essentially, the fully qualified name of the Interface implementation class is configured in a file, and the Service loader reads the configuration file and loads the implementation class. In this way, the Interface implementation class can be dynamically replaced at runtime.

It is through the SPI mechanism that Dubbo implements many of its extensions, and instead of using Java’s native SPI mechanism, Dubbo has been enhanced and improved with alignment.

SPI has many applications in Dubbo, including protocol extension, cluster extension, routing extension, serialization extension, and so on.

The usage can be configured in the meta-INF /dubbo directory:

key=com.xxx.value
Copy the code

Dubbo’s ExtensionLoader then loads the corresponding implementation class with the specified key. The advantage of this is that it can be loaded on demand and optimized for performance.

What if you were to implement an RPC framework?

  1. First, you need a service registry so that consumers and providers can register and subscribe to services
  2. Need a load balancing mechanism to determine how the consumer calls the client, and of course include fault tolerance and retry mechanisms
  3. Communication protocol and tool framework, such as HTTP or RMI protocol communication, and then according to the protocol to choose which framework and tool to use for communication, of course, the data transmission serialization should be considered
  4. In addition to the basics, things like monitoring, configuration management pages, and logging are additional optimization considerations.

So, essentially, once you’re familiar with one or two RPC frameworks, it’s easy to figure out how to implement one ourselves.

Phase to recommend

Complete PyPy mastery in 5 minutes

5 minutes to master common Configuration files in Python

OpenCV artificial intelligence image recognition technology practical case

Click below to read the original article and join the communityCopy the code