What is the dubbo

Dubbo is a lightweight, high-performance Java RPC framework.

It provides three core capabilities: interface-oriented remote method invocation, intelligent fault tolerance and load balancing, and registration for automatic service discovery. Provide high-performance and transparent RPC remote service invocation solutions and SOA service governance solutions.

What is the Rpc

Remote Procedure Call (RPC) is a protocol that requests services from Remote computer programs over the network without understanding the underlying network technology. For example, two different services A and B are deployed on two different machines. What if service A wants to invoke A method in service B? Using HTTP requests is certainly possible, but it may be slower and some optimizations are not well done. RPC was invented to solve this problem.

What is the principle of RPC?



  1. A service consumer (client) invocation invokes a service as a local invocation;
  2. After receiving the call, the Client stub is responsible for assembling methods and parameters into a message body that can be transmitted over the network.
  3. The Client stub finds the service address and sends the message to the server.
  4. The Server Stub decodes the message after receiving it.
  5. The Server stub invokes the local service based on the decoding result.
  6. The local service executes and returns the result to the Server stub;
  7. The Server Stub packages the returned result into a message and sends it to the consumer;
  8. The Client stub receives the message and decodes it.
  9. The service consumer gets the end result.

What problem does RPC solve

Make calls between different services in a distributed or microservice system as simple as local calls

If you have HTTP, why use RPC for service calls?

RPC is a concept and a design designed to solve the problem of invocation between different services. It usually includes transport protocols and serialization protocols.

However, HTTP is a protocol, and RPC framework can use HTTP as the transport protocol or DIRECTLY use TCP as the transport protocol. Different protocols are generally used to adapt to different scenarios.

Transport protocols include: http2 used by the famous [gRPC](gRPC/grpc. IO), and TCP used for custom packets such as Dubbo.

Serialization protocols include: XML JSON based on text encoding, protobuf binpack based on binary encoding, etc.

Why use RPC with custom TCP protocol for back-end process communication

(Error) Compared with the custom TCP protocol, HTTP costs more in connection establishment and disconnection

HTTP protocol is supported
The connection poolMultiplexing, that is, establishing a number of connections over time, does not frequently create and destroy connections.

Second, HTTP can also encode content using the binary serialization protocol protobuf, so the biggest difference is in the transport protocol.

TCP packets of the generally defined HTTP1.1 protocol contain too much invalid informationEven though the encoding protocol (body) uses binary encoding protocol, the message metadata (key-value pair of header header) uses text encoding, which takes up a lot of bytes. As shown in the figure above, the number of valid bytes in the packet only accounts for about 30%, that is, 70% of the time is used to transmit metadata waste codes. Of course, the actual message content may be longer than that, but the percentage of headers is still significant.




HTTP is like Mandarin, RPC is like gang slang.

Speaking Mandarin, the advantage is that everyone can understand, everyone can speak.

The advantage of slang is that it can be simplified, more confidential, and more customizable, but the disadvantage is that it requires that the “speaking” side (the client side) also understand it, and once everyone speaks a slang, it is difficult to change it


Why dubbo

I think the main reasons for using Dubbo are the following four features:

Load balancing


Service invocation link generation

With the development of the system, there are more and more services, and the dependency relationship between services becomes complicated. It is even unclear which application should be started before which application. Even architects cannot describe the architectural relationship of the application completely. Dubbo can help us figure out how services call each other.

Service access pressure and duration statistics, resource scheduling, and management

Manage cluster capacity in real time based on access pressure to improve cluster utilization.

Service degradation

A standby service is invoked when a service fails.

Dubbo architecture

Architecture diagram of Dubbo



  • Provider: exposes the service Provider of the service
  • Consumer: Service Consumer that invokes the remote service
  • Registry: A Registry where services are registered and discovered
  • Monitor: monitors the number and duration of service invocation
  • Container: service running Container

Call relationship description:

  1. The service container is responsible for starting, loading, and running the service provider.
  2. At startup, service providers register their services with the registry.
  3. At startup, service consumers subscribe to the registry for the services they need.
  4. The registry returns a list of service provider addresses to the consumer, and if there are changes, the registry pushes the change data to the consumer based on the long connection.
  5. The service consumer, from the provider address list, selects one provider to call based on the soft load balancing algorithm. If the call fails, selects another one to call.
  6. Service consumers and providers accumulate calls and call times in memory and regularly send statistics to the monitoring center every minute.

Dubbo’s load balancing strategy

What is load balancing

For example, a service in our system has a high volume of traffic, and we deploy this service on multiple servers. When the client initiates a request, multiple servers can handle the request. Then, choosing the right server to handle the request is critical. If you need a single server to handle requests for the service, the point of deploying the service on multiple servers ceases. Load balancing is to avoid a single server to respond to the same request, easy to cause server downtime, crash and other problems, we can obviously feel its significance from the four words of load balancing.

Random LoadBalance(default, Random load balancing based on weights)

Random, set random probability according to weight.



The random strategy determines whether all invokers have the same weight, and if they do, it’s easier. Use random.nexint (length) to randomly generate an Invoker number and select the corresponding Invoker according to the number. If the service Provider is not weighted in Dubbo Admin, then all invokers have the same weight, which defaults to 100. If the weights are different, then you need to combine the weights to set the random probability.

The algorithm looks like this: Suppose there are four Invokers.

invoker weight
A 10
B 20
C 20
D 30

The total weight of A, B, C and D is 10 + 20 + 20 + 30 = 80. Distribute the 80 numbers in the following figure:

+-----------------------------------------------------------------------------------+ | | | | | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 1 October 30 to 50, 80 |-----A----|---------B----------|----------C---------|---------------D--------------| ---------------------15 -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- and -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - 54Copy the code

In the figure above, there are 4 areas with the weights of A, B, C and D. Use random.nextint (10 + 20 + 20 + 30) to select one of 80 numbers at random. And then determine where that number is distributed. For example, if it is random to 37, which is distributed in region C, then select Invoker C. 15 is in zone B, 54 is in zone D.


RoundRobin LoadBalance(not recommended, weighted round load balancing mechanism)

Round-robin, the round-robin ratio is set according to the weight after the convention.

Polling load balancing is to call all providers in turn. Like random load balancing policies, polling load balancing policies also have the concept of weights. The polling load balancing algorithm allows RPC calls to be allocated exactly as we set them. Whether it’s a few calls or a lot of calls.

However, the polling load balancing algorithm also has its drawbacks, such as slow Provider accumulation of requests, such as the second machine being slow but not hung up, and then getting stuck when the request is transferred to the second machine, and over time, all the requests are stuck on the second machine.

LeastActive LoadBalance Minimum number of active calls

The goal is for slower machines to receive fewer requests.

Minimum number of active calls, random number of the same active number, active number refers to the difference in count before and after the call.

Make slower providers receive fewer requests, because slower providers have a larger difference in the count before and after the invocation.

Each service maintains an active count counter. When machine A starts processing requests, this counter increases by 1, while MACHINE A is not finished processing. If the processing is complete, the counter decreases by 1. Machine B receives the request and processes it quickly. So the active numbers for A and B are 1,0. When A new request is made, machine B is chosen to execute it (with the smallest number of active requests) so that the slower machine A receives fewer requests.

If there is only one Invoker with the minimum active number, the Invoker is directly returned. If there are multiple invokers with different weights and the totalWeight is greater than 0, a weight is randomly generated, which is in the range of (0, totalWeight). Finally, Invoker is selected according to randomly generated weights.

ConsistentHash LoadBalance Consistency of the Hash

Requests with the same parameters are always sent to the same provider. (If you don’t want random load balancing and you want all of your requests to go to the same node, use this consistent hash strategy.)

When a provider hangs, requests originally sent to that provider are spread over other providers based on virtual nodes without drastic changes

Zookeeper breaks down and is directly connected to Dubbo

If the ZooKeeper registry goes down, the service consumer will still be able to call the provider’s service for a period of time. It actually uses a local cache to communicate, which is just a measure of Dubbo’s robustness.

Dubbo robustness

  1. The breakdown of the monitoring center does not affect the use of the system, but only the loss of some sample data
  2. After the database goes down, the registry can still provide a list of services query through the cache, but it cannot register new services
  3. If one of the registry peer clusters fails, it will automatically switch to the other one
  4. After all registries go down, service providers and service consumers can still communicate through local caches
  5. The service provider is stateless. If any service provider breaks down, the service is not affected
  6. When all service providers go down, the service consumer application becomes unavailable and reconnects indefinitely waiting for the service provider to recover

Dubbo service exposed

Dubbo will publish the ContextRefreshEvent event in the last step of refreshing the container after Spring has instantiated the bean, Notifies the ServiceBean class that implements ApplicationListener to call back the onApplicationEvent event method, Dubbo calls the Export method of ServiceBean’s parent ServiceConfig in this method, which actually implements (asynchronous or non-asynchronous) publishing of the service.

Dubbo agreement

dubbo

Single long connections and NIO asynchronous communication, suitable for large concurrent and small data volume service calls, and far more consumers than providers. TCP, asynchronous, Hessian serialization;

Rmi:

Using JDK standard RMI protocol implementation, transfer parameters and return parameter objects need to implement Serializable interface, the use of Java standard serialization mechanism, the use of blocking short connection, transmission packet size mixed, the number of consumers and providers is about the same, can transfer files, transmission protocol TCP. Multiple short connections, TCP transport, synchronous transport, suitable for general remote service calls and RMI interoperation. Java serialization suffers from security vulnerabilities when relying on earlier versions of the Common-Collections package;

webservice

Remote call protocol based on WebService, integrated with CXF implementation, provides interoperability with native WebService. Multiple short connections, based on HTTP transmission, synchronous transmission, suitable for system integration and cross-language call;

http

A remote invocation protocol based on Http form submission, implemented using Spring’s HttpInvoke. Multiple short connections, transport protocol HTTP, the size of the incoming parameter, more providers than consumers, need to give the application and browser JS call;

hessian 

Integrated Hessian service, based on HTTP communication, using Servlet to expose the service, Dubbo embedded Jetty as the default implementation of the server, providing interoperation with Hession service. Multiple short connections, synchronous HTTP transfers, Hessian serialization, large incoming parameters, more providers than consumers, greater provider pressure, passable files;

memcache

RPC protocol based on memcached implementation

redis

RPC protocol based on Redis


Are calls between Dubbo services blocked





  1. The business thread makes a request to get an instance of the Future.
  2. The business thread then blocks with a call to future.get to wait for the business result to return.
  3. When the business data is returned, it is deserialized by a separate Consumer side thread pool, and the future. Set is called to put the deserialized business results back.
  4. The business thread returns the result directly



  1. The business thread makes a request to get an instance of the Future.
  2. Call threadlessexEcutor.wait () before calling future.get(). Wait causes the business thread to wait on a blocking queue until an element is added to the queue.
  3. When the business data is returned, a Runnable Task is generated and placed in the ThreadlessExecutor queue
  4. The business thread takes the Task out and executes it in this thread: deserialize the business data and set it to the Future.
  5. The business thread returns the result directly

Compared with the old thread pool model, the business threads themselves are responsible for monitoring and parsing the returned results, eliminating the additional overhead of the consumer thread pool.


What are the differences between Dubbo and Spring Cloud?

Dubbo is a product of the SOA era, which focuses on service invocation, traffic distribution, traffic monitoring, and fusing.

Spring Cloud was born in the era of microservice architecture, considering all aspects of microservice governance, and relying on the advantages of Spring and Spring Boot

The two frameworks had different goals from the start, with Dubbo positioning service governance and Spring Cloud building an ecosystem.

The bottom layer of Dubbo uses NIO framework such as Netty and is transmitted based on TCP protocol. With the serialization of Hession, RPC communication is completed.

Spring Cloud is based on the Http protocol Rest interface to invoke remote process communication, relatively speaking, Http requests will have larger packets, and occupy more bandwidth. However, REST is more flexible than RPC. The dependence of service providers and callers only depends on a contract, and there is no strong dependence at the code level. This is more appropriate in a microservice environment that emphasizes rapid evolution.