This is why Technology’s 30th original article

This is probably the first article on the web to address one of the improvements in Dubbo’s 2.7.5 milestone release: client-side threading model optimization.

Dissuasion first: The text contains 8190 words and 54 images. You should have a good understanding of Dubbo before reading. Content is more hardcore, advise you to read carefully.


It doesn’t matter if I can’t read, I write really hard, help pull to the last point like it.

This article directory

Section one: Official release

This section introduces the topic of this article: client-side threading model optimization, based on the official article “Dubbo Milestone Release, 30% Performance Improvement”.

Section two: the introduction of the official website

Before introducing the optimized consumer-side threading model, let’s briefly describe what Dubbo’s threading model is. At the same time, I found that the introduction of this part in the official documents was very brief, so I supplemented it with codes.

Section 3: Problems with the threading model prior to version 2.7.5

This section introduces and analyzes some consumer applications through an issue series. When faced with a large number of services to be consumed and a large number of concurrent traffic scenarios (typical gateway scenarios), the problem of excessive allocation of consumer threads often occurs.

Section 4: What is threadless

Section 3 introduces a new version of the solution, Threadless. And a simple introduction to it.

Section 5: Scene repetition

The scene was a hassle to reproduce due to limitations, but I found a good ending in issues#890, so I moved in.

Section 6: Comparison of old and new threading models

This section serves as a guide by comparing the invocation flow of the old and new threading models and comparing the key code from 2.7.4.1 to 2.7.5.

Section 7: Dubbo version introduction.

Dubbo has two major releases: 2.6.X and 2.7.x.

The official release

On January 9, 2020, Alibaba middleware published an article entitled “Dubbo releases milestone version, Performance increases by 30%” :


The article says this is a landmark version of Dubbo.

After reading about it, I realized that it was indeed a landmark crossover, a version of Dubbo’s tumultuous life that showed his great vitality and positive spirit of exploration.

The strong vitality is reflected in the numerous community feedback after the release of the new version or praise, or ridicule.

The spirit of exploration is reflected in Dubbo’s exploration of multilingualism and protocol penetration.

There are 9 major changes listed in this article. This article only introduces one change in version 2.7.5: the optimized consumer side threading model.

Most of the source code for this article is version 2.7.5, and version 2.7.4.1 is available for comparison.

Introduction on the official website

Before introducing the optimized consumer-side threading model, let’s briefly describe what Dubbo’s threading model is.

Dubbo official documentation is a very good introduction to learning documents, many points of knowledge are written in great detail.

Unfortunately, in the thread model section, it is not satisfactory, a few words, the diagram is not satisfactory:


In the official illustration, there is no thread “pool” or synchronous to asynchronous call link at all. Just the sending and receiving of a remote call request, and the sending and receiving of a response are not shown in this figure.

So I will combine the official documentation with the 2.7.5 version of the source code for a brief introduction, in the process of reading the source code you will find:

On the client side, in addition to the user thread, there is a thread pool named Dubboclienthandler-IP :port, which is implemented by default as a cache thread pool.


The implication of line 93 in the figure above is that cached is implemented when the client does not specify a ThreadPool.

The setThreadName method above sets the thread name:

org.apache.dubbo.common.utils.ExecutorUtil#setThreadName


You can clearly see that the default thread name, if not specified, is DubboClientHandler- IP :port.

On the server side, in addition to boss threads, worker threads (IO threads), there is also a thread pool named dubboServerHandler-ip :port, whose default implementation is fixed thread pool.


Enable the dubo.xml configuration for thread pools as follows:

<dubbo:protocol name="dubbo" threadpool="xxx"/>

The above XXX can be fixed, cached, limited, or eager, where fixed is the default implementation. Of course, since it is an SPI, it can also be extended by itself:


So, based on the latest version 2.7.5, this red box at the bottom of the official document is misleading:


From the SPI interface, fixed is indeed the default.

But since the client adds one line of code (93 lines) before initializing the thread pool, the default implementation on the client is cached and on the server is fixed.

I have also looked at previous versions and, at least as of 2.6.0 (not before), the default implementation of the client thread pool is cached.

There is no problem with the description of the Dispatcher section:


The Dispatcher part is an important point in the threading model, as discussed later.

Here is a slightly more detailed thread model before version 2.7.5 for your reference:


Image source: https://github.com/apache/dubbo/issues/890

Problems with the threading model prior to 2.7.5

So what are the problems with improving the previous threading model?

In the article “Dubbo releases milestone version with 30% performance improvement”, this is how it is described:

For Dubbo applications before version 2.7.5, especially some consumer applications, when faced with heavy traffic scenarios (such as gateway scenarios) that need to consume a large number of services and have a large number of concurrent requests, the number of threads allocated on the consumer often exceeds the threshold.

At the same time, the article gives a link to an issue:

https://github.com/apache/dubbo/issues/2013

In this section, I will follow issue#2013 to give you an overview of the problems with the threading model prior to Dubbo 2.7.5. Specifically, the problems with the client-side threading model:


First, Jaskey said, analyzing issue#1932, that in some cases, too many threads were created, so the process had OOM problems.

After analyzing the problem, he found that the client was using a cached thread pool (that is, the client thread implementation was cached) and that it did not limit the thread size, which was the root cause.

Next, let’s go to issue#1932 and see what it says:

https://github.com/apache/dubbo/issues/1932


You can see issue#1932 was also proposed by Jaskey, who basically expressed the same idea: why am I setting actives=20, but on the client side there are over 10,000 threads named DubboClientHandler whose state is blocked? Is this a Bug?

Just for this issue, LET me answer this first: it is not a Bug!

Let’s first look at what actives=20 means:


Actives =20 means a maximum of 20 concurrent calls per method per service consumer, as explained on the website.

That is, the server provides a method that the client calls and allows up to 20 requests to be invoked at the same time, but the client’s threading model is cached and the request can be cached in the thread pool once received. So in the case of a large number of time-consuming requests, the number of client threads is well over 20.

This Actives configuration is also explained in the article “The Minimum Active Number Algorithm for Dubbo load Balancing”. It works with the ActiveLimitFilter, which defaults to 0 for unlimited. When actives>0, the ActiveLimitFilter automatically takes effect. Since it is not the focus of this article, I will not elaborate on it here, but you can read the previous article if you are interested.

Following issue#2013, we can see this question mentioned by issue#1896:


Question 1: I have explained before that he is right about the first half of his guess and wrong about the second half. Say no more.

The main point here is problem 2 (click on the larger image to see) : With more service providers, more thread pools are maintained on the consumer side. As a result, although the service provider has greater capacity, the consumer side has a huge thread consumption. He is saying the same thing as issue#4467 below: he wants a shared thread pool.

Let’s scroll down and see issue#4467 and issue#5490


To issue#4467, CodingSinger said: why does Dubbo create a thread pool for every link?


We can also see from Dubbo 2.7.4.1 that there is indeed a thread pool created for each connection in the WarppedChannelHandler constructor:


What is issue#4467 trying to say?

This is where link level thread isolation comes in. A client should use a shared thread pool even if it has multiple connections.

I personally don’t think thread isolation should be done here either. Use scenarios for thread isolation should be for methods that are particularly important or slow or that differ greatly in functionality. Obviously, Dubbo’s clients are treated equally even if a method has multiple connections (with the connections parameter configured), which does not fit the thread isolation scenario.

Then Big Chickenij replied to the issue on July 24, 2019:


The existing design is that the provider side shares a thread pool by default. The consumer side shares a thread pool per link.

He also said that the consumer thread pool is currently being optimized.

The implication is that he also feels there is room for improvement in the existing threading model on the consumer side.

Who is ChickenLJ?

Liu Jun, GitHub account Chickenlj, Apache Dubbo PMC, the core maintainer of the project, witnessed the whole process of Dubbo from the restart of open source to the graduation of Apache. Currently, I am working in aliyun cloud native application Platform team, participating in service framework and micro-service related work, and mainly promoting Dubbo open source cloud biogenics.


He’s the author of this article. His words carry a lot of weight.

I heard him share this on Dubbo Developer Day:


If you are interested in the content of his speech, you can reply at the background of the official account :1026. Get lecturers PPT and recording address.

The threadless Executor mechanism will be introduced in version 2.7.5 to optimize and enhance the client threading model.


What is threadless?


According to the description on the class, we can know:

The most important difference between this Executor and any normal Executor is that this Executor does not manage any threads.

Tasks submitted to this Executor through the execute(Runnable) method are not scheduled to a particular thread, but other executors delegate Runnable execution to the thread.

These tasks are stored in a blocking queue and only actually execute when Thead calls the waitAndDrain() method. Simply put, the theAD executing the task is exactly the same as the theAD calling the waitAndDrain() method.

The waitAndDrain() method is as follows:


Execute (Runnable) :


You can also see that a thread pool named sharedExecutor is maintained. By definition, we know that thread pool sharing should be done here.

Scene: the repetition

So much for the pre-2.7.5 thread model, how can we reproduce it again?

I have limited conditions here and it’s a bit tricky to reproduce the scene, but I found a nice ending in issues#890 that I can move over:


A mind map based on his following description is as follows:


This is the scenario where CoreThreads is greater than zero. However, according to the existing threading model, even if the corethreads count is zero, a large number of consumer threads can occur when the service provider on which the consumer application depends is slow to process and the request concurrency is high. And you can compare that.

Comparison of old and new threading models

As you already know from the introduction, this update is mainly to enhance the client thread model, so we will focus on the Consumer part of the thread pool model before and after version 2.7.5.

The old threading model

The old thread pool model looks like this, note the line colors:


1. The business thread makes a request to get a Future instance.

2. The business thread then calls future.get and blocks to wait for the business result to return. 3. When the business data is returned, it is turned over to the independent Consumer side thread pool for deserialization and other processing, and calls future.set to put back the deserialized business results. 4. The business thread returns the result directly.

New threading model

The new thread pool model looks like this, note the line color:


1. The business thread makes a request to get a Future instance. Call threadlessexecutor.wait () before calling future.get(). Wait causes the business thread to wait on a blocking queue until an element is added to the queue. 3. When the business data is returned, a Runnable Task is generated and ThreadlessExecutor queue is placed. 4. The business thread retrieves the Task, deserializes the business data in this thread and sets it to the Future. 5. The business thread returns the result directly.

As you can see, compared to the old thread pool model, the new thread model allows the business threads themselves to monitor and parse the returned results, eliminating the additional overhead of the consumer thread pool.

Code comparison

Let’s compare the 2.7.4.1 and 2.7.5 code to illustrate the above changes.

It should be noted that, because of the changes involved in the code is very much, I only play a guide role here, if readers want to understand the changes in detail, but also need to read the source code.

The first step in both versions is the same: the business thread makes a request and gets an instance of the Future.

However, the implementation code is different. In version 2.7.4.1, the following code is shown:


The request method circled above ends up here, and you can see that it does return a Future instance:


The newFuture method is as follows:


At the same time, it can be seen from the source code that after obtaining the Future instance, the subscribeTo method is called as follows:


Use Java 8 CompletableFuture to achieve asynchronous programming.

But in version 2.7.5, the following code looks like this:


There is an executor parameter in the request method, and that parameter is the implementation class ThreadlessExecutor.

Next, as in previous versions, we get a DefaultFuture object using the newFuture method:


Compare this to the newFuture method in version 2.7.4.1 and you’ll see that this is a very different place. Although both are to get the Future, but the Future inside the content is different.

The previous code comparison diagram, at a glance:


Step 2: The business thread then calls future.get to block and wait for the business result to return.

Dubbo is a synchronous call by default, and I explained the difference between synchronous and asynchronous calls in my first article, “Dubbo 2.7 features asynchrony” :


To find where async is going to sync, let’s look at 2.7.4.1 as follows:


The source code for asyncresult.get () is completableFuture.get () :


In version 2.7.5, the corresponding places have changed:


The change is in the asyncresult. get method.

In version 2.7.5, the source code for this method is:


The completableFuture.get () call is the same as in version 2.7.4.1. But there is more code logic labeled ①. This code is what was previously represented in the new thread model, as outlined in red:


Call threadlessexEcutor.wait () (the logic at number ①) before calling future.get(). Wait causes the business thread to wait on a blocking queue until the element is added to the queue.

Then compare the two places:

First up: the WrappedChannelHandler mentioned earlier, you can see that the 2.7.5 version has a very big constructor change:


The second point: the Dispatcher mentioned before needs to be clearly explained in another article. I am just making a point to mention:


AllChannelHandler is the default policy, and the proof code is as follows:


So, first of all, I’m going to look at number 2, which looks like a big change, but it’s actually a decoupling, encapsulation of the code. The sendFeedback method is as follows, which is the same as the code labeled ② in version 2.7.4.1:


So let’s focus on two places marked ① where the method of obtaining executor has changed:

2.7.4.1 version of the method is getExecutorService (2.7.5) version of the method is getPreferredExecutorService ()Copy the code

The code is as follows:


Comments on the main translation getPreferredExecutorService method:

Currently, this method is mainly customized to facilitate the thread model on consumer side.
1. Use ThreadlessExecutor, aka., delegate callback directly to the thread initiating the call.   
2. Use shared executor to execute the callback.
Copy the code

Currently, the use of this approach is primarily customized for the client-side threading model.

1. Use ThreadlessExceutor, aka., to delegate the callback directly to the calling thread. 2. Use the shared executor to execute the callback.

Whisper: I really don't know how to translate aka here. AKA from hip hop? Hi, I'm GEM GEM, AKA your uncle. Who draws rainbows and dragons.


Well, that’s all for the introduction. I’m sure there aren’t many people left who can see this place. Because of the changes involved in the code is very much, I only play a guide role here, if the reader wants to understand the changes in detail, but also need to read the source code. I hope you can build your own Demo to run and compare the differences between the two versions.

Dubbo version introduction

With this update, let’s take a look at Dubbo’s current major releases.

According to Liu Jun, Dubbo community mainly maintains 2.6.x and 2.7.x versions at present, among which:

2.6. X is mainly bugfix and a small amount of enhancements, thus fully guaranteeing stability.

2.7.x, as the major development version of the community, has been continuously updated with a large number of new features and optimizations, while also bringing some stability challenges.

In order to facilitate users’ upgrading, the following table summarizes the various versions of Dubbo, including key features, stability and compatibility, and evaluates each version from multiple aspects to help users complete the upgrade evaluation:



As you can see, the community’s recommendation for the latest version, 2.7.5, is that mass production use is not recommended.

Meanwhile, if you look at the latest issue of Dubbo, there are many “jokes” about version 2.7.5.

But I do think that 2.7.5 is a big step in Dubbo’s evolution. It’s the first shot in Dubbo’s embrace of the microservices cloud as a whole. Exploration of directions for multilingual support. Implementation of HTTP/2 protocol support, while adding integration with Protobuf.

Open source project, common maintenance. We know that Dubbo is not a perfect framework, but we also know that there are engineers behind it who know that it is not perfect, but still don’t give up, and they are trying to make it perfect. We, as users, are less “teasing” and more encouraging. Only in this way can we proudly say that we have contributed a little to the open source world, and we believe that its future will be better.

Hats off to open source, hats off to open source engineers.

Anyway, awesome.

One last word

If you find something wrong, please leave a message and point it out to me so that I can modify it.

Thank you for reading, I insist on original, very welcome and thank you for your attention.

The above.

Welcome to pay attention to the public account [WHY Technology]. I’m going to share some technical stuff here, focusing on Java and being responsible for every line of code. Occasionally, I will talk about life, write a book review, film review. May you and I make progress together.

Public account – WHY Technology