“This is my fourth day of participating in the First Challenge 2022. For more details: First Challenge 2022.”

Official resources

Nacos. IO/useful – cn/docs /…

Think with questions

  • What is the impact on the response time of client long polling
  • Why does the client get an immediate response after changing configuration information
  • Why is the client timeout set to 30s
  • With these questions in mind, let’s look at the server-side code.

Configuration center of Nacos

  • Dynamic configuration management is one of the three features of Nacos, which enables you to centrally and dynamically manage configuration information for all applications or services in all environments by dynamically configuring services.
  • The dynamic configuration center enables the corresponding configuration information to take effect without redeploying applications and services during configuration updates, which greatly increases the o&M capability of the system.

Dynamic configuration

Nacos dynamic configuration ability, see how Nacos is in a simple, elegant and efficient way to manage configuration, configuration dynamic change, next to understand the function of dynamic configuration of Nacos.

Dynamic configuration mechanisms for clients

The Nacos client maintains a long polling task to check whether the configuration information of the server has changed. If it has, the client will get the groupKey of the changed configuration item and obtain the latest value based on the groupKey.

If the interval is set too long, it may not be able to obtain the changes of the server in time. If the interval is set too short, frequent requests for the server will undoubtedly be a burden.

If the client sends a request to the server at an appropriate interval, the server proactively pushes the configuration change result to the client if the configuration is changed during this period. In this way, the client can sense the configuration change in real time and the pressure on the server is reduced.

Client long polling

The part of the client long poll, the checkUpdateDataIds method in LongPollingRunnable, which is used to see if the configuration of the server has changed, ends up calling the following method:

HTTP request operations

The client obtains the result from the server through an HTTP POST request, and sets a timeout time: 30s. Generally speaking, the client waits 29.5+s to request the result of the server. Then, after the client gets the result of the server, it does some subsequent operations. After all the operations are completed, it calls itself again in finally.

Long polling execution logic

The client makes a request to the server and it takes at least 29.5 seconds to get the result, assuming, of course, that the configuration has not changed. How long does it take for the request to return if the client’s configuration changes during long polling?

An operation that did not obtain modified data triggers a return

A return is triggered as soon as the modified data operation is obtained

The service side controller

HTTP: /v1/cs/configs/listener Com. Alibaba. Nacos. Config. Server controller. ConfigController. Java, in ConfigController class, as shown in the figure below:

The server is an HTTP service provided externally by springMVC. After converting the parameters in the HttpServletRequest, it is handed to an object called inner to execute. Inner object is ConfigServletInner class instances, com. Alibaba. Nacos. Config. Server controller. ConfigServletInner. Java

This method is a polling interface that supports short polling logic in addition to long polling. Enter the addLongPollingClient method of longPollingService again, as shown below:

com.alibaba.nacos.config.server.service.LongPollingService.java

This method essentially adds the client’s long polling request to something: the server wraps the client’s long polling request into a task called ClientLongPolling that the Scheduler executes.

When the server takes the client’s submission timeout, it subtracts 500ms, which means that the server is using a timeout that is 500ms less than the client’s submission, which is 29.5s, which we should get a little excited about.

PS: Here, timeout may not always be 29.5. When isFixedPolling() is true, timeout will be a fixed interval.

Let’s take a look at what the server wrapped ClientLongPolling task does, as shown below:

com.alibaba.nacos.config.server.service.LongPollingService.ClientLongPolling.java

After ClientLongPolling is submitted to scheduler for execution, the actual execution can be divided into the following four steps:

  1. Example Create a scheduled task with a delay of 29.5s.
  2. Add an instance of the ClientLongPolling itself to an allSubs.
  3. When the delay is up, first remove the instance of ClientLongPolling itself from allSubs.
  4. Obtain whether the groupKeys of the corresponding client request stored in the server have changed, write the result in response and return it to the client.

AllSubs object, which is a ConcurrentLinkedQueue queue to which ClientLongPolling adds itself.

Scheduling tasks

The server checks the groupKey submitted by the client. If the MD5 value of a groupKey is not the latest, the configuration item of the client has not been changed. Therefore, the server adds the groupKey to a changedGroupKeys list. Finally, the changedGroupKeys are returned to the client. For the client, just get the changedGroupKeys.

The server data changes

ClientLongPolling has no other work to do on the server until the scheduling delay is reached, so the allSubs queue must have something to do during that time.

During client long polling, when the configuration is changed, the client gets an immediate response.

Server data change interface

/v1/cs/configs is a POST request. The publishConfig method in ConfigController is shown below: /v1/cs/configs

After modifying the configuration, the server first updates the value of the configuration to the persistence layer, and then fires a ConfigDataChangeEvent event with the fireEvent method:

com.alibaba.nacos.config.server.utils.event.EventDispatcher.java

The fireEvent method is actually an onEvent method that triggers an AbstractEventListener, and listeners are stored in a process object called Listeners.

The triggered AbstractEventListener object is added to the Listeners using the addEventListener method to find out where the addEventListener method is called. You should know which AbstractEventListener needs to trigger the onEvent callback method.

Class AbstractEventListener = AbstractEventListener = AbstractEventListener = AbstractEventListener = AbstractEventListener = AbstractEventListener = AbstractEventListener;

com.alibaba.nacos.config.server.utils.event.EventDispatcher.AbstractEventListener.java

You can see LongPollingService in all subclasses AbstractEventListener. When we update the configuration items from the Dashboard, the onEvent method of the LongPollingService is actually called.

Back in LongPollingService, take a look at the onEvent method as shown below:

com.alibaba.nacos.config.server.service.LongPollingService.DataChangeTask.java

It is found that when the onEvent method of LongPollingService is triggered, a task called DataChangeTask is actually executed, which should be used to inform the client that the server data has been changed. Let’s go to DataChangeTask and look at the code, as shown below:

Traverse the queue for allSubs

The queue of allSubs that maintains all client request tasks needs to find ClientLongPolling tasks equal to the groupKey of the currently changed configuration item

Write response data to the client

After the I and ClientLongPolling tasks are lost, only the changed groupKey needs to be written into the response object through the ClientLongPolling, and a data change “push” operation is completed

If the DataChangeTask task completes the “push” of data, it needs to cancel the scheduled task waiting to be executed. In this way, it prevents the scheduled task from writing the response data after the push operation finishes writing the response data.

As you can see from the sendResponse method, this is exactly what happens:

HTTP requests are stateless in nature, so you don’t have to or can’t set timeouts too long, which is a waste of resources.

At the same time, the server also encapsulates the request into a scheduling task to execute. The waiting period is waiting for the DataChangeTask to actively trigger. If the DataChangeTask has not been triggered after the delay time, the scheduling task starts to check the data changes. The result of the check is then written to the response object, as shown below:

Conclusion:

  1. The Nacos client circulates requests for changed data from the server, and the timeout is set to 30s. When the configuration changes, the response will be returned immediately, otherwise it will wait until 29.5s+ before returning the response
  2. The Nacos client can sense changes in the server configuration in real time.
  3. Real-time awareness is based on client pull and server push, but the server push needs to be quoted here, because the server and client communicate directly and essentially over HTTP, so it feels like push. Because the server actively writes the changed data in advance through the HTTP response object.