doubt

In the process of participating in the development of NACOS, many students asked why I took the service offline in the Nacos Console, but the service can still be called, which is not in line with the official announcement of the second offline feature. After further inquiry, it was found that those instances that could still provide services after offline had a common feature — they all had rabbion, a load balancing component. Therefore, this article will discuss the problem from two aspects: the implementation of nacOS ‘second-level up-down and rabbion’s instance update mechanism causes instance up-down awareness delay

Nacos second on and off line

@CanDistro
@RequestMapping(value = "", method = RequestMethod.PUT)
public String update(HttpServletRequest request) throws Exception {
	String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
	String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);

	String agent = request.getHeader("Client-Version");
	if (StringUtils.isBlank(agent)) {
		agent = request.getHeader("User-Agent");
	}

	ClientInfo clientInfo = new ClientInfo(agent);

	if (clientInfo.type == ClientInfo.ClientType.JAVA &&
		clientInfo.version.compareTo(VersionUtil.parseVersion("1.0.0"> =))0) {
		serviceManager.updateInstance(namespaceId, serviceName, parseInstance(request));
	} else {
		serviceManager.registerInstance(namespaceId, serviceName, parseInstance(request));
	}
	return "ok";
}
Copy the code

The parseInstance(request) method extracts instance information from the request. The underlying updateInstance method is as follows

public void updateInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {

	Service service = getService(namespaceId, serviceName);

	if (service == null) {
		throw new NacosException(NacosException.INVALID_PARAM, "service not found, namespace: " + namespaceId + ", service: " + serviceName);
	}

	if(! service.allIPs().contains(instance)) {throw new NacosException(NacosException.INVALID_PARAM, "instance not exist: " + instance);
	}

	addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
}

public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips) throws NacosException {

	String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);

	Service service = getService(namespaceId, serviceName);

	List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);

	Instances instances = new Instances();
	instances.setInstanceList(instanceList);

	consistencyService.put(key, instances);
}
Copy the code

The next step is the same as the previous post on registering a service instance on the Nacos Server side. Therefore, in the NacOS Console, the instance information data in the NacOS Naming Server is updated immediately once an instance is clicked offline.

Rabbion instance update mechanism

Let’s start with the rabbion instance pull code implemented by NacOS

public class NacosServerList extends AbstractServerList<NacosServer> {

	private NacosDiscoveryProperties discoveryProperties;

	private String serviceId;

	public NacosServerList(NacosDiscoveryProperties discoveryProperties) {
		this.discoveryProperties = discoveryProperties;
	}

	@Override
	public List<NacosServer> getInitialListOfServers(a) {
		return getServers();
	}

	@Override
	public List<NacosServer> getUpdatedListOfServers(a) {
		return getServers();
	}

	private List<NacosServer> getServers(a) {
		try {
			List<Instance> instances = discoveryProperties.namingServiceInstance()
					.selectInstances(serviceId, true);
			return instancesToServerList(instances);
		}
		catch (Exception e) {
			throw new IllegalStateException(
					"Can not get service instances from nacos, serviceId="+ serviceId, e); }}private List<NacosServer> instancesToServerList(List<Instance> instances) {
		List<NacosServer> result = new ArrayList<>();
		if (null == instances) {
			return result;
		}
		for (Instance instance : instances) {
			result.add(new NacosServer(instance));
		}

		return result;
	}

	public String getServiceId(a) {
		return serviceId;
	}

	@Override
	public void initWithNiwsConfig(IClientConfig iClientConfig) {
		this.serviceId = iClientConfig.getClientName(); }}Copy the code

You can see that NacosServerList inherits AbstractServerList, so where is this AbstractServerList finally collected? Through the code tracking as you can see, finally is collected in the DynamicServerListLoadBalancer this class

protected final ServerListUpdater.UpdateAction updateAction = new ServerListUpdater.UpdateAction() {
        @Override
        public void doUpdate(a) { updateListOfServers(); }};public DynamicServerListLoadBalancer(IClientConfig clientConfig) {
	initWithNiwsConfig(clientConfig);
}
    
@Override
public void initWithNiwsConfig(IClientConfig clientConfig) {
	try {
		super.initWithNiwsConfig(clientConfig);
		String niwsServerListClassName = clientConfig.getPropertyAsString( CommonClientConfigKey.NIWSServerListClassName, DefaultClientConfigImpl.DEFAULT_SEVER_LIST_CLASS);
		ServerList<T> niwsServerListImpl = (ServerList<T>) ClientFactory
                    .instantiateInstanceWithClientConfig(niwsServerListClassName, clientConfig);
    // Get the implementation classes for all the ServerList interfaces
		this.serverListImpl = niwsServerListImpl;

    // Get Filter(Filter the pulled Servers list)
		if (niwsServerListImpl instanceof AbstractServerList) {
			AbstractServerListFilter<T> niwsFilter = ((AbstractServerList) niwsServerListImpl)
                        .getFilterImpl(clientConfig);
			niwsFilter.setLoadBalancerStats(getLoadBalancerStats());
			this.filter = niwsFilter;
		}

    // Get The ServerListUpdater object implementation class name
		String serverListUpdaterClassName = clientConfig.getPropertyAsString( CommonClientConfigKey.ServerListUpdaterClassName, DefaultClientConfigImpl.DEFAULT_SERVER_LIST_UPDATER_CLASS);

    PollingServerListUpdater = PollingServerListUpdater = PollingServerListUpdater
		this.serverListUpdater = (ServerListUpdater) ClientFactory.instantiateInstanceWithClientConfig(serverListUpdaterClassName, clientConfig);

    // Initialize or reset
		restOfInit(clientConfig);
	} catch (Exception e) {
		throw new RuntimeException(
                    "Exception while initializing NIWSDiscoveryLoadBalancer:"
                            + clientConfig.getClientName()
                            + ", niwsClientConfig:"+ clientConfig, e); }}void restOfInit(IClientConfig clientConfig) {
	boolean primeConnection = this.isEnablePrimingConnections();
	// turn this off to avoid duplicated asynchronous priming done in BaseLoadBalancer.setServerList()
	this.setEnablePrimingConnections(false);
  // Enable the scheduled task. This task is to periodically refresh the instance information cache
	enableAndInitLearnNewServersFeature();

  // Perform an instance pull operation before opening
	updateListOfServers();
	if (primeConnection && this.getPrimeConnections() ! =null) {
		this.getPrimeConnections() .primeConnections(getReachableServers());
	}
	this.setEnablePrimingConnections(primeConnection);
	LOGGER.info("DynamicServerListLoadBalancer for client {} initialized: {}", clientConfig.getClientName(), this.toString());
}

// Update the instance information cache
@VisibleForTesting
public void updateListOfServers(a) {
	List<T> servers = new ArrayList<T>();
	if(serverListImpl ! =null) {
    // Call the method that pulls the new instance information
		servers = serverListImpl.getUpdatedListOfServers();
		LOGGER.debug("List of Servers for {} obtained from Discovery client: {}", getIdentifier(), servers);

    // Update the list of pulled Servers with Filter
		if(filter ! =null) {
			servers = filter.getFilteredListOfServers(servers);
			LOGGER.debug("Filtered List of Servers for {} obtained from Discovery client: {}", getIdentifier(), servers); }}// Update the instance list
	updateAllServerList(servers);
}
Copy the code

Let’s see enableAndInitLearnNewServersFeature (); What is the final call to

@Override
public synchronized void start(final UpdateAction updateAction) {
	if (isActive.compareAndSet(false.true)) {
		final Runnable wrapperRunnable = new Runnable() {
			@Override
			public void run(a) {
				if(! isActive.get()) {if(scheduledFuture ! =null) {
						scheduledFuture.cancel(true);
					}
					return;
				}
				try {
          / / here UpdateAction object is encapsulated in the DynamicServerListLoadBalancer updateListOfServers implementation
					updateAction.doUpdate();
					lastUpdated = System.currentTimeMillis();
				} catch (Exception e) {
					logger.warn("Failed one update cycle", e); }}};// The default task execution interval is 30s
		scheduledFuture = getRefreshExecutor().scheduleWithFixedDelay(
                    wrapperRunnable,
                    initialDelayMs,
                    refreshIntervalMs,
                    TimeUnit.MILLISECONDS);
	} else {
		logger.info("Already active, no-op"); }}Copy the code

Therefore, it is not difficult to see that nacOS implements second-level instance up and down, but because in Spring Cloud, the rabbion instance information update of the load component is in the form of a scheduled task, it is possible that the task is executed only one second before you execute the instance up and down the next second. The Rabbion must wait for the refreshIntervalMs to sense the change.