Learning is not so utilitarian, two brothers with you from a higher dimension to easily read the source ~

In this article, we will analyze the local caching and failover capabilities of Nacos through source code, involving the core classes ServiceInfoHolder and FailoverReactor.

ServiceInfoHolder function overview

The ServiceInfoHolder class, as the name suggests, holds the service information. The ServiceInfoHolder class has been covered several times in previous articles, such as the processServiceInfo method that is invoked each time a client retrieves new service information from the registry for localized processing, including updating the cache service, publishing events, updating local files, and so on.

In addition to the above functions, the instantiation of this class also includes initialization of the local cache directory and failover initialization. Let’s take a look at each of them.

ServiceInfo’s local memory cache

ServiceInfo Specifies the information about the registered service, including the service name, group name, cluster information, instance list information, and last update time. That is, the information the client gets from the registry is hosted locally as ServiceInfo.

The ServiceInfoHolder class in turn holds ServiceInfo, which is stored in a ConcurrentMap:

public class ServiceInfoHolder implements Closeable {
    private final ConcurrentMap<String, ServiceInfo> serviceInfoMap;
}
Copy the code

This is the first level of caching of service registration information by the Nacos client. When we analyzed the processServiceInfo method earlier, we saw that the information in serviceInfoMap is updated first when the service information changes.

public ServiceInfo processServiceInfo(ServiceInfo serviceInfo) { // .... Serviceinfomap. put(serviceInfo.getKey(), serviceInfo); Boolean changed = isChangedServiceInfo(oldService, serviceInfo); // Check whether the registered instance information has changed. if (StringUtils.isBlank(serviceInfo.getJsonFromServer())) { serviceInfo.setJsonFromServer(JacksonUtils.toJson(serviceInfo)); } / /... }Copy the code

Using serviceInfoMap is as simple as putting the latest data to it when changing instances. In this example, perform the GET operation based on the key.

ServiceInfoMap is initialized in the Constructor of ServiceInfoHolder and creates an empty ConcurrentMap by default. However, when information is configured to read from a cache file at startup, it is loaded from the local cache.

// Whether to read information from the cache directory at startup. The default is false. If (isLoadCacheAtStart(properties)) {this.serviceInfoMap = new ConcurrentHashMap<String, ServiceInfo>(DiskCache.read(this.cacheDir)); } else { this.serviceInfoMap = new ConcurrentHashMap<String, ServiceInfo>(16); }Copy the code

The local cache directory is involved here. In the processServiceInfo method, when the service instance changes, you will see ServiceInfo written to this directory using the DiskCache#write method.

If (changed) {naming_logger. info("current ips:(" + serviceinfo.ipcount () + ") service: " + serviceInfo.getKey() + " -> " + JacksonUtils.toJson(serviceInfo.getHosts())); // Add instance change event, Will be pushed to the subscriber execution NotifyCenter. PublishEvent (new InstancesChangeEvent (serviceInfo. The getName (), serviceInfo. GetGroupName (), serviceInfo.getClusters(), serviceInfo.getHosts())); // Record the Service local file diskCache. write(serviceInfo, cacheDir); }Copy the code

Let’s talk about local cache directories.

Local cache directory

The local cache directory exists as an attribute of ServiceInfoHolder to specify the root directory for the local cache and the root directory for failover.

private String cacheDir;
Copy the code

In the ServiceInfoHolder constructor, the first call is to generate the cache directory:

Public ServiceInfoHolder(String namespace, Properties Properties) {// Generate cache directory: The default is ${user.home}/nacos/naming/public, // You can customize the root directory initCacheDir(namespace, properties) with system.setProperty (" jm.snapshot.path "); / /... }Copy the code

The default cache directory is ${user.home}/nacos/naming/public. You can customize the root directory with system.setProperty (” jm.snapshot.path “).

After the directory is initialized, failover information is also stored in the directory.

failover

Also in the ServiceInfoHolder constructor, a FailoverReactor class is initialized, which is also a member variable of ServiceInfoHolder. The FailoverReactor is designed to handle failover.

this.failoverReactor = new FailoverReactor(this, cacheDir);
Copy the code

Here this is the current object of ServiceInfoHolder, meaning that the two hold references to each other.

Let’s look at the FailoverReactor construction method:

public FailoverReactor(ServiceInfoHolder serviceInfoHolder, String cacheDir) {// Hold ServiceInfoHolder reference this. ServiceInfoHolder = ServiceInfoHolder; / / splicing failure root directory: ${user. Home} / nacos/naming/public/failover enclosing failoverDir = cacheDir + FAILOVER_DIR; / / initialize the executorService enclosing the executorService = new ScheduledThreadPoolExecutor (1, new ThreadFactory() { @Override public Thread newThread(Runnable r) { Thread thread = new Thread(r); // Run thread.setdaemon (true); thread.setName("com.alibaba.nacos.naming.failover"); return thread; }}); // Other initialization operations, including executing this.init() with multiple scheduled tasks from executorService; }Copy the code

The FailoverReactor’s construction method basically shows all its functions:

  • Holds the ServiceInfoHolder reference;
  • Splicing failure root directory: ${user. Home} / nacos/naming/public/failover, the public can also be other custom namespace;
  • Initialize the executorService;
  • Init method: Enable multiple scheduled tasks through the executorService.

Init method execution

The init method enables three scheduled tasks:

  • The initialization is performed immediately with an interval of 5 seconds. The task is SwitchRefresher.
  • The initialization delay is 30 minutes and the interval is 24 hours. The task is DiskFileWriter.
  • The initialization is performed immediately with an interval of 10 seconds. The core operation is DiskFileWriter.

These three tasks are all internal classes of the FailoverReactor. Let’s look at the implementation of the DiskFileWriter for the last two tasks.

class DiskFileWriter extends TimerTask { @Override public void run() { Map<String, ServiceInfo> map = serviceInfoHolder.getServiceInfoMap(); for (Map.Entry<String, ServiceInfo> entry : map.entrySet()) { ServiceInfo serviceInfo = entry.getValue(); if (StringUtils.equals(serviceInfo.getKey(), UtilAndComs.ALL_IPS) || StringUtils .equals(serviceInfo.getName(), UtilAndComs.ENV_LIST_KEY) || StringUtils .equals(serviceInfo.getName(), UtilAndComs.ENV_CONFIGS) || StringUtils .equals(serviceInfo.getName(), UtilAndComs.VIP_CLIENT_FILE) || StringUtils .equals(serviceInfo.getName(), UtilAndComs.ALL_HOSTS)) { continue; } // Write the cache content to the disk file diskCache. write(serviceInfo, failoverDir); }}}Copy the code

The logic is simple: get ServiceInfo cached in ServiceInfoHolder, determine if it meets the requirement to write to a disk file, and if so, write it to the previously concatenated failover directory: ${user. Home} / nacos/naming/public/failover. The difference between the second scheduled task and the third scheduled task is that the third scheduled task has a pre-judgment and is executed only when the file does not exist.

Finally, the core implementation of SwitchRefresher is as follows:

File switchFile = new File(failoverDir + UtilAndComs.FAILOVER_SWITCH); // File does not exist exit if (! switchFile.exists()) { switchParams.put("failover-mode", "false"); NAMING_LOGGER.debug("failover switch is not found, " + switchFile.getName()); return; } long modified = switchFile.lastModified(); if (lastModifiedMillis < modified) { lastModifiedMillis = modified; / / get a failover file content String failover = ConcurrentDiskUtil. GetFileContent (failoverDir + UtilAndComs FAILOVER_SWITCH, Charset.defaultCharset().toString()); if (! StringUtils.isEmpty(failover)) { String[] lines = failover.split(DiskCache.getLineSeparator()); for (String line : lines) { String line1 = line.trim(); If (is_failover_mode.equals (line1)) {switchParams.put(FAILOVER_MODE_PARAM, Boils.true. ToString ()); NAMING_LOGGER.info("failover-mode is on"); new FailoverFileReader().run(); } else if (no_failover_mode.equals (line1)) {// 0 switchParams.put(FAILOVER_MODE_PARAM, Boolean.FALSE.toString()); NAMING_LOGGER.info("failover-mode is off"); } } } else { switchParams.put(FAILOVER_MODE_PARAM, Boolean.FALSE.toString()); }}Copy the code

The logic of the above code is as follows:

  • If the failover file does not exist, return directly. The failover [switch] file name is “00-00-000-vipSRv_failover_switch-000 –00-00”.
  • Compares the file modification time and, if it has been modified, retrieves the contents of the failover file.
  • The 0 and 1 identifiers are stored in the failover file. 0 indicates disabled, and 1 indicates enabled.
  • When enabled, FailoverFileReader is executed.

FailoverFileReader, as the name implies, is a failover file read. The basic operation is to read the contents of the ServiceInfo file stored in the Failover directory, convert it to ServiceInfo, and store all ServiceInfo in the serviceMap attribute of the FailoverReactor.

The following is an example of a failover directory file:

(base) appledeMacBook-Pro-2:failover apple$ ls DEFAULT_GROUP%40%40nacos.test.1 DEFAULT_GROUP%40%40user-provider@@DEFAULT  DEFAULT_GROUP%40%40user-service-consumer@@DEFAULT DEFAULT_GROUP%40%40user-service-provider DEFAULT_GROUP%40%40user-service-provider@@DEFAULTCopy the code

The file format is as follows:

{" hosts ": [{" IP" : "1.1.1.1", "port" : 800, "valid" : true, "healthy" : true, "marked" : false, "instanceId" : "1.1.1.1#800#DEFAULT#DEFAULT_GROUP@@nacos.test.1", "metadata": {"netType": "external", "version": "2.0"}, "enabled": true, "weight": 2, "clusterName": "DEFAULT", "serviceName": "DEFAULT_GROUP@@nacos.test.1", "ephemeral": true } ], "dom": "DEFAULT_GROUP@@nacos.test.1", "name": "DEFAULT_GROUP@@nacos.test.1", "cacheMillis": 10000, "lastRefTime": 1617001291656, "checksum": "969c531798aedb72f87ac686dfea2569", "useSpecifiedURL": false, "clusters": "", "env": "", "metadata": {} }Copy the code

Let’s look at the core business implementation:

for (File file : files) { if (! file.isFile()) { continue; } // Skip if (file.getName().equals(utilandcoms.failover_switch)) {continue; } ServiceInfo dom = new ServiceInfo(file.getName()); try { String dataString = ConcurrentDiskUtil .getFileContent(file, Charset.defaultCharset().toString()); reader = new BufferedReader(new StringReader(dataString)); String json; if ((json = reader.readLine()) ! = null) { try { dom = JacksonUtils.toObj(json, ServiceInfo.class); } catch (Exception e) { NAMING_LOGGER.error("[NA] error while parsing cached dom : " + json, e); } } } catch (Exception e) { NAMING_LOGGER.error("[NA] failed to read cache for dom: " + file.getName(), e); } finally { try { if (reader ! = null) { reader.close(); } } catch (Exception e) { //ignore } } // ... Read into the cache if (! CollectionUtils.isEmpty(dom.getHosts())) { domMap.put(dom.getKey(), dom); }}Copy the code

The basic code flow is as follows:

  • Read all files in the failover directory for traversal processing;
  • If the file does not exist, skip it.
  • If the file is a failover flag file, skip;
  • Read the JSON content in the file and convert it into a ServiceInfo object.
  • Put the ServiceInfo object into domMap;

When the for loop completes, if domMap is not empty, assign it to serviceMap:

if (domMap.size() > 0) {
    serviceMap = domMap;
}
Copy the code

So, some of you might ask, where is this serviceMap used? When we talked about getting an instance, we usually call a method called getServiceInfo:

public ServiceInfo getServiceInfo(final String serviceName, final String groupName, final String clusters) {
    NAMING_LOGGER.debug("failover-mode: " + failoverReactor.isFailoverSwitch());
    String groupedServiceName = NamingUtils.getGroupedName(serviceName, groupName);
    String key = ServiceInfo.getKey(groupedServiceName, clusters);
    if (failoverReactor.isFailoverSwitch()) {
        return failoverReactor.getService(key);
    }
    return serviceInfoMap.get(key);
}
Copy the code

That is, if failover is enabled, the failoverReactor#getService method is called first, which gets ServiceInfo from serviceMap.

public ServiceInfo getService(String key) {
    ServiceInfo serviceInfo = serviceMap.get(key);

    if (serviceInfo == null) {
        serviceInfo = new ServiceInfo();
        serviceInfo.setName(key);
    }

    return serviceInfo;
}
Copy the code

At this point, the analysis of the Nacos client failover process is completed.

summary

This article introduces the implementation of Nacos client local caching and failover. The so-called local cache has two aspects. First, the instance information obtained from the registry is cached in memory, that is, in the form of a Map, so that the query operation is convenient. The second method is to cache it periodically in the form of disk files for a rainy day.

Failover is divided into two aspects. On the one hand, failover switches are marked by files. The second aspect is that when failover is enabled, when a failure occurs, the service instance information can be obtained from the files backed up periodically during failover.

About the blogger: Author of the technology book SpringBoot Inside Technology, loves to delve into technology and writes technical articles.

Public account: “program new vision”, the blogger’s public account, welcome to follow ~

Technical exchange: Please contact the weibo user at Zhuan2quan