preface

Some time ago, I have been assisting the project team in the system reconstruction. The system application was split into multiple services, and some services were deployed in clusters. As the above architecture evolved, it was natural to introduce ELK + Filebeat for log collection. However, when using Kibana to view logs, due to the lack of TraceID, it is difficult for developers to screen out the logs related to the specified request, and it is also difficult to track the process of the application calling downstream services, which consumes a lot of time. After I checked the problems for several times, I couldn’t bear to spend so much time each time, so I quickly mentioned the transformation to the supervisor.

This article is mainly to record my research, problems and specific implementation of the solution to the project TraceID link tracking transformation. At the same time, this transformation also deepened my understanding of distributed service tracking, which I also wrote in it.

Main contents of this paper:

  • The preliminary implementation
  • The traceId of the asynchronous thread is lost. Procedure
  • Dubbo RPC link tracing
  • HTTP Service link tracing
  • Consider the implementation of SpringCloud Sleuth
  • summary

I. Preliminary implementation

The basic idea is to use the SLF4J MDC function + Spring Interceptor to generate a traceId in the MDC when an external request comes in.

MDC

Here is a brief introduction to MDC.

MDC (Mapped Diagnostic Context) is a feature provided by LOG4J and LogBack to facilitate logging in multithreaded conditions. MDC can be thought of as a Map bound to the current thread to which you can add key-value pairs. The content contained in MDC can be accessed by code executing in the same thread. Children of the current thread inherit MDC content from their parent thread. To record logs, you only need to obtain the required information from the MDC. The MDC contents are saved by the program at the appropriate time. For a Web application, this data is usually saved at the very beginning of the request being processed.

To put it simply, MDC is an InheritableThreadLocal provided by the logging framework, where key/value pairs can be placed in project code, retrieved from the ThreadLocal, and printed out at print time. The detailed principle will not be described in this paper. Take a look at the implementation classes in Log4J and LogBack.

implementation

  1. Custom Spring interceptorsTraceInterceptor
/ * * *@author Richard_yyf
 */
public class TraceInterceptor extends HandlerInterceptorAdapter {

    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
        / / to empty
        MDC.clear();

        ThreadMdcUtil.setTraceIdIfAbsent();

        // The following logic... .
        return true; }}Copy the code
  1. Register interceptors
/ * * *@author Richard_yyf
 */
@Configuration
public class WebMvcConfig implements WebMvcConfigurer {

    @Override
    public void addInterceptors(InterceptorRegistry registry) {
        registry.addInterceptor(traceInterceptor())
                .addPathPatterns("/ * *")
                .order(0);
    }

    @Bean
    public TraceInterceptor traceInterceptor(a) {
        return newTraceInterceptor(); }}Copy the code

ThreadMdcUtil is a utility class I wrapped myself that wraps some operations on TraceId:

public class ThreadMdcUtil {
    
    public static String createTraceId(a) {
        String uuid = UUID.randomUUID().toString();
        return DigestUtils.md5Hex(uuid).substring(8.24);
    }

    public static void setTraceIdIfAbsent(a) {
        if (MDC.get(TRACE_ID) == null) { MDC.put(TRACE_ID, createTraceId()); }}// Some methods are omitted which will be shown later
}
Copy the code

DigestUtils comes from third-party dependencies:

<dependency>
	<groupId>commons-codec</groupId>
	<artifactId>commons-codec</artifactId>
    <version>* * *</version>
</dependency>
Copy the code

TRACE_ID is placed in the Constant class for easy reference:

public class Constant {...public static final String TRACE_ID = "traceId"; . }Copy the code
  1. Modify the output format in the log configuration file to add the TraceID field

    Value: %X{traceid}

The results of

After following the above steps, your Web application will print a log with TraceId when it receives a request.

The thread pool TraceID is lost

The previous solution simply implemented our most basic requirements. But if you do use it, you will find that asynchronous task threads do not get TraceID.

A mature application must use many thread pools. Common examples are @async thread pools for asynchronous calls, some thread pools defined by the application itself, etc.

As mentioned earlier, MDC is implemented with InheritableThreadLocal, and when the child thread is created, the inheritableThreadLocals property of the parent thread is copied. But in a thread pool, threads are reused, not newly created, so MDC content cannot be passed in.

So we need the curve to save the country, since threads are reusable, it is natural to think of doing some “dirty” operations when tasks are submitted to the thread pool to tell the MDC content to be passed on.

transform

Here’s the code:

/ * * *@author Richard_yyf
 */
public class ThreadMdcUtil {
    
    public static String createTraceId(a) {
        String uuid = UUID.randomUUID().toString();
        return DigestUtils.md5Hex(uuid).substring(8.24);
    }

    public static void setTraceIdIfAbsent(a) {
        if (MDC.get(TRACE_ID) == null) { MDC.put(TRACE_ID, createTraceId()); }}public static void setTraceId(a) {
        MDC.put(TRACE_ID, createTraceId());
    }

    public static void setTraceId(String traceId) {
        MDC.put(TRACE_ID, traceId);
    }

    public static <T> Callable<T> wrap(final Callable<T> callable, final Map<String, String> context) {
        return() - > {if (context == null) {
                MDC.clear();
            } else {
                MDC.setContextMap(context);
            }
            setTraceIdIfAbsent();
            try {
                return callable.call();
            } finally{ MDC.clear(); }}; }public static Runnable wrap(final Runnable runnable, final Map<String, String> context) {
        return() - > {if (context == null) {
                MDC.clear();
            } else {
                MDC.setContextMap(context);
            }
            setTraceIdIfAbsent();
            try {
                runnable.run();
            } finally{ MDC.clear(); }}; }}Copy the code

Wrap your own ThreadPoolExecutor extension

/ * * *@author Richard_yyf
 */
public class ThreadPoolExecutorMdcWrapper extends ThreadPoolExecutor {

    public ThreadPoolExecutorMdcWrapper(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit,
                                        BlockingQueue<Runnable> workQueue) {
        super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
    }

    public ThreadPoolExecutorMdcWrapper(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit,
                                        BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory) {
        super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory);
    }

    public ThreadPoolExecutorMdcWrapper(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit,
                                        BlockingQueue<Runnable> workQueue, RejectedExecutionHandler handler) {
        super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, handler);
    }

    public ThreadPoolExecutorMdcWrapper(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit,
                                        BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory,
                                        RejectedExecutionHandler handler) {
        super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, threadFactory, handler);
    }

    @Override
    public void execute(Runnable task) {
        super.execute(ThreadMdcUtil.wrap(task, MDC.getCopyOfContextMap()));
    }

    @Override
    public <T> Future<T> submit(Runnable task, T result) {
        return super.submit(ThreadMdcUtil.wrap(task, MDC.getCopyOfContextMap()), result);
    }

    @Override
    public <T> Future<T> submit(Callable<T> task) {
        return super.submit(ThreadMdcUtil.wrap(task, MDC.getCopyOfContextMap()));
    }

    @Override
    publicFuture<? > submit(Runnable task) {return super.submit(ThreadMdcUtil.wrap(task, MDC.getCopyOfContextMap())); }}Copy the code

use

Executor = new ThreadPoolExecutor(…) To executor = new ThreadPoolExecutorMdcWrapper (…). Can.

For example, if you are using the Spring @async asynchronous method, declare this when configuring the thread pool:

@SpringBootApplication
public class Application {

    public static void main(String[] args) {
        SpringApplication.run(Application.class, args);
    }

    @EnableAsync
    @Configuration
    class TaskPoolConfig {

        @Bean("taskExecutor")
        public Executor taskExecutor(a) {
            ThreadPoolTaskExecutor executor = new ThreadPoolExecutorMdcWrapper();
            executor.setCorePoolSize(10);
            executor.setMaxPoolSize(20);
            executor.setQueueCapacity(200);
            executor.setKeepAliveSeconds(60);
            returnexecutor; }}}Copy the code

The results of

By following these steps, your asynchronous task will print the log with the requested TraceID.

3. Dubbo RPC link tracing

Our project team mainly used Dubbo to develop the microservices framework. We want to pass the TraceID of the upstream service between service invocations to achieve link tracing.

Dubbo provides a mechanism to set and pass the consumer’s TraceID through the Dubbo RPC + Dubbo Filter.

See the official website for explanations of these two concepts.

Dubbo RPC

Dubbo Filter

Here I’ll just show you the code and extension point configuration.

Dubbo Filter for Consumer

Consumer application:

/ * * *@author Richard_yyf
 */
@Activate(group = {Constants.CONSUMER})
public class ConsumerRpcTraceFilter implements Filter {

    @Override
    public Result invoke(Invoker
        invoker, Invocation invocation) throws RpcException {
        // If the MDC context has a trace ID, it is passed to the provider as is
        String traceId = MDC.get(TRACE_ID);
        if (StringUtils.isNotEmpty(traceId)) {
            RpcContext.getContext().setAttachment(TRACE_ID, traceId);
        }
        returninvoker.invoke(invocation); }}Copy the code

SPI configuration:

In the resources directory, create a/meta-inf/dubbo/com. Alibaba. Dubbo. RPC. The Filter file.

consumerRpcTraceFilter=com.xxx.xxx.filter.ConsumerRpcTraceFilter
Copy the code

Dubbo Filter for Provider

Service provider application:

/ * * *@author Richard_yyf
 */
@Activate(group = {Constants.PROVIDER})
public class ProviderRpcTraceFilter implements Filter {
    
    @Override
    public Result invoke(Invoker
        invoker, Invocation invocation) throws RpcException {
        // Receive the traceId from the consumer
        String traceId = RpcContext.getContext().getAttachment(TRACE_ID);
        if (StringUtils.isBlank(traceId)) {
            traceId = ThreadMdcUtil.createTraceId();
        }

        // Set the log traceId
        ThreadMdcUtil.setTraceId(traceId);

        // TODO If this service also calls the next service, the downstream parameters need to be set again
        // RpcContext.getContext().setAttachment("trace_id", traceId);

        try {
            return invoker.invoke(invocation);
        } finally {
            // Remove the MDC attributes after the call is completeMDC.remove(TRACE_ID); }}}Copy the code

SPI configuration:

providerRpcTraceFilter=com.xxx.xxx.filter.ProviderRpcTraceFilter
Copy the code

HTTP Service-oriented link tracing

In addition to the Dubbo RPC approach, common calls between microservices are also made through HTTP REST. In this scenario, the upstream service automatically adds TraceID to the HTTP Header when making HTTP calls.

Take the commonly used Spring RestTemplate as an example to wrap the HTTP Header with an interceptor.

        RestTemplate restTemplate = new RestTemplate();

        List<ClientHttpRequestInterceptor> list = new ArrayList<>();
        list.add((request, body, execution) -> {
            String traceId = MDC.get(TRACE_ID);
            if (StringUtils.isNotEmpty(traceId)) {
                request.getHeaders().add(TRACE_ID, traceId);
            }
            return execution.execute(request, body);
        });

        restTemplate.setInterceptors(list);
Copy the code

Downstream services, since they are exposed through an HTTP interface, can be retrieved by adding an interceptor.

public class TraceInterceptor extends HandlerInterceptorAdapter {

    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) {
        MDC.clear();
        String traceId = request.getHeader(TRACE_ID);
        if (StringUtils.isEmpty(traceId)) {
            ThreadMdcUtil.setTraceId();
        } else {
            MDC.put(TRACE_ID, traceId);
        }
        return true; }}Copy the code

5. Consider the implementation of Spring Cloud Sleuth

Through the above steps, we have come up with a fairly basic service tracing solution on our own.

Spring Cloud, as a one-stop microservices development framework, provides Spring Cloud Sleuth as a distributed tracking solution under the technology system. Here I want to take it out and talk about it.

Sleuth is a mature technical solution based on the theory of Google Dapper, and some of the terms in Sleuth are derived from that paper. For the TraceID transmission problem, some of the ideas for solving the problem of the simple version of the solution we mentioned above are actually reflected in Sleuth.

The first is distributed tracing, where Sleuth adds SpanID and TraceID to the Slf4J MDC so that the printed logs have the corresponding identifiers.

In the thread pool TraceID transmission failure, we rather to the operation of the submitted task for packaging, so in Slueth is by implementing HystrixConcurrencyStrategy interface to solve the problem of TraceID asynchronous transmission. Hystrix in the actual call, will call HystrixConcurrencyStrategy wrapCallable method. By implementing this interface, in the heart of the wrapCallable TraceID stored (specific see SleuthHystrixConcurrencyStrategy).

In the face of Dubbo RPC invocation and Http Service invocation, we use Dubbo RpcContext + Filter and Http Header + Interceptor. TraceID is passed through extension points and context mechanisms provided by the protocol or framework itself. In the Spring Cloud tracking @ Async Sleuth, RestTemplate, Zuul, Feign components, such as a similar solution. Such as tracking RestTemplate borrowed from the Spring and it is as same as the above Client Interceptor mechanism (@ See TraceRestTemplateInterceptor).

The above comparison of our simple solution with Spring Cloud Sleuth shows that the idea of log tracing is similar to some technical solutions.

Of course, Spring Cloud Sleuth is implemented based on Dapper and provides a relatively mature distributed system call tracing architecture. After integrating ZipKin + Spring-Cloud-sleUth-Zipkin dependencies, It can build a complete distributed service tracking system with data collection, data storage and data display functions.

Sleuth allows you to know exactly what services a service request goes through and how long each service takes to process. So that we can be very convenient to clarify the call relationship between the micro-services. In addition Sleuth can help us:

  • Time consuming analysis: Sleuth can be used to easily understand the time consuming of each sampling request, so as to analyze which service invocation is time-consuming;
  • Visual errors: Exceptions not caught by the program can be seen through the integrated Zipkin service interface;
  • Link optimization: You can optimize services that are frequently invoked.

PS: Spring-Cloud-sleth 2.0 officially supports Dubbo. The idea is to extend Dubbo filter.

summary

Why not introduce Sleuth + ZipKin as a solution? Because the call link of our system is not complicated, generally there is only one layer of call relationship, so we do not want to add third-party components, and prefer to use simple solutions.

That’s the end of this article. It is not too difficult to implement a simple logging solution for tracking microservice calls. What is important is to approach the problem and learn from some of the best technical solutions that exist on the market.

If this article is helpful to you, I hope you can give a thumbs up. This is the biggest motivation for me.