The background,

The most common method for troubleshooting system faults is to view system logs. In distributed systems, traceId is used as the link log tracing.

<property name="pattern" value="%d{yyyy-MM-dd HH:mm:ss.SSS} [${applicationName},%X{X-B3-TraceId:-}] [%thread] %-5level [%uid] [%C:%L] %msg%n"/>
Copy the code

The values of traceId and X-B3-TracEID are the same in the MDC. However, because MDC uses ThreadLocal internally, only the local thread is valid, and values in child threads and downstream service MDC are lost. Therefore, the main difficulty of the scheme is to solve the problem of value transfer, which mainly includes the following parts:

  • How do asynchronous cases (thread pools) pass traceId to child threads
  • How is an asynchronous service call (RPC) delivered

This paper will focus on the above two problems, to provide solutions.

Second, solutions

1. Extend thread pools

1.1 toThreadLocalThreadPoolExecutorFor extension, the implementation of the parent threadMDCContent is copied to child threads.

public class LogThreadLocalThreadPoolExecutor extends ThreadLocalThreadPoolExecutor {...@Override
    public void execute(Runnable command) {
        super.execute(new MDCRunnable() {
            @Override
            protected void innerRun(a) { command.run(); }}); }public static abstract class MDCRunnable implements Runnable {

        private Map<String, String> copyOfContextMap = MDC.getCopyOfContextMap();

        @Override
        public void run(a) {
            if(copyOfContextMap ! =null) {
                MDC.setContextMap(copyOfContextMap);
            }
            try {
                innerRun();
            } finally {
                if(MapUtils.isNotEmpty(copyOfContextMap)) { MDC.clear(); }}}protected abstract void innerRun(a); }}Copy the code

1.2 Alibaba TTL Transmission traceId Problem

In this project, we use TtlRunnable to package the first layer. Because the MDC implementation of Logback uses ThreadLocal to transmit child threads, we need to rewrite ali’s TransmittableThreadLocal. Note: While InheritableThreadLocal can be passed to child threads, the TransmittableThreadLocal provides a solution for thread reuse in the thread pool, passing ThreadLocal from the current thread to other threads in the thread pool.

The requirement scenario for ThreadLocal is the latent requirement scenario for TTL, and the TTL target scenario is if your business needs to “pass ThreadLocal using components that cache threads such as thread pools”

public class ThreadLocalThreadPoolExecutor extends ThreadPoolExecutor {...@Override
    public void execute(Runnable command) {
        super.execute(TtlRunnable.get(command)); }}Copy the code

If the LogbackMDCAdapter has any issues, replace ThreadLocalMap with TransmittableThreadLocal. Note: The package must be org.slf4j

Initialize the TtlMDCAdapter instance and replace the Adapter object in MDC. Auto-inject configuration spring.factories

org.springframework.context.ApplicationContextInitializer=\
  com.moka.hcm.common.log.config.TtlMDCAdapterInitializer
Copy the code
public class TtlMDCAdapterInitializer implements ApplicationContextInitializer<ConfigurableApplicationContext> {
    @Override
    public void initialize(ConfigurableApplicationContext applicationContext) {
        // Load the TtlMDCAdapter instanceTtlMDCAdapter.getInstance(); }}Copy the code

2. Asynchronous Feign calls

When other services are invoked through feign in the thread pool, hystrix sends requests through the thread pool, but ThreadLocal in MDC cannot pass child thread variables, causing inconsistent TraceIds.

2021-12-21 19:20:58.385 [50c8a6e474e8d858] [http-nio-8080-exec-5] INFO  
2021-12-21 19:20:58.388 [50c8a6e474e8d858] [http-nio-8080-exec-5] INFO  
2021-12-21 19:21:05.195 [50c8a6e474e8d858] [pool-15-thread-1] INFO  
2021-12-21 19:21:05.219 [f85da8ad351a0ff3] [hystrix-9] INFO  
2021-12-21 19:21:05.319 [50c8a6e474e8d858] [http-nio-8080-exec-5] INFO 
Copy the code

Solutions:

public class ThreadLocalHystrixConcurrencyStrategy extends HystrixConcurrencyStrategy {
    /** * Hystrix thread pool passes the MDC context */
    @Override
    public <T> Callable<T> wrapCallable(Callable<T> callable) {
        Map<String, String> map = MDC.getCopyOfContextMap();
        return() - > {try {
                MDC.setContextMap(map);
                return callable.call();
            } finally{ MDC.clear(); }}; }}Copy the code

Take a look at the modified effect:

2021-12-23 14:52:32.714 [205932514ea19D87] [HTTP-NIO-8080-exec-1] INFO 2021-12-23 14:52:32.850 [205932514ea19d87] INFO 2021-12-23 14:52:32.850 [205932514ea19d87] [hystrix-2] INFO 2021-12-23 14:52:32.890 [205932514ea19d87] [HTTP-NIO-8080-exec-1] INFO 2021-12-23 14:52:32.892 [205932514ea19d87] [http-nio-8080-exec-1] INFO 2021-12-23 14:52:32.937 [205932514ea19d87] [HTTP-nio-8080-exec-1] INFO 2021-12-23 14:52:32.987 [205932514ea19D87] [pool-15-thread-2] INFO 2021-12-23 14:52:33.036 [205932514ea19d87] INFO 2021-12-23 14:52:33.036 [205932514ea19d87] [hystrix-3] INFO 2021-12-23 14:52:33.153 [205932514ea19d87] [HTTP-NIO-8080-exec-1] INFOCopy the code

3,Slf4jCurrentTraceContextuseInheritableThreadLocalThe problem

[InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal]

In the thread pool state, when the thread is created for the first time, the inheritableThreadLocals data will be copied from the parent thread. Therefore, the variables set by the parent thread will be retrieved when the task starts. It does not trigger data operations on the inheritableThreadLocals in the parent thread, so even if a new traceId is set in the main thread, it does not take effect. The asynchronous thread traceId uses the old traceId.

So how to solve this problem? Solutions are as follows:

Create a custom DefaultTraceContext with three types of thread copies (ThreadLocal, InheritableThreadLocal, TransmittableThreadLocal). This project uses the TransmittableThreadLocal

/**
 * DefaultTraceContext
 *
 * @author weiyilin
 * @dateThe 2022-01-29 16:12:43 * /
public class DefaultTraceContext extends CurrentTraceContext {
 
    static final ThreadLocal<TraceContext> DEFAULT = new ThreadLocal<>();
    static final InheritableThreadLocal<TraceContext> INHERITABLE = new InheritableThreadLocal<>();
    static final TransmittableThreadLocal<TraceContext> TRANSMITTABLE = new TransmittableThreadLocal<>();
 
 
    final ThreadLocal<TraceContext> local;
 
    /** * Uses a non-inheritable static thread local */
    public static CurrentTraceContext create(a) {
        return new DefaultTraceContext(DEFAULT);
    }
 
    /**
     * Uses an inheritable static thread local which allows arbitrary calls to {@link
     * Thread#start()} to automatically inherit this context. This feature is available as it is was
     * the default in Brave 3, because some users couldn't control threads in their applications.
     *
     * <p>This can be a problem in scenarios such as thread pool expansion, leading to data being
     * recorded in the wrong span, or spans with the wrong parent. If you are impacted by this,
     * switch to {@link #create()}.
     */
    public static CurrentTraceContext inheritable(a) {
        return new DefaultTraceContext(INHERITABLE);
    }
 
    public static CurrentTraceContext transmittable(a) {
        return new DefaultTraceContext(TRANSMITTABLE);
    }
 
    DefaultTraceContext(ThreadLocal<TraceContext> local) {
        if (local == null) {
            throw new NullPointerException("local == null");
        }
        this.local = local;
    }
 
    @Override
    public TraceContext get(a) {
        return local.get();
    }
 
    @Override
    public CurrentTraceContext.Scope newScope(@Nullable TraceContext currentSpan) {
        final TraceContext previous = local.get();
        local.set(currentSpan);
        class DefaultCurrentTraceContextScope implements CurrentTraceContext.Scope {
            @Override
            public void close(a) { local.set(previous); }}return newDefaultCurrentTraceContextScope(); }}Copy the code

Inject bean objects

/**
 * TraceConfiguration
 *
 * @author weiyilin
 * @dateThe 2022-01-29 16:10:16 * /
@Configuration
public class TraceConfiguration {
 
    @Bean
    public CurrentTraceContext slf4jSpanLogger(a) {
        returnSlf4jCurrentTraceContext.create(DefaultTraceContext.transmittable()); }}Copy the code

Reference documentation

  • transmittable-thread-local