A Java probe is a tool for listening to or changing the behavior of your code, which can penetrate your application code without being aware of it while it is running.

Distributed call link tracing can be implemented in two ways, code invasion and non-code invasion, and non-code invasion is based on Java probe.

Code written in the programming languages running on the Java virtual machine has a common intermediate format: the Class file format. Code that dynamically modiates class bytecode to insert additional behavior, enabling non-code intrusive collection of application call behavior.

Benefit from the Instrumentation interface provided by Java SE 6. Java Agent applications (Java probes) based on Instrumentation can be developed to modify class bytecodes at run time, either by replacing the bytecodes of a class before the class is loaded, or by reloading the class after the class is loaded.

Simply implementing runtime modifications to class bytecode is not enough to be called a “probe.” Java Agent based on Instrumentation development, only need to add virtual machine parameter “-JavaAgent” in the Java application startup command to specify the location of the Java Agent application JAR package, and do not need to introduce its JAR package in the project, You can insert probes into every corner of your application code. Isolation of the environment by using a different class load from the application gives the illusion that the Java Agent is attached to the application.

Instrumentation is difficult to navigate, is the need to understand the Java class loading mechanism and bytecode, accidentally can encounter a variety of strange exceptions. One of the many pitfalls I’ve encountered in implementing Java probes is classloading-related issues, which I’ll share with you in this article.

Classes loaded by the parent class loader cannot reference classes loaded by the subclass loader

Classes loaded by the parent class loader cannot reference classes loaded by the subclass loader, otherwise a NoClassDefFoundError will be raised.

How to understand this sentence? This is also an interview question.

The Java.* classes provided by the JDK are loaded by the startup class loader.

What if we modify the classes under the Java package in the Java Agent to insert code that calls logBack to print the diary?

Since logbacks under the Java Agent package are loaded by the AppClassLoader (application class loader, also known as the system class loader), the launcher class loader that loads classes under the Java package is the parent class loader of the AppClassLoader.

Insert code that calls logback to print a diary into a class in a Java package. First, when loading a class in a Java package, the JVM looks to see if the startup class loader has loaded the dependent Logback class, and if not, tries to load it. But the startup class loader cannot load the logback package’s classes, and the startup class loader does not ask the subclass loader. No class loader will ask the subclass loader if the subclass loader can load, even if the subclass loader loads the class. So there’s a NoClassDefFoundError.

How do I avoid NoClassDefFoundError if I have to modify a class in a Java package, and if I have to access a class in a Java package that I wrote in a project, or a class provided by a third-party JAR package, or a class that I wrote in a JavaAgent package?

The author encountered this problem online looked for a lot of resources, unfortunately did not find. It occurred to me that I had Arthas’s source code on my computer, so I might as well learn how to solve Arthas.

Arthas is an open source Java diagnostic tool on Alibaba that is ideal for online troubleshooting.

Reference Alibaba’s open source Arthas solution:

Classes for receiving events reported by buried code (Spy) :

public final class Spy {
    public static void before(String className, String methodName, String descriptor, Object[] params) {}public static void complete(Object returnValueOrThrowable, String className, String methodName, String descriptor) {}}Copy the code
  • Before: report before method execution;
  • Complete: reported before the method returns or throws an exception. When the method throws an exception, the first argument is the exception. Otherwise, the first argument is the return value.

To Spy on a separate jar package, premain, agentmain method invokes the Instrumentation appendToBootstrapClassLoaderSearch method, will Spy’s scanning to start the class loader to load the jar package, See the code below.

// agent-spy.jar
String agentSpyJar = jarPath[1];
File spyJarFile = new File(agentSpyJar);
instrumentation.appendToBootstrapClassLoaderSearch(new JarFile(spyJarFile));
Copy the code

Print the classloader in the Spy class. If the print result is null, the Spy class was loaded by the launch classloader.

public final class Spy {
    static {
        System.out.println("Spy class loader is " + Spy.class.getClassLoader());
    }
    / /...
}
Copy the code

Finally, inject the reporting method into Spy and invoke the reporting method in Spy through reflection. The complete code for the Spy class is shown below.

public final class Spy {

    public static Method beforMethod;
    public static Method completeMethod;

    public static void before(String className, String methodName, String descriptor, Object[] params) {
        if(beforMethod ! =null) {
            try {
                beforMethod.invoke(null, className, methodName, descriptor, params);
            } catch (IllegalAccessException | InvocationTargetException e) {
            }
        }
    }

    public static void complete(Object returnValueOrThrowable, String className, String methodName, String descriptor) {
        if(completeMethod ! =null) {
            try {
                completeMethod.invoke(null, returnValueOrThrowable, className, methodName, descriptor);
            } catch (IllegalAccessException | InvocationTargetException e) {
            }
        }
    }
}
Copy the code

Calling by reflection has an impact on performance, especially since each method on the call link requires reflection to call two reporting methods.

It may not be completely understood correctly, but I have tried this scheme and it works.

Isolate the Agent from the application environment

Why implement isolation?

Isolation prevents the Agent from contaminating the application itself, so that the Java Agent development does not need to consider whether the imported JAR package conflicts with the jar package imported by the target application.

What happens when a Java Agent meets a Spring Boot application?

After a Spring Boot application is packaged, attaching an Agent to the startup of the application may raise an eye-catching NoClassDefFoundError exception. This does not occur in the IDEA test because the Agent and the packaged Spring Boot application use different classloaders.

A NoClassDefFoundError may occur when the Agent calls the code of the monitored SpringBoot application or the API of the third-party JAR packages that the Agent depends on and which also happen to have imports in the SpringBoot application.

The Agent jar package is loaded by the AppClassLoader class loader (system class loader).

In IDEA, the project’s class files and third-party libraries are loaded by AppClassLoader, and the jar specified with -JavaAgent is also loaded by AppClassLoader, so testing in IDEA does not encounter this problem.

After the SpringBoot application is packaged, the JVM process startup entry is no longer the main method we wrote, but the startup class generated by SpringBoot. SpringBoot uses a custom LaunchedClassLoader to load classes in JARS and third-party JAR packages. The parent of the LaunchedClassLoader is AppClassLoader.

That is, after the SpringBoot application is packaged, the class loader used to load the classes under the JavaAgent package is the parent of the class loader used by SpringBoot.

How is isolation implemented?

Instead of using the AppClassLoader, use a custom class loader to load agent packages.

Refer to the open source Arthas implementation of Alibaba to customize URLClassLoader to load agent packages and third-party JAR packages that Agent depends on.

Because the classes of the Premain or AgentMain methods are loaded by the JVM using the AppClassLoader, the Agent must be split into two JARS. Core functions are stored in the Agent-core package, and classes where premain or AgentMain methods reside are stored in the Agent-boot package. Load agent-core using a custom URLClassLoader in the Premain or AgentMain methods.

The first step:

Custom class loader OnionClassLoader, inherit URLClassLoader, as shown in the following code:

public class OnionClassLoader extends URLClassLoader {

    public OnionClassLoader(URL[] urls) {
        super(urls, ClassLoader.getSystemClassLoader().getParent());
    }

    @Override
    protected synchronizedClass<? > loadClass(String name,boolean resolve) throws ClassNotFoundException {
        finalClass<? > loadedClass = findLoadedClass(name);if(loadedClass ! =null) {
            return loadedClass;
        }
        // Load system classes from parent (SystemClassLoader) first, avoid throwing ClassNotFoundException
        if(name ! =null && (name.startsWith("sun.") || name.startsWith("java."))) {
            return super.loadClass(name, resolve);
        }
        try{ Class<? > aClass = findClass(name);if (resolve) {
                resolveClass(aClass);
            }
            return aClass;
        } catch (Exception e) {
            // ignore
        }
        return super.loadClass(name, resolve); }}Copy the code

Specify the parent class loader of OnionClassLoader as the parent class loader of AppClassLoader in the constructor.

This getSystemClassLoader () : access to the system class loader (AppClassLoader)

The second step:

Load agent-core using the OnionClassLoader in the Premain or AgentMain methods.

/ / 1
File agentJarFile = new File(agentJar);
final ClassLoader agentLoader = new OnionClassLoader(new URL[]{agentJarFile.toURI().toURL()});
/ / 2Class<? > transFormer = agentLoader.loadClass("com.msyc.agent.core.OnionClassFileTransformer");
/ / 3Constructor<? > constructor = transFormer.getConstructor(String.class); Object instance = constructor.newInstance(opsParams);/ / 4
instrumentation.addTransformer((ClassFileTransformer) instance);
Copy the code
  • 1. Construct OnionClassLoader according to the absolute path where agent-core-jar resides.
  • 2. Load ClassFileTransformer under agent-core-jar;
  • Create ClassFileTransformer instance with reflection;
  • 4. Add ClassFileTransformer to Instrumentation;

OnionClassFileTransformer class depends on the agent – class under the core package, nature will also be used OnionClassLoader class loader loads, including the agent – core rely on third-party jars.

Fit the WebMVC framework

The difficulty of generating a distributed call chain diary is the connection between method embedding and method call diary.

There are many ways to connect the distributed call chain diary, and the author uses the simplest way: id+ time.

For the same thread in the same process, the method calls can be concatenated by the punch ID, sorting the method call diary by the punch time and the value of an accumulator.

For different processes, the log diaries of different applications can be connected by passing the log ID and sorted according to the log time.

For example, the purpose of adapting the WebMVC framework is to get the dot ID(transaction ID) passed from the call source from the request header. Peg the DispatcherServlet#doDispatch method and get the request header “s-tid” from the HttpServletRequest parameter. S-tid is a custom request header parameter that is used to pass the dot ID.

The author encountered the same problem when adapting WebMVC and OpenFeign. For example, when adapting WebMVC, the doDispatch method of DispatcherServlet was modified, The asm framework throws Java. Lang. TypeNotPresentException.

  • Java. Lang. TypeNotPresentException: when the application attempts to use the type name string to access type, but can’t find the type definition with a specified name, throw the exception.

The reason for this is that when the DispatcherServlet Class is overwritten using the ASM framework, ASM loads the Class referenced by the symbol using the class.forname method and throws a TypeNotPresentException if the target Class is not loaded.

By default, ASM uses its own class loader to try to load some classes that the currently overwritten class depends on. The same class loader used to load the ASM framework is used to load the Agent-Core package. The DispatcherServlet is loaded by SpringBoot’s LaunchedClassLoader.

Fortunately, the ClassFileTransformer#transform method passes the classloader used to load the current class:

public class OnionClassFileTransformer implements ClassFileTransformer {

    @Override
    public byte[] transform(ClassLoader loader, String className, Class<? > classBeingRedefined, ProtectionDomain protectionDomain,byte[] classfileBuffer) {
   
    }

}

Copy the code
  • If the class being overwritten is DispatcherServlet, the first argument to the transform method is the classloader that will be used to load the DispatcherServlet class.

We simply specify that ASM loads the classes that the DispatcherServlet depends on using the classloader passed in by the ClassFileTransformer#transform method.

ClassWriter classWriter = new ClassWriter(ClassWriter.COMPUTE_MAXS | ClassWriter.COMPUTE_FRAMES) {
       @Override
       protected ClassLoader getClassLoader(a) {
            returnloader; }};Copy the code

As shown in the code, we override the getClassLoader method of asm’s ClassWriter class and return the classloader passed in by the ClassFileTransformer#transform method.

conclusion

  • 1. Self-implementing Java probes need to remember one thing: classes loaded by a parent class loader cannot reference classes loaded by a subclass loader;
  • 2. Agent can be isolated from application by loading agent through custom class loading to prevent agent from contaminating application;
  • Write the class bytecode using ClassFileTransformer#transform. Remember to override the getClassLoader method of the ASM framework ClassWriter class. Class loader passed in using ClassFileTransformer#transform.