preface

In everyday coding, we often write code files with a.java suffix. This code is intended for application developers to read and understand, but is not accepted by the JVM, so in order for code to run on the JVM, it must be converted to code that the JVM understands

This takes a step of compiling.java code files into.class files so that the.class files can be recognized and run by the JVM

Class loading is to load the.class bytecode file into memory, examine, transform and initialize the data in the file, and finally form information that can be directly used by virtual machines. This is the class loading mechanism of virtual machines.

Understand the Java class loading mechanism, can quickly solve a variety of runtime loading problems and quickly locate the essence of the reason behind it, but also to solve the problem of the sharp tool, so the following start today’s content: “Hard anatomy of JVM class loading process”

The JVM startup process

Before we talk about class loading, let’s take a look at the startup process of the JVM. Class loading is just one step in the startup process of the JVM. Let’s take a look at where class loading is in the startup process

Write test code

We’ll write a Math class with a main method that creates an object of class Math using the new keyword and calls the compute method of the object in main:


public class Math {
    public int compute(a) {
        int a = 1;
        int b = 2;
        int c = (a + b) * 10;
        return c;
    }
    public static void main(String[] args) {
        Math math = newMath(); math.compute(); }}Copy the code

Start test code

The bottom line of the IDE startup is to run the Math class’s main function with a Java command, so the ide startup and Java startup are essentially the same. Let’s see how the Math class runs in general.

JVM startup process source analysis

This section starts with the analysis of HotSpot source code for JVM startup and HotSpot source code for class loading. If you are interested in it, you can read it carefully. If you feel boring or cannot accept it at once, you can directly skip it. You can also understand the entire JVM startup and class loading process from the diagram

Find the startup portal

jdk/src/java.base/share/native/launcher/main.c

This is the Java command startup entry, where a breakpoint is placed before the main() function to start tracking JVM startup

int main(int argc, char **argv){
    int margc;
    char** margv;
    const jboolean const_javaw = JNI_FALSE;
    margc = argc;
    margv = argv;
    // At the end of the main function, JLI_Launch is called
    return JLI_Launch(margc, margv,
                   sizeof(const_jargs) / sizeof(char *), const_jargs,
                   sizeof(const_appclasspath) / sizeof(char*), const_appclasspath, FULL_VERSION, DOT_VERSION, (const_progname ! = NULL) ? const_progname : *margv, (const_launcher ! = NULL) ? const_launcher : *margv, (const_jargs ! = NULL) ? JNI_TRUE : JNI_FALSE, const_cpwildcard, const_javaw, const_ergo_class); }Copy the code

Preparation for initialization

JLI_Launch method in the JDK/SRC/Java. The base/share/native/libjli/Java directory c

// called by main() of main.c
int JLI_Launch(int argc, char ** argv,              /* main argc, argc */
        int jargc, const char** jargv,          /* java args */......) {
  // Some code to prepare data before initialization, but not focus on the whole process.// Ready to initialize the JVM
 return JVMInit(&ifn, threadStackSize, argc, argv, mode, what, ret);
}
Copy the code

JVMInit method in the JDK/SRC/Java. The base/Unix/native/libjli/java_md_solinux c directory

int JVMInit(InvocationFunctions* ifn, jlong threadStackSize,
        int argc, char **argv,
        int mode, char *what, int ret)
{
    // Ignore the whole process for now.// Call ContinueInNewThread
    return ContinueInNewThread(ifn, threadStackSize, argc, argv, mode, what, ret);
}
Copy the code

ContinueInNewThread method in the JDK/SRC/Java. The base/share/native/libjli/Java directory c

ContinueInNewThread(InvocationFunctions* ifn, jlong threadStackSize,
                    int argc, char **argv,
                    int mode, char *what, int ret)
{
   // Set the thread stack size
   if (threadStackSize == 0) {
      struct JDK1_1InitArgs args1_1;
      memset((void*)&args1_1, 0, sizeof(args1_1));
      args1_1.version = JNI_VERSION_1_1;
      ifn->GetDefaultJavaVMInitArgs(&args1_1);  /* ignore return value */
      if (args1_1.javaStackSize > 0) { threadStackSize = args1_1.javaStackSize; }}// Create a new thread to create the JVM, calling JavaMain
    { /* Create a new thread to create JVM and invoke main method */
      JavaMainArgs args;
      intrslt; args.argc = argc; . args.ifn = *ifn;ContinueInNewThread0 is called, passing the JavaMain pointer and args, the argument needed to call the function
      rslt = ContinueInNewThread0(JavaMain, threadStackSize, (void*)&args);
}
Copy the code

Find the main function and execute it

ContinueInNewThread0 method in the JDK/SRC/Java. The base/Unix/native/libjli/java_md_solinux c directory

int ContinueInNewThread0(int (JNICALL *continuation)(void *), jlong stack_size, void * args) {
    int rslt;
    #ifndef __solaris__
    ......
    // Create a thread
    if (pthread_create(&tid, &attr, (void* (*) (void*))continuation, (void*)args) == 0) {
      void * tmp;
      pthread_join(tid, &tmp);
      rslt = (int)(intptr_t)tmp;
    } else {
      Call the JavaMain method
      // The first argument to the method, int (JNICALL Continuation)(void), receives a pointer to the JavaMain function
      // So the continuation method below is the JavaMain function
      rslt = continuation(args);
    }
    // Ignore subsequent code.return rslt;
Copy the code

JavaMain methods in JDK/SRC/Java. The base/share/native/libjli/Java. C directory

int JNICALL JavaMain(void * _args) { JavaMainArgs *args = (JavaMainArgs *)_args; . InvocationFunctions ifn = args->ifn; start = CounterGet();// InitializeJVM Initializes the JVM and assigns the correct values to JavaVM and JNIEnv objects by calling the CreateJavaVM() function pointer under the InvocationFunctions structure
    // The LoadJavaVM() function points to the JNI_CreateJavaVM() function in the libjvm.so dynamic link library
    if(! InitializeJVM(&vm, &env, &ifn)) { JLI_ReportErrorMessage(JVM_ERROR1); exit(1); }...// Load the Math class
   mainClass = LoadMainClass(env, mode, what); 
   appClass = GetApplicationClass(env);
   // Get the main method of the Math class
   mainID = (*env)->GetStaticMethodID(env, mainClass, "main"."([Ljava/lang/String;)V");
  // Call main() and call the CallStaticVoidMethod() method defined in JNIEnv
  // The JavaCalls::call() function is eventually called to execute the main() method in the Math class.
  // The JavaCalls:call() function is a very important method, which will be covered later in the tutorial on method execution engines.
  (*env)->CallStaticVoidMethod(env, mainClass, mainID, mainArgs);
  
  / / end
  LEAVE();
}
Copy the code

In the JavaMain function, the main method is called, and after the main method is called, the LEAVE method is called to complete the entire start -> end life cycle

The general process can be roughly divided into the following steps:

  • Prepare to initialize the JVM, basically prepare some data needed to initialize the JVM, and finally call the JavaMain function
  • Initialize the JVM in the JavaMain function
  • Load the class of the main method
  • Get main method
  • Call the main method JavaCalls:call() function
  • End subsequent operations such as destroying the JVM

As mentioned above, class loading is just one step in the JVM startup process, and loading the class on which the main method is installed is class loading, so this step is used to dive into today’s topic: an in-depth look at the JVM class loading process

Class loading process source code analysis

After the above analysis, we enter the LoadMainClass method

// Load the Math class
mainClass = LoadMainClass(env, mode, what); 
Copy the code

LoadMainClass method

static jclass LoadMainClass(JNIEnv *env, int mode, char *name){
     / / LancherHelper class
    jclass cls = GetLauncherHelperClass(env);
     // Get the checkAndLoadMain method of LancherHelper
    NULL_CHECK0(mid = (*env)->GetStaticMethodID(env, cls,"checkAndLoadMain"."(ZILjava/lang/String;) Ljava/lang/Class;"));
    NULL_CHECK0(str = NewPlatformString(env, name));
    // Load the Math class with checkAndLoadMain
    NULL_CHECK0(result = (*env)->CallStaticObjectMethod(env, cls, mid,USE_STDERR, mode, str));
    return (jclass)result;
}
Copy the code

Loading LancherHelper class

If we look at the GetLauncherHelperClass method, we’ll see that if helperClass already exists, we’ll return it. If it doesn’t, we’ll call FindBootStrapClass

jclass GetLauncherHelperClass(JNIEnv *env)
{
    if (helperClass == NULL) {
        NULL_CHECK0(helperClass = FindBootStrapClass(env,
                "sun/launcher/LauncherHelper"));
    }
    return helperClass;
}
Copy the code

FindBootStrapClass method

jclass FindBootStrapClass(JNIEnv *env, const char* classname)
{
   if (findBootClass == NULL) {
        // Get the JVM_FindClassFromBootLoader method in jvm. CPP
       findBootClass = (FindClassFromBootLoader_t *)dlsym(RTLD_DEFAULT,
          "JVM_FindClassFromBootLoader");
   }
   Call the JVM_FindClassFromBootLoader method
   return findBootClass(env, classname); 
}
Copy the code

JVM_FindClassFromBootLoader method in the hotspot/SRC/share/vm/prims/JVM. CPP directory

JVM_ENTRY(jclass, JVM_FindClassFromBootLoader(JNIEnv* env,
                                              const char* name))
  // Call SystemDictionary to parse the class to load it
  Klass* k = SystemDictionary::resolve_or_null(h_name, CHECK_NULL);
  return (jclass) JNIHandles::make_local(env, k->java_mirror());
JVM_END
Copy the code

Resolve_or_null method in the hotspot/SRC/share/vm/classfile/systemDictionary CPP directory

Klass* SystemDictionary::resolve_or_null(Symbol* class_name, ...) {
   // Go here
    return resolve_instance_class_or_null(class_name, class_loader, protection_domain, THREAD);
}
Copy the code

Resolve_instance_class_or_null method in the hotspot/SRC/share/vm/classfile/systemDictionary CPP directory

Klass* SystemDictionary::resolve_instance_class_or_null(Symbol* name, ...) { 
      // Do actual loading
      k = load_instance_class(name, class_loader, THREAD);
}                                                      
Copy the code

The load_instance_class method, where the load is actually called, is used by ClassLoader to load the class

nstanceKlassHandle SystemDictionary::load_instance_class(Symbol* class_name, Handle class_loader, TRAPS) {
    if (k.is_null()) {
      // Use VM class loaderk = ClassLoader::load_class(class_name, search_only_bootloader_append, CHECK_(nh)); }}Copy the code

This: : load_class method, the hotspot/SRC/share/vm/classfile/this CPP

instanceKlassHandle ClassLoader::load_class(Symbol* name, bool search_append_only, TRAPS) {
  // Create a bytecode file stream
  stream = search_module_entries(_exploded_entries, class_name, file_name, CHECK_NULL);
  // Each loaded Java class corresponds to a ClassLoaderData structure. ClassLoaderData internally maintains a linked list of classLoaders and classes loaded by classLoaders
  ClassLoaderData* loader_data = ClassLoaderData::the_null_class_loader_data();
  // Parse the Java bytecode file streaminstanceKlassHandle result = KlassFactory::create_from_stream(stream, name, ...) ; }Copy the code

ClassFileParser is eventually called to parse the Java bytecode file stream

instanceKlassHandle KlassFactory::create_from_stream(ClassFileStream* stream,Symbol*name, ...) {
   // Invoke class resolution
   ClassFileParser parser(stream,name,loader_data,protection_domain,host_klass,cp_patches, ClassFileParser::BROADCAST, // publicity level CHECK_NULL);
  // Create instanceKclass and save the resultinstanceKlassHandle result = parser.create_instance_klass(old_stream ! = stream, CHECK_NULL);return result;
 }
Copy the code

Look for the LancherHelper’s checkAndLoadMain to load the Math class

Going back to the original LoadMainClass method, after the LancherHelper is loaded, the JVM looks for the checkAndLoadMain method that gets the LancherHelper and executes it to load the Math class

LoadMainClass method

static jclass LoadMainClass(JNIEnv *env, int mode, char *name){
     / / LancherHelper class
    jclass cls = GetLauncherHelperClass(env);
     // Get the checkAndLoadMain method of LancherHelper
    NULL_CHECK0(mid = (*env)->GetStaticMethodID(env, cls,"checkAndLoadMain"."(ZILjava/lang/String;) Ljava/lang/Class;"));
    NULL_CHECK0(str = NewPlatformString(env, name));
    // Load the Math class with checkAndLoadMain
    NULL_CHECK0(result = (*env)->CallStaticObjectMethod(env, cls, mid,USE_STDERR, mode, str));
    return (jclass)result;
}
Copy the code

The LancherHelper checkAndLoadMain method is used in JDK 11

    @SuppressWarnings("fallthrough")
    public staticClass<? > checkAndLoadMain(boolean printToStderr,
                                            int mode,
                                            String what) {
                                            
        // Omit unnecessary code. Class<? > mainClass =null;
        switch (mode) {
            // The breakpoint displays mode=1, loadMainClass is used
            case LM_MODULE: case LM_SOURCE:
                mainClass = loadModuleMainClass(what);
                break;
            default:
                mainClass = loadMainClass(mode, what);
                break;
        }
        
        // Omit unnecessary code.return mainClass;
    }
Copy the code

The loadMainClass method, which takes the Java level Class loader and loads it through class.forname

    private staticClass<? > loadMainClass(int mode, String what) {
        String cn;
        switch (mode) {
            case LM_CLASS:
                // Use the class loader to load the Hello class with mode=1 and what = Hello
                cn = what;
                break;
            case LM_JAR:
                cn = getMainClassFromJar(what);
                break;
            default:
                // should never happen
                throw new InternalError("" + mode + ": Unknown launch mode");
        }

        // load the main class
        cn = cn.replace('/'.'. '); Class<? > mainClass =null;
        // The ClassLoader returned in this step is AppCLassLoader
        ClassLoader scl = ClassLoader.getSystemClassLoader();
        try {
            try {
                // class.forname will perform security verification and call forName0 in class.c
                mainClass = Class.forName(cn, false, scl);
            } catch (NoClassDefFoundError | ClassNotFoundException cnfe) {
               // Omit unnecessary code. }}catch (LinkageError le) {
            abort(le, "java.launcher.cls.error6", cn,
                    le.getClass().getName() + ":" + le.getLocalizedMessage());
        }
        return mainClass;
    }
Copy the code

This method getSystemClassLoader

public static ClassLoader getSystemClassLoader(a) {
        // Get the class loader, return the corresponding class loader according to the initialization level, defined in vm. Java each level meaning:
        // 1. JAVA_LANG_SYSTEM_INITED = 1, initialization of lang library is complete,
        // 2. MODULE_SYSTEM_INITED = 2 Module initialization is complete,
        // 3. SYSTEM_LOADER_INITIALIZING = 3
        // 4. SYSTEM_BOOTED= 4 The system is completely started
        // It is obvious that the JVM is already initialized when the Math class is loaded, so the initialization level is 4
        // SCL is ClassLoader, SCL is assigned in initSystemClassLoader, initSystemClassLoader is called during HotSpot startup, so SCL is not empty, it is AppClassLoader
        switch (VM.initLevel()) {
            case 0:
            case 1:
            case 2:
                // the system class loader is the built-in app class loader during startup
                return getBuiltinAppClassLoader();
            case 3:
                String msg = "getSystemClassLoader cannot be called during the system class loader instantiation";
                throw new IllegalStateException(msg);
            default:
                // system fully initialized
                assertVM.isBooted() && scl ! =null;
                SecurityManager sm = System.getSecurityManager();
                if(sm ! =null) {
                    checkClassLoaderPermission(scl, Reflection.getCallerClass());
                }
                returnscl; }}/ / initSystemClassLoader method
static synchronized ClassLoader initSystemClassLoader(a) {
        if(VM.initLevel() ! =3) {
            throw new InternalError("system class loader cannot be set at initLevel " +
                                    VM.initLevel());
        }

        // detect recursive initialization
        if(scl ! =null) {
            throw new IllegalStateException("recursive invocation");
        }

        // Call getBuiltinAppClassLoader
        ClassLoader builtinLoader = getBuiltinAppClassLoader();
        // Omit unnecessary code. }/ / getBuiltinAppClassLoader method
static ClassLoader getBuiltinAppClassLoader(a) {
        return ClassLoaders.appClassLoader();
}
Copy the code

After obtaining the Java level classloader, call the class.forname (cn, false, SCL) method

public staticClass<? > forName(String name,boolean initialize,
                                   ClassLoader loader)
        throwsClassNotFoundException { Class<? > caller =null;
            // Unnecessary code.// If the class loader passed in is empty, the default AppClassLoader is used
            if (loader == null) {
                ClassLoader ccl = ClassLoader.getClassLoader(caller);
                if(ccl ! =null) { sm.checkPermission( SecurityConstants.GET_CLASSLOADER_PERMISSION); }}return forName0(name, initialize, loader, caller);
    }
Copy the code

Use SystemDictionary again: : resolve_or_null Math class loading

ForName0 method, returned to the HotSpot, JDK/SRC/Java. The base/share/native/libjava/Class c directory

JNIEXPORT jclass JNICALL
Java_java_lang_Class_forName0(JNIEnv *env, jclass this, jstring classname,
                              jboolean initialize, jobject loader, jclass caller){
  cls = JVM_FindClassFromCaller(env, clname, initialize, loader, caller);
}                             
Copy the code

JVM_FindClassFromCaller method in the hotspot/SRC/share/vm/prims/JVM. CPP

JVM_ENTRY(jclass, JVM_FindClassFromCaller(JNIEnv* env, const char* name,
                                          jboolean init, jobject loader,
                                          jclass caller)){
jclass result = find_class_from_class_loader(env, h_name, init, h_loader,
                                               h_prot, false, THREAD);
}
Copy the code

Find_class_from_class_loader method

jclass find_class_from_class_loader(JNIEnv* env, Symbol* name, jboolean init, Handle loader, Handle protection_domain, jboolean throwError, TRAPS) {
  // Load the Math classKlass* klass = SystemDictionary::resolve_or_fail(name, loader, protection_domain,throwError ! =0, CHECK_NULL);
  return (jclass) JNIHandles::make_local(env, klass_handle->java_mirror());
}
Copy the code

SystemDictionary::resolve_or_fail

Klass* SystemDictionary::resolve_or_fail(Symbol* class_name, bool throw_error, TRAPS)
{
  return resolve_or_fail(class_name, Handle(), Handle(), throw_error, THREAD);
}
Copy the code
Klass* SystemDictionary::resolve_or_fail(Symbol* class_name, Handle class_loader, Handle protection_domain, bool throw_error, TRAPS) {
  Resolve_or_null (resolve_or_null, resolve_or_null, resolve_or_null)Klass* klass = resolve_or_null(class_name, class_loader, protection_domain, THREAD); .return klass;
}
Copy the code

Resolve_or_null (resolve_or_null); resolve_or_null (resolve_or_null)

Get the main method of the Math class and call it

 // Get the main method of the Math class
   mainID = (*env)->GetStaticMethodID(env, mainClass, "main"."([Ljava/lang/String;)V");
  // Call main() and call the CallStaticVoidMethod() method defined in JNIEnv
  // The JavaCalls::call() function is eventually called to execute the main() method in the Math class.
  // The JavaCalls:call() function is a very important method, which will be covered later in the tutorial on method execution engines.
  (*env)->CallStaticVoidMethod(env, mainClass, mainID, mainArgs);
Copy the code

Well, the boring source reading stage is over, we put this whole process through the form of drawing to strengthen the impression of everyone:

Class overall running flow chart

  1. Step 1: Run the Java classload.math. class command to run the bytecode file
  2. Step 2: When running this command, the system actually uses the java.exe file, and the main function that starts in the main.c file is also the entry point for the program to start
  3. Step 3: During the creation of the Java virtual machine, an instance of the LauncherHelper class is created. The source code for the LauncherHelper class is provided in the next article
  4. After step 4: create the Java virtual machine, c + + code will go to a lot of calls to the Java virtual machine start the program, there will be a sun in the start-up procedure. The launcher. LauncherHelper class, start the LauncherHelper classes to create a lot of Java class loaders
  5. Step 5: Load real Java bytecode files, such as the Math class, through the Java layer class loader
  6. Step 6: after loading the bytecode file, the c++ code directly calls the Main function ID and makes the call
  7. Step 7: After the program finishes running, the JVM destroys the LEAVE method

The specific flow of class loading

There are five steps to go from a.class file to information that is directly used by the virtual machine:

Load >> Verify >> Prepare >> Parse >> Initialize as shown below:

Here’s how to break down these five steps

loading

Loading refers to loading the class bytecode file into memory from various sources through the class loader and converting the bytecode file into a stream of bytes. During loading, the Java virtual machine needs to do three things:

  1. Gets the binary byte stream that defines a class by its fully qualified name.
  2. Transform the static storage structure represented by this byte stream into the runtime data structure of the method area.
  3. Generate a java.lang.Class object representing the Class in the heap as an access point to the Class in the method area.

There is one point to note here: the source of the bytecode

Bytecode source

The Java Virtual Machine Specification is not particularly specific about these three requirements, leaving much flexibility for virtual machine implementation and Java applications. For example, the rule “get the byte stream that defines a Class by its fully qualified name” does not specify that the binary byte stream must be obtained from a Class file, or indeed, where or how to get it at all.

This gives the developer a great degree of flexibility, for example:

  • Common loading sources include.class files compiled from the local path
  • Reading from ZIP packages was common and eventually became the basis for future JAR, EAR, and WAR formats
  • The most typical application in this scenario is a Web Applet.
  • Runtime compute generation, the scenario that is most commonly used is the dynamic Proxy technique. In java.lang.reflection.proxy, Is to use the ProxyGenerator. GenerateProxyClass () for the specific interface generated form of Proxy class for “* $Proxy” binary byte streams
  • Generated by other files, the typical scenario is JSP applications, JSP files generate the corresponding Class files.
  • Reading from a database is a relatively rare scenario, for example some middleware servers (such as SAP Netweaver) can choose to install programs into a database to distribute program code across clusters.
  • It can be obtained from encrypted files. This is a typical protection against Class file decompilation. It decrypts the Class file at load time to protect the program running logic from snooping.

After loading

After the loading phase is complete, the binary byte streams external to the Java VM are stored in the method area in the format set by the VM. The data storage format in the method area is fully defined by the VM. The Java VIRTUAL Machine Specification does not specify the specific data structure in this area.

Once the type data is properly placed in the method area, an object of the Java.lang. Class Class is instantiated in the Java heap memory, which acts as an external interface for the application to access the type data in the method area.

Points to note during loading

The loading stage intersects with the part of the partial verification action. Before the loading stage is completed, the verification stage may have started, but the start time of the two stages still maintains a fixed sequence

validation

The purpose of the validation phase is to ensure that the information contained in the byte stream of the Class file complies with all the constraints of the Java Virtualizer Specification, and that the information does not compromise the security of the virtual machine when it is run as code.

The verification stage is very important. Whether this stage is rigorous or not directly determines whether the Java VIRTUAL machine can withstand the attack of malicious code. From the perspective of the amount of code and execution performance, the workload of the verification stage accounts for a considerable proportion in the class loading process.

On the whole, the verification phase will roughly complete the following four verification actions: file format verification, metadata verification, bytecode verification and symbol reference verification

File format validation

Verify that the byte stream complies with the Class file format specification and can be processed by the current version of the virtual machine.

For example, if we open math. class without decomcompiling, we can see that the file begins with cafe Babe. This cafe babe indicates that the file is a bytecode file, including the primary and secondary version numbers, etc. If we change this information at will, the JVM will not be able to recognize it. So the first step is to verify that the content character of the bytecode does not conform to the JVM specification

Validation of metadata (class metadata information)

The main purpose of this phase is to verify the metadata information of the class to ensure that there is no metadata information contrary to the definition of the Java Language Specification. The verification points in this phase are as follows:

  • Whether this class has a parent (all classes except java.lang.object should have a parent)
  • Does the parent of this class inherit from classes that are not allowed to inherit (classes modified by final)?
  • If the class is not abstract, does it implement all the methods required by its parent or interface
  • Whether a field or method ina class conflicts with the parent class (for example, overwriting a final field in the parent class, or a method overloading that does not conform to rules, such as method arguments that are identical but return value types that are different).

Bytecode validation

After verifying the data types in metadata information in the previous stage, the method body of the Class (Code attribute in the Class file) should be verified and analyzed in this stage to ensure that the methods of the verified Class will not endanger the security of virtual machines when running

This stage is the most complex stage in the whole validation process. The main purpose is to determine the program semantics is legal and logical through data flow analysis and control flow analysis.

  • Ensure that the data type of the operand stack and the sequence of instruction codes work together at any time. For example, do not place an int on the operand stack and load it into the local variable table as long when used.
  • Ensure that no jump instruction jumps to a bytecode instruction outside the method body.
  • Ensure that type conversions in the method body are always valid

Due to the high complexity of data flow analysis and control flow analysis, the Java virtual machine design team implemented a joint optimization between the Javac compiler and the Java virtual machine after JDK 6 to move as much validation assistance as possible into the Javac compiler to avoid excessive execution time spent in the bytecode validation phase.

This is done by adding a new attribute named StackMapTable to the property table of the method body’s Code property. This attribute describes the state of the local change table and operation stack when all the Basic blocks of the method body start

During bytecode validation, the JVM does not need to programmatically derive the validity of these states, just check whether the records in the StackMapTable attribute are valid.

This changes the type derivation of bytecode validation to type checking, saving a lot of validation time.

Validation of symbol references

This validation behavior occurs when the JVM converts symbolic references to direct references, which occurs during the parsing phase

Symbolic reference verification can be regarded as the matching verification of all kinds of information outside the class itself (various symbolic references in the constant pool). In plain English, that is, whether the class is missing or denied access to some external classes, methods, fields and other resources on which it depends

  • Whether a class can be found for a fully qualified name described by a string in a symbol reference.
  • Whether a field descriptor for a method and methods and fields described by a simple name exist in the specified class.
  • Whether the accessibility of classes, fields, methods in symbolic references (private, protected, public, package) can be accessed as the previous class.

Notation to refer to the main purpose of the validation is to ensure that the parsing behavior can normal execution, if cannot be verified, the Java virtual machine will be thrown a Java lang. IncompatibleClassChangeError subclass exception, such as: Java. Lang. IllegalAccessError, Java. Lang. NoSuchFieldError, Java. Lang. NoSuchMethodError, etc

To prepare

We created two new static variables in Our Math class. To prepare, we set these static variables to default values (instead of “666” or a reference type). Int is 0. Boolean is false and so on, assigning null to reference types.

public class Math {

    /** * is assigned to 0 */ in preparation for class loading
    private static int zero = 1;
    /** * is assigned to null */ in preparation for class loading
    private static Math math = new Math();

}
Copy the code

Two things to note about the preparation phase:

  1. Only class variables, not instance variables, are allocated in the Java heap along with the object when it is instantiated

  2. The zero variable will have an initial value of 0 instead of 123 after the preparation phase, because no Java methods have been executed yet, and the putStatic instruction that assigns zero to 1 is stored in the class constructor method after the program is compiled, so the assignment of zero to 1 will not be executed until the initialization phase of the class.

A special case

If a class field has a ConstantValue attribute/constant attribute in the field attribute table, the value of the variable will be initialized to the initial value specified by the ConstantValue attribute in the preparation phase, assuming that the above definition of class variable zero is changed to:

    /** * is assigned to 0 */ in preparation for class loading
    private static final int zero = 1;
Copy the code

Why initial values of class variables in the preparation phase

In the case of instance variables, multiple instance variables point to different instance variable heap memory, i.e. the value of the instance variable is only relevant to the object. The value of a class variable has nothing to do with the class object. It is the last modified value. Multiple class objects will only share the same heap memory.

parsing

This step is the process of replacing symbolic references in the constant pool with direct references. There are two important points to note here

  1. What is a symbolic reference?
  2. What is a direct reference?

Let’s talk about symbolic references and direct references

In the JVM, a kind of method name, the name of the class, the modifier, the return value, etc are a series of symbols, and these symbols are a constant, stored in a constant pool, at the same time these symbols, variable, code blocks and so on in memory area are constructed from pieces of memory to store, these memory areas have corresponding memory address, And these memory addresses are direct references, and the parsing step is to replace “symbol” with “memory address”.

Symbolic reference

Let’s take test. class as an example and see how symbolic references work

public class Test {

    public static void main(String[] args) {
        Test test = new Test();
        test.say();
    }
    private void say(a) {
        System.out.println("Hello World"); }}Copy the code

Let’s look at the bytecode file in the test. class path (executing javap -v)

public class classload.Test
  minor version: 0
  major version: 52
  flags: ACC_PUBLIC.ACC_SUPER
Constant pool# 1:= Methodref          #8.#24         // java/lang/Object."<init>":()V
   #2 = Class              #25            // classload/Test
   #3 = Methodref          #2.#24         // classload/Test."<init>":()V
   #4 = Methodref          #2.#26         // classload/Test.say:()V
   #5 = Fieldref           #27.#28        // java/lang/System.out:Ljava/io/PrintStream;
   #6 = String             #29            // Hello World
   #7 = Methodref          #30.#31        // java/io/PrintStream.println:(Ljava/lang/String;) V
   #8 = Class              #32            // java/lang/Object
   #9 = Utf8               <init>
  #10 = Utf8               ()V
  #11 = Utf8               Code
  #12 = Utf8               LineNumberTable
  #13 = Utf8               LocalVariableTable
  #14 = Utf8               this
  #15 = Utf8               Lclassload/Test;
  #16 = Utf8               main
  #17 = Utf8               ([Ljava/lang/String;)V
  #18 = Utf8               args
  #19 = Utf8               [Ljava/lang/String;
  #20 = Utf8               test
  #21 = Utf8               say
  #22 = Utf8               SourceFile
  #23 = Utf8               Test.java
  #24 = NameAndType        #9: #10         // "<init>":()V
  #25 = Utf8               classload/Test
  #26 = NameAndType        #21: #10        // say:()V
  #27 = Class              #33            // java/lang/System
  #28 = NameAndType        #34: #35        // out:Ljava/io/PrintStream;
  #29 = Utf8               Hello World
  #30 = Class              #36            // java/io/PrintStream
  #31 = NameAndType        #37: #38        // println:(Ljava/lang/String;) V
  #32 = Utf8               java/lang/Object
  #33 = Utf8               java/lang/System
  #34 = Utf8               out
  #35 = Utf8               Ljava/io/PrintStream;
  #36 = Utf8               java/io/PrintStream
  #37 = Utf8               println
  #38= Utf8 (Ljava/lang/String;) V {public classload.Test();
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=1, locals=1, args_size=1
         0: aload_0
         1: invokespecial #1                  // Method java/lang/Object."<init>":()V
         4: return
      LineNumberTable:
        line 3: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       5     0  this   Lclassload/Test;

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=2, args_size=1
         0: new           #2                  // class classload/Test
         3: dup
         4: invokespecial #3                  // Method "<init>":()V
         7: astore_1
         8: aload_1
         9: invokespecial #4                  // Method say:()V
        12: return
      LineNumberTable:
        line 7: 0
        line 8: 8
        line 9: 12
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      13     0  args   [Ljava/lang/String;
            8       5     1  test   Lclassload/Test;
}
SourceFile: "Test.java"
Copy the code

The Constant pool is our Constant pool, and the Constant pool holds all kinds of symbols

This identifier is used by the JVM for instance creation, variable passing, and method call. For example, new Test:

In the main method, we’ll start with a new Test class. In the comment, we’ll also specify that the new is class classload/Test. Let’s see which constant #2 points to

We can see that #2 is a class, and it points to a #25. Let’s trace it back to #25

Can see # 25 is representing a class, at the same time encoding utf8, so through the symbolic identifier in the constant pool, the JVM can find what is created step by step, method calls, too, in the code to compile, after the method name, (), the name of the class, etc., have become a symbol, and in constant pool

Parsing and dynamic joining

Until now, the compiled symbols have been put into the constant pool. The constant pool is static at this time, but when loaded, the constant pool has the corresponding memory address, and the constant pool becomes the runtime constant pool. JDK1.7 and later JVMS have removed the runtime constant pool from the method area, creating an area in the Java Heap for the runtime constant pool.

Some symbolic references are converted to direct references during class loading or the first time they are used. This conversion is called static resolution. The other part, which is converted to a direct reference at each run, is called the dynamic join

Summary of analytical steps

So in class loading, parsing is static linking, for static methods (such as the main method) or other immutable methods, because static methods wait until the memory is loaded and allocated, the memory address will not change, so it can be directly replaced by the memory address during class loading.

But as shown below, suppose we Test have multiple subclasses, and as a result of the existence of polymorphism, like non-static methods, there may be a different implementation, so it is impossible to know the load at compile time, need to wait until the real operation, to find the specific means of implementation, to find a specific memory address, reference symbols refer to replace directly

Initialize the

The initialization phase of a class is the last step in the class loading process, and it is not until the Initialization phase that the Java virtual machine actually executes the Java program code written in the class, handing over control to the application.

The initialization phase is the process of executing the clinit method of the class constructor. The Clinit method is not written directly by programmers in Java code; it is an automatic artifact of the Javac compiler

Clinit method is by the compiler automatically collect all kinds of variable assignment in class action and static blocks (static {} block) of the statement in merger, the compiler collection order is decided by the order of the statement in the source file, static block can only access to the definition in the static block variables before and after its variables, The previous static block can be assigned, but not accessed

public class Test { 
    static {
    // Copy variables to compile correctly
    i = 0; 
    // The compiler will say "illegal forward reference"
    System.out.print(i);  
  }
    static int i = 1; 
}
Copy the code

The order of initialization

Unlike the class’s constructor (the instance constructor init method), the Clinit method does not require an explicit call to the parent class constructor, and the Java virtual machine guarantees that the parent class’s Clinit method completes execution before the subclass’s Clinit method executes.

So the first clinit method to be executed in the Java virtual machine must be of type java.lang.object

Initialization considerations

Clinit methods are not required for a class or interface, and the compiler may not generate clinit methods for a class that has no static blocks and no assignment to variables.

Static blocks cannot be used in the interface, but there is still assignment for variable initialization, so the clinit method is generated by the interface as well as the class. But unlike classes, the Clinit method that implements the interface does not need to execute the parent interface’s Clinit method first, because the parent interface is initialized only when a variable defined in the parent interface is used. In addition, the implementation class of the interface does not execute the clinit methods of the interface when initialized.

Initialization of the synchronization mechanism

The JVM must ensure that a class’s Clinit methods are locked and synchronized correctly in a multithreaded environment. If multiple threads initialize a class at the same time, only one thread will execute the clinit method, and all the other threads will block until the active thread finishes executing the Clinit method.

This paper summarizes

Ok, that’s all the content of this article, let’s remember:

  • Why class loading? Why class loading
  • Next, from the perspective of HotSpot source code analysis of the whole running process of the whole class, from a complete process to see the content of class loading in a what position
  • Analysis of the details of the specific class loading, step by step analysis of the load >> Verify >> Prepare >> Parse >> initialization these five steps do what, responsible for what functions

omg

Finally, if you feel confused, please leave a comment in the first time, if you think I have something to ask for a thumbs up 👍 for attention ❤️ for share 👥 for me really very useful!! If you want to obtain the ebook “In-depth Understanding of Java Virtual Machine: ADVANCED features and Best Practices of JVM (3rd edition) Zhou Zhiming”, you can pay attention to wechat public account Java Encyclopedia, finally, thank you for your support!!