The virtual machine loads the data describing the Class from the Class file into memory, validates, transforms, and initializes the data, and finally forms the Java type that can be directly used by the virtual machine. This is the Class loading mechanism of the virtual machine. The dynamic loading and dynamic wiring of Java runtime make Java inherently a dynamically extended language.

When the class is loaded

The life cycle of a class from when it is loaded into virtual machine memory until it is unloaded includes:

  • Loading
  • Verification
  • Preparation
  • Resolution (Resolution)
  • Initialize (Initialization)
  • Using (Using)
  • Unloading (an optimization)

The sequence of the seven stages is as follows:



The loading of a class must start in this order, but the parsing phase does not. In some cases, it can be started after the initialization phase, in order to support Java language runtime binding.

The Java Virtual Machine specification strictly states that there are five and only five situations in which classes must be “initialized” immediately, and loading, validating, and preparing naturally begin before that. The five cases are:

  1. When you encounter one of the four bytecode instructions, new, getstatic, putstatic, or invokestatic, you need to trigger initialization if the class has not been initialized before.
  2. When you use the java.lang.reflect package method to make a reflection call to a class, if the class has not been initialized, you need to trigger its initialization first.
  3. When initializing a class, if it is found that its parent class has not been initialized, it needs to trigger the initialization of its parent class first.
  4. When the virtual machine starts, the user needs to specify a main class to execute, and the virtual machine initializes this class.
  5. When using JDK1.7 dynamic language support, if a Java lang. Invoke. The final analytical results REF_getStatic MethodHandle instance, REF_inkokeStatic, REF_putStatic method handles, If the class to which the method handle corresponds has not been initialized, it needs to be initialized first.

The behavior in the five scenarios above is called making active references to a class. In addition, all ways of referring to a class that do not trigger initialization are called passive references. Examples of Passive References:

  • Referencing static fields of a parent class through a subclass does not result in initialization of the subclass;
  • Referencing a class through an array definition does not trigger the class’s initialization.
  • Constants are stored in the constant pool of the calling class at compile time. They do not refer directly to the class that defines the constant per se, and therefore do not trigger initialization of the class that defines the constant.

Interfaces and classes before the difference between article 5 of the “one and only” need to start the initialization in the scene 3 kinds, when a class is initialized, require all of its parent class has been initialized, but when an interface is initialized, does not require all of its parent interface has completed initialization, only at the time of real use to the parent interface, will be initialized.

The process of class loading

The process of class loading, namely loading, verification, preparation, parsing and initialization of the five stages.

loading

“Loading” is only one stage in the “classloading” process. During the load phase, the virtual machine needs to do three things:

  1. Get the binary byte stream that defines a class by its fully qualified name;
  2. Converts the static storage structure represented by the byte stream to a runtime data structure for the method area.
  3. A java.lang.Class object representing the Class is generated in memory as an access point to the various data of the Class in the method area.

validation

Validation is the first step in the connection phase, which is to ensure that the information contained in the byte stream in the Class file is appropriate for the current virtual machine and does not compromise the security of the virtual machine itself. On the whole, the verification stage can be roughly divided into four stages: file format verification, metadata verification, bytecode verification, symbol reference verification.

File format validation

Verify that the byte stream conforms to the Class file format specification and can be processed by the current version of the virtual machine. The main purpose of file format validation is to ensure that the input byte stream is properly parsed and stored in the method area, in a format that describes a Java type of information. Possible verification points:

  • Whether to start with magic number 0xCAFEBABE;
  • Whether the primary and secondary version numbers are within the scope of the current virtual machine processing;
  • Whether there are unsupported constants in the constant pool;
  • Whether any of the various indexes to a constant refer to a nonexistent constant or a non-type constant;
  • If there is data in CONSTANT_Utf8_info that does not conform to the UTF8 encoding;
  • Whether there is any additional information that has been deleted or attached to various parts of the Class file and the file itself.

Metadata validation

The semantic analysis of the information described by bytecode is carried out to ensure that the information described meets the requirements of the Java language specification. Possible verification points:

  • Whether this class has a parent class (all classes except java.lang.Object should have a parent class);
  • Whether the class’s parent class inherits from a class that is not allowed to be inherited (a class that is final decorated);
  • If the class is not abstract, does it implement all of the required methods in its parent class or interface?
  • If the fields and methods in the class are inconsistent with the parent class (for example, the final fields of the parent class are overridden, or the method overloads are not consistent with the rules, for example, the method parameters are the same, but the return value types are different, etc.).

Bytecode verification

Bytecode validation is the most complex step in the whole validation process, and its main purpose is to determine that the program semantics are legitimate and logical through data flow and control flow analysis. Possible verification points:

  • Ensure that the data types of any real operand stack work together with the instruction code sequence. For example, there is no such thing as placing an int on the operand stack and loading it into the local variable table as long.
  • Ensure that jump instructions do not jump to bytecode instructions outside the method body.
  • It is safe to ensure that type conversions in the method body are valid. For example, it is safe to assign a subclass object to a superclass data type, but it is dangerous and illegal to assign a superclass object to a subtype, or even to a completely unrelated data type from which it has no inheritance relationship.

Symbolic reference verification

This validation takes place when the virtual machine converts a symbolic reference to a direct reference, which takes place during the third phase of the connection, parsing. Notation to refer to the main purpose of the validation is to ensure that the parsing action can normal operation, if not through reference symbol verification, so will be thrown a Java lang. A subclass of IncompatibleClassChangeError anomalies, such as: Java. Lang. IllegalAccessError, Java. Lang. NoSuchFieldError, Java. Lang. NoSuchMethodError, etc. Possible verification points:

  • Whether a fully qualified name described by a string in a symbolic reference can find a corresponding class;
  • Whether there is a field descriptor corresponding to the method and the method and field described by the simple name in the specified class;
  • The accessibility of a class, field, or method in a symbolic reference is accessible by the current class.

To prepare

The prepare phase is the stage of formally allocating memory and setting initial values for class variables, and the memory used by these variables will be allocated in the method area. Only class variables are allocated during the prepare phase, not instance variables, which are allocated along with the object in the Java heap when the object is instantiated. The initial value of the prepare phase, which is “normally” the zero value of the data type.

parsing

The parsing phase is the process in which the virtual machine replaces symbolic references in the constant pool with direct references.

  • Symbol reference: Symbol reference refers to a set of symbols that describe the referenced target. Symbols can be literals in any form, as long as they can be used to unambiguously locate the target. Symbolic references are independent of the memory layout implemented by the virtual machine, and the target of the reference is not necessarily loaded into memory.
  • Direct reference: A direct reference can be a pointer directly to the target, a relative offset, or a handle that can be indirectly located to the target. Direct references are related to the memory layout of the virtual machine implementation. The same symbolic reference will not translate the same direct reference to different virtual machine instances. If there is a direct reference, the target of the reference must already exist in memory.

The parsing action is mainly for the seven class symbol references: class or interface, field, class method, interface method, method type, method handle, and call point qualifier.

Initialize the

The initialization phase is the process of executing the <clinit>() method of the class constructor. Some of the characteristics and details of the execution of the <clinit>() method that may affect the behavior of the program:

  • Is the method by the compiler automatically collect all kinds of variable assignment in class action and static blocks the statement in the combined, produced by the compiler to collect the order by the order of the statement in the source file is determined by, static block can only access to the definition in the static block variables before, after its variables, static block in the previous assignment, But not accessible.
  • This method, unlike the class constructor, does not require an explicit call to the parent constructor, and the virtual opportunity ensures that the parent’s

    () method is executed before the child’s

    () method is executed. Therefore, the first class of the

    () method to be executed in the virtual machine must be java.lang.Object.
  • Since the parent’s

    () method executes first, this means that static statement blocks defined in the parent take precedence over variable assignments in the child.
  • This method is not required for a class or interface, and the compiler may not generate the

    () method for a class that has no static statement blocks and no assignment to variables.
  • The interface cannot use static statement blocks, but there are still assignment operations to initialize variables, so the interface class generates

    () methods just like the class. But unlike a class, the

    () method that executes the interface does not require the

    () method of the parent interface to be executed first. The parent interface is initialized only when the variables defined in the interface are used. Also, the interface’s implementation class does not execute the interface’s

    () method when initialized.
  • The virtual opportunity ensures that the

    () method of a class is properly shackled and synchronized in a multi-threaded environment. If multiple threads initialize a class at the same time, only one thread will execute the

    () method of the class, and all other threads will have to block until the active thread finishes executing the

    () method. If you have a long operation in a class-like

    () method, you can cause multiple process blocks.

Class loader

Implemented outside of the Java Virtual Machine during class loading, the action of “getting a binary byte stream describing a class by a fully qualified name” lets the application decide how to get the required class. The code module that implements this action is called a “classloader”.

For any class, it is up to the class loader that loads it and the class itself to establish its uniqueness in the Java Virtual Machine. Each class loader has a separate class namespace. If the two classes are loaded by the same Class loader, then the two classes are not equal, even if they are loaded from the same Class file by the same virtual machine, as long as they are loaded by different Class loaders.

Parental delegate model

From the perspective of the Java Virtual Machine, there are only two different class loaders: one is the boot class loader, which is implemented in the C++ language and is part of the virtual machine itself; The other is all the other class loaders, which have Java language implementations, are independent of the VM, and all inherit from the abstract class java.lang.ClassLoader.

From a developer’s perspective, the vast majority of Java programs use classloaders provided by the following three systems:

  • Start class loader: This class loader will be stored in the

    /lib directory, or in the path specified by the -xbootclasspath parameter, and is the class library that the virtual machine identifies by its file name (for example: Rt.jar, which does not match the name of the library in the lib directory will not be loaded) into the virtual machine memory. The boot classloader cannot be directly referenced by a Java program. Users writing custom classloaders who need to delegate load requests to the boot classloader can simply use NULL instead.
  • Extended classloader: This loader is implemented by sun.misc.launcher $ExtClassLoader and is responsible for loading all libraries in the

    /lib/ext directory or in the path specified by the java.ext.dirs system variable. Developers can use the extended classloader directly.
  • Application ClassLoader: This loader is implemented by sun.misc.launcher $AppClassLoader. Since this ClassLoader is the return value of the getSystemClassLoader() method in the ClassLoader, it is also commonly referred to as the system ClassLoader. It is responsible for loading the libraries specified on the user’s classpath. Developers can use this class loader directly. If the application does not have a custom class loader of its own, this is generally the default class loader in the application.

The relationship before the class loader is generally as follows:



The hierarchical relationship between class loaders shown in the figure above is called the parent delegate model of the class loader. The parent delegate model requires that all classloaders, except the top level boot classloader, have their own parent classloaders.

The working process of the parental delegate model is as follows: If a class loader received the request of the class loading, it won’t try to load the first class, but to delegate the request to the parent class loader to complete, each level of class loaders, so all the load request should be sent to the top finally start the class loader, only when the parent class loader feedback they can’t finish the load request, The child loader will try to load it itself.

One obvious benefit of using the parental delegate model to organize relationships between classloaders is that a Java class has a hierarchical relationship with priority along with its classloaders.