preface

In the last article we gave a rough introduction to the Runtime data area of the Java Virtual machine and explained the division of the runtime data area. Today we will start with class loading and take a closer look at the format in which data is stored in the runtime data area.

compile

A.java file is compiled to become a.class file by following the leave step: .java -> Lexical Parser -> Tokens Stream -> Parser -> Syntax tree/Abstract Syntax Tree -> Semantic Parser -> Annotated Abstract syntax tree -> bytecode Generator ->.class file. The specific process is not analyzed, involving the compilation principle is more complex, we need to analyze is. Class file exactly what kind of file?

The Class file

In Java, each class file contains a single class or interface, and each class file consists of an 8-bit byte stream. All 16 -, 32 -, and 64-bit quantities are built by reading two, four, and eight consecutive 8 – bit bytes, respectively.

According to the Java Virtual Machine specification, the Class file format uses a pseudo-structure similar to THE C language to store data. There are only two data types in the Class file, unsigned numbers and tables. Note that there is no alignment or padding in the class file. All the data is arranged in a compact order in the class file.

  • Unsigned numbers are the basic types of data, which are u1, U2, U4, and U8 for 1 byte, 2 roof bytes, 4 bytes, and 8 bytes. (In the Java SE platform, These types can be read by readUnsignedByte, readUnsignedShort, and the readInt method in the interface java.io.datainput).
  • A table consists of zero or more variable-sized items that can be used in multiple class file structures, meaning that a class is essentially a table.

Class file structure

A Class file consists roughly of the following structure:

ClassFile { u4 magic; // u2 minor_version; // Version number u2 major_version; // Main version number u2 constant_pool_count; Cp_info constant_pool[constant_pool_count-1]; // Constant pool information u2 access_flags; // Access tag u2 this_class; // Class index u2 super_class; // Parent index U2 interfaces_count; // Number of interfaces (2 bits, so a class has up to 65535 interfaces) u2 interfaces[interfaces_count]; // interface index u2 fields_count; Field_info fields[fields_count]; U2 methods_count; Method_info methods[methods_count]; // Method set u2 attributes_count; // Attribute_info attributes[attributes_count]; // Attribute set}Copy the code

This structure will not one by one to explain in this article, if one by one to explain way seem to be very boring, and probably will occupy a large amount of space, the concept of a whole, these things have in mind to check the information when, later, if you meet some very commonly used class structure meaning will, If magic number is necessary to understand.

Class file example

Let’s start with an arbitrary sample testClassFormat.java file:

package com.zwx.jvm; public class TestClassFormat { public static void main(String[] args) { System.out.println("Hello JVM"); }}Copy the code

Testclassformat. class = testClassformat. class = testClassformat. class



Because Java virtual machines only recognize Class files, they must have strict security checks on the format of Class files.

The magic number

Each Class file begins with a 4-byte magic (U4), the CA FE BA BE (coffee baby) used to mark whether a file is a Class file.

Major and minor version numbers

Minor_version: minor_version: minor_version: minor_version: minor_version: minor_version: minor_version: minor_version: minor_version: minor_version

java.lang.UnsupportedClassVersionError: com/zwx/demo : Unsupported major, minor version 52.0 at Java lang. This. DefineClass1 (Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)Copy the code

This exception indicates that the main version number is incorrect. The version number in Java starts at 45, so JDK1.0 corresponds to the major version of the Class file 45, and JDK8 corresponds to the major version 52. The main version number (digits 7 and 8) of the Class file is 00 34, which is 52 in decimal form. This Class is compiled in JDK1.8, and since I am running it in JDK1.6, I will see an error in this file because older JDK versions are backward compatible with older Class files. However, the Class file cannot be up-compatible with higher versions, so the above exception occurs.

other

Other validations, such as constant pool information and counts, access permissions (public, etc.), and other rules, are all grouped together in the order specified in the Class file.

Class loading mechanism

Once the.java files have been compiled, the class files need to be loaded into memory and the data sorted into different areas of the runtime data area.

A class goes through five steps (seven stages) from being loaded into memory to being unloaded when it is finished using it:

2. Loading (a) Linking (b) Initialization (C) Using (D) Unloading A. Verification B. Preparation C. Resolution D. Verification

Load (Loading)

Loading means taking its binary stream form from a complete class or interface name and storing it in the runtime data area as per the Java Virtual machine specification.

Classes are loaded to do three things:

  • Get the binary byte stream that defines a class by its fully qualified name.
  • 2. Convert the static storage structure represented by this binary byte stream into the runtime data structure of the method area.
  • 3. Generate a java.lang.Class object representing this Class in the Java heap as an access point to the data in the method area.

Step 1 above does not say in the virtual machine specification where the Class comes from or how to get it, so there are many different implementations. Here are some common implementations:

  • 1, the most normal way, read the local compiled. Class file.
  • 2. Read files from compressed packages such as zip, JAR,war, etc.
  • 3. Obtain it from the network.
  • 4. Dynamically generate. Class files through dynamic proxy.
  • 5. Read from the database.

Performing a Class(Class or interface) load requires a Class loader, and a good, qualified Class loader requires two properties:

  • 1. For the same Class name, the same Class object should always be returned
  • 2. If Class loader L1 delegates another Class loader L2 to load a Class object C, then two Class loaders L1 and L2 should return the same Class object for any type T in the following scenarios: (1) C’s immediate parent or parent interface type; (2) the field type in C; (3) parameter types of methods or constructors in C; (4) the return type of the method in C

There is more than one class loader in Java, and objects loaded by different class loaders are not equal for the same class. So how does Java guarantee the above two points? This is the parent delegate mode, which is used in Java to prevent malicious loading and also ensures Java security.

Parental delegation pattern

Parents assignment model of working process is very simple, when loading a class loader received the request, you don’t go to load, but to its parent loader to load, and so on, until transfer to the top of the class loader, and only when the parent said they unable to load the class loader feedback, child loader will try to load the class.



In the picture aboveParental delegation modelAnd those of you who are careful will notice that the top loader I’m using a dotted line because the top loader is a special existence that has no parent loader, and implementionally, no child loader, it’s a separate loader, The Extension ClassLoader and Application ClassLoader have a parent-child relationship from the perspective of inheritance, and both inherit URLClassLoader. However, although Bootstrap ClassLoader does not have children from class inheritance, However, the Extension ClassLoader logically gives the received request priority to the Bootstrap ClassLoader for priority loading.

  • Start the Bootstrap ClassLoader, which loads the classes under $JAVA_HOME\lib or the classes identified by the jar name specified by -xbootCLASspath. Rt.jar), the startup class loader is directly controlled by the Java virtual machine. Developers cannot use the startup class loader directly.
  • Extension ClassLoader, The class loader is responsible for loading classes under $JAVA_HOME\lib\ext or all libraries (system.getProperty (” java.ext.dirs “) in the path specified by the java.ext.dirs System variable. Developers can use this class loader directly.
  • The Application ClassLoader is responsible for loading the library specified in $CLASS_PATH. Developers can use this class loader directly, which is normally used if we don’t have a custom class loader in our application.
  • Custom class loaders. If necessary, you can define your own ClassLoader by subclassing java.lang.ClassLoader. Generally we choose to inherit URLClassLoader to do appropriate rewriting.

Break the parent delegate pattern

The parent delegate model is not a mandatory constraint model, but a recommended loading model. Although most people follow this rule, there are some non-parent delegate models, such as JNDI,JDBC, and related SPI actions that do not fully comply with the parent delegate model

One of the easiest ways to break the parent delegate pattern is to inherit the ClassLoader class and override the loadClass method (because the parent delegate logic is written in the loadClass() method).

Common abnormal

If an exception occurs during loading, the following exceptions (all subclasses of LinkageError) may be thrown:

  • ClassCircularityError: extends or implements its own class or interface
  • ClassFormatError: The binary format of a class or interface is incorrect
  • NoClassDefFoundError: The class or interface cannot be found based on the provided fully qualified class name

ClassNotFoundException and NoClassDefFoundError

There is also a ClassNotFoundException that you may often encounter. This Exception looks similar to NoClassDefFoundError, but the name indicates that ClassNotFoundException is derived from Exception, And NoClassDefFoundError is inherited from Error.

  • ClassNotFoundException Thrown when the JVM attempts to load the bytecode of a specified file into memory and finds that the file does not exist. This exception usually occurs in explicit loading, and there are three main scenarios: (1) call class.forname (); (2) call findSystemClass(); (3) call loadClass(); Generally, you need to check whether the specified file exists in the classpath directory.
  • NoClassDefFoundError this exception usually occurs in implicit loading when JVM implicit loading is triggered by the use of the new keyword, a property that references a class, an inherited class or interface, or a parameter in a method that references a class. This exception is thrown when the class does not exist at load time. Workaround: Make sure that each referenced class is in the current classpath

Connection (Linking)

Chaining is the process of taking the binary form of a class or interface type and combining it into the runtime state of the Java Virtual machine for execution. The chain consists of three steps: validation, preparation, and parsing.

Note: Because the link involves the allocation of new data structures, it may throw an OutOfMemoryError.

Validation (Verification)

This step is easy to understand, class loading must be a format check, otherwise everything is directly in memory, Java security is not guaranteed. The following aspects are mainly verified:

  • 1, file format verification: such as whether to start with magic number, the correctness of JDK version number and so on.
  • 2. Metadata verification: for example, whether the fields in the class are legal, whether there is a parent class, whether the parent class is legal and so on
  • 3. Bytecode verification: mainly to determine whether the semantics and control flow of the program conform to logic

If validation fails, an exception VerifyError (inherited from LinkageError) is thrown.

To prepare (Preparation)

Preparation is the stage in which the allocation of memory addresses begins in earnest by creating static fields (class variables and constants) for a class or interface and initializing these fields to default values. Here are some common starting values:

The data type The default value
int 0
long 0L
short (short)0
float 0.0 f
double 0.0 d
char ‘\ u0000’
byte (byte)0
boolean false
Reference types null

It is important to note that some fields are assigned in the spring phase if they already exist in the constant pool. Such as:

static final int i = 100;
Copy the code

The final modifier is assigned an initial value, not a default value.

Resolution (Resolution)

The parsing phase is the process of replacing symbolic references in the constant pool with direct references. Before a symbolic reference can be used, it must be parsed, during which the correctness of the symbolic reference is checked.

Note: Because Java supports dynamic binding, some references do not know what object to point to until they are used, so parsing can be done after initialization.

Common abnormal

The following exceptions may occur during parsing:

  • IllegalAccessError: Permission exceptions, such as a method or property that was declared private but then called, are raised.
  • InstantiationError: InstantiationError This exception is thrown when a symbolic reference is parsed and it points to an interface or abstract class so that the object cannot be instantiated.
  • NoSuchFieldError: A symbolic reference to a particular field of a particular class or interface was encountered, but the class or interface does not contain a field of that name.
  • NoSuchMethodError: A symbolic reference to a particular method of a particular class or interface was encountered, but the class or interface does not contain a method for that signature.

Symbolic reference

A symbol reference is a set of symbols that describe the target of a lock reference. The symbol can be any literal, as long as the symbol uniquely locates the target. For example: String s = xx, xx is a symbol, as long as the symbol can be located xx is the value of the variable s.

Direct reference

A pointer directly to the target, a relative offset, or a handle that can be indirectly located to the target. For the same symbol reference through different virtual machine conversion to get direct drinking is generally different. When there is a direct reference, the target of the reference must already exist in memory, so this step is after the preparation stage, because the preparation stage allocates memory, and this step is actually a matching process of addresses.

Initialization (Initialization)

At this stage, the actual assignment is performed, replacing the previously assigned default value with the true initial value, and at this stage, the constructor method is executed.

So when does a class need to be initialized? What is the initialization order of the parent and child classes?

Initialization sequence

In the Java Virtual Machine specification, there are five situations in which a class must be initialized immediately. The action that triggers initialization is also called an active reference (except for the following five cases, references that do not trigger initialization are called passive references).

  • 1. When the virtual machine starts, it first initializes the main class we specify to execute (that is, the class of main method).
  • When you instantiate an object with the new keyword, read or set a static field of a class (except for final modifications), and when you call a static method of a class.
  • 3. When initializing a class, if its parent class is not initialized, initialization of the parent class is triggered first.
  • When using reflection to call a class.
  • 5, JDK1.7 began to provide a dynamic language support, if a Java lang. Invoke. Analytical results REF_getStatic MethodHandle instance, REF_putStatic, The REF_invokeStatic method handle corresponds to a class that is not initialized and needs to trigger its initialization.

Note: Interface initialization is a little different in point 3: when an interface is initialized, it does not require its parent interface to be fully initialized, but only when the parent interface is actually used (such as calling constants defined in the interface).

Example for initializing actual combat

Let’s look at some initialization examples:

package com.zwx.jvm; public class TestInit1 { public static void main(String[] args) { System.out.println(new SubClass()); // system.out.println (subclass.value); // system.out.println (subclass.value); Println (subclass.finalValue); // system.out.println (subclass.finalValue); // system.out.println (subclass.s1); // system.out.println (subclass.s1); // SubClass[] arr = new SubClass[5]; // E-array does not trigger initialization}} class SuperClass{static {system.out.println ("Init SuperClass"); } static int value = 100; final static int finalValue = 200; final static String s1 = "Hello JVM"; } class SubClass extends SuperClass{ static { System.out.println("Init SubClass"); }}Copy the code
  • 1. The output of statement A is:

    Init SuperClass Init SubClass com.zwx.jvm.SubClass@xxxxxx

Because the new keyword triggers the initialization of the SubClass (active reference case 2), the parent class is initialized first (active reference case 3).

  • 2. Statement B outputs:

    Init SuperClass 100

The static constant of the class is called (active reference case 2), although it is called by the subclass, but the static constant is defined in the parent class, so only the parent class initialization is triggered, because the call of the static property only fires the class of the property

  • C, D, C, D

    200

Because static constants modified by final are in the constant pool, the property is assigned directly during the preparation phase of the connection, without the need to initialize the class.

  • 4. Statement E does not output anything because constructing an array object and constructing an object directly are done by different bytecode instructions. Creating an array is done by a separate Newarray instruction and does not initialize the object.

Use (Using)

After the above five steps, a complete object is loaded into memory and ready to be used in our code.

Uninstall (Unloading)

When an object is no longer in use, it is garbage collected, which will be covered in future ARTICLES on the JVM series.

conclusion

This article mainly introduces the Java virtual machine class loading mechanism, I believe that after reading this article combined with the previous article on the runtime data area, you have an overall understanding of the working principle of Java virtual machine class loading mechanism, so the next article, We will take a more detailed and in-depth analysis of the Java virtual machine method call process and method overloading and method rewriting from a deeper level of bytecode.

** Please pay attention to me, learn and progress together **

The last:

Deep analysis of the causes of Java virtual machine heap and stack and OutOfMemory exceptions