The JVM is introduced

JVM: A Java Virtual Machine used to execute a series of instructions from a Virtual Machine

The running flow of the Java file we wrote:

The JVM’s job is to convert.class files into machine code that the computer can recognize directly

The Java virtual machine hides the differences between the underlying hardware and instructions of different operating systems at the software level

This enables the cross-platform nature of Java

How exactly does the JVM handle.class files?

The JVM wants to execute the.class file, and we need to load it into a class-loading subsystem

It acts like a porter, moving all the.class files into the JVM

The data then enters the runtime data section

Finally, the bytecode execution engine executes the class file, converting the bytecode into machine code that the machine can recognize directly

Vm class loading mechanism

There are seven steps from the moment a class is loaded into virtual machine memory to the moment it is released: load, validate, prepare, parse, initialize, use, and unload. Verification, preparation, and resolution are collectively referred to as connection

Loading:

During the load phase, the Java virtual machine does three main things

  1. Gets the binary byte stream that defines a class by its fully qualified name
  2. Transform the static storage structure represented by this byte stream into the runtime data structure of the method area
  3. Generate a java.lang.Class object representing the Class in memory as an access point to the various data structures of the Class in the method area

Connection:

  1. Validation: To ensure that the classes being loaded conform to JVM specifications and security, and that the methods of the class being checked do not cause events that harm the virtual machine at runtime
  2. Preparation: Allocates memory space in the method area for static variables and sets their initial values
  3. Resolution: The process by which the virtual machine replaces symbolic references in the constant pool with direct references

Validation is the first step in the connection phase, which ensures that the byte stream in the Class file meets the requirements of the current virtual machine and does not compromise the security of the virtual machine itself.

On the whole, the verification phase will roughly complete the inspection actions of the following four stages

  1. File format validation: Ensures that the input byte stream is properly parsed and stored in the method area

  2. Metadata validation: Ensures that there is no metadata information that does not conform to Java language specifications

  3. Bytecode validation: Determine that program semantics are legitimate and logical through data flow and control flow analysis

  4. Symbolic reference validation: Guarantees that symbolic references can be converted to direct references

The preparation phase is when class variables allocate memory and set their initial values

The parsing phase is the process by which the virtual machine replaces symbolic references in the constant pool with direct references

Initialization It is not until this stage that the Java virtual machine actually executes the Java files written in the class

The initialization phase is the process of executing the class constructor

() method, which is an automatic artifact of the Javac compiler

We need to understand the details of how the

() method might affect the behavior of the program

The

() method is generated by combining the assignment action of all the class variables in the class automatically collected by the compiler in turn with statements in the static code block.

In a static statement block, only variables defined before the static statement block can be accessed. Variables defined after the static statement block can be assigned but cannot be accessed. The compiler will prompt for illegal forward references

The

() method is not required for a class or interface, and the compiler may not generate a

() method for a class that has no static blocks and no assignment to variables

Unload: THE GC removes unwanted objects from memory

Class loader

The order in which a Class is loaded is also prioritized, starting at the bottom and going up

  1. BootStrap ClassLoader: starts the ClassLoader and loads the jar package with a specific name in the \lib directory
  2. Extention ClassLoader: Extention ClassLoader that loads jar packages in the \lib\ext directory
  3. Application ClassLoader: An Application ClassLoader that loads jar packages under the specified classpath
  4. Custom ClassLoader: Custom ClassLoader

The hierarchical relationship between the various loaders is called the parent-delegate model of class loaders

The parent delegate model requires that all loaders have their own parent class loaders, except for the top-level start class loaders

The working process of the parental delegation model: If a classloader receives a classload request, it does not try to load the class itself at first. Instead, it delegates the request to the parent classloader. This is true at every level of classloaders, so all load requests should eventually be passed to the top level of the starting classloader. Only when the parent loader reports that it cannot complete the load request (it did not find the desired class in its search scope) will the child loader attempt to load it itself.

The advantage of this is that no matter which loader loads the java.lang.Object class, it will be delegated to the BootStrap ClassLoader. This ensures that different classloaders will get the same result

The comparison between two classes is valid only if the two classes are loaded by the same class loader. Otherwise, even if the same class file is loaded by the same VM, as long as the class loader is different, the two classes must be different

Break the parental delegation model:

There were three large-scale “broken” cases in the parental delegation model

  1. The parent delegate model was introduced after JDK 1.2, whereas classloaders and abstract classesjava.lang.ClassLoaderIt has been around since JDK 1.0, and a new protected method was added after JDK 1.2findClass()Previously, the only reason a user inherited a ClassLoader class was to override itloadClass()Method, and the parent delegate logic is implemented in this method, which has been deprecated since JDK 1.2loadClass()Method, should instead write its own classloading logic tofindClass()Method to ensure that newly written classloaders comply with the parent delegate rule.
  2. The base class cannot call the classloader to load user-supplied code (The more basic classes are loaded by the upper loader, and the user-supplied code is placed under the classpath), for which Java introduced the Thread Context ClassLoader. This class loader can pass throughsetContextClassLoaser()Method, if it is not set when the thread is created, it inherits one from the parent thread, which is the application classloader by default if it is not set at the global level of the application. So,The parent class loader requests the subclass loader to complete the class loading actionThis behavior is essentially working backwards with the classloader by knocking through the hierarchy of the parent delegate model

Why use thread context loader, clearly using this. GetSystemClassLoader () method can obtain the Application class loader, it is because not context loader are Application class loaders, Sometimes custom class loaders

  1. The key to OSGi’s modular Hot Deployment is the implementation of its custom classloader mechanism. Each program module (Bundle) has its own class loader. When a Bundle needs to be replaced, the Bundle is replaced with the same class loader to achieve hot replacement of the code. In the OSGi environment, class loaders have evolved from a tree structure in the parent delegate model to a more complex network structure

Runtime data area

During the execution of Java programs, the Java VIRTUAL machine divides the memory it manages into several different data areas

Program counter

Can be seen as a line number indicator of the bytecode being executed by the current thread. If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instruction being executed

In the conceptual model of virtual machine, bytecode execution engine works by changing the value of this counter to select the next bytecode instruction to be executed. Branch, loop, jump, exception handling, thread recovery and other basic functions need to be completed by this counter

Because JVM multithreading is implemented by the way threads alternate and allocate processor execution time, at any given moment, only one processor executes instructions in one thread. Therefore, in the future, the thread can be restored to the correct execution position after switching, and each thread needs an independent program counter, which does not affect each other between threads and is stored independently

The virtual machine stack

Threads are private and have the same life cycle as threads

The virtual machine stack describes the memory model of Java method execution. Each method execution creates a stack frame to store information such as local varitables, operand stacks, dynamic links, method exits, etc. The process of each method from invocation to completion corresponds to the process of a stack frame in and out of the virtual machine stack

Like executing this thread

Each thread has a private stack space, and when main calls compute, it creates a stack frame, puts that frame at the bottom of the stack space, creates a stack frame for compute

When compute is finished executing, it is pushed off the stack

The method called first allocates memory first, and the method called later terminates first, satisfying the principle of stack first and then out

There are two exceptions to this memory region:

  1. If the stack depth requested by the thread is greater than the depth allowed by the virtual machine, it is thrownStackOverflowErrorabnormal
  2. Raised if the Java virtual machine stack cannot allocate enough memory when expandingOutOfMemoryErrorabnormal

Local variables table: used to store method parameters and local variables defined within a method

Operand stack: When a method is first executed, the operand stack of the method is empty. During the execution of the method, various bytecode instructions write and extract contents to the operand stack, namely, push and unload operations

Dynamic Linking: Each stack frame contains a reference to a method in the run-time constant pool to which that stack frame belongs. This reference is used only to support Dynamic Linking during method invocation. The Class file has a large number of symbolic references in the constant pool, and the method invocation instructions in the bytecode take symbolic references to methods in the constant pool as arguments. Some of these symbolic references are converted to direct references during class loading or the first time they are used, which is called static linking. The other part is converted to a direct reference during each run, called dynamic linking

The current thread’s stack frame can use the constant pool and operand stack to execute the method by fetching a direct reference to the method and pointing to the bytecode of the corresponding method in the constant pool

Method exit: Once a method is executed, there are only two ways to exit the method.

  1. When the execution engine meets a bytecode instruction return from any method, the return value is passed to the upper method caller, which is referred to as the normal completion exit

  2. An Exception is encountered during the execution of a method and the agenda is not handled in the method body, referred to as the Exception completion exit

Regardless of the exit method, after the method exits, it needs to return to the location where the method was called before the program can continue executing. When the method returns, it may need to store some information in the stack frame to restore the execution state of its upper method. In general, when a method exits normally, the value of the caller’s PC counter can be used as the return address, and it is likely that this counter value will be stored in the stack frame. When a method exits abnormally, the return address is determined by the exception handler table, which is generally not stored in the stack frame

Local method stack

The role of the native method stack is very similar to that of the virtual machine stack, except that the virtual machine stack performs Java method (bytecode) services for the virtual machine, while the native method stack serves native methods used in the virtual machine

The heap

The Java heap is shared by all threads and is created when the virtual machine is started. The sole purpose of this memory area is to hold object instances, and almost all object instances are allocated memory here

The heap is also the main area managed by the garbage collector and is often referred to as the “GC heap”. The Java heap can also be subdivided into: new generation and old generation; More detailed are Eden space, From Survivor space, To Survivor space, etc

When we create an object, we will first put it into the memory allocated by Eden as storage space. When Eden space is full, Minor GC will be triggered and the surviving object will be moved to Survivor from area. When the Survivor FROM section is full, the Minor GC is triggered, which moves the surviving object to the Survivor to section and swaps the FROM and TO Pointers, ensuring that one Survivor section is always empty for a period of time. Objects that are still alive after multiple Minor GC’s are moved to the old age

The old age is where long-lived objects are stored, and when it fills up, it triggers the Full GC, the most commonly heard of GC, during which all threads are stopped waiting for the GC to complete. For response-demanding applications, Full GC should be minimized to avoid response timeouts

Minor /Young GC: Refers to garbage collection actions that occur in the new generation

Major GC/Full GC: General collection of old generation, young generation, method area garbage

The program counter, virtual machine stack, and local method stack are three areas that live with threads. Memory allocation and reclamation are determined. Memory is naturally reclaimed at the end of the thread, so there is no need to worry about garbage collection

Unlike the Java heap and method area, threads are shared, and memory allocation and reclamation are dynamic. Therefore, the garbage collector focuses on the heap and method area of memory

Methods to determine whether an object is still alive:

  1. Reference counter calculation: add a reference counter to the object. Each time the object is referenced, the counter increases by one, and when the reference is invalid, the counter decreases by one. When the counter is equal to 0, the object is judged dead.

  2. Reachability analysis calculation: Start with a series of GC roots as a set of viable objects, and search down from this node for directly or indirectly reachable objects

Garbage collection algorithm:

  1. Mark clearing algorithm: Mark clearing algorithm is divided into “mark” and “clear” two stages. Mark all objects that need to be recycled, and recycle them uniformly after marking. The inefficiency of marking and clearing is relatively low. This will cause memory fragmentation

  2. Mark-clean algorithm: The mark-clean process is still the same as the mark-clean algorithm, but instead of cleaning up the recyclable objects directly, the next step is to move all surviving objects to one end and then clean up memory directly beyond the boundary

  3. Copy algorithm: The memory is divided into two pieces of equal size based on the memory capacity. Use only one block at a time, and when the block is full, copy the surviving objects to the other block, clearing the used memory

  4. Generation collection algorithm: The core idea is to divide the memory into different domains according to the different life cycles of objects. In general, GC heap is divided into old generation and new generation. The characteristics of the old generation are that only a small number of objects need to be recycled in each garbage collection, while the characteristics of the new generation are that a large number of garbage needs to be recycled in each garbage collection, so different algorithms can be selected according to different regions. The new generation uses the copy algorithm, while the old generation uses the tag sorting algorithm

Method area (meta-space)

Like the heap, it is shared by all threads and is mainly used to store information about classes that have been loaded by the JVM, constants, static variables, code compiled by the just-in-time compiler, and so on

Bytecode execution engine

Execution engine is one of the core part of the Java virtual machine, the “virtual machine” is relative to the concept of “physical machine”, the two machines have the ability to code execution, the difference is that the physical machine execution engine is directly based on processor and operating system, hardware, instruction set level, while the virtual machine execution engine are implemented by myself, Therefore, the instruction set and execution engine architecture can be developed by itself, and the instruction set formats that are not directly supported by the hardware can be implemented.

A Stack Frame is a data structure used to support vm method invocation and method execution. It is the Stack element of the VM Stack in the data area when the VM runs. A stack frame stores information about a method’s local variables, operand stack, dynamic linkage, and method return address. Each method from the call to the completion of the process, corresponding to a stack frame in the virtual machine stack from the process of loading and unloading.

For those who need to know about bytecode engines, see “Understanding the Java Virtual Machine in Depth.”