As a Java consumer, understanding the ARCHITECTURE of the JVM is also a must. When it comes to Java, the first thing that comes to mind is the Java programming language, but in fact, Java is a technology that consists of four aspects: the Java programming language, the Java class file format, the Java Virtual machine, and the Java application Programming Interface (Java API). Their relationship

As shown below:The runtime environment represents the Java platform. Developers write Java code (.java files), which is then compiled into bytecode (.class files), which is then loaded into memory. Once the bytecode is in the virtual machine, it is interpreted and executed by the interpreter, or optionally converted into machine code by the real-time code generator.

The Java platform is built by the Java virtual machine and Java application program interface. The Java language is the gateway to the platform. Programs written and compiled in The Java language can run on this platform. The structure of this platform is shown below:In the structure of the Java platform, it can be seen that the Java Virtual Machine (JVM) is at the heart of the program, independent of the underlying operating system and hardware. Below it is the migration interface, which consists of two parts: the adapter and the Java operating system, where the platform-dependent part is called the adapter; The JVM is implemented on specific platforms and operating systems through porting interfaces; On top of the JVM are Java’s base and extension libraries and their apis, Applications and Java applets written using Java apis can run on any Java platform regardless of the underlying platform because the Java Virtual Machine (JVM) separates the program from the operating system. Thus realizing Java platform independence.

The JVM has one specific job in its life cycle: to run Java programs, so when a Java program starts, an instance of the JVM is created; When the program ends, the instance disappears. Let’s take a closer look at the JVM in terms of its architecture and how it works.

· Every JVM has two mechanisms:

① Class loading subsystem: load classes or interfaces with appropriate names

② Execution engine: responsible for executing instructions contained in loaded classes or interfaces· Each JVM contains:Method area, Java heap, Java stack, local method stack, instruction counter, and other hidden registersIn my opinion, these are the most important parts of learning the JVM:

The entire process of compiling and executing Java code

JVM memory management and garbage collection mechanisms

###2, the entire process of Java code compilation and execution

As mentioned earlier, the entire process of compiling and executing Java code looks like this: The developer writes Java code (.java files), which is then compiled into bytecode (.class files), which is then loaded into memory. Once the bytecode is in the virtual machine, it is interpreted and executed by the interpreter, or optionally converted into machine code by the real-time code generator.

(1) Java code compilation is done by the Java source compiler, that is, Java code to JVM bytecodes (.class files). The flow chart is as follows:(2) Java bytecode execution is completed by JVM execution engine, as shown in the following flow chart:The entire process of compiling and executing Java code involves the following three important mechanisms:

  • Java source code compilation mechanism

  • Class loading mechanism

  • Class execution mechanism

Java source compilation consists of the following three processes:

Analyze and input to symbol table

② Annotation processing

③ Semantic analysis and generation of class files

The flow chart is as follows:The resulting class file consists of the following sections:

① Structure information: including the class file format version number and the number and size of each part of the information

Metadata: information corresponding to declarations and constants in Java source code. Contains declaration information, domain and method declaration information, and constant pool for class/inherited superclass/implemented interfaces

③ Method information: corresponding Java source code statements and expressions corresponding information. Contains bytecode, exception handler table, evaluation stack and local variable area size, evaluation stack type record, debug symbol information

(2) Class loading mechanismJVM class loading is done by ClassLoader and its subclasses. The class hierarchy and loading order can be described in the following figure:(1) the Bootstrap this

$JAVA_HOME jre/lib/rt.jar class, C++ implementation, not ClassLoader subclass

(2) the Extension of this

It is responsible for loading some JAR packages of extended functions in Java platform, including jre/lib/*. Jar in $JAVA_HOME or jar packages in -djava.ext. dirs specified directory

(3) this App

Record the classes in the jar packages specified in the classpath and directory

(4) the Custom this

ClassLoader is customized by application programs according to their own needs. For example, Tomcat and JBoss implement ClassLoader by themselves according to J2EE specifications

During the loading process, the class is first checked to see if it has been loaded. The check sequence is bottom-up, from Custom ClassLoader to BootStrap ClassLoader. As long as a ClassLoader has been loaded, the class is considered to have been loaded. Ensure that this class is loaded only once for all classloaders. The loading sequence is top-down, that is, the upper layer tries to load classes layer by layer.

(3) Class execution mechanism

The JVM is a stack-based virtual machine. The JVM allocates a stack for each newly created thread. In other words, for a Java program, its execution is done through operations on the stack. The stack holds the state of the thread in frames. The JVM performs only two operations on the stack: frame-by-frame pushing and off-stack operations.

JVM implementation class bytecode, threads are created, produces the program counter (PC) and Stack (Stack), the program counter to store the next instructions to be executed within the method of offset, one by one in the Stack frame, each Stack frame corresponding to each method of each call, the Stack frame is a local variable area and the operand Stack in two parts. The local variable area is used to hold local variables and parameters in the method, and the operand stack is used to hold intermediate results generated during the execution of the method. The stack structure is shown in the figure below:

The JVM memory structure is divided into method (method), stack (stack), heap (heap), and local method stack (JNI), as shown in the following diagram: (1) Heap memoryMemory for all objects created by new is allocated in the heap, and its size can be controlled with -xmx and -xms. The operating system has a linked list of free memory addresses. When the system receives a request from a program, it iterates through the list, looking for the first heap node that has more space than the requested heap, and then removes that node from the free list and allocates that node’s space to the program. Also, for most systems, The size of the allocated memory is recorded at the beginning of the block, so that the delete statement in the code can properly free the memory. However, since the size of the found heap may not be exactly the size of the requested heap, the system will automatically put the extra heap back into the free list. The memory allocated by new is generally slow and prone to memory fragmentation, but it is the most convenient to use. In addition, on WINDOWS, the best way to allocate memory is to use VirtualAlloc, which is not on the heap or stack, but directly preserves a chunk of memory in the address space of the process. This method is the least convenient, but it is also the fastest and most flexible. Heap memory is a data structure that extends to high addresses and is a discontinuous area of memory. Because the system uses a linked list to store the free memory address, it is naturally discontinuous, and the traversal direction of the list is from low address to high address. The size of the heap is limited by the amount of virtual memory available in the computer system. Thus, the heap is more flexible and larger.

In Windows, a stack is a data structure that extends to a lower address and is a contiguous memory area. On WINDOWS, the stack size is fixed (a constant determined at compile time). Overflow will be prompted if the stack size exceeds the amount of space left on the stack. Therefore, less space can be obtained from the stack. As long as the remaining stack space is larger than the requested space, the system will provide memory for the program; otherwise, an exception will be reported indicating stack overflow. Automatic distribution by the system, fast. But programmers have no control.

The basic data types are allocated directly in the stack space, and the method’s formal parameters are allocated directly in the stack space and reclaimed from the stack space when the method call is completed. The reference data type needs to be created using new, which allocates both an address space in stack space and an object’s class variable in heap space. Method, which is allocated an address space in the stack space and refers to the object area of the heap space, which is reclaimed from the stack space when the method call completes. When the local variable new comes out, space is allocated between the stack space and the heap space. When the local variable life cycle ends, the stack space is immediately reclaimed and the heap space area is waiting for GC collection. Literal parameters passed in during a method call are allocated in the stack space and retrieved from the stack space after the method call completes. String constants, static, are allocated in the DATA section, and this is allocated in the heap space. Arrays allocate both the name of the array in stack space and the actual size of the array in heap space.

Such as:(3) Local method stack (JNI calls in Java)

Used to support the execution of native methods, storing the state of each native method call. For a native method interface, implementing the JVM does not require it to be supported, or even completely absent. Sun implemented the Java native Interface (JNI) for portability, but it is possible to design other native interfaces to replace Sun’s JNI. However, the design and implementation can be complicated, and you need to ensure that the garbage collector does not release objects that are being called by local methods.

(4) Method area

It holds method code (compiled Java code) and symbol tables. Holds information about classes to load, static variables, constants of final types, properties, and methods. The JVM uses Permanet Generation to store method areas, which can be specified by -xx :PermSize and -xx :MaxPermSize.

Garbage collection mechanism

The heap contains all objects created by the application, and the JVM also has instructions for new, newarray, anewarray, and multianewarray. However, there are no instructions for releasing space to C++ delete, free, etc. Java all free by GC, GC in addition to do the recycling of memory, another important job is memory compression, this in other languages have similar implementation, compared with C++ not only good, but also increased security, of course, it also has disadvantages, such as the performance of the big problem.

###4, Java virtual machine running process example

Each part of the VIRTUAL machine has been described in detail. The following is an example to analyze its running process.

The virtual machine starts by calling the main method of a specified class, passing main an array of string arguments, causing the specified class to be loaded, linking other types used by the class, and initializing them. For example, for programs:After compiling, type Java HelloApp Run Virtual Machine in command line mode

The Java VIRTUAL machine is started by calling the HelloApp method main, passing main an array of three strings “run”, “virtual”, and “machine”. Now let’s outline the steps that the virtual machine might take when executing HelloApp.

I tried to execute the main method of class HelloApp and found that the class was not loaded, that is, the virtual machine does not currently contain a binary representation of the class, so the virtual machine used the ClassLoader to try to find such a binary representation. If the process fails, an exception is thrown. After the class is loaded and before the main method is called, the class HelloApp must be linked to other types and initialized. Links consist of three phases: validation, preparation, and parsing. Validation checks the symbolism and semantics of the loaded main class. Preparation creates static fields of the class or interface and initializes those fields to standard defaults. Resolution checks symbolic references of the main class to other classes or interfaces, which is optional in this step. The initialization of a class is the execution of the static initialization function declared in the class and the initialization constructor of the static field. Before a class can be initialized, its parent class must be initialized. The whole process is as follows: