The JVM runtime Data area (part 1)

This is the sixth day of my participation in Gwen Challenge

Our JVM series has been broken for a few days, friends in the background crazy private messages ah Q, want to see the follow-up content, today it came. Now that you’ve had a good look at the classloading subsystem in the last article, let’s demythe the run-time data section.

Run-time data area overview

Memory is a very important system resource, which is hard disk andCPUThe intermediate warehouse and bridge, carrying the real-time operation of operating systems and applications. The JVM memory layout defines the Java memory allocation, allocation, and management strategies during the running process, ensuring efficient and stable running of the JVM. There are some differences in how memory is divided and managed by different JVMS. Below is theHotSpotClassic memory layout:In the figureCodeCacheIt is classified as a meta-space in the official JVM documentation, while it is separately extracted in the official Ali documentation. The difference here does not affect our learning of it.

In the process of executing Java programs, the Java virtual machine will divide the data involved into different memory areas for management. This area is the runtime data area of the Java Virtual machine.As shown in the figure above, our runtime data is divided into PC registers, method areas, heaps, local method stacks, and virtual machine stacks. The meta-space mentioned above is the concrete implementation of the method area. Estimate some old iron will ask: isn’t there direct memory? In fact, direct memory is not part of the runtime data area, nor is it an area in the Java Virtual Machine specification. Its size is not limited by the Java heap size, but is usedNativeOut-of-heap memory allocated directly by the library is frequently used. It stores heap-specific data related to local methods, which can be avoided in the Java heap andNativeCopying data back and forth across the heap improves efficiency.

The red method area and heap are shared between threads. That is, they are created with vm startup and destroyed with VM exit. The blue ones are owned by each thread individually, that is, they correspond to the thread one by one and are created and destroyed as the thread starts and ends. In HotSpot JVM, each thread maps directly to the operating system’s native thread: when a Java thread is ready to execute, an operating system native thread is created at the same time, and when the Java thread terminates, the local thread is reclaimed. The operating system is responsible for scheduling all threads to any available CPU, and once the local thread is successfully initialized, it calls the Java off-the-shelf run() method.

We can take a look at the official documentation for the Runtime class:

Every Java application has a single instance of class Runtime that allows the application to interface with the environment in which the application is running. The current runtime can be obtained from the getRuntime method.

Every Java application has a runtime instance of a class that allows the application to interact with the environment in which it is running. The current runtime can be obtained from the getRuntime method.

If you don’t have a general idea of the running data area, let me give you a quick example:As shown in the image above, the cook is cooking. If we compare the cook to our virtual machine executing code, the cook is the execution engine we will mention later, and the tools and ingredients behind the cook are our run-time data area.In the process of writing this article found a little knowledge, so AH Q divided it into two parts to explain, this article first introduces the thread private area: PC register, local method stack and virtual machine stack.

PC register (program counter)

The register here is not a physical register in a broad sense, but rather an abstract simulation of the physical register, which is more appropriately called a PC counter (or instruction counter).

introduce

A Java virtual machine can support multiple threads of execution at a time, and each Java virtual machine thread has its own PC register that is unique to the thread. PC registers are created as the thread is created and die as the thread ends. Because the program counter records instruction addresses, it takes up less memory, making it the fastest storage area and the only one where the Java Virtual Machine specification does not specify any OutOtMemoryError. At any given time, each Java virtual machine thread is executing code for a single method, that is, the current method of that thread. If the thread is currently executing a method that is not native, the PC register contains the address of the Java virtual machine instruction that is currently executing. If the thread is currently executing a native method, the Java VIRTUAL machine’s PC register value is undefned.

role

The function of the PC register is to store the address pointing to the next instruction, that is, the instruction code to be executed, which is read by the execution engine and delivered to the CPU to execute. It is an indicator of program control flow, and basic functions such as branching, looping, jumping, exception handling, and thread recovery depend on this counter. We can think of the PC register as a line number indicator that records the bytecode executed by the current thread, or as a cursor that tells the program to execute in the order I specify. Next, use examples to demonstrate its position and role.

Ex. : As shown in the figure, the PC register stores the “instruction address” pointing to the “operation instruction”. If the instruction address stored in the PC register is “5”, the execution engine will take out the corresponding operation instructions, and then do two things: one is to operate the local variable table, operand stack to complete the data storage, take, add and subtract operations; Second, the operation instructions are translated into machine instructions that can be recognized by the CPU, and finally executed by the CPU; At this point, the bytecode interpreter changes the value in the PC register to “6”, and so on.

Analysis of interview questions

(1) Why use the PC register to record the execution address of the current thread?

JVM multithreading is implemented using a CPU time slice rotation algorithm, where threads switch in turn and allocate processor execution time. That is, one thread may be suspended during execution because the time slice runs out, while another thread gets the time slice and starts executing. When suspended thread to get back to the time when it wants to continue from where she was hung up, you must know it last execution to which position, when PC register is required to record a thread bytecode executive position, if the virtual machine is single thread is no need to use the program counter to record the location of each thread.

(2) Why is the PC register set to be thread private?

Because JVM multithreading is implemented by the way threads alternate and allocate processor execution time, at any given moment, only one processor executes instructions in one thread. Therefore, the best way to accurately record the current address of the bytecode instructions being executed by each thread is to allocate a PC register to each thread. In this way, the counters between each thread do not affect each other and are stored independently.

The virtual machine stack

The introduction of the stack

As we explained in our introduction to JVM Collections, due to its cross-platform nature, Java’s instructions are designed on a stack basis, following a “first in, first out, last in, last out” principle. Its advantages are that it is cross-platform, has a small instruction set and is easier for compilers to implement.

Here we make a simple distinction between “stack” and “heap”, where the stack is the unit of runtime, which deals with the problem of program execution, that is, how the program executes, or how it processes data; The heap is a unit of storage that addresses the problem of data storage, how and where data is stored. Let’s take a simple example: if you are repairing a car, we can think of repairing the car as stack operation, and putting the parts of the car into the car one by one can be regarded as stack storage.

This section describes the VM stack

Java Virtual machine stack, also known as Java stack in its early days. Each thread creates a virtual stack when it is created, so the virtual stack is thread-private and ends when the thread terminates. The JVM’s operations on the virtual machine stack are only in and out of the stack, making it second only to program counters in terms of access speed and a fast and efficient way to allocate storage. There is no garbage collection problem for the virtual stack, but the size of the virtual stack is dynamic or fixed, so it can have stack overflow or memory overflow problems:

Stack overflow: With a fixed stack size, the stack size for each thread can be selected independently at thread creation time. If a thread requests a stack that exceeds the maximum capacity allowed by the virtual stack, the virtual stack throwsStackOverflowErrorThe exception.
Memory overflow: The vm will be thrown if the stack can scale dynamically and cannot claim enough memory when attempting to scale, or if there is not enough memory to create the corresponding stack when creating a new threadOutOfMemoryErrorThe exception.

The size of the stack directly determines the maximum depth of function calls. We can configure the stack memory with the -xss parameter, appending the letters K or k for KB, m or M for MB, and G or G for GB, for example: -xss1m.

Stack frame operation principle

The virtual machine stack manages the execution of Java programs, holds local variables of methods (8 basic data types, reference addresses of objects), partial results, and participates in method calls and returns. Inside the virtual machine stack, each stack frame is stored.Stack Frame), each stack frame corresponds to each method being executed by that thread. A stack frame is a block of memory, a data set that holds various data information during the execution of a method. In an active thread, there is only one active stack frame at a time. That is, only the stack frame (top stack frame) of the currently executing method is valid. This stack frame is called the current stack frame (Current Frame), the method corresponding to the current stack frame is the current method (Current MethodThe class that defines this method is the current class (Current Class). All bytecode instructions run by the execution engine operate only on the current stack frame. The execution process is shown as follows:The program starts to execute. First, method 1 is pushed onto the stack, which is stack frame 1. At this time, stack frame 1 is the current stack frame. Then method 1 calls method 2, and method 2 is pushed into the stack frame, which is the current stack frame, and so on. When method 4 become stack frame 4 and execute code in 4 () method returns, the stack frame 4 4 results back to the way to the stack frame 3, then the virtual opportunity to discard stack frame 4 the stack frame 4 out of the stack, the stack frame 3 back into the current stack frame, and so on, until the method completes, the stack frame 1 stack, virtual machine stack are recycled.

Java methods can return functions in two ways. One is a normal function return using a return directive (including the void return type). One is to throw an exception (this is an unhandled exception, if it is a try… Catch, count as the first. Either way, it will cause the stack frame to go off the stack. The stack frames contained in different threads are not allowed to reference each other, that is, it is impossible to reference another thread’s stack frame in one stack frame.

As shown, stack frames are made up of local variators, operand stacks, dynamic links, method return addresses, and some additional information. Let’s take a look at each one.

Local Variables

A local variable table, also known as a local variable array or a local variable table, is actually a “number” array used to store the parameters of a method and local variables defined in the method body (including basic data types, object references,returnAddressType), the virtual machine returns using the local variable table completion method. Because the local variable table is built on the thread’s virtual machine stack, it is the thread’s private data, so there are no data security issues. In addition, the size of the stack frame is mainly affected by the local variable table, and the capacity required by the local variable table is determined at compile time and stored in the methodCodeProperties of themaxmum_local_variablesTherefore, the amount of memory allocated to a stack frame is not affected by the program runtime variable data, but only depends on the specific virtual machine implementation. In general, if the virtual machine stack size is fixed, the larger its local variation scale is, the larger its stack frame will be, and the fewer nested calls (method calls) it will have, i.e. the shallower the stack. A few bytecode diagrams illustrate the contents of the table of local variables:

The data in the local variable scale is only valid in the current method. During method execution, the virtual machine passes the parameter value to the parameter variable list by using the local variable table. When the method call ends, the local variable table is destroyed along with the method stack frame.

Solt

Arguments are always stored at index 0 of the local variable array and end at index 1 minus the array length. Its basic storage unit isSoltVariable slot. When an instance method is called, its method parameters and local variables defined inside the method body are copied to each of the local variables table in orderSoltOn. The JVM will locally change each of the tablesSlotAre assigned an access index through which the local variable value specified in the local variable table can be successfully accessed. The type less than 32 bits occupies only onesolt(including,returnAddressType,byte,short,char,floatAre converted tointType, andbooleanType 0 is 0false, non-zerotrue), 64-bit type (longanddouble) Take up twosolt. If you need to access the value of a 64bit local variable in the local variable table, you only need to use the previous index.A reference to the object if the current frame was created by the constructor or instance methodthisWill be storedindex0slot, the remaining parameters continue to be arranged in the order of the parameter list, andthisVariables do not exist in the static method of the local variable table, so abovemainMethod does not existthisThe variable.In additionSoltIf a local variable is out of scope, then new local variables declared after its scope are likely to reuse the local variable’s slot, thereby saving resources.

Variables can be divided into member variables and local variables according to their position in a class. Member variables are also divided into class variables and instance variables.

Member variables are assigned by default before they are used. Class variables are assigned by default during the preparation stage of the class-loading subsystem, and assignment is displayed during the initialization stage.
As the object is created, instance variables are allocated in the heap space and assigned by default.
Local variables are not assigned by default, so they must be explicitly assigned before they are used or the compiler will fail.

Variables in the local variable table are also important garbage collection root nodes, as long as objects referenced directly or indirectly in the local variable table are not collected.

Operand Stack

Operand stack is also called expression stack. During the execution of a method, data is written to or extracted from the stack according to bytecode instructions, that is, pushed and unloaded. The operand stack is mainly used to store the intermediate results of the calculation process and serve as temporary storage space for variables during the calculation process. Each operand stack has an explicit stack depth for storing data values. The maximum depth required is defined at compile time and stored in the method’scodeProperty, ismax_stackIs similar to the local variable scale above. The elements in the stack can be any Java data type, with 32bit using one stack unit depth and 64bit using two stack units depth. The data types of the elements in the operand stack must exactly match the sequence of bytecode instructions, which is verified by the compiler at compile time and again during the data flow analysis phase during the class validation phase during class loading. The interpretation engine of the Java Virtual Machine is the stack-based execution engine, where the stack refers to the operand stack. Given the above theory, you might look something like thisAh Q specially made a dynamic diagram to illustrate the operation process of PC register, local variable table and operand stack during the execution of bytecode instructions:

 public void test(a) {
      byte i = 15;
      int j = 8;
      int k = i + j;
  }
Copy the code

The size of the local variable table and operand stack is determined at compile time:

Address of the instruction to be executed first0At this time, the local variable table and operand stack data is empty;
When executing the first instructionbipush, will operand15Put it in the operand stack, and then set the value of the PC register to the execution address of the next instruction, i.e2;
When the execution instruction address is2The operand stack data is taken out and stored in the local variable table1Because the method is an instance method0The location is storedthisThe value in the PC register becomes3;
Repeat steps 2 and 38Put it on the operand stack, then take it out and store it in the local variable table. The value in the PC register is also changed by3->5->6;
When the address instruction is executed6,7,8, change the index position in the local table to1and2Reloads the data into the operand stack and proceedsiaddAdd operation, the result value will be stored in the operand stack, the value in the PC register by6->7->8->9;
Execute operation instructionistore_3, the data in the operand stack is taken out and stored in the local variable table index is3Execute the return command, and the method ends.

If the method being called has a return value, its return value is pushed into the operand stack of the current stack frame and updates the next bytecode instruction to be executed in the PC register.

Top of stack cache: All the elements on the top of the stack are cached in the physical CPU registers to reduce the number of memory reads and writes and improve the execution efficiency of the execution engine.

Dynamic Linking

When a Java source file is compiled into a bytecode file, all variable and method references are kept as symbolic references in the class file’s constant pool.When the bytecode file is loaded to the VIRTUAL machine, some data in the bytecode file, such as type information, domain information, method information, etc., is put into the method area, and the constant pool in the bytecode file is put into the runtime constant pool in the method area. Each stack frame contains a reference to the method that the stack frame belongs to in the runtime constant pool, and this reference is included so that the code supporting the current method can be dynamically linked. Dynamic linking is the process of converting symbolic references to direct references during the parsing phase of links in class loading.

Why do bytecode files need constant pools? Because bytecode file need data support, usually this kind of data is very big, so much so that cannot be directly deposited into the bytecode, in a way, point to the data symbols can be reference to bytecode file of constants in the pool, so that by simply using bytecode constant pool can through dynamic link to find the corresponding data at runtime and use.

Method Return Address

The return address of the method is used to hold the value of the PC register that called the method. As we all know, methods can end in one of two ways: by executing normally; One is an unhandled exception, an abnormal exit. Either way, after the method exits, it returns to where it was called before the program can continue. When the method returns, it may need to store some information in the stack frame to help restore the execution state of its upper calling method. When a method exits normally, the value of the current thread’s PC register is returned as the address of the next instruction that calls the method. In the case of an exception exit, the return address is determined by the exception table, which is generally not stored in the stack frame. In essence, the method exit is the process of the current stack frame out of the stack. At this point, it is necessary to restore the local variable table and operand stack of the upper method, push the return value into the operand stack of the caller’s stack frame, set the value of the PC register, etc., so that the caller’s method can continue to execute.

According to the different ways to complete the export, it can be divided into normal export and abnormal export:

The return value in the bytecode instruction for the normal completion exit is of typeireturn(boolean,byte,char,shortandint),lreturn(long),freturn(float),dreturn(double),areturn(Reference type) andreturnVoid, instance initialization methods, class and interface initialization methods.
When an exception is encountered during the execution of a method and the exception is not handled within the method, that is, as long as no matching exception handler is found in the exception handling table of the method, the method will exit, referred to as the exception completion exit. The exception handling table is used to store the exception handling when an exception is thrown during the execution of a method. It is convenient to find the code to handle an exception when an exception occurs.

The essential difference between the two approaches is that the exit exit of the exception does not return any value to its upper callers.

Some additional information

Stack frames also allow you to carry additional information about Java virtual machine implementations, such as support for program debugging.

Local method stack

To talk about the local method stack, let’s first introduce local methods.

Native Method

First of all, the local method is not in the runtime data area. Its location is shown below:

Native methods are simply Java interfaces that call non-Java code and are implemented by non-Java languages. The purpose of the native interface is to merge different programming languages for Java use. It was originally intended to merge C/C++ programs. nativeCan be used with all other Java identifiers, howeverabstractWith the exception of.

Why use itNative Method?

Diplomacy with the Java environment: Sometimes Java applications need to interact with environments outside of Java;
Interaction with the operating system: Using the native method, we can use Java to implement the interaction between the JRE and the underlying system.
Sun’s Java:Sun’s interpreter is implemented in C, which allows it to interact with the outside world like some normal C.

Native Method Stack

The local method stack is used to manage calls to local methods and is thread private. It can also be implemented with fixed or dynamically expandable memory sizes, similar to the virtual stack in terms of memory overflow. The Native Method Stack is implemented by registering Native methods in the Stack and loading the local Method library at Execution Engine time.

When a thread calls a local method, it enters a new world that is no longer restricted by the virtual machine and has the same permissions as the virtual machine:

Local methods Access the runtime data area inside a VIRTUAL machine through the local method interface.
You can use the host area of the local processor directly.
Allocate any amount of memory directly from the heap of local memory.

That’s all for today, if you are interested, you can pay attention to GZH “AH Q said code”! You can also add your friend Qingqing-4132, ah Q is looking forward to your arrival!

Background message to get Java dry goods materials: study notes and big factory interview questions