The JAVA Virtual Machine stack

JVM system learning path series demo code address: github.com/mtcarpenter…

Overview of virtual machine stack

Due to its cross-platform design, Java’s instructions are designed on a stack basis. Different platforms have different CPU architectures, so they cannot be register-based. Advantages are cross-platform, small instruction set, compiler easy to implement, disadvantages are performance degradation, to achieve the same function requires more instructions. There are many Java developers who, when it comes to Java memory structures, have a very coarse-grained understanding of the area of memory in the JVM as just the Java heap and the Java stack. Why is that? First, the stack is the unit of runtime, and the heap is the unit of storage

The stack takes care of the execution of the program, that is, how the program executes, or how it processes data.
The heap solves the problem of data storage, where and where data is stored.

What is a Java virtual machine stack

Java Virtual Machine Stack (Java Virtual Machine Stack), also known as the Java Stack. Each thread creates a virtual Stack with Stack frames that correspond to each Java method call.
- Is thread private

The life cycle

The lifecycle is the same as the thread, that is, the thread terminates and the virtual stack is destroyed

role

Manages the execution of a Java program, which holds local variables of a method, partial results of a method, and participates in method calls and returns.

Local variables, which are compared to member variables (or attributes)
Primitive data type variables VS reference type variables (classes, arrays, interfaces)

The characteristics of the stack

The stack is a fast and efficient way to allocate storage, second only to program counters in access speed.
There are only two direct JVM operations on the Java stack:
- Each method executes with a push (push, push)
- Exit the stack after execution
There is no garbage collection problem for stacks (stacks overflow)
- GC; OOM

What exceptions are encountered during development?

Possible exceptions in the stack

The Java Virtual Machine specification allows the size of the Java stack to be dynamic or fixed.
- With a fixed size Java virtual machine stack, the size of the Java virtual machine stack for each thread can be selected independently at thread creation time. The Java virtual machine will throw a StackoverflowError if the thread request allocates more stack capacity than the maximum allowed by the Java virtual machine stack.
- An outofMemoryError will be thrown if the Java virtual machine stack can be dynamically extended and cannot allocate enough memory when attempting to extend it, or if there is not enough memory to create the corresponding virtual machine stack when creating a new thread.

/** * stack size: -xss256K: count: 2209 */
public class StackErrorTest {
    private static int count = 1;
    public static void main(String[] args) { System.out.println(count++); main(args); }}Copy the code

When I run this code on Windows 10 + I7, I run out of stack memory when the stack depth reaches 9788. We can use the -xss option to set the maximum stack size of a thread. The stack size directly determines the maximum depth of a function call.

The storage unit of a stack

Each thread has its own stack, and the data in the stack isA Stack Frame format exists 。
Each method being executed on this thread corresponds to a Stack Frame.
A stack frame is a block of memory, a data set that holds various data information during the execution of a method.

What is stored in the stack?

Each thread has its own Stack, and the data in the Stack is stored in the format of Stack frames. Each method being executed on this thread corresponds to a Stack Frame. A stack frame is a block of memory, a data set that holds various data information during the execution of a method.

OOP basic concepts: Class and object classes in the basic structure: field (property, field, field), method

The ONLY two direct JVM operations on the Java stack are on stack framesThe pressure of stack 和 Out of the stack, following the “fifo”/” LIFO “principle.
In an active thread, there is only one active stack frame at a time. That is, only the stack frame (top stack frame) of the currently executing method is valid, and this stack frame is calledCurrent stack FrameThe method corresponding to the current stack frame isCurrent MethodThe class that defines this method isCurrent Class 。
All bytecode instructions run by the execution engine operate only on the current stack frame.
If another method is called in this method, a new stack frame is created and placed at the top of the stack as the new current frame.

Pass a simple test of code

/ * * *@author shkstart
 * @create2020 4:11 PM The * * method ends in two ways: * (1) The method ends normally, represented by return * (2) The method ends by throwing an exception that is not caught or processed * */
public class StackFrameTest {
    public static void main(String[] args) {
        try {
            StackFrameTest test = new StackFrameTest();
            test.method1();
        } catch (Exception e) {
            e.printStackTrace();
        }
        System.out.println("Main () ends normally");
    }
    public void method1(a){
        System.out.println("Method1 () starts executing...");
        method2();
        System.out.println("Method1 () completes execution...");
// System.out.println(10 / 0);
// return ; // Can be omitted
    }
    public int method2(a) {
        System.out.println("Method2 () starts executing...");
        int i = 10;
        int m = (int) method3();
        System.out.println("Method2 () coming to an end...");
        return i + m;
    }
    public double method3(a) {
        System.out.println("Method3 () starts executing...");
        double j = 20.0;
        System.out.println("Method3 () coming to an end...");
        returnj; }}Copy the code

The output result is:

Method1 () starts executing... Method2 () starts executing... Method3 () Starts executing... Method3 () is coming to an end... Method2 () is coming to an end... Method1 () End... Main () Ends normallyCopy the code

The stack is advanced and then out. You can view the stack information through the DEBUG of idea.

Stack operation principle

The stack frames contained in different threads are not allowed to reference each other, that is, it is impossible to reference another thread’s stack frame in one stack frame.
If the current method calls another method, when the method returns, the current stack frame will return the execution result of this method to the previous stack frame, and then the virtual machine will discard the current stack frame, making the previous stack frame become the current stack frame again.
Java methods have two ways of returning functions. One is to return a normal function using a return directive. The other is to throw an exception. Either way, the stack frame will be ejected.

Each stack frame stores:

Local Variables
Operand Stack (or expression Stack)
DynamicLinking (or method reference pointing to the runtime constant pool)
Method Return Address (or definition of method normal exit or abnormal exit)
Some additional information

Parallel the stacks under each thread are private, so each thread has its own stack, and there are many stack frames within each stack. The size of stack frames is mainly determined by the local variable table and operand stack.

Local variable scale

Local Variables: Local Variables are called Local variable arrays or Local Variables
An array of numbers used to store method parameters and local variables defined in the method body. These data types include basic data types, object references, and return Address types.
Since the local variable table is built on the stack of the thread, it is the thread’s private data, so there is no data security problem
The size required by the local variables table is determined at compile time and stored in the Maximum Local Variables data item in the Code attribute of the method. The size of the local variable scale does not change during method execution.
The number of nested calls to a method is determined by the stack size. In general, the larger the stack, the more nested method calls. For a function, the more parameters and local variables it has, which causes the local variable table to swell, the larger its stack frame will be to meet the need for more information to be passed through method calls. In turn, function calls take up more stack space, resulting in fewer nested calls.
Variables in the local variable table are only valid in the current method call. During method execution, the virtual machine passes the parameter values to the parameter variable list using the local variable table. When the method call ends, the local variable table is destroyed along with the method stack frame.

Understanding of Slot

Parameter values are always stored at index 0 of the local variable array and end at index -1 of the array length.
Local variable table. The most basic storage unit is Slot. Local variable table stores variables of the basic data types known at compilation time (8 types), reference types, and return Address types.
In the local variable table, types up to 32 bits occupy only one slot (including return Address types), and 64-bit types (long and double) occupy two slots.
- Byte, short, and char are converted to int before storage. Boolean is also converted to int, where 0 means false and non-0 means true.
- Long and double occupy both slots.
The JVM assigns an access index to each Slot in the local variable table, which successfully accesses the value of the local variable specified in the local variable table
When an instance method is called, its method parameters and local variables defined inside the method body are copied to each slot in the local variable table in sequence.
If you need to access the value of a 64bit local variable in the local variable table, you only need to use the previous index. (e.g., accessing a long or double variable)
If the current frame is created by a constructor or instance method, the object reference to this will be placed in slot with index 0, and the rest of the arguments will continue in the argument list order.

Slot reuseThe slots in the local variable table in the stack frame can be reused. If a local variable goes out of its scope, the new local variable declared after its scope is likely to reuse the slots of the expired local variable, so as to achieve the purpose of saving resources.

public class SlotTest {
    public void localVar1(a){
        int a = 0;
        System.out.println(a);
        int b= 0;
    }

    public void localVar2(a){{int a = 0;
            System.out.println(a);
        }
        // In this case, b will reuse a slot
        int b= 0; }}Copy the code

Static versus local variables

Classification of variables:

By data type: basic data type, reference data type
According to the position of the declaration in the class: member variables (class variables, instance variables), local variables
- Class variable: the default value is assigned to the class variable during the linking paper phase, and the static code block is displayed for the class variable during the init phase
- Instance variables: As the object is created, instance variable space is allocated in the heap space with default assignment
- Local variables: Must be explicitly assigned before use, or the compiler will fail.
After the parameter list is allocated, it is allocated according to the order and scope of the variables defined in the method body.
We know that the class variable table has two chances to be initialized. The first is during the “preparation phase”, when system initialization is performed and the class variable is set to zero, and the second is during the “initialization” phase, when the programmer is given the initial values defined in the code.
Unlike class variable initialization, there is no system initialization for a local variable table, which means that once a local variable is defined, it must be initialized manually or it cannot be used.

added

The part of the stack frame that is most relevant for performance tuning is the local variable table mentioned earlier. When a method executes, the virtual machine uses a local variable table to complete the method’s delivery.
Variables in the local variable table are also important garbage collection root nodes, as long as objects referenced directly or indirectly in the local variable table are not collected.

The operand stack

Operand Stack: Operand Stack

Each individual Stack frame contains a last-in-first-out operand Stack, also known as the Expression Stack, In addition to the local variable table.
Operand stack, during the execution of a method, data is written to or extracted from the stack according to bytecode instructions, i.e. push and pop.
- Some bytecode instructions push values onto the operand stack, while others push operands off the stack. Use them and push the results onto the stack
- For example, copy, swap, and sum operations are performed

The case presentation willtestAddOperation()Compiled into bytecode as follows

Operand stack, mainly used to store the intermediate results of the calculation process, and as a temporary storage space for variables during the calculation process.
The operand stack is a workspace of the JVM execution engine. When a method is first executed, a new stack frame is created. The operand stack of this method is empty. .
- In this case, the array has length, because once it’s created, it’s immutable
Each stack of operands has an explicit stack depth for storing values. The maximum depth required is defined at compile time and stored in the method’s Code property as the value of maxStack.
Any element in the stack is a Java data type that can be arbitrary
- 32-bit types occupy one stack unit depth
- 64-bit types occupy two stack units of depth
The operand stack does not access the data by accessing the index, but only once through the standard push and push operations
If the called method has a return value, the return value is pushed into the operand stack of the current stack frame and updates the NEXT bytecode instruction to be executed in the PC register.
The data types of the elements in the operand stack must exactly match the sequence of bytecode instructions, which is verified by the compiler during compiler time and again during data flow analysis during class validation during class loading.
In addition, we say that the Interpretation engine of the Java Virtual machine is a stack-based execution engine, where the stack refers to the operand stack.

The code tracking

Code tracking is done using our code in the operand stack:

    public void testAddOperation(a) {
        byte i = 15;
        int j = 8;
        int k = i + j;
    }
Copy the code

Decompile the class file using the javap command: javap -v Class name. Class

  public void testAddOperation(a);
    descriptor: ()V
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=4, args_size=1
         0: bipush        15
         2: istore_1
         3: bipush        8
         5: istore_2
         6: iload_1
         7: iload_2
         8: iadd
         9: istore_3
        10: return

Copy the code

Byte, short, char, and Boolean are all stored internally using ints. As we can see from the above code, we are all pushing operands 15 and 8 using bipush and adding them using iadd. I -> stands for int, which is an addition operation of type int

The execution process is as follows:

The first statement is executed, the PC register points to 0, that is, the instruction address is 0, and then the operand 15 is pushed using bipush.After executing, let the PC register + 1 point to the next line of code, which stores the operand stack elements in the local variable table 1. We can see that the local variable table has added an element

Why doesn’t the local variable scale start at zero? In fact, the local variable table also starts at 0, but since 0 stores the this pointer, it directly omits ~

Then the PC register +1 points to the next line. The operand 8 is also pushed, and the store operation is performed to store it in the local variable table The data is then placed in turn on the operand stack from the local variable table The two elements in the operand stack are then added together and stored at the location of the local variable table 3 The last PC register position points to 10, which is the return method, which exits the method directly

Top of stack caching

Top of stack caching: As mentioned before by Top Of Stack Cashing, the zero address instruction used by virtual machines based on Stack architecture is more compact, but more instructions for loading and unloading are necessary to complete an operation. This also means more instruction dispatches and more memory reads/writes. Because operands are stored in memory, frequent memory read/write operations inevitably affect execution speed. In order to solve this problem, the designers of HotSpot JVM proposed Tos (top-of-stack Cashing) technology, which caches all the top-of-stack elements in the registers of the physical CPU to reduce the number of reads/writes to memory and improve the execution efficiency of the execution engine.

Register: Fewer instructions, faster execution

Dynamic link

Dynamic Linking

Dynamic linking, method return address, additional information: some areas are called frame data areas

Each stack frame contains a reference to the method in the runtime constant pool to which the stack frame belongs for the purpose of enabling Dynamic Linking for the code supporting the current method. For example, an Invokedynamic instruction
When a Java source file is compiled into a bytecode file, all variable and method references are kept as symbolic references in the class file’s constant pool.

For example, describing a method that calls another method is represented by symbolic references to the method in the constant pool, so dynamic linking is used to convert these symbolic references to direct references to the calling method.

Why do you need a runtime constant pool?
- Because constants or methods can be called in different methods, you only need to store one copy, saving space
- Constant pool is used to provide symbols and constants for instruction identification

Method invocation: parse and dispatch

In the JVM, the conversion of symbolic references to direct references to calling methods is related to the method binding mechanism

link

Static link

When a bytecode file is loaded into the JVM, if the target method being called is restrained at compile time and the run time remains the same, the process of converting the symbolic reference of the calling method into a direct reference is called static linking

Dynamic link

If the method to be called cannot be determined at compile time, that is to say, the symbol of the method to be called can only be converted into a direct reference during the run time of the program. Because this reference conversion process is dynamic, it is also called dynamic linking.

Binding mechanism

The Binding mechanism of the corresponding method is: Early Binding and Late Binding. Binding is the process by which a symbolic reference to a field, method, or class is replaced with a direct reference, which happens only once.

Early binding

Early binding is invoked if the target method at compile time, and the run-time remains the same, this method can be bound with subordinate type, as a result, due to clearly define the target method is called which one on earth is, therefore, you can use the static link way of converting symbols refer to reference directly.

Late binding

If the method to be called cannot be determined at compile time, only the related method can be bound according to the actual type at run time, which is called late binding.

The development history of early and late binding

As a high-level language, similar to the Java based on object oriented programming language nowadays more and more, even though this type of programming language in grammar has certain difference on the style, but they always maintained a commonality among each other, that’s all support object-oriented features such as encapsulation, inheritance, and polymorphism, since this kind of programming languages have polymorphism characteristics, Naturally, there are two types of binding: early binding and late binding. Any ordinary method in Java has the characteristics of virtual functions, which are the equivalent of virtual functions in C++ (which are explicitly defined using the keyword virtual). If you do not want a method to have the characteristics of a virtual function ina Java program, you can mark the method with the keyword final.

Virtual methods and non-virtual methods

If a method is called at compile time, that version is immutable at run time. Such methods are called non-virtual methods.
Static methods, private methods, final methods, instance constructors, and superclass methods are all non-virtual.
Other methods are called virtual methods.

The premise of using the polymorphism of subclass objects

Class inheritance relationships

Method rewriting

The virtual machine provides the following method call instructions:

Ordinary call instructions:

Invokestatic: Invokes static methods, and the parsing phase determines the unique method version
Invokespecial: Call method, private and parent methods. The parsing phase determines the unique method version
Invokevirtual: Calls all virtual methods
Invokeinterface: Invokes interface methods

Dynamic call instruction:

Invokedynamic: Dynamically resolves the method to be invoked and executes it

The first four instructions are fixed inside the virtual machine, and the method invocation is performed without human intervention, whereas the InvokeDynamic instruction allows the user to determine the method version. The invokestatic and Invokespecial commands call non-virtual methods, and the rest (excluding final modifications) are virtual methods. Invokednamic instruction

The JVM bytecode instruction set was relatively stable until Java7 added an InvokeDynamic instruction, an improvement Java made for dynamically typed language support.
In Java7, however, there is no way to generate invokedynamic instructions directly. You need to use ASM, the underlying bytecode tool, to generate invokedynamic instructions. Until the advent of Java8’s Lambda expressions, the generation of invokedynamic instructions, there was no direct generation in Java.
The dynamic language type support added in Java7 is essentially a modification of the Java virtual machine specification, not a modification of the Java language rules. This is a relatively complex area, and the addition of method calls in the virtual machine will most directly benefit the dynamic language compiler running on the Java platform.

The difference between dynamically typed and statically typed languages is whether type checking is done at compile time or at run time. The former is statically typed and the other is dynamically typed. To put it more bluntly, statically typed languages judge the type information of variables themselves; Dynamically typed language is the type information used to judge the value of a variable. The value of a variable has type information only when there is no type information, which is an important feature of dynamic language.

Java: String info = “mogu blog”; JS: var name = “shkstart”; var name = 10; (Check only at runtime)

The nature of method rewriting

The nature of method rewriting in the Java language:

Find the actual type of the object executed by the first element at the top of the operand stack and call it C.
If a method is found in type C that matches both the description and the simple name in the constant, the access permission is checked. If it passes, a direct reference to the method is returned, and the search process ends. If not through, it returns the Java. Lang. An IllegalStateException anomalies.
Otherwise, search and verify step 2 for each parent class of C from bottom to top according to inheritance relationship.
If didn’t find the right way, it throws the Java. Lang. AbstractMehodError anomalies.

Introduce an IllegalStateException

The program attempts to access or modify a property or call a method that you do not have access to. Normally, this will cause a compiler exception. This error, if it occurs at run time, indicates an incompatible change to a class.

Method invocation: virtual method table

In object-oriented programming, dynamic dispatch is frequently used, and it may affect the execution efficiency if you have to search for the appropriate target in the method metadata of the class during each dynamic dispatch. Therefore, to improve performance, the JVM implements this by creating a virtual Method table in the method section of the class (non-virtual methods do not appear in the table). Use index tables instead of lookups.
Each class has a virtual method table that holds the actual entry to each method.

When was the virtual method table created?

The virtual method table is created and initialized during the linking phase of the class load, and the JVM initializes the method table for that class after the class’s variable initializers are ready.As shown above: if a method is overridden in a class, it will be called directly in the virtual method table, otherwise it will be connected directly to Object’s method.

Method return address

Holds the value of the PC register that called the method.
There are two ways to end a method:
- Normal Execution Completed
- Unhandled exception, abnormal exit
Either way, the method is returned to where it was called after it exits. When a method exits normally, the value of the caller’s PC counter is returned as the address of the next instruction that calls the method. However, if an exception exits, the return address is determined by the exception table, which is generally not stored in the stack frame.
When a method is executed, there are only two ways to exit the method:
When the execution engine meets any bytecode instruction (return) returned by a method, the return value will be passed to the upper method caller, referred to as normal completion exit.
- Which return instruction to use after a method is normally called depends on the actual data type of the method’s return value.
- In bytecode instructions, return instructions include iReturn (used when the return value is Boolean, byte, CHAR, short, and int), LReturn (Long), freturn (Float), dreturn (Double), Areturn. There is also a void method declared by the return directive, used by instance initializers, class and interface initializers.
If an Exception is encountered during the execution of a method and the Exception is not handled within the method, that is, if no matching Exception handler is found in the Exception table of the method, the method will exit.
During the execution of the method, the exception processing when the exception is thrown is stored in an exception processing table, which is convenient to find the code to handle the exception when the exception occurs

In essence, the method exit is the process of the current stack frame out of the stack. At this point, you need to restore the local variable table of the upper method, operand stack, push the return value into the operand stack of the caller’s stack frame, set the PC register value, etc., and let the caller’s method continue to execute.
The difference between a normal completion exit and an exception completion exit is that an exception completion exit does not return any value to its upper callers.

Some additional information

Stack frames also allow you to carry additional information about the Java virtual machine implementation. For example: support information for program debugging.

Stack related interview

Example stack overflow? (StackOverflowError)
- Set the stack size with -xss
Can stack size be adjusted to prevent overflow?
- No guarantee against overflow
Is it better to allocate more stack memory?
- No, it reduces the OOM probability for a while, but it takes up space for other threads because the space is limited.
Does garbage collection involve the virtual machine stack?
- Don’t
Are local variables defined in a method thread-safe?
- Case by case

/** * Are local variables defined in methods thread-safe? What is thread-safe? * If only one thread can manipulate this data, it must be thread-safe. * The data is shared if more than one thread operates on it. If synchronization is not taken into account, there are thread safety issues. * /
public class StringBuilderTest {

    int num = 10;

    //s1 is declared thread-safe
    public static void method1(a){
        //StringBuilder: thread unsafe
        StringBuilder s1 = new StringBuilder();
        s1.append("a");
        s1.append("b");
        / /...
    }
    //sBuilder is not thread safe
    public static void method2(StringBuilder sBuilder){
        sBuilder.append("a");
        sBuilder.append("b");
        / /...
    }
    // the s1 operation is thread unsafe
    public static StringBuilder method3(a){
        StringBuilder s1 = new StringBuilder();
        s1.append("a");
        s1.append("b");
        return s1;
    }
    // the s1 operation is thread-safe
    public static String method4(a){
        StringBuilder s1 = new StringBuilder();
        s1.append("a");
        s1.append("b");
        return s1.toString();
    }

    public static void main(String[] args) {
        StringBuilder s = new StringBuilder();


        new Thread(() -> {
            s.append("a");
            s.append("b"); }).start(); method2(s); }}Copy the code

To sum up, an object is thread-safe if it is created internally and dies internally without being returned externally, and thread-unsafe if it is not. Is there Error and GC in the run-time data area?

Runtime data area	Is there an Error?	Whether GC exists
Program counter	no	no
The virtual machine stack	is	no
Local method stack	is	no
Methods area	Is (OOM)	is
The heap	is	is

conclusion

To briefly review this chapter, what is a Java virtual machine stack? When each thread is created, it creates a virtual machine stack, which holds stack frames for each Java method call. The stack is a fast and efficient way to allocate storage, second only to program counters in access speed. The pressing and unloading of stack frames follow the principle of “first in, last out”/” last in, first out “. Stack frames contained in different threads are not allowed to reference each other. JVM stack top cache technology, all the stack top elements are cached in the physical CPU register, so as to reduce the memory read/write times, improve the execution engine execution efficiency. A method returns the address, the end of a method, in two ways: normal execution completes, an unhandled exception occurs, and an abnormal exit.

Welcome to pay attention to the public number Shanma carpenter, I am Xiao Chun brother, engaged in Java back-end development, will be a little front-end, through the continuous output of a series of technical articles to literary friends, if this article can help you, welcome to pay attention to, like, share support, _ we see you next period!