What is a virtual machine stack?

A stack, unlike arrays and linked lists, is a virtual data structure that can be implemented by arrays or linked lists. As you can see from this, the stack is relatively easy to implement. The operation of the stack is also relatively simple, mainly two, out of the stack and into the stack (push). It is adopted by the JVM because it is simple.

  • Why does the JVM choose the stack

Different CPU architectures are different, register-based coupling to the CPU is relatively high.

The instruction set of the virtual machine stack is small, which makes it easier for the compiler to interpret and implement cross-platform features. But the disadvantage is also obvious, the performance is much worse than the register. However, in order to be cross-platform, the JVM chose the stack

  • The characteristics of the stack

The virtual machine stack manages the running of JAVA programs, holds local variables of methods (basic data types and reference addresses of objects), partial results, and participates in method calls and returns. The virtual machine stack is implemented based on arrays and is second only to PC registers in access speed. There is no garbage collection problem in the stack due to simple operation, but there is OOM(stack also has depth).

  • What’s the difference between stack and heap

A stack is a unit of runtime, unlike a heap, which is a unit of storage.

The stack takes care of how the program executes and how it handles the data, while the heap takes care of how and where the data goes.

When each thread is created, it creates a virtual machine stack that holds a stack frame, one for each method call. Heap memory is shared by each thread.

What is stack frame?

The stack frame corresponding to the executing method is called the current stack frame. All bytecode instructions of the execution engine operate only on stack frames.

A stack frame is a block of memory, a data set that holds various data during the execution of a method.

If the current stack frame calls a new method, the corresponding stack frame of the new method will be created and pushed into the stack as the new stack frame.

JAVA methods have two ways of returning functions. One is to return a normal function using a return directive. Another is to throw an exception; Either way, this causes the stack frame to be ejected.

The internal structure of a stack frame

There are five parts of data in the stack frame, namely: local variable table, operand stack, method return address, dynamic link and some additional information.

The method returns the address, dynamic link, and some additional information, also known as the frame data area.

  • Local variable scale

The bottom layer of the local variable table is actually a numeric array, which is mainly used to store method parameters and local variables defined in the method body. These data types include basic data types, object references, and method return addresses. Exists in every stack frame of every stack of every thread, only valid in the current method, after the call ends, with the destruction of the stack frame, there is no thread-safety problem.

The size required for a local variable table is determined at compile time and does not change at run time. In the bytecode file of the class, you can see the size determined by compilation.

In the local variable table, types up to 32 bits occupy one slot (including the method return address) and types up to 64 bits occupy two slots. Byte, short, and char are converted to int before storage. Boolean is also converted to int, with 0 indicating false and non-0 indicating true. Long and double occupy both slots.

Slot is the basic storage unit for a local variable table. If the current frame is created by a constructor or instance method, the object reference to this will be placed in slot where index=0. Slots are reusable and can be reclaimed when a local variable is out of scope to save resources.

The number of nested calls to a method is determined by the stack size. If a method has a large number of parameters and local variables, it will inflate the local variable table, resulting in an increase in stack frame space and compression of the number of nested calls.

The difference between local variables and member variables:

Member variables Class variables A local variable
Initialize assignment before use. As the object is created, instance variable space is allocated in the heap space and default assignment is made Default values are assigned to class variables during the preparation phase of linking, and explicit values are assigned to class variables during initialization (static code blocks) An explicit assignment must be made before use; otherwise, compilation fails.

The part of the stack frame that is most relevant for performance tuning is the local variable table (which is not much to GC), and the virtual machine uses the local variable table to complete the method delivery during method execution. The local variable table is also an important garbage collection root node, as objects referenced directly or indirectly in local variables are not collected.

  • Operand stack (expression stack)

Operand stack in Java, using arrays. The Java Virtual machine’s interpretation engine is a stack-based execution engine, where the stack is the operand stack.

It is mainly used to store the intermediate results of the calculation process, and at the same time as a temporary storage space for variables in the calculation process. The operand stack is a workspace of the JVM’s execution engine. When a method starts executing, a new stack frame is created, which is empty.

Each operand stack has an explicit stack depth for storing values. The maximum depth required is determined at compile time and is stored in the method’s Code property as the max_stack value.

In the stack, 32-bit data accounts for one unit of depth, and 64-bit data accounts for two depths. If the called method has a return value, the return value is pushed into the operand stack of the current stack frame and updates the NEXT bytecode instruction to be executed in the PC register.

To optimize the execution of the operand stack, the JVM also uses top-of-stack caching:

All elements at the top of the stack frame and operand stack are cached in the CPU register to reduce the number of reads and writes to memory and improve the execution efficiency.

The difference between I ++ and ++ I is that I ++ takes I off the top of the stack, performs other calculations, and then increments it by 1 and puts it back on the stack. ++ I, add 1 first, then participate in other calculations, and then put back on the stack;

  • Dynamic link

A bytecode file defines a constant pool, and a dynamic link is a method reference pointing to the runtime constant pool.

When the source file is compiled into the bytecode file, all variable and method references are kept as symbolic references in the constant pool of the class file, and dynamic linking is used to translate these symbolic references into direct references to the calling method

  • Method return address

The return address of the method holds the value of the PC register that calls the current method.

Methods can end either normally or abnormally, and either way, the method returns to where it was called after it exits. When a method exits normally, the value of the caller’s PC register serves as the return address, the address of the next instruction that calls the method. However, the return address of an exception is determined by the exception table, which is generally not stored in the stack frame. That is, an exception-terminated method does not return any value to the caller

Exception handling when an exception is thrown during the execution of a method is stored in an exception handling table, which is convenient to find the code to handle exceptions when exceptions occur. When an exception is encountered, the method ends when no matching exception handling is found in the exception table.

Common return instructions are:

The name of the instructions
ireturn The return values are Boolean, byte, char, short, int
lreturn long
freturn foat
dreturn double
areturn Reference types
return Methods of void, instance initialization methods, class and interface initialization methods
  • Additional information

Stack frames allow you to carry additional information about the Implementation of the Java Virtual machine. For example, information that provides support for program debugging.

The stack frame corresponds to the method call procedure

First, methods need to be bound

  • Method binding

There are two types of method binding, early binding and late binding. A binding is the process by which a symbolic reference to a field, method, or class is replaced with a direct reference, which occurs only once.

Early binding: A method called can be bound to its type if the method is known at compile time and the runtime remains unchanged. Call method of parent class in subclass, can know the specific specification, so it is early binding; Methods modified by final cannot be overridden and are early bindings.

Late binding: If the recompile time of the called method cannot be determined, the related method can only be bound to the actual type at runtime. Calls to interfaces and abstract classes cannot know the calls to their concrete subclasses, so are late bound.

Early bindings correspond to static links and late bindings to dynamic links.

  • The way methods are linked

Static linking: When a bytecode file is loaded into the JVM, the process of converting a symbolic reference to a direct reference is called static linking if the target method being called is known to the compiler and the runtime remains the same.

Dynamic linking: If the invoked method cannot be determined by the compiler, symbolic references to the invoked method can only be converted to direct references during program execution.

Only after the chaining is complete can the JVM execution engine invoke the specific method through the invocation instruction.

Method calling instruction:

Invokestatic: Invokes static methods, and the parsing phase determines the unique method version

Invokespecial: Call the

method, private and parent methods, and determine the unique method version in the parsing phase

Invokevirtual: Calls all virtual methods

Invokeinterface: Invokes interface methods

Invokedynamic: Dynamically resolves the method to be invoked and executes it. This directive did not appear until Java7, an improvement to support the features of dynamically typed languages. This directive is often used in conjunction with Lambda expressions.

Type checking is performed at compile time in static languages and vice versa. Statically typed languages are used to determine the type information of variables themselves; Dynamically typed language is used to determine the type information of a variable value. A variable value has type information only when it has no type information.

The first four instructions are fixed inside the virtual machine, and the method invocation is performed without human intervention, whereas the InvokeDynamic instruction allows the user to determine the method version. Invokestatic and Invokespecial are non-virtual method instructions, and others are virtual method instructions. A virtual method directive does not necessarily call a virtual method; a non-virtual method directive must call a non-virtual method.

  • What is a (non-) virtual method?

Non-virtual methods: Methods are called non-virtual if the version of a particular call is determined at compile time and is immutable at run time. Static methods, private methods, final methods, instance constructors, and super methods are all non-virtual. Others are virtual methods.

Methods other than non-virtual methods are virtual methods.

  • Virtual method table

A virtual method table is a cache of virtual methods. Each time a virtual method is called, the above four steps may be repeated. To improve performance, the JVM creates a virtual method table in the method section of the class as a cache. The virtual method table is created and initialized during the link phase of class loading. After the class’s variables are initialized, the virtual method table is also initialized.

  • The nature of method rewriting

1. Find the first element at the top of the operand stack, the actual type of the object executed, and call it C.

2. If C finds a method whose description matches a simple name, the access permission check is performed. If the method passes, a direct reference is returned; if not, an IllegalAccessError is returned.

3. Otherwise, search and verify each parent class of C from top to bottom in the second step according to the inheritance relationship

An AbstractMethodError exception is raised if no suitable method is found

IllegalAccessError: This is normally handled at compile time, but if it happens at run time, an incompatible change has occurred to a class.

Do local methods also have stacks?

Local methods also have a corresponding stack, called a local method stack.

Local method stack, also thread private. The stack size can be fixed or dynamically expanded, and memory overflow is the same as the virtual machine stack.

The virtual machine stack is used to manage the invocation of Java methods, and the local method stack is used to manage the invocation of local methods.

When a local method is called, it enters a new world that is no longer bound by the virtual machine and has the same permissions as the virtual machine. It can allocate any amount of memory directly from the heap of memory, even directly using the processor’s registers, as well as using local methods to access the runtime data area inside the virtual machine.

The JVM specification does not specify the language, implementation, and data structure of the native method stack, so not all JVMS support native methods. In the Hotspot VIRTUAL machine, the local method stack and the virtual machine stack are combined directly.

  • Local method interface corresponding to local method stack

Methods modified by native are native methods. The implementation of this method is not written in Java as a C/C++ fusion program.

There are times during development when you need to interact with external systems, and the JDK doesn’t provide enough interfaces to meet your needs, which is the main reason for the existence of native methods; In addition, the operating system is written by C language, and the interaction with the operating system can be realized through local methods. Some functions of the JDK need to be provided by the underlying functions of the operating system;

Common exceptions and setting the stack size

The JVM specification allows the Size of the JAVA stack to be dynamic or fixed.

Fixed size: With a fixed size JAVA virtual machine stack, the stack size for each thread can be selected independently at thread creation time. StackOverflowError is raised if the stack size allocated by a thread request exceeds the maximum allowed capacity.

Dynamic scaling: An OutOfMemoryError is raised if the JAVA virtual machine stack can scale dynamically and cannot allocate enough memory when attempting to scale, or if there is not enough memory to create the corresponding stack when creating a new thread.

The unit is k=KB, m=MB, and g=GB. The default is bytes

-Xss1024
-Xss256k
-Xss1m
Copy the code

Set to dynamic extension, specified when the thread is created

public Thread(ThreadGroup group, Runnable target, String name,  
              long stackSize) {  
    init(group, target, name, stackSize);  
}  
Copy the code
  • What is the use of using PC registers to store byte code instruction addresses?

The CPU needs to keep switching threads, and when switching back, it needs to know where to start.

The JVM bytecode interpreter needs the value of the PC register to determine what bytecode instructions to execute next.

Frequently seen exam

  • Stack exception scenario? : //todo
  • Adjust the stack size so that there is no overflow? : For problem programs, only the overflow time can be affected. For some methods that do not require a lot of stack space, overflow may not occur.
  • Allocate stack memory, the bigger the better? : Not necessarily. It is good for a single method, but it may cause serious waste from the perspective of system resource utilization.
  • Does garbage collection involve the virtual machine stack? : No, the stack is only in and out of the stack, there is no GC for resource collection.
  • Are local variables defined in a method thread-safe? The stack frame is the private space inside each thread, so it is thread-safe.