001. Bytecode operation mechanism

1. A simple example

As in the previous example, the Java code is:

public class Hello {
    public static void main(String[] args) {
        System.out.println("Hello, World"); }}Copy the code

After compiling it, generate a class file and use XXD hello. class to get the binary code

00000000: cafe babe 0000 0034 001d 0a00 0600 0f09  ......4..00000010: 0010 0011 0800 120a 0013 0014 0700 1507.00000020: 0016 0100 063c 696e 6974 3e01 0003 2829. <init>... (a)00000030: 5601 0004 436f 6465 01000f4c 696e 654e V... Code... LineN00000040: 756d 6265 7254 6162 6c65 0100 046d 6169umberTable... mai00000050: 6e01 0016 285b 4c6a 6176 612f 6c61 6e67n... ([Ljava/lang00000060: 2f53 7472 696e 673b 2956 01000a53 6f75 /String;) V... Sou00000070: 7263 6546 696c 6501 000a 48656c6c 6f2e rceFile... Hello. 00000080: 6a617661 0c00 0700 0807 0017 0c00 1800  java............
00000090: 1901 000c 4865 6c6c 6f2c 20576f72 6c64 .... Hello, World 000000a0:0700 1a0c 001b 001c 01000548 656c 6c6f ........... Hello 000000b0:0100 106a 6176 612f 6c61 6e672f4f 626a ... java/lang/Obj 000000c0:6563 7401 0010 6a61 7661 2f6c 616e 672fect... java/lang/ 000000d0:5379 7374 656d 0100 036f 7574 0100154c System... out... L000000e0: 6a61 7661 2f69 6f2f 5072 696e 7453 7472  java/io/PrintStr
000000f0: 6561 6d3b 0100 136a 6176 612f 696f 2f50  eam;...java/io/P
00000100: 7269 6e74 5374 7265 616d 0100 0770 7269rintStream... pri00000110: 6e74 6c6e 0100 1528 4c6a 6176 612f6c61 ntln... (Ljava/la00000120: 6e67 2f53 7472 696e 673b 2956 0021 0005ng/String;) V.! .00000130: 0006 0000 0000 0002 0001 0007 0008 0001.00000140: 0009 0000 001d 0001 0001 0000 00052ab7 .............. *.00000150: 0001 b100 0000 0100 0a00 0000 0600 0100.00000160: 0000 0100 0900 0b00 0c00 0100 0900 0000.00000170: 2500 0200 0100 0000 09b2 0002 1203 b600  %...............
00000180: 04b1 0000 0001 000a 0000 000a 0002 0000. 00000190:0003 0008 0004 0001 000d 0000 0002 000e  ................
Copy the code

After using the Javap -c -S Hello directive, you get the following bytecode

Compiled from "Hello.java"
public class Hello extend Object{
  public Hello(a);
    descriptor: ()V
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]);
    descriptor: ([Ljava/lang/String;)V
    Code:
       0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
       3: ldc           #3                  // String Hello, World
       5: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;) V
       8: return
}
Copy the code

2. Stack-based execution engine

Virtual machine can be divided into two kinds from the implementation, to the stack based implementation, to the register-based implementation.

Of course, both approaches have their pros and cons, and our current JVM is a stack-based implementation.

Let’s focus on a simple approach

public void f(int a,int b){
  int c = a + b;
}
Copy the code

First, a method corresponds to something called a frame stack, which can be interpreted as a strange data structure.

What about this data structure? It contains three things

  1. A reference to the runtime constant pool
  2. Local variable table (think of it as an array of fixed length)
  3. Operand stack (think of lifO stack)

It may seem abstract, but let’s take an analogy

The above method body will become

0: iload_1 // push a onto the operand stack
1: iload_2 // push b onto the operand stack
2: iadd    // Take the top two values off the stack, add them, and put the result back on the top of the stack
3: istore_3 // Put the top of the stack into slot 3 in the local variable table
Copy the code

To put ourselves in the body of the method, when iload_1 is executed, we put a on the operand stack, and when iload_2 we put B on the operand stack

When iADD is executed, the instruction itself requires two arguments, so it is natural to pull two arguments out of the operand stack, which is called unstack

The iADD operator pushes the result back to the top of the stack. The subsequent istore_3 command stores the result in three local variable tables.

Local variables are always pre-assigned, e.g. Int c = a + b; When compiled, c is considered to be a local variable that needs to be present in the local variable table.

So bytecode running is always a process of loading and storing stacks of local variables and operands

3 Execution Process

// Compiled from "Hello.java"
1. public class Hello {
2.  public Hello(a);
3.    descriptor: ()V
4.    Code:
5.       0: aload_0
6.       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
7.       4: return
8. 
9.  public static void main(java.lang.String[]);
10.    descriptor: ([Ljava/lang/String;)V
11.    Code:
12.       0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
13.       3: ldc           #3                  // String Hello, World
14.       5: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;) V
15.       8: return
16.}
Copy the code

Line 3 to 7: You can see that the compiler automatically adds a default constructor function to the Hello class, even though there is no constructor function for it

Line 5: ALOad_0 This opcode is one of aload_x format opcodes. They are used to load object references onto the operand stack. X represents the location of the local variable array being accessed. What does the 0 represent here? We know that non-static functions have their first default argument, this, and aload_0 in this case pushes this

Line 6: Invokespecial #1, invokespecial calls instance initialization method, private method, parent method, #1 refers to the first in the constant pool, here is the method reference Java /lang/Object.””:()V, that is, the constructor function

Line 7: Return, a member of the iReturn, lreturn, freturn, dreturn, Areturn, and return opcodes, where I is an int that returns an integer, l is long, f is float, D for double and A for object reference. A return without a prefix type letter returns void

That’s it for the default constructor functions. Next, let’s look at the main function in lines 9 through 14

Line 12: getstatic # 2, getstatic access a static field of the specified class, and its value pressure into the stack, # 2 on behalf of the second in the constant pool, said here is Java/lang/System. Out: Ljava/IO/PrintStream; Is the static variable out of the java.lang.System class (type PrintStream)

Line 13: LDC #3, LDC is used to push constants from the runtime constant pool to the operand stack, #3 represents the third constant pool (string Hello, World).

Line 14: Invokevirtual #4, invokevirutal calls an instance method of an object, #4 refers to the printStream.println (String) function reference, and pushes the top two elements off the stack

Line 15: return returns void