Small knowledge, big challenge! This paper is participating in theEssentials for programmers”Creative activities

This article also participated in the “Digitalstar Project” to win a creative gift package and creative incentive money

Java architecture

Description of Java Conceptual Diagram

We often talk about JVM tuning. What is the relationship between the JVM and the JDK? This is Java basics.

This is an important picture to understand. This is the structure diagram of the JDK. From the structure, we can see that the Java architecture system, JDK mainly contains two parts:

Part 1: Java Tools (Tools&Tool APIs)

Such as Java, javac, javap and so on. All the commands we use are here

Part two: Java Runtime Enveriment, which is the heart of Java.

The JRE defines the core libraries required by the Java runtime, such as the lang package, util package, Math package, Collection package, and so on. There is also an important part of the JVM(the last part is cyan) Java Virtual Machine, which is also part of the JRE, which is part of the Java runtime environment. Here’s a closer look:

  • At the bottom is the Java Virtual Machine: Java Virtual Machine
  • Common base libraries: lang and util. It defines common Math, Collections, Regular Expressions, Logging, Reflection, and so on.
  • Other extension libraries: Beans, Security, Serialization, Networking, JNI, Date and Time, Input/Output, etc.
  • Integration integration library: JDBC database connection, JNDI, Scripting, etc.
  • User Interface Toolkits.
  • Deployment tools: Deployment, etc.

As you can see, the JVM is the lowest level of the entire JDK. The JVM is part of the JDK.

The cross-platform features of the Java language

1. How is the Java language cross-platform?

Cross-platform is when a programmer creates a set of code that runs on Windows, runs on Linux, and runs on Macs. We know that the instructions that the machine ultimately runs are binary instructions. The same code might generate binary instructions 0101 on Windows, but 1001 on Linux and 1011 on a MAC. So if you want to run the same code on different platforms and put it on the same platform, you have to change the code, but Java doesn’t, so how does this cross-platform feature work?

The reason is that in the JDK, we end up compiling the program into binary and leaving it to run on the JVM, which is part of the JRE. We download JDK differently from platform to platform. We download JDK for Windows platform, Linux JDK for Linux, MAC JDK for MAC. JVMS on different platforms have a specific implementation for that platform, and it is this implementation that makes Java cross-platform.

2. Stretch your thinking

From the above analysis, we know that cross-platform is possible because the JVM encapsulates change. We often talk about JVM tuning, but are tuning parameters generic across platforms? Obviously not. JVMS on different platforms are particularly personalized.

The part that encapsulates change is the JVM in the JDK. What is the overall structure of the JVM? Let’s look at the next section.

Iii. JVM overall structure and memory model

1.The JVM consists of three parts:

  • Classloading subsystem
  • Runtime data area (memory model)
  • Bytecode execution engine

The class loading subsystem, implemented in C++, loads classes into the virtual machine. This section is the class loader that was analyzed earlier to load classes, using the parent delegate mechanism, and put the class load into the JVM virtual machine.

The bytecode execution engine then reads the data from the virtual machine. The bytecode execution engine is also implemented in c++. We focus on the runtime data area.

2. Composition of the runtime data area

The runtime data area mainly consists of five parts: heap, stack, local method stack, method area, and program counter.

3. The three parts of THE JVM work closely together

Let’s take a look at how the classloading subsystem, runtime data area, and bytecode execution engine work together when an application is running.

Let’s take an example:

package com.lxl.jvm; public class Math { public static int initData = 666; public static User user = new User(); public int compute() { int a = 1; int b = 2; int c = (a + b) * 10; return c; } public static void main(String[] args) { Math math = new Math(); math.compute(); }}Copy the code

What are we doing when we execute main?

Step 1: The class loading subsystem loads the Math.class class and then throws it into memory. This is the part of the class loading process that was explored in the previous blog

Step 2: Processing the bytecode files in memory. This section is more extensive and the focus of our research, and we’ll cover each section in more detail later

Step 3: the memory code in the Java virtual machine is executed by the bytecode execution engine, which is also implemented by c++

The core part here is the second run-time data area (memory model) for which we will tune later.

Let’s look at the memory area in more detail

This is the Java memory region, what does the memory region do? The memory area is actually where you put data, all kinds of data j in different memory areas

Stack four.

A stack is used to store variables

4.1. The stack space

Again, using the Math example, when the program runs, a thread is created, and when a thread is created, a small space is allocated within the large stack space to hold variables from the current thread

 public static void main(String[] args) {
        Math math = new Math();
        math.compute();
    }
Copy the code

For example, for this code to run, it will first allocate a small space within the large stack space. The local variable math is stored in the allocated small space.

Here we run the math.pute () method and look at the internal implementation of the compute method

public int compute() {
        int a = 1;
        int b = 2;
        int c = (a + b) * 10;
        return c;
    }
Copy the code

We have local variables a, B, c, where do we put these local variables? Also in the small stack space allocated above.

The effect is shown above, where a small area of stack space is allocated for local variables in the Math class

What if there’s another thread? Once again, we allocate a small space in the stack space to store the variables inside the new thread

Are variables in main and variables in compute() grouped together? How do they do it? This brings us to the concept of stack frames.

4.2. The stack frame

1. What is stack frame?

package com.lxl.jvm; public class Math { public static int initData = 666; public static User user = new User(); public int compute() { int a = 1; int b = 2; int c = (a + b) * 10; return c; } public static void main(String[] args) { Math math = new Math(); math.compute(); }}Copy the code

So again, let’s see, when we start a thread to run main, a new thread starts, and it allocates a little stack space. The main method is then allocated an area of stack space called stack frame space.

When the program runs to compute(), it calls compute(), which allocates another stack frame space for the compute() method.

2. Why put different methods in a thread in different stack frame Spaces?

On the one hand: local variables in our different methods are not mutually accessible. For example, compute a, B, and c cannot be accessed in main. Stack frames are used for good isolation.

On the other hand, it is convenient for garbage collection. When a method is used up and its value is returned, the variable in it is garbage.

In Math, two methods, when running to main, place main in a stack frame space that holds only local variables from main. When running to compute, a stack frame space is created. This space holds only local variables of the compute() method.

Different methods open up different memory Spaces, which makes it easier for us to manage local variables of each method, as well as facilitate garbage collection.

3. Stack algorithm in Java memory model

We learned about the stack algorithm, and the stack algorithm is first in last out. So is the stack in our memory model the same as the stack in the algorithm? Is there a connection?

Stacks in our Java memory model use the stack algorithm, first in, last out. So, for example, this code right here

package com.lxl.jvm; public class Math { public static int initData = 666; public static User user = new User(); public int compute() { int a = 1; int b = 2; int c = (a + b) * 10; return c; } public int add() { int a = 1; int b = 2; int c = a + b; return c; } public static void main(String[] args) { Math math = new Math(); math.compute(); math.add(); // Notice that the compute() method is called twice}}Copy the code

What is the memory model being loaded at this point?

  1. The first method on the stack is main, which allocates a block of stack frame space in the thread stack.
  2. Compute is called inside main, and it creates a stack frame space for a compute method, and we know that compute is loaded later, but it’s executed first, and when it’s done, local variables in compute are recycled, so out of the stack.
  3. After executing the add method, allocate a stack frame space to the add method. When add completes, it exits the stack.
  4. The main method is finally executed, and the main method is finally removed from the stack. This algorithm just validates fifo. Postloaded methods are executed first. It also follows the logic of program execution.

4.3 Internal structure of stack frames

As mentioned above, each method has a corresponding stack frame space when running. What is the internal structure of stack frame space?

There are many parts inside the stack frame, and we will focus on the following four parts:

1. Local variable table 2. Operand stack 3Copy the code

4.2.1 Local variable table: Store local variables

The local variable scale, as the name suggests, is used to store local variables.

4.2.2 Operand stack

So what do operand stacks, dynamic links, method exits do? Let’s use an example to illustrate the operand stack

So how do these four parts work?

We compare the analysis with the execution of the code.

We want to look at the JVM decompiled bytecode file, which is generated using the Javap command.

What does the javap command do? You can check out the javap help documentation

Javap-c and javap-v are used

Javap-c: decompiles code javap-v: outputs additional information, which is much more verbose than javap-cCopy the code

The following command generates a bytecode file for math.class. We generate it to a file

javap -c Math.class > Math.txt
Copy the code

Open the math.txt file as follows. This is decompiling Java bytecode into JVM assembly language.

Compiled from "Math.java"
public class com.lxl.jvm.Math {
  public static int initData;
​
  public static com.lxl.jvm.User user;
​
  public com.lxl.jvm.Math();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return
​
  public int compute();
    Code:
       0: iconst_1
       1: istore_1
       2: iconst_2
       3: istore_2
       4: iload_1
       5: iload_2
       6: iadd
       7: bipush        10
       9: imul
      10: istore_3
      11: iload_3
      12: ireturn
​
  public static void main(java.lang.String[]);
    Code:
       0: new           #2                  // class com/lxl/jvm/Math
       3: dup
       4: invokespecial #3                  // Method "<init>":()V
       7: astore_1
       8: aload_1
       9: invokevirtual #4                  // Method compute:()I
      12: pop
      13: return
​
  static {};
    Code:
       0: sipush        666
       3: putstatic     #5                  // Field initData:I
       6: new           #6                  // class com/lxl/jvm/User
       9: dup
      10: invokespecial #7                  // Method com/lxl/jvm/User."<init>":()V
      13: putstatic     #8                  // Field user:Lcom/lxl/jvm/User;
      16: return
}
Copy the code

This is the decompiled bytecode file generated by the JVM.

To understand this, we need to know the JVM documentation manual. Now we don’t it doesn’t matter, refer to the article (www.cnblogs.com/ITPower/p/1)… At the end of the content, encountered to check the back of the line

Let’s take the compute() method as an example of how this method is handled on the stack

Public int compute() {int a = 1; int b = 2; int c = (a + b) * 10; return c; } JVM directive public int compute(); Code: 0: iconst_1 1: istore_1 2: iconst_2 3: istore_2 4: iload_1 5: iload_2 6: iadd 7: bipush 10 9: imul 10: istore_3 11: iload_3 12: ireturnCopy the code

What does decompiled code for the JVM mean? We refer to the inquiry manual

1: int a=1; iconst_1: int a=1; The 1 in is pushed onto the operand stack first

Int a=1; int a=1; The a variable is stored in the local variable table

Note: 1 is not the value of the variable, it refers to a subscript of the local variable. We see in the manual that there are local variables 0,1,2,3

0 for this, 1 for putting the variable in the second location of the local variable, and 2 for putting the variable in the third location.

Corresponding to the compute() method, 0 represents this, 1 represents local variable A, 2 represents local variable B, and 3 represents local variable C

Int a=1; int a=1; A is placed in the second position of the local variable table, and then 1 is removed from the operand stack and assigned to A


2: iconst_2 pushes int b=2; The constant 2 is pushed onto the operand stack

Int b=2; int b=2; The local variable b is stored in the third position in the local variable table, and then the number 2 in the operand stack is removed from the stack, assigning the value 2 to b in the local variable table

4: iloAD_1 loads int from local variable 1: iloAD_1 loads int from local variable 1: iloAD_1 loads int from local variable 1

To better understand ILOAD_1, let’s look at program counters.

Program counter

In the JVM virtual machine, program counters are an integral part of it.

The program counter is unique to each thread and is the memory location, or line number, that it uses to store the line of code that is about to be executed. We see JVM decomcompiled code with positions like 0, 1, 2, and 3 (shown below), which we can think of as an identifier. A program counter can simply be understood as keeping track of these numbers. These numbers actually correspond to addresses in memory

When the bytecode execution engine reaches line 4, it will execute to 4: ILoAD_1, which we can simply interpret as the code position recorded by the program counter is 4. Our method math.class is placed in the method area and executed by the bytecode execution engine, which changes the position of the program counter by moving it down one bit at the end of each line of code

Why do Java virtual machines design program counters?

Because of multithreading. When a thread is executing and another thread preempts the CPU, the previous thread is suspended. When thread 2 finishes executing, thread 1 is executed. So where did thread 1 go before that? The program counter did it for us.

Iload_1 loads int from local variable 1 –> means it takes int from the second position in the local variable table and places it on the operand stack. The program counter now points to 4

5: iload_2 loads int from local variable 2 –> means it takes the third int b from local variable and puts it on the operand stack, where the program counter points to 5

6: iadd performs an addition of type int –> takes two local variables from the table and performs the addition operation. This operation is done in the CPU and puts the result 3 on the operand stack, at which point the program counter points to 6


7: bipush 10: push an 8-bit signed integer onto the operand stack

We find that this position is 7, but the next one is 9, so where does the 8 go? Actually, 0, 1, 2, 3… It’s the same memory address, and our multiplier of 10 will also take up memory space, so we’re storing the multiplier of 10 in place of 8

9: imul performs the multiplication of int type –> this is the same as iADD, first takes 3 and 10 from the operand stack, calculates it in the CPU, and puts the result 30 back on the operand stack

The multiplication operation is computed in the CPU’s registers. We’re talking about keeping it in memory.

10: istore_3 stores int in the local variable table –> means that c is placed in the local variable table, and 30 of the operand stack is assigned to c

11: ILOAD_3 load int type value from local variable 3 –> load the local variable table into the fourth position of the value 30

12: ireturn returns data of type int from the method –> the final result c is returned.

We will see how the variables in this method are converted between the operand stack and the local variable table. You should now understand operand stacks and local variable tables

Summary: What is an operand stack? **

In the process of computing, the constants 1, 2, and 10 also need some memory space, so where do they live? It’s stored in the operand stack

The operand stack is a temporary transfer space in memory during execution

4.3.3 Dynamic linking

In what is said before dynamic link: refer to the article: www.cnblogs.com/ITPower/p/1… Search: dynamic links

Static links are loaded when the program is loaded. Usually static constants, static methods, etc., because they only have one copy in the memory address, so, for performance, they are loaded directly in

Dynamic links, on the other hand, are links that are loaded when used, like the compute method. The load only happens when the math.pute () method is executed.

4.3.4 Method export

After we run compute(), we need to return to the math.comput() method of main. How does it return, and what line of code should we execute after we return? So before I go into the compute() method, I have a record in the method exit, how I’m going to return, where I’m going to return. A method exit is a way to record information about a method.

V. The relation between heap and stack

Compute () = compute(); compute() = main (); Overall, it’s the same, but there’s one thing that needs to be explained, and that’s the local scale. Take a look at the following code

public static void main(String[] args) { Math math = new Math(); math.compute(); }Copy the code

What’s the difference between local variables in main and compute()? Math in the main method is an object. We know that objects are usually created in a heap. Math, on the other hand, is in the local variable table, recording the address of the new Math object in the heap.

Just to be clear, math doesn’t store concrete content, it stores the address of the instance object.

So the relationship between stack and heap comes out, if there are a lot of new objects in the stack, those objects are created in the heap. The stack holds the memory addresses of the objects created in the heap.

Vi. Method area

We can print a more detailed version of the decomcompiled JVM code using the javap -v math.class > math.txt command

The difference between the generated code and the code generated using Javap-c is the Constant pool. Where do these constant pools go? Put it in the methods area. The constant pool seen here is called the runtime constant pool. There are many other constant pools, such as object constant pools for eight data types, string constant pools, and so on.

The main understanding here is run-time constant pooling. The runtime constant pool is placed in the method area.

What are the main elements of the method area?

Constant + static variable + class meta information (i.e. the code information of the class)Copy the code

In the Math.class class, you have constants and static constants

public static int initData = 666;
public static User user = new User();
Copy the code

They’re in the methods section. Where new User() is placed in the heap, a memory address is allocated in the heap, and the User object is placed in the method area. The User object in the method area points to the memory space allocated in the heap.

The relationship between the heap and the method area is that objects in the method area refer to the addresses of new objects in the heap

Class meta-information: The content defined in the entire class is the class meta-information, which is also placed in the method area.

Local method stack

Native method stacks are methods implemented in c++ code. Method names with native code.

Such as:

new Thread().start();
Copy the code

Here start() calls the local method

That’s the local method

Local method stack: The runtime also needs memory space to store, which is provided by the local method stack

Each thread is allocated a stack space, a local method stack, and a program counter. Main thread: contains thread stacks, local method stacks, and program counters.