☕【Java Principle Exploration 】 Thoroughly you understand what JIT compiler is (Just In Time compiler)

If the profile

As we all know, development languages generally fall into two categories, one is compiled language, the other is interpreted language. So you know what the difference is? What’s the difference between a compiler and an interpreter?
This is for both start-up efficiency and operational efficiency. Java program was originally run by the interpreter to interpret, back when the virtual machine is a method or the operation of the code block special frequently put this code marking for hot code, in order to provide both the operating efficiency of the code, at run time, the virtual machine will compile the code into machine code associated with the local platform. And various levels of optimization.

Compilers and interpreters

The Java compiler (JavAC), which compiles Java source programs into intermediate code bytecode files, is the most basic development tool.
The Java Interpreter is a computer program that can do direct, line by line, translation of high level programming languages. The interpreter does not translate the entire program at once, but acts like a “middleman”, which converts the program into another language before running it, so the program runs slowly. It runs every time it translates a line of the program, then it translates the next line, runs again, and so on and so on.

When the program needs to be started and executed for the first time, the interpreter can be the first to play a role, line by line, direct translation run, but inefficient.
The JIT compiler comes into play when a method or loop body is called multiple times, and more and more code is compiled into native machine code, which can then become more efficient (in memory). This is where the intelligent compiler (JIT compiler) comes in.

Interpreter/compiler interaction:

** The HotSpot virtual machine has two real-time compilers built in, called Client Complier and Server Complier,

It automatically selects the running mode based on its version and the hardware performance of the host machine. Users can also use the “-client” or “-server” parameter to force the vm to run in client or server mode **

What is a JIT compiler

The just-in-time compiler is a component of the Java runtime environment that improves the performance of runtime Java applications. Nothing in the JVM affects performance more than the compiler, and the choice of compiler is one of the first decisions you make when running a Java application.
When radical optimizations by the compiler fail, such as changes in the type inheritance structure after loading a new class. When a rare trap occurs, you can reverse optimize and go back to the interpreted state and continue running.

How to use the interpreter with the compiler:

The HotSpot JVM has two built-in compilers, Client Complier and Server Complier, and the virtual machine defaults to Client mode. We can get through.

-client: Forces the VM to run the client mode
-server: Forces the VM to run in server mode
Default (Java-version mixed mode)

In either Client or Server mode, the virtual machine runs in a mixed mode where the interpreter and compiler are used together. Can pass.

Interpretation mode (java-xint-version) Forces the VM to run in interpretation mode, using only the interpreter.
Compile mode (Java-xcomp-version) preferentially executes the program in compile mode, but the interpreter intervenes in the execution if compilation is not possible.

Java -version Java Version "1.8.0_121" Java(TM) SE Runtime Environment (build 1.8.0_121-b13) Java HotSpot(TM) 64-bit Server VM (Build 25.121-b13, Mixed Mode)Copy the code

Java-xint-version Java Version "1.8.0_121" Java(TM) SE Runtime Environment (build 1.8.0_121-b13) Java HotSpot(TM) 64-bit Server VM (build 25.121-B13, interpreted Mode)Copy the code

Java -Xcomp-version Java Version "1.8.0_121" Java(TM) SE Runtime Environment (build 1.8.0_121-B13) Java HotSpot(TM) Compiled mode 64-bit Server VM (Build 25.121-b13, Compiled mode)Copy the code

The key to Java’s “compile once, run everywhere” feature is bytecode. The way the bytecode is translated into the machine instructions of the application has a big impact on the speed of the application. This bytecode can be interpreted, compiled into native code, or executed directly on processors that conform to the bytecode specification in the instruction set architecture.

Bytecode is interpreted by a standard implementation of the Java Virtual Machine (JVM), which slows down the execution of the program. To improve performance, the JIT compiler interacts with the JVM at run time and compiles the appropriate sequence of bytecode into native machine code.
- When using a JIT compiler, the hardware can execute native code rather than having the JVM repeatedly interpret the same sequence of bytecode, resulting in a relatively lengthy translation process. This can speed up execution, unless the method executes less frequently.
- The time the JIT compiler takes to compile bytecode is added to the overall execution time, and infrequent calls to JIT-compiled methods can result in a longer execution time than the interpreter used to execute bytecode.
The JIT compiler performs some optimizations when compiling bytecode to native code.
Because the JIT compiler converts a series of bytecodes into native instructions, it can perform some simple optimizations.
- Some common optimizations performed by JIT compilers include data analysis, converting from stack operations to register operations, reducing memory access through register allocation, eliminating common subexpressions, and so on.
- The more optimized the JIT compiler is, the more time is spent in the execution phase.

As a result, the JIT compiler cannot afford all the optimizations done by the static compiler, not only because of the overhead of execution time, but also because it only restricts the program.

The JIT compiler is enabled by default and is activated when Java methods are called.
The JIT compiler compiles the method’s bytecode into local machine code “just in time” to run.
After compiling a method, the JVM calls the compiled code for that method directly, rather than interpreting it.

In theory, compiling each method can make Java programs close to the speed of native applications if compilation does not require processor time and memory usage.

JIT compilation does require processor time and memory usage. When the JVM is first started, thousands of methods are invoked. Even if the program ends up with very good peak performance, compiling all these methods can seriously affect startup time.

Different compilers for different applications

JIT compilers come in two forms, and the choice of which compiler to use is often the only compiler tuning you need to make when running an application. In fact, even before you install Java, you need to consider knowing which compiler to choose, because different Java binaries contain different compilers.

Client compiler

The well-known optimized compiler is C1, which is enabled with the -clientJVM startup option. As the name suggests, C1 is the client-side compiler. It is designed for client applications that have fewer resources available and, in many cases, are sensitive to application startup time. C1 uses performance counters for code performance analysis to achieve simple, relatively non-intrusive optimizations.

Server side compiler

For long-running applications, such as server-side enterprise Java applications, the client-side compiler may not be sufficient. A server-side compiler like C2 can be used. C2 is typically enabled by adding the JVM startup option -server to the startup command line. Since most server-side applications are expected to run for a long time, enabling C2 means that you will be able to collect more performance analysis data than you would with a lightweight client application that runs for a short time. As a result, you will be able to apply more advanced optimization techniques and algorithms.

Layered compilation

Why compile hierarchically

This is because the compiler takes time to compile native code, it may take longer to compile more optimized code locks, and in order to compile more optimized code, the interpreter may have to collect performance monitoring information for the compiler. This also has an effect on the speed of interpretation. In order to find a balance between the speed of the program startup response and the efficiency of the operation. Therefore, the strategy of layered compilation is adopted.

Layered compilation combines client – and server-side compilation. Layered compilation takes advantage of both the client and server compilers in the JVM.
The client compiler is most active during application startup and performs optimizations triggered by lower performance counter thresholds.
The client-side compiler also inserts performance counters and prepares the instruction set for more advanced optimizations, which the server-side compiler addresses at a later stage.

Layered compilation is a very resource-efficient approach to performance analysis because the compiler is able to collect data during low-impact compiler activities that can later be used for more advanced optimizations. This approach can also yield more information than would be obtained by simply using interpreted code profile counters.

A hierarchical policy is seen as follows:

Layer 0: Program interpretation runs. The performance monitoring function is disabled for the interpreter to trigger layer 1 compilation.
Layer 1: C1 compilation. Compile bytecode to native code. Perform simple and reliable optimizations, adding performance monitoring logic if necessary.
Layer 2: C2 compilation, which compiles bytecode to native code, and at the same time enables some optimizations that take time to compile, and even some radical optimizations that are unreliable based on performance monitoring information.

Code optimization

When you choose a method to compile, the JVM feeds its bytecode to the Just-in-time (JIT) compiler. The JIT must understand the semantics and syntax of the bytecode before it can compile the method correctly.
- To help the JIT compiler analyze the method, its bytecode is first reformatted into something called trees, which is more similar to machine code than bytecode.
- Then the tree of the method is analyzed and optimized.
- Finally, the tree is converted to native code.
The JIT compiler can use multiple compilation threads to perform JIT compilation tasks, and using multiple threads can potentially help Java applications start faster.

The default number of compile threads is identified by the JVM and depends on the system configuration. If the number of threads generated is not optimal, the JVM decision can be overridden with this XcompilationThreads option.

Compilation consists of the following phases:

inline

Inlining is the process of merging or “inlining” the tree of a smaller method into the tree of its caller. This can speed up frequent method calls.

Local optimization

Local optimizations can analyze and improve a small piece of code at a time. Many native optimizations implement time-tested techniques used in classical static compilers.

Control flow optimization

Control flow optimization analyzes the control flow within a method (or a particular part of a method) and rearranges code paths to improve their efficiency.

Global optimization

Global optimization can be applied to the whole method at once. They are more “expensive” and require significant compilation time, but can greatly improve performance.

Native code generation

The native code generation process varies depending on the platform architecture. Typically, at this stage of compilation, the tree of methods is converted into machine code instructions; Perform some small optimizations based on architectural characteristics.

Compile the object

Compile objects are hot code that will be optimized for compilation. There are two categories:

A method that is called multiple times
The body of a loop that has been run many times

The trigger condition

This involves the concept of triggering a condition to determine whether a piece of code is hot code. If you need to trigger just-in-time compilation, this behavior becomes a Spot Dectection.

There are two ways to detect hot spots:

Sample Based Hot Spot Dectection

The virtual machine periodically checks the top of the stack for each thread, and if it finds a method that frequently appears on the top of the stack, it is a hot method.

Counter Based Hot Spot Dectection

The virtual machine sets up a counter for each method or block of code and counts the number of times the method has been run. It is considered a hot method if the number of runs exceeds a certain threshold.

The HotSpot JVM uses another approach, a counter based HotSpot detection approach. It provides two types of counters for each method:

Method call counter

This threshold is 1500 times in Client mode. In Server mode it is 10000 and this threshold can pass the parameter-XX:CompileThresholdTo set it artificially.

Method invocation counts is not the method is called the absolute number, but the operation of the relative frequency, the method is invoked over a period of time the number of times, when more than a certain time limit, assume that method call number is not enough to make it still submitted to instant compiler, that this method is called counter will be reduced by half, This process is called Counter Decay of method call counters.
This is called the Counter Half Life Time for this method. The same can also be used with parameters-XX:-UseCounterDecayTo turn off the heat attenuation.

The entire process of the method call counter triggering just-in-time compilation is seen in the following figure:

Back edge counter

What is the back edge?

An instruction that jumps after the bytecode encounters the control flow is called a Back Edge.

The back edge counter is used to count the number of times the loop body code in a method has been run. The back edge counter threshold can be passed by the parameter- XX: OnStackReplacePercentageTo adjust.

Virtual virtual machines run in Client mode, back edge counter threshold calculation formula is:

Method call counter closed value (CompileThreshold) xOSR ratio (OnStackReplacePercentage) / 100Copy the code

The default value of OnSlackReplacePercentage is 933. Assuming default values are taken, the threshold value of the back edge counter of the Client mode VM is 13995.

When the virtual machine runs in Server mode, the ITM formula of back edge counter threshold is:

Method invocation counter worshiping value (CompileThreshold) x (OSR ratio (OnStackReplacePercentage) - interpreter monitoring (InterpreterProffePercentage) / 100Copy the code

The default value of OnStackReplacePercentage is 140. InterpreterProffePercentage defaults to 33.
So let’s say I take the default values. The backedge counter threshold of a VM in BF Server mode is 10700.

The backedge counter triggers the just-in-time compilation process as seen in the following figure:

Unlike the method call counter, the back edge counter has no heat decay, so it counts the absolute number of times the loop runs.

The compilation process

By default, whether a method call generates an in-time compilation request or an OSR compilation request, the virtual machine continues as interpreted until the code compiler is finished, and the compilation action continues in the background compilation thread. You can also use -xx: -backgroundcompilation to disable BackgroundCompilation. In this case, once JIT compilation is encountered, the running thread will submit a request to the virtual machine and wait until the compilation is complete before running the native code output by the compiler.

So what does the compiler do during background compilation?

Client Compiler compilation process

Stage 1: A platform-independent front end constructs bytecode into a High Level Infermediate Representaion, and HIR uses static single assignment to represent code values, which makes it easier to perform some optimization actions during and after construction. Until then, the compiler has completed some basic optimizations on the bytecode, such as method inlining, constant propagation, and so on.
Phase 2: A platform-dependent back end generates a Low Level Intermediate Representation from the HIR, while some optimizations are completed on the HIR before this. Such as null value check elimination, range check elimination, etc. To allow HIR to achieve a more efficient code representation.
Stage 3: On the platform-dependent back end, Linear Scan Register Allocation is used to allocate registers on the LIR, Peephole optimization is performed on the LIR, and machine code is generated.