Writing in the front

A brief introduction to the JVM (Java Virtual Machine), also known as the Java Virtual Machine. Although it is called a virtual machine, it is not really a virtual machine as we understand it. It is more like a process in an operating system. The JVM hides the underlying aspects of each operating system, and Java programs simply generate bytecode files that the JVM interprets and executes.

Java Development Kit (JDK) can be regarded as the core of Java. It contains the compilation and debugging toolkit and the basic class library. It also includes the JRE.

The Java Runtime Environment (JRE) contains the JVM and basic class libraries. The JVM is responsible for interpreting operations, mapping its instructions to the CPU instruction set of different devices, so you only need to install different versions of the VIRTUAL machine on different operating systems. This also gives Java the ability to cross platform.

The development of the JVM

Just like with the three-party library, there are different implementations of the same functionality. The same is true for JVMS. The first JVM was Sun’s Classic VM, which was used by JVMS until JDK1.2, and was gradually replaced by HotSpot, as we all know, until JDK1.4, when the Classic VM was completely abandoned.

HotSpot is probably the most widely used VIRTUAL machine currently and is included in the OpenJDK. But what you may not know is that HotSpot was not originally developed by Sun, it was designed and implemented by a small company, and it wasn’t originally designed for the Java language. Sun saw the JIT advantage of the virtual machine and bought the company to get HotSpot VM.

Runtime memory area

You may have experienced what it is like to be tortured by your soul, but if it happens in OOM (Out Of Memory) online, how can you check it? If you were asked to tune the running parameters of a JVM, what would you do?

Unlike C++, which can take over memory and play the role of god and low-level worker, in Java we leave memory management to the JVM, and if we don’t understand how runtime memory is distributed and how garbage collection works, we probably won’t be able to solve the problem when it actually occurs. This is also necessary to understand the JVM in depth.

At runtime, Java manages memory in the following areas: heap, method area, virtual machine stack, local method stack, and program counters.

The heap

The Java Heap is the largest chunk of memory managed by the JVM. Objects that are instantiated using the new keyword in normal development are almost always allocated memory in the heap, and all threads can share objects that are allocated on the heap.

The heap is also the main area of GARBAGE collection in the JVM, and because of the generational mechanism of garbage collection, the heap can be divided into smaller generations and older generations. We’ll talk more about GC later.

So why almost? The SPECIFICATION of the JVM itself specifies that all objects are allocated memory on the heap, but with the maturity of JIT (Just In Time) compilers and escape analysis techniques, all objects allocated memory on the heap becomes less absolute.

The JIT compiler

I don’t know if you’ve heard of this, but the 80/20 rule also applies to our program, which is that 20% of the code takes up 80% of the running resources. In the code we write, there might be some hot code that gets called frequently. In addition to code that is called frequently, the body of a loop that is executed multiple times is hot code.

The JIT compiler will then optimize this part of the Code, compile it into Machine Code, and perform some optimizations accordingly. If you’re not familiar with it, you might say, isn’t our code already compiled into bytecode? How did it compile into Machine Code again?

Since bytecode is only an intermediate state, the actual operation is that the JVM translates bytecode into Machine Code, like an interpreted language, when it is run, and this Machine Code is the instruction that the operating system can recognize when it is run directly. The JIT will save the Machine Code corresponding to the compiled hot Code, so that the process of compiling bytecode to Machine Code is omitted when downloading and calling, and the efficiency is naturally improved.

Escape analysis

As we mentioned earlier, almost all objects in Java are allocated space on the heap, and the memory space in the heap is shared by all threads, so synchronization is a concern in multithreading. What if this variable is a local variable that is only accessed in a function?

This local variable is the unescaped variable, and what if that variable can be accessed elsewhere? This means that the variable has escaped the current scope. Escape analysis allows us to know which variables have not escaped from the current scope, and then the object memory can be allocated on the stack. At the end of the call, as the thread continues to execute, the stack space is reclaimed, and the memory allocated by the local variable is also reclaimed.

Methods area

The method area holds loaded Class information, constants, static variables, and JIT-compiled results. Like the heap, the method area is an area of memory shared by all threads. Unlike the heap, the intensity of garbage collection is much smaller relative to the intensity of the heap’s GC, but there is still a constant GC.

The virtual machine stack

The virtual stack is thread-private, so there is no need for synchronization in multithreading, which is thread-safe. When each method is executed, a stack frame will be created in the virtual machine stack in the current thread. The process of each method from call to end corresponds to the process of stack frame loading and unloading in the virtual machine stack. Naturally, the stack frame should contain the local variables of the method, the operand stack, the dynamic link, and the corresponding return information.

If you’ve ever written recursion inside a method, the program went into an infinite loop because the exit condition was never reached, and then you saw the program throw a StackOverflow error. The corresponding stack is the operand stack mentioned above.

If there is not enough memory, OutOfMemory (OOM) will be thrown.

Local method stack

The function of the Native method stack is similar to that of the virtual machine stack, except that the virtual machine stack serves Java methods in the JVM, while the Native method stack serves Native methods.

GC

In fact, the region in the heap can be divided into the new generation and the old age, and further divided into more details, such as Eden, From Survivor, and To Survivor. The first allocated object instance goes to the Eden region, which is generally the largest in the new generation, with a ratio of 8:1 to From Survivor, although this ratio can be changed by JVM parameters. And when the allocated object entity is large it will go straight to the old age.

The reason for more detailed memory partition of the heap is to make garbage collection more efficient.

Garbage identification mechanism

How does the JVM determine which objects are “garbage” and need to be collected? We need to understand how the JVM determines which memory needs to be reclaimed.

Reference counting

The idea is to add a reference counter to each object. Each time another object references the object, the value of the reference counter is +1. If an object has a reference count of 0, no object references it.

At first glance this is fine, so why hasn’t Java taken this approach?

Imagine a function that defines two objects O1 and O2, O1 refers to O2, and O1 refers to O1, so that neither object’s reference counter is zero, but they are never actually accessed again.

So we need another solution to solve this problem.

Accessibility analysis

Reachability analysis can be understood as a walk through a tree, where the root node is an object and its children are objects that reference the current object. The traversal starts from the root node. If you find that all traversals from the root node have been completed, but there are still objects that are not accessed, then these objects are unavailable and need to be reclaimed.

These root nodes are called GC Roots. What resources can be considered GC Roots?

  • The object referenced by a local variable in a stack frame
  • The object referenced by the class static attribute in the method area
  • The object referenced by the constant in the method area
  • The object referenced by the local method stack

As we talked about earlier, in reference counting, memory is reclaimed if its reference counter is 0. In reachability analysis, if an object does not have any references, it is not necessarily recycled.

Garbage collection algorithm

After talking about how the JVM determines whether an object needs to be reclaimed, let’s talk about how the JVM does it.

Mark-clear

As the name suggests, the process is divided into two stages, marking and cleaning. All objects that need to be reclaimed are marked first, and then the tagged objects are uniformly reclaimed. This algorithm is very limited, first of all, the two processes of mark and clear are not high efficiency, and such a clean way will produce a large amount of memory fragmentation, what is the meaning?

That is, although there seems to be plenty of free memory overall, it’s all scattered around in a very small piece of memory. If you need to claim space for a large object at this point, the JVM’s inability to find such a contiguable chunk of memory, even if overall memory is sufficient, will cause a GC to be triggered.

Mark-clear

copy

The general idea is to divide the existing memory space into two halves, A and B. All the memory of new objects is allocated in A. Then, when A is used up, the object survival judgment starts, and the surviving objects in A are copied to B, and then the memory space in A is reclaimed once.

Replication algorithm

This eliminates the problem of memory fragmentation caused by the use of mark-clear. However, it still has its own shortcomings. That’s at the cost of cutting memory space in half, which in some cases can be quite high.

The new generation of heap is the replication algorithm adopted. As mentioned earlier, the new generation is divided into Eden, From Survivor, and To Survivor. Since almost all new objects are allocated memory here, Eden is much larger than Survivor. Therefore, Eden and Survivor regions do not need to allocate memory to the 1:1 default of the replication algorithm.

The default Eden to Survivor ratio in HotSpot is 8:1, which means that only 10% of the space is wasted.

You might see a problem here.

Since your Eden zone is so much larger than Survivor zone, what if the size of surviving objects after a GC is larger than the total size of Survivor zone?

Indeed, in the new generation GC, the worst case scenario is that all objects in Eden are alive. What will this JVM do? We need to introduce a concept called memory allocation guarantee.

When this happens, the new generation needs the memory space of the old generation as a guarantee, saving Survivor objects directly into the old generation.

Mark-tidy

The process of mark-clean GC is the same as mark-clear GC, except that all living objects are moved to the same side, so there is no memory fragmentation like mark-clean.

249993-20170308200502734-920263398

Generational collection

This is also the algorithm used by the current mainstream virtual machine, in fact, for the characteristics of different memory areas, using different algorithms mentioned above.

For example, the nature of the new generation is that most objects need to be recycled, and only a few survive. So the new generation is generally used to copy the algorithm.

However, the old age belongs to the memory space where the survival rate of objects is very high, so the mark-clean and mark-tidy algorithms are used for garbage collection.

Garbage collector

Cenozoic collector

After talking about the algorithm of garbage collection, we need to know more about what GC is landed through, that is, the practical application of the algorithm above.

Serial

Serial uses a garbage collector with a copy algorithm and is single-threaded. That is, when Serial does garbage collection, it must suspend all other threads until The garbage collection is complete, an action called STW (Stop The World). The GC in Golang also has a STW, which suspends all running Goroutines in preparation for its marking phase.

Furthermore, the pause action is not visible to the user, who may only know that a request has been executed for a long time, and it is difficult to hook up with GC without experience.

But in some ways, if you have a single-core system, Serial doesn’t have the overhead of interworking between threads, making GC more efficient. This is why Serial is still the default new generation collector in Client mode.

249993-20170308204330750-898195038

ParNew

The only difference between ParNew and Serial is that ParNew is multithreaded while Serial is single-threaded. Other than that, the garbage collection algorithm is exactly the same as the collection behavior.

This collector may perform somewhat worse than Serial in a single-core environment, because single-core does not take advantage of multithreading. In a multi-core environment, the default number of threads is the same as the number of cpus.

249993-20170308210151797-1882924644

Parallel Scavenge

The Parallel Avenge is a multithreaded collector and the default garbage collector in Server mode. The above two collectors focus on reducing the STW insane, while the Parallel Avenge focuses more on system throughput.

For example, if the JVM has been running for 100 minutes and GC has been running for 1 minute, the throughput of the system at this point is (100-1)/100 = 99%.

The focus of throughput and short pause time is different, and you need to judge according to your actual situation.

High throughput

The shorter the total GC time, the higher the system throughput. In other words, high throughput means that STW may take a little more time than normal, which makes it more suitable for systems that don’t have much interaction in the background, because real-time requirements are not very high, and tasks can be efficiently completed.

Short pause

If the STW time is short, the response speed of the system is very high because of frequent interaction with users. Low response times lead to a higher user experience.

Old age collector

Serial Old

Serial Old is the older version of Serial, which uses the mark-collation algorithm.

New generation: Most resources need to be recycled

The old days: Most resources don’t need to be recycled

Therefore, the new generation collectors are basically using the copy algorithm, and the old collectors are basically using the mark-collation algorithm.

Serial Old is also used by JVMS in Client mode.

Parallel Old

The Parallel Old avenge is a version of the Parallel Avenge avenge, a multithreaded mark-collation collector that has just been discussed the throughput of the system. Consider the Parallel Insane and Parallel Old garbage collector.

249993-20170309210552797-797186750

CMS

CMS stands for Concurrent Mark Sweep and uses a mark-sweep collection algorithm. Collectors that focus on the lowest STW times, if your application is very focused on response time, consider using a CMS.

249993-20170312201047482-791570909

The core steps can be seen in the figure:

  • An initial tag is made to mark all the objects that can be associated from GCRoots, which requires STW but doesn’t take much time
  • Then, concurrent marking will be carried out, and multithreading will conduct reacability analysis for all objects through GC Roots Tracing, which is time-consuming
  • When it’s doneTo markIn the process of concurrent marking, the program is still running normally. At this time, the state of some objects may have changed, so it is requiredSTW, and the relationship between the time taken isInitial tag < re-tag < concurrent tag.
  • After the marking phase is complete, the execution concurrency is clear.

CMS is a garbage collector with obvious advantages, such as multi-threaded GC and low STW time. But CMS also has many disadvantages.

disadvantages

As we mentioned at the beginning, the use of mark-clear algorithms can result in discontinuous memory space, known as memory fragmentation. If you need to allocate space for larger objects at this point, you will find that you are out of memory and trigger a Full GC again.

Second, because a CMS may consume more CPU resources than a throughput-focused collector, if the application is already CPU sensitive, there will be less CPU resources available for GC, and the overall GC time will be longer, resulting in lower system throughput.

G1

G1, which stands for Garbage First, is so highly regarded that it is even proposed as the default Garbage collector in JDK9. The Parallel Insane is focused on throughput, whereas CMS is focused on shorter STW times, so G1 is designed to minimize STW times while maximizing throughput.

As we know, the garbage collector discussed above divides the contiguous heap memory space into new generation and old generation, and the new generation is more finely divided, with Eden and two smaller Survivor Spaces, both contiguous. The G1, on the other hand, introduces a new concept called Region.

A Region is a group of equal but discontinuous memory Spaces. It also adopts the concept of generation, but there is no physical isolation of other collectors. Regions belonging to the new generation and the old generation are distributed in various parts of the heap.

8ca16868

H stands for large Object, also known as Humongous Object. To prevent frequent copying of large objects, they are placed directly in the old age. How does G1 compare to other garbage collectors?

From a macro point of view, it uses mark-tidy algorithm, while from a region-to-region point of view, it uses a copy algorithm, so G1 does not generate memory fragmentation like CMS does during running.

In addition, G1 can use multiple cpus to shorten STW time and execute concurrently with user threads. Predictable pause times can also be modeled to let users know how many milliseconds should not be spent on GC in a given slice of time. G1 is able to do this because, unlike other collectors, it does not collect the entire new and old generations, but rather systematically avoids region-wide garbage collection of the entire heap.

conclusion

This graph is from the reference blog, and it sums it up nicely.

The collector Serial, parallel or concurrent New generation/old age algorithm The target Applicable scenario
Serial serial The new generation Replication algorithm Speed of response priority Client mode in a single-CPU environment
Serial Old serial The old s Mark-tidy Speed of response priority Client mode in single CPU environment, backup scheme of CMS
ParNew parallel The new generation Replication algorithm Speed of response priority In a multi-CPU environment, it works with the CMS in Server mode
Parallel Scavenge parallel The new generation Replication algorithm Throughput priority Computations in the background without much interaction
Parallel Old parallel The old s Mark-tidy Throughput priority Computations in the background without much interaction
CMS concurrent The old s Mark-clear Speed of response priority Java applications concentrated on Internet sites or B/S system servers
G1 concurrent both Mark-tidy + copy algorithm Speed of response priority Server oriented applications, replacing CMS in the future

reference

  • Java garbage Collection (GC) mechanism in detail
  • An in-depth understanding of the JVM(3) — seven garbage collectors

If you find this article helpful, please give it a thumbs up, a comment, a share and a comment

You can also search the official account [SH full stack Notes] on wechat, and of course you can also scan the TWO-DIMENSIONAL code to follow it


Thanks to a worship

This article is formatted using MDNICE