Main Features Application Scenario Setting Parameters Basic operating principles

\

How to determine whether an object is alive or dead from Java Virtual Machine Garbage Collection (I) Basics? Learn about several common algorithms for Java VIRTUAL Machine garbage collection in Java Virtual Machine Garbage Collection (2) Garbage Collection Algorithms.

Let’s take a look at the seven garbage collectors in the HotSpot VIRTUAL machine: The Serial, ParNew, Parallel Scavenge, Serial Old, Parallel Old, CMS, AND G1 garbage collection concepts are introduced, as well as their characteristics, application scenarios, setup parameters, and operating principles.

1. Overview of garbage collector

The garbage collector is an implementation of the garbage collection algorithms (mark-sweep, copy, mark-collation, train). The garbage collector may vary from vendor to vendor and JVM version, but this article focuses on the garbage collector in the HotSpot VIRTUAL machine.

1-1. Garbage collector combination

After JDK7/8, HotSpot virtual machine all collector and combination (wired), as shown below:

(A) The figure shows 7 collectors of different generations:

Serial, ParNew, Parallel Scavenge, Serial Old, Parallel Old, CMS, G1;

(B) Their location indicates whether they belong to Cenozoic or old age collectors:

Cenozoic collectors: Serial, ParNew, Parallel Scavenge;

Collector: Serial Old, Parallel Old, CMS;

Whole heap collector: G1;

(C) There is a line between the two collectors, indicating that they can be used together:

8. Serial/Serial Old, Serial/CMS, ParNew/Serial Old, ParNew/CMS, Parallel Scavenge/Serial Old, Parallel Scavenge/Parallel insane Old, G1;

(D) Serial Old is used as a backup plan for the Failure of Concurrent Mode (described later);

1-2. The difference between concurrent and parallel garbage collection

(A) Parallel

Refers to multiple garbage collection threads working in parallel while the user thread is still in a waiting state.

Be the ParNew, Parallel Avenge, Parallel Old;

(B) Concurrent

Refers to the simultaneous execution of the user thread and the garbage collection thread (but not necessarily in parallel and may be executed alternately);

The user program continues to run while the garbage collector thread runs on another CPU.

Such as CMS, G1 (also parallel);

1-3. The difference between Minor and Full GC

(A) Minor GC

Also known as Cenozoic GC, refers to the garbage collection in the new generation.

Because Java objects are mostly ephemeral, Minor GC is frequent and generally fast;

(B) Full GC

Also known as Major GC or old GC, refers to GC occurring in the old age;

The occurrence of a Full GC is often accompanied by at least one Minor GC (not always, the Parallel Sacvenge collector can optionally set a Major GC policy);

Major GC is typically 10 times slower than Minor GC.

        

The following sections describe the features, rationale, and usage scenarios of these collectors, focusing on CMS and G1, two relatively complex collectors. But one point needs to be made clear:

There is no best collector, and there is no universal collection;

You can only select collectors that are appropriate for specific application scenarios.

2. Serial collector

The Serial garbage collector is the most basic and oldest;

JDK1.3.1 was the only choice for HotSpot new generation collection before;

1, the characteristics of

For the New generation;

Copy algorithm is adopted;

Single-thread collection;

When garbage collection is done, all worker threads must be paused until complete;

“Stop The World”;

The Serial/Serial Old collector runs as follows:

2. Application scenarios

It is still the default new generation collector for HotSpot in Client mode.

There are also advantages over other collectors:

Simple and efficient (compared to the single-threaded of other collectors);

For environments limited to a single CPU, the Serial collector achieves the highest single-thread collection efficiency without thread interaction (switching) overhead;

In a user’s desktop application scenario, the available memory is generally small (tens of megabytes to one or two hundred megabytes) and garbage collection can be completed in a relatively short time (tens of MS to more than one hundred MS), as long as it does not occur frequently, this is acceptable

3. Set parameters

“-xx :+UseSerialGC” : add this parameter to explicitly use the serial garbage collector;

4. Stop TheWorld

JVM in the background automatically initiated and automatically completed, invisible to the user, the user’s normal work thread all stopped, that is, GC pause;

Will bring users bad experience;

From JDK1.3 to the present, from Serial collector – Parallel collector – CMS- G1, user thread pause times have been reduced, but still cannot be completely eliminated;

For more information about “Stop The World”, see “Java Virtual Machine Garbage Collection (1) Basics”, “2-2, Reachability Analysis Algorithm “.

For more information about the Serial collector:

The Memory Management in the Java HotSpot ™ Virtual Machine “section 4.3 Serial Collector (Memory Management), the white paper: www.oracle.com/technetwork…

The Java Platform, Standard Edition HotSpot Virtual Machine Garbage Collection Tuning Guide, Section 5, Available Collectors Docs.oracle.com/javase/8/do…

ParNew collector

The ParNew garbage collector is a multithreaded version of the Serial collector.

1, the characteristics of

With the exception of multithreading, the behavior and characteristics of the Serial collector are the same.

For example, Serial collector can control parameters, collection algorithm, Stop The World, memory allocation rules, reclaim strategy, etc.

The two collectors share a lot of code;

The ParNew/Serial Old collector runs as follows:

2. Application scenarios

In Server mode, the ParNew collector is a very important collector because it is currently the only one besides Serial that works with the CMS collector;

However, in a single CPU environment, it is no better than the Serail collector because of the thread interaction overhead.

3. Set parameters

“-xx :+UseConcMarkSweepGC” : After CMS is specified, ParNew is used as the new generation collector by default.

“-xx :+UseParNewGC” : specifies ParNew forcibly.

“-xx :ParallelGCThreads” : specifies the number of garbage collection threads. ParNew enables the same number of garbage collection threads as the CPU by default.

4. Why only ParNew works with the CMS collector

CMS is HotSpot’s first truly Concurrent collector in JDK1.5, the first to allow garbage collection threads to work (basically) concurrently with user threads;

CMS is an older collector that does not work as Parallel Scavenge with JDK1.4.

The Parallel Insane (and G1) do not use the traditional GC collector code framework and are implemented independently; The other collectors share some of the framework code;

More on the CMS collector later.

4. Parallel avenge

The Parallel Collector is also known as a Throughput Collector because of its affinity to the Throughput.

1, the characteristics of

(A) The ParNew collector has some characteristics similar to those of the ParNew collector

Cenozoic collector;

Copy algorithm is adopted;

Multithreaded collection;

(B) The main feature is that its focus is different from other collectors

Collectors such as CMS focus on minimizing the pause time of user threads during garbage collection;

The goal of the Parallel Insane is to achieve a controlled Throughput.

Details on throughput and collector concerns are provided later in this section;

2. Application scenarios

The goal of high throughput is to reduce garbage collection time and allow user code to run longer;

When the application runs on multiple cpus and does not have a particularly high pause time requirement, that is, the program mainly performs calculations in the background without much interaction with the user;

For example, applications that perform batch processing, order processing, payroll, scientific calculations;

3. Set parameters

The Parallel Avenge collector provides two parameters for precise throughput control:

(A), “-xx :MaxGCPauseMillis”

Controls the maximum garbage collection pause time, the number of milliseconds greater than 0;

If MaxGCPauseMillis is set to a smaller size, pause times may decrease, but throughput may also decrease.

Because garbage collection may occur more frequently;

(B), “-xx :GCTimeRatio”

Set the ratio of garbage collection time to total time, 0<n<100 integer;

GCTimeRatio is equivalent to setting the throughput size.

The ratio of garbage collection execution time to application execution time is calculated as follows:

      1 / (1 + n)

For example, the -xx :GCTimeRatio=19 option sets the garbage collection time to 5% of the total time –1/(1+19);

The default is 1%–1/(1+99), which is n=99;

The time spent in garbage collection is the total time collected by the younger generation and the older generation;

If the throughput goal is not met, the memory size of the generation is increased to maximize the time the user program runs;

In addition, there is another parameter worth watching:

(C), “-xx :+UseAdptiveSizePolicy”

With this parameter enabled, there is no need to manually specify details such as:

The size of the Cenozoic (- Xmn), Eden and Survivor area ratio (- XX: SurvivorRation), promotion of the old s objects age (- XX: PretenureSizeThreshold);

The JVM collects performance monitoring information based on current system health and adjusts these parameters dynamically to provide the most appropriate pause times or maximum throughput, a method known as GGC adaptive tuning.

Here’s a recommended approach:

(1), just set the size of memory data (such as “-xmx” to set the maximum heap);

(2) Then use “-xx :MaxGCPauseMillis” or” -xx :GCTimeRatio” to set an optimization target for the JVM;

(3) The tuning of those detailed parameters is done by the JVM adaptively;

This is an important difference between the Parallel Scavenge collector and the ParNew collector.

For more information on target tuning and GC adaptive tuning strategies, see:

Memory Management in the Java HotSpot™ Virtual Machine Automatic Selections and Behavior Tuning www.oracle.com/technetwork…

Java Platform, Standard Edition HotSpot Virtual Machine Garbage Collection Tuning Guide Docs.oracle.com/javase/8/do…

Description of throughput and collector concerns

(A) Throughput

The ratio of CPU time spent running user code to total CPU consumption;

Throughput = run user code time/(run user code time + garbage collection time);

High throughput means less garbage collection time and longer runtime for user code;

(B) Desired target of garbage collector (concerns)

(1) Pause time

Shorter pauses are suitable for applications that need to interact with the user;

Good response speed can improve user experience;

(2) Throughput

High throughput can efficiently use THE CPU time to complete the computation task as soon as possible;

Mainly suitable for tasks that do not require much interaction in the background;

(3) Footprint

Minimize the memory space of the heap while achieving the first two goals;

Better spatial locality can be obtained;

To find out more about the Parallel Avenge collector:

The official garbage collection tuning guidelines Section 6: docs.oracle.com/javase/8/do…



All of the above are Cenozoic collectors, and we’ll start with the old-age collectors;

Serial Old collector

Serial Old is an older version of the Serial collector;

1, the characteristics of

For the old age;

A “mark-sweep-compact” algorithm (and compression, mark-sweep-compact);

Single-thread collection;

The Serial/Serial Old collector runs as follows:

2. Application scenarios

It is mainly used in Client mode.

In Server mode there are two main uses:

To be used with the Parallel Collector insane or preexisting in JDK1.5.

(B) as a backup plan for CMS collector, used in the event of Concurrent Mode Failure in Concurrent collection (detailed below);

For more information about the Serial Old collector:

White paper on memory management section 4.3.2: www.oracle.com/technetwork…

Parallel Old collector

The Parallel Old garbage collector is an older version of the Parallel Avenge collector;

It is only available in JDK1.6;

1, the characteristics of

For the old age;

“Mark-collation” algorithm is adopted.

Multithreaded collection;

The Parallel Avenge /Parallel Old collector is illustrated as follows:

2. Application scenarios

JDK1.6 and later used to replace the Serial Old collector;

Especially in Server mode with multiple cpus;

The Parallel Insane plus the Parallel Old collector is the result of the throughput and CPU-sensitive applications.

3. Set parameters

“-xx :+UseParallelOldGC” : Specifies the use of ParallelOld collector;

For more information about the Parallel Old collector collection process, see:

“The white paper on memory management” section 4.5.2: http://www.oracle.com/technetwork/java/javase/tech/memorymanagement-whitepaper-1-150020.pdf

7. CMS collector

A Concurrent Mark Sweep (CMS) Collector is also called a Concurrent Low Pause Collector or low-latency garbage Collector;

The ParNew collector was briefly introduced earlier;

1, the characteristics of

For the old age;

Based on the “mark-clean” algorithm (no compression operation, memory fragmentation);

To obtain the shortest recovery pause time as the goal;

Concurrent collection, low pause;

Requires more memory (see disadvantages below);

            

HotSpot is the first truly Concurrent collector in JDK1.5.

For the first time, garbage collection threads work (basically) at the same time as user threads;

2. Application scenarios

Scenarios with a lot of user interaction;

Expect the system to stop the shortest time, pay attention to service response speed;

To give users a better experience;

Such as common WEB, B/S system on the server application;

3. Set parameters

“-xx :+UseConcMarkSweepGC” : specifies to use the CMS collector;

4, CMS collector operation process

It is more complex than the previous collectors and can be divided into four steps:

(A) CMS Initial Mark

Mark only objects to which GC Roots can be directly associated;

Very fast;

But you need to “Stop The World”;

(B) CMS Concurrent Mark

GC Roots Tracing process;

The surviving object is marked in the collection just generated;

The application is also running;

There is no guarantee that all live objects will be marked;

(C) re-marking (CMS remark)

To correct the mark record of that part of the object whose mark changes because the user program continues to operate during concurrent marking;

You need to “Stop The World”, and The pause is slightly longer than The initial tag, but much shorter than The concurrent tag;

Using multi-thread parallel execution to improve efficiency;

(D) CMS Concurrent sweep

Recycle all garbage objects;

The concurrent markers and concurrent sweeps that take the most time throughout the process work with the user thread;

So in general, the CMS collector’s memory reclamation process is executed concurrently with the user thread;

The CMS collector runs as follows:

5. The CMS collector has three obvious disadvantages

(A) Very sensitive to CPU resources

Concurrent collection does not suspend user threads, but it can slow down the application and reduce overall throughput because it consumes CPU resources.

CMS default number of collection threads =(number of cpus +3)/4;

When the number of cpus is more than 4, the collection threads occupy more than 25% of the CPU resources, which may have a great impact on user programs. Less than four, the impact is greater and may be unacceptable.

\

Incremental concurrent collector:

For this situation, “Incremental Concurrent Mark Sweep” (I-CMS) occurred;

Similar to the idea of using preemption to simulate multitasking mechanism, let the collection thread and user thread run alternately, reduce the running time of the collection thread;

But the effect is not ideal, JDK1.6 after the official no longer advocate users.

For more information:

The official garbage collection tuning guide section 8.8 Incremental Mode:docs.oracle.com/javase/8/do…

See some descriptions in section 4.6.3 of the Memory Management Whitepaper;

(B) Floating garbage cannot be processed and a “Concurrent Mode Failure” may occur

(1) Floating Garbage

In concurrent removal, the user thread generated new garbage, known as floating garbage;

This makes it necessary to reserve a certain amount of memory space for concurrent cleanup, unlike other collectors, which can almost fill up the old years and then collect;

It is also possible to assume that the CMS needs more space than other garbage collectors;

“- XX: CMSInitiatingOccupancyFraction” : setting the CMS reserved memory space;

JDK1.5 defaults to 68%;

JDK1.6 becomes about 92%;

(2) “Concurrent Mode Failure

A “Concurrent Mode Failure” occurs if the CMS does not have enough memory reserved for the program.

At this point, the JVM enables backup: temporarily enable the Serail Old collector, resulting in another Full GC;

This price is very big, so CMSInitiatingOccupancyFraction can’t set too high.

(C) A large number of memory fragments are generated

Because CMS is based on “mark-clear” algorithm, no compression operation is performed after clearing.

Java Virtual Machine garbage Collection (2) Garbage collection algorithm “mark-clean” algorithm introduction has said:

Generating a large number of discrete memory fragments can cause large memory objects to be allocated without finding enough contiguous memory, requiring another Full GC action to be triggered in advance.

Solutions:

(1), “- XX: + UseCMSCompactAtFullCollection”

Make the CMS not perform Full GC when the above situation occurs, and open the merge defragmentation process;

However, the merge and collation process cannot be concurrent and the pause time will be longer.

The default open (but not, in combination with the following CMSFullGCsBeforeCompaction);

(2), “- XX: + CMSFullGCsBeforeCompaction”

Set the number of uncompressed Full GC’s to perform a collation;

To reduce the pause time in the consolidation process;

The default is 0, which means that Full GC is performed every time and no collation is done;

Since space is no longer contiguous, CMS needs to use the available “free list” memory allocation method, which is more expensive than the simple practical “collision pointer” memory allocation.

For more information about how to allocate memory, see Java Object Creation in a Java Virtual Machine.

Overall, CMS reduces the application pause time when performing old-era garbage collection compared to the Parallel Old garbage collector;

However, it increases the application pause time for new generation garbage collection, reduces throughput, and takes up more heap space.

For more information about the CMS collector, see:

Section 8 of the garbage collection tuning guide Concurrent Mark Sweep (CMS) Collector:docs.oracle.com/javase/8/do…

The white paper on memory management section 4.6 Concurrent Mark – Sweep (CMS) Collector:www.oracle.com/technetwork…

G1 collector

G1 (garbage-first) is the commercially available collector of JDK7-U4.

1, the characteristics of

(A) Parallelism and concurrency

Can make full use of multi-CPU, multi-core environment hardware advantages;

Can be parallel to shorten The “Stop The World” pause time;

You can also have garbage collection run concurrently with the user program;

(B) Collection by generation, including Cenozoic and old age

The ability to manage the entire GC heap (young and old) independently without needing to be paired with other collectors;

Being able to deal with objects of different eras in different ways;

                

While the generational concept remains, the memory layout of the Java heap varies considerably;

Divide the whole heap into independent regions of equal size.

Cenozoic and oleozoic are no longer physically separate; they are collections of regions (which do not need to be continuous);

For more information about G1 memory layout, see:

Section 9 of the garbage collection tuning guide: docs.oracle.com/javase/8/do…

(C) Combined with a variety of garbage collection algorithms, spatial integration, no debris generation

As a whole, it is based on mark-collation algorithm.

Locally (between two regions), it is based on the replication algorithm.

This is an implementation of a train-like algorithm;

 

Will not generate memory fragmentation, is conducive to a long time running;

(D) Predictable pauses: high throughput with low pauses

G1 can not only pursue low pause, but also build a predictable pause time model.

You can explicitly specify that within a time slice of M milliseconds, garbage collection takes no more than N milliseconds.

2. Application scenarios

Service-oriented applications, for machines with large memory, multi-processor;

The primary application is to provide a solution for applications that require low GC latency and have a large heap;

For example, when the heap size is about 6GB or larger, predictable pause times can be less than 0.5 seconds;

            

To replace the CMS collector in JDK1.5;

G1 may be better than CMS when:

(1) More than 50% of the Java heap is occupied by active data;

(2) The frequency of object allocation or chronological lifting varies greatly;

(3) GC pause time is too long (longer than 0.5 to 1 second).

Do you have to use G1? May not:

If there are no problems with the current collector, don’t rush to G1;

If your application is looking for low pauses, try G1;

Whether or not to replace the CMS requires actual scenario testing.

3. Set parameters

“-xx :+UseG1GC” : specifies to use G1 collector;

“- XX: InitiatingHeapOccupancyPercent” : when the Java heap utilization rate of parameter values, began to concurrent mark phase; The default value is 45.

“-xx :MaxGCPauseMillis” : sets the pause time target for G1. The default value is 200 ms.

“-xx :G1HeapRegionSize” : set the Region size, ranging from 1MB to 32MB. The goal is to have about 2048 regions at the minimum Java heap;

For more information about G1 parameter Settings, see:

Section 10.5 of the garbage collection tuning guide: docs.oracle.com/javase/8/do…

4. Why can the G1 collector achieve predictable pauses

G1 can build a predictable pause time model because:

Region-wide garbage collection in the Java heap can be systematically avoided;

G1 tracks the value of each Region and maintains a priority list in the background.

According to the allowed collection time, the Region with the highest value (garbage-first) is reclaimed First.

This ensures the highest possible collection efficiency in a limited time;

5. The problem of an object being referenced by different regions

A Region cannot be isolated. Objects in a Region can be referenced by objects in any Region. Do YOU need to scan the entire Java heap to determine whether an object is alive?

In other generational collectors, this problem also exists (and is more pronounced in G1) :

Will the new generation also have to scan the old?

This reduces the efficiency of the Minor GC;

Solutions:

For both G1 and other generational collectors, the JVM uses Remembered Set to avoid global scans:

Each Region has a Remembered Set.

Each time a Reference data Write operation is performed, a Write Barrier operation is generated.

Then check whether the Reference to be written refers to an object in a different Region from the Reference type data (other collectors: check whether old objects refer to new ones).

If not, the related references are recorded in the Remembered Set of the Region where the reference points to the object through CardTable.

                    

When garbage collection is performed, add the enumeration scope of the GC root to Remembered Set.

You can guarantee that no global scan will be done, and there will be no omissions.

G1 collector operation process

Maintaining Remembered Set without counting can be divided into four steps (similar to CMS).

(A) Initial Marking

Mark only objects to which GC Roots can be directly associated;

Next Top at Mark Start (TAMS) is modified so that when the Next stage is run concurrently, the user program can create new objects in the correct available Region.

You need to “Stop The World”, but very fast;

(B) Concurrent Marking

GC Roots Tracing process;

The surviving object is marked in the collection just generated;

It takes longer, but the application is running;

There is no guarantee that all live objects will be marked;

(C) Final Marking

To correct the mark record of that part of the object whose mark changes because the user program continues to operate during concurrent marking;

Changes made to objects in the last phase are recorded in the thread Remembered Set Log.

Merge the Remembered Set Log into the Remembered Set;

                    

You need to “Stop The World”, and The pause is slightly longer than The initial tag, but much shorter than The concurrent tag;

Using multi-thread parallel execution to improve efficiency;

(D), Live Data Counting and Evacuation

Firstly, the recovery value and cost of each Region are sorted.

Then make a collection plan based on the expected GC pause time of the user;

Finally, recycle garbage objects in some high-value regions according to plan;

                    

The “copy” algorithm is used to copy living objects from one or more regions to another empty Region on the heap, and compress and release memory in the process.

Can be done concurrently, reducing pause times, and increasing throughput;

The G1 collector runs as follows:

For more information about the G1 collector, see:

Section 9 of the Garbage collection tuning guide Garbage – First Garbage Collector:docs.oracle.com/javase/8/do…

Section 10 of the Garbage collection tuning guide Garbage – First Garbage Collector Tuning:docs.oracle.com/javase/8/do…

        

Now that we have a general overview of all the garbage collectors in the HotSpot VIRTUAL machine, we will look at some of the JVM memory allocation and reclamation strategies, and JVM garbage collection related tuning methods…

The original address: blog.csdn.net/tjiyu/artic…

【 References 】

1. Chapter 7 of Principles of Compilation, 2nd edition

2. Chapter 3 in Understanding the Java Virtual Machine: Advanced JVM features and Best Practices, 2nd edition

3, “The Java Virtual Machine Specification, Java SE 8 Edition:docs.oracle.com/javase/spec…

4, the Java Platform, Standard Edition HotSpot Virtual Machine Garbage Collection Tuning Guide “: docs.oracle.com/javase/8/do…

5, the Memory Management in the Java HotSpot ™ Virtual Machine “: www.oracle.com/technetwork…

6, the HotSpot virtual machine parameters of the official explanation: docs.oracle.com/javase/8/do…

Thinking in Java 4th Edition 5.5 Cleanup: Finalization and garbage Collection