In the previous section, we learned about generational collection by JVM. JVM garbage collection mainly uses mark-copy algorithms in the new generation, and mark-sweep and mark-collation algorithms in the old generation. Next, let’s take a look at some implementations of the garbage collector for HotSpot, the JDK default virtual machine.

1. Common garbage collectors

First, take a look at all the garbage collectors available prior to JDK 11.

Seven garbage collectors are listed in the figure, with lines indicating that they work together and areas indicating whether they belong to a new generation or an old generation.

The collection algorithm used by the garbage collector is also highlighted here. The G1 collector is a special one that uses the mark-collation algorithm as a whole and the mark-copy algorithm as a part, but more on that later.

1.1 Serial collector

The Serial collector is the most basic and oldest.

As its name suggests (serial), it is a single-threaded collector that uses either a processor or a collection thread to do garbage collection. And while garbage collection is taking place, all other worker threads must be paused until The garbage collection is complete — this is called “Stop The World.”

The Serial/Serial Old collector runs as follows:

1.2. ParNew collector

The ParNew collector is essentially a multithreaded parallel version of the Serial collector, using multiple threads for garbage collection.

The working process of the ParNew collector is shown below:

It is important to note that Par stands for Parallel, but it is important to note that Parallel simply describes multiple GC threads working together at the same time, not the GC thread and user thread running at the same time. ParNew garbage collection also needs to Stop The World.

1.3. Parallel Avenge

The Parallel Scavenge collector is a new generation collector based on the mark-copy algorithm and is also able to collect in Parallel. Similar to ParNew, but the Parallel Insane is primarily focused on garbage collection throughput.

Throughput is the ratio of the time spent running user code to the total processor consumption. The higher the ratio, the smaller the garbage collection is.

The Parallel Scavenge collector provides two parameters for precise throughput control:

  • -xx: MaxGCPauseMillis, the maximum garbage collection pause time. The principle of this parameter is that the collector will control the area size of the new generation to ensure that the collection is less than this maximum pause time. Simply put, the smaller the area, the less time it takes to recycle. So this parameter is not set as small as possible. If set too low, the Cenozoic generation space will be too small and GC will be triggered more frequently.

  • -xx: GCTimeRatio, ratio of garbage collection time to total time. This is the reciprocal of throughput, same principle as MaxGCPauseMillis.

The Parallel Avenge collector is also often referred to as a “through-first collector” because of its affinity for throughput.

1.4 Serial Old collector

Serial Old is an older version of the Serial collector, which is also a single-threaded collector using a mark-collation algorithm.

The Serial Old collector works as follows:

1.5. Parallel Old collector

Parallel Old is an older version of the Parallel Avenge collector, supported by multiple threads for concurrent collection and implemented on a mark-collation algorithm.

1.6. CMS collector

CMS (Concurrent Mark Sweep) collector is a collector that aims to obtain the shortest recovery pause time. It is also a collection of the old age and adopts mark-sweep algorithm.

CMS collects garbage in four steps:

  • CMS Initial Mark: Single thread running, need to Stop The World, mark GC Roots can reach The object.
  • Concurrent Mark (CMS Concurrent Mark): no pauses, running at the same time as the user thread, traversing the entire object graph starting with GC Roots direct objects.
  • Re-marking (CMS Remark): Multithreaded run, need to Stop The World, mark concurrent mark stage to generate objects.
  • CMS Concurrent sweep: no pause, running at the same time as the user thread, cleaning up dead objects marked by the marking phase.

A little bit of tricolor abstraction is inserted here as it relates to the process of multiple tagging. A tricolor abstraction is used to describe the state of an object during garbage collection.

Usually white indicates that the object is not scanned, gray indicates that the object is scanned but not processed, and black indicates that the object and its descendants have been processed. This abstraction is used in the CMS marking and clearing process, see [5] for more details.

The Concurrent Mark Sweep collector runs as follows:

Advantages: The main advantages of CMS are already in the name – concurrent collection, low pauses.

Disadvantages: CMS also has three obvious disadvantages.

  • The Mark Sweep algorithm causes memory fragmentation

  • The concurrent capability of CMS depends on CPU resources. During concurrent collection, garbage collection threads may preempt resources of user threads, resulting in performance degradation of user programs.

  • In the concurrent cleanup phase, the user thread is still running, generating so-called “Floating Garbage” that cannot be handled by this Garbage collection and must be handled by the next Garbage collection. If there is too much floating garbage, new garbage collections can be triggered, resulting in performance degradation.

Garbage First

The Garbage First (G1 for short) collector is a radical outgrowth of the Garbage collector, which pioneered the idea of local collection design and region-based memory layout.

While G1 is still designed to follow the generational collection theory, its heap memory layout is very different from other collectors. Previous collector generation is divided into new generation, old age, lasting generation, etc.

G1 divides the contiguous Java heap into independent regions of equal size, each of which can act as the Eden space of the new generation, Survivor space, or old chronospace, as needed. The collector can apply different policies to regions that play different roles.

In this way, the heap is collected based on multiple regions rather than the entire heap. In addition, a priority list is maintained to track the value of each Region, and the Region with the highest value is collected first.

The G1 collector can be roughly divided into the following four steps:

  • Initial mark, which marks objects directly associated with reachability from GC Root. Stop the World (STW) is executed.

  • Concurrent marking, which is executed concurrently with the user thread, analyzes the reachability of objects in the heap starting from GC Root, recursively scans the whole heap object graph to find the objects to be reclaimed,

  • Final marking (Remark), STW, marking and then the garbage generated in the process of concurrent marking.

  • Filter (Live Data Counting And Evacuation), develop a recovery plan, select multiple regions to form a back collection, copy the living objects of the recovered set Region into an empty Region, And clean up the entire space of the old Region. Need to STW.

G1 has many advantages over CMS, such as specifying the maximum pause time, memory layout by Region, and dynamically determining collection by revenue.

Only from the point of view of memory, unlike the CMS “tag – clear” algorithm, on the whole is based on the G1 “tag – finishing” algorithm implementation of collector, but look from the local () between the two Region is based on “tag – copy” algorithm, however, these two algorithms are mean G1 does not produce memory space debris during operation, Provides neat free memory after garbage collection is complete.

2. Frontier garbage collector

2.1. ZGC collector

In JDK 11, an experimental ZGC was added. It takes less than 2 milliseconds to reclaim on average. It is a collector with low pauses and high concurrency.

Like ParNew and G1 in CMS, ZGC uses a mark-copy algorithm, but with a major improvement: ZGC is almost always concurrent at the mark, transfer, and relocation phases, which is the most critical reason for ZGC to achieve its goal of less than 10ms pause times.

Although ZGC is still in the experimental stage in JDK 11, it has a very promising future as the algorithm and ideas are a very big improvement.

3. Garbage collector selection

3.1 Collector selection trade-offs

There are a number of trade-offs in choosing a garbage collector – for example, what about the infrastructure to run your application? Who are the publishers using the JDK? And so on…

Here’s a brief list of some of the collector scenarios mentioned above:

  • Serial: If the application has a small memory space (about 100 MB) or if it is running on a single-threaded processor with no pause time requirements.
  • Parallel: If peak performance of the application is a priority and there is no time requirement, or pause times of 1 second or more are acceptable.
  • CMS/G1: If response time is higher than throughput priority, or garbage collection pauses must be kept to approximately 1 second.
  • ZGC: If the response time is high priority, or the heap space is large.

3.1. Set up the garbage collector

Set the garbage collector (composite) parameters as follows:

The new generation The old s The JVM parameter
Incremental Incremental -Xincgc
Serial Serial -XX:+UseSerialGC
Parallel Scavenge Serial -XX:+UseParallelGC -XX:-UseParallelOldGC
Parallel New Serial N/A
Serial Parallel Old N/A
Parallel Scavenge Parallel Old -XX:+UseParallelGC -XX:+UseParallelOldGC
Parallel New Parallel Old N/A
Serial CMS -XX:-UseParNewGC -XX:+UseConcMarkSweepGC
Parallel Scavenge CMS N/A
Parallel New CMS -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
G1 -XX:+UseG1GC

Reference:

[1] : Zhou Zhipeng, In Depth Understanding the Java Virtual Machine: Advanced JVM Features and Best Practices

[2] : Manual of Garbage Collection Algorithms the Art of Automatic Memory Management

[3] : Garbage Collection in Java — What is GC and How it Works in the JVM

[4] : Some key technologies of Java Hotspot G1 GC

[5] : GC Algorithms: Implementations

[6] : Exploration and practice of a new generation of garbage collector ZGC