Object Search algorithm and Collection Algorithm introduces the basic algorithm of garbage collection, which is equivalent to the methodology of garbage collection. Let’s look at the implementation of garbage collection in detail.

As mentioned above, modern commercial virtual machines use generational collection, with different collectors for different regions. There are seven commonly used collectors, whose scope of application is shown in the figure


Let’s look at the characteristics of each collector

Serial collector

Serial is a single thread that performs garbage collection. When it is time to do a garbage collection, the program pauses everything in hand and executes a single thread of garbage collection.

Because the new generation is characterized by low survival rate of objects, the collection algorithm uses the replication algorithm to copy the new generation of surviving objects to the old age, with little content and good performance.

ParNew collector

ParNew, also used in the new generation, is a multithreaded version of Serial and is identical to Serial in terms of parameters and algorithms (again, copying algorithms).

Par is short for Parallel, but its parallelism only refers to collecting multithreaded parallelism, not that the collection and the original program can be done in Parallel. ParNew also needs to suspend everything in the program and then multithread garbage collection.

Parallel avenge

The new generation of collectors, also using the replication algorithm, is also parallel multi-threaded collection. The biggest difference with ParNew is that it focuses on garbage collection throughput.

Throughput here refers to the ratio of total time to garbage collection time. The higher the ratio, the smaller the garbage collection is.

The Parallel Scavenge collector provides two parameters to control garbage collection execution:

  • -xx :MaxGCPauseMillis, the maximum garbage collection pause time. The principle of this parameter is that the collector will control the area size of the new generation to ensure that the collection is less than this maximum pause time. Simply put, the smaller the area, the less time it takes to recycle. So this parameter is not set as small as possible. If set too low, the Cenozoic generation space will be too small and GC will be triggered more frequently.
  • -xx :GCTimeRatio, ratio of garbage collection time to total time. This is the reciprocal of throughput, same principle as MaxGCPauseMillis.

The Parallel Insane collector is focused on throughput, so when you set the above parameters, you don’t want to set the area size (Cenozoic, old age, etc.). You can turn on the ** -xx :UseAdaptiveSizePolicy** argument to let the JVM monitor collected performance and adjust these region size parameters dynamically.

Serial Old collector

Older collectors, like Serial, were single-threaded, except that the algorithm used mark-compact.

Parallel Old collector

The Older Collector is a version of the Parallel Avenge. The algorithm was replaced with Mark-compact.

CMS collector

CMS, Concurrent Mark Sweep, is also an older collector. It focuses on the minimum pause times (low pauses) for garbage collection, which is appropriate in older, infrequent GC scenarios.

The naming of concurrent, rather than parallel, indicates that the collector is capable of concurrent work execution. MS means it uses the Mark Sweep algorithm.

Let’s see how it works. The whole CMS process is more complex than the previous collector, the whole process is divided into four steps:

  • Initial mark, single-threaded execution, needs to “Stop The World”, but only to mark objects directly reachable by GC Roots, which is very fast because The directly reachable objects are small.
  • Concurrent marking, the concurrent tracing of initial mark objects marked by the initial marking process while other threads continue to work. Here the time is longer, but there is no pause.
  • Remark: During the concurrent marking process, new garbage may be generated, so the newly generated garbage needs to be re-marked. The parallel marker is executed here, not concurrent with The user thread, so it’s still “Stop The World”, which takes a little longer than The initial time.
  • Concurrent sweep, a concurrent sweep of previously flagged garbage. Other user threads can still work without pauses.

Due to the above features of CMS, the disadvantages are also relatively obvious,

  • The Mark Sweep algorithm causes memory fragmentation
  • The concurrent capability of a CMS depends on CPU resources. Therefore, the CMS performance deteriorates when the NUMBER of cpus is small or the CPU resources are tight
  • In the concurrent cleanup phase, the user thread is still running, so new garbage is still generated, and garbage from this phase is not collected in the current GC, but stored in the next one. Therefore, GC cannot wait for memory to run out before GC, which can result in concurrent cleanouts when user threads run out of space. So there’s going to be some wasted memory reserved for the user thread.

One might think that since Mark Sweep causes memory fragmentation, why not change the algorithm to Mark Compact?

The answer is simple, because when concurrent cleanup is done, using Compact memory, how do you use the memory used by the user thread? To ensure that the user thread can continue to execute, the resource it is running on is not affected. The Mark Compact is better suited for “Stop the World” scenarios.

G1 collector

G1, Garbage First, was officially launched in JDK 1.7 and was the cutting edge Garbage collector at the time. The G1 is the ultimate CMS improvement, addressing the problem of fragmentation and more memory space. Although the process is similar to a CMS, the underlying principles are completely different.

High efficiency is preferred. G1 predicts the pause time of garbage collection by calculating the benefit rate of old objects and prioritizing those with the most benefit.

Heap memory structure differences. Previous collector generation is divided into new generation, old age, lasting generation, etc.

G1 divides memory into multiple regions of the same size. Each Region has its own generation attributes, but these generations need not be continuous.

Such partitioning can effectively avoid memory fragmentation problems.

But this also introduces a new problem of generational memory discontinuities, resulting in the need for a full sweep to find reference memory when GC searches for garbage objects.

To solve this problem, G1 maintains a Remembered Set for each Region, which records object references. Search for references to Remembered Set when GC occurs.

There are two GC modes:

  • The Young GC focuses on all regions of the Young generation and controls the collection time of GC by controlling the number of regions collected in the Young generation.
  • Mixed GC focuses on all regions of the younger generation and adds several regions of the older generation to calculate the maximum payoff by prediction.

The overall execution process:

  • Initial mark, which marks objects directly associated with reachability from GC Root. Stop the World (STW) is executed.
  • Concurrent marking, in which the object of the initial marking is concurrently marked, while the user thread can still execute.
  • Final marking (Remark), STW, marking and then the garbage generated in the process of concurrent marking.
  • Filter (Live Data Counting And Evacuation), evaluate marked garbage, And recover garbage based on GC mode. STW execution.

Amazing ZGC

In JDK 11, an experimental ZGC was added. It takes less than 2 milliseconds to reclaim on average. It is a collector with low pauses and high concurrency.

ZGC is executed almost everywhere concurrently, except for the STW that is initially marked. So the pause time is almost spent on the initial tag, which is actually very little. So how can other phases be executed concurrently?

“Colored Pointer” and “Load Barrier” are two new technologies in ZGC.

ZGC uses the 64-bit bits of the Pointer to denote Finalizable, Remapped, Marked1, and Marked0 (ZGC only supports 64-bit platforms) to mark the storage state of the Pointer. The object’s pointer is labeled with information about the object. Note that the pointer here is equivalent to a reference in Java terminology.

When the pointed memory changes (when the Compact is moved), the color changes.

As mentioned in G1, the Compact phase requires STW, otherwise it will affect user thread execution. So how to solve this problem?

Because of the coloring pointer, it is easy to know the state of the object in memory when the program is accessing it at runtime, if the requested memory is being colored. Then the read barrier is triggered. The read barrier updates the pointer and returns the result, at some cost, in order to achieve the effect of concurrency with the user thread.

Put these two technologies together, to quote RednaxelaFX

Compared with the traditional algorithm of tagged objects, ZGC marking on the pointer, join the Load when accessing the pointer Barrier (read Barrier), such as when the object moving by GC, the pointer on the color will be wrong, the Barrier will first pointer updates to effectively address and back, that is, never read only a single object probability were to slow down, There is no Stop The World that grates on The whole in order to keep The application consistent with GC.

ZGC is still in the experimental stage in JDK 11, but it will become a mainstream GC collector in the near future because the algorithm and ideas are such a big improvement.


More technical articles, wonderful dry goods, please pay attention to the blog: Zackku.com wechat official number: Zack said code