Garbage collection algorithm

Garbage collection algorithm classification

Generational collection theory

Now the common garbage collectors on the market have adopted the theory of generational collection. Generational collection is the division of memory into young and old based on the lifetime of objects. In the new generation of objects, a large number of objects (99%) die per collection, so you can opt for a mark-copy algorithm that can complete each garbage collection with a small amount of object replication cost. However, the survival probability of old objects is higher, and there are more surviving objects. If the replication algorithm needs to pay a high IO cost, and there is no additional space for replication, it is reasonable to choose mark-clear or mark-tidy.

Mark-copy algorithm

The mark-copy algorithm divides memory into two regions of the same size (such as a new generation of Survivor regions), allocates elements in one of these regions at a time, and when this region is full, copies the surviving elements to another memory region and emptying the current memory region.

  • Cons: Waste half of memory space.
  • Pros: Simple and efficient.

The JVM stores new objects in Eden, and during GC, copies the surviving objects from Eden and Survivor to another partition of Survivor. This is an optimization of the replication algorithm by the JVM. Only 1/10 of the memory space is wasted.

Mark-clear algorithm

The algorithm consists of two stages: marking and clearing, marking the objects that are alive, retrieving the unmarked objects uniformly (which is the general case), or vice versa, marking all the objects that need to be collected and recycling (the cost of marking is relatively large).

  • Advantages: the most basic collection algorithm, relatively simple
  • Disadvantages: low efficiency (if more objects need to be marked), space problems (a large number of discontinuous memory space will be generated after the mark is cleared)

Space problem Possible problems: After A GC occurs, the memory reclaims a large number of dead objects, resulting in a large amount of free memory. In this case, the system generates a large object, which may fail because there is not enough contiguous memory space. [Enough space, but not enough continuous space]

Mark-collation algorithm

Solves the space problem in tag clearing.

The marking process is still the same as the mark-clean algorithm, but instead of reclaiming the recyclable object directly, the next step is to move all the living objects to one end, and then directly clean up the memory beyond the end boundary.

Garbage collector

The current mainstream garbage collector on the market are: Serial collector, Parallel Scavenge collector [JDK8 default collector], ParNew collector, CMS collector, G1 collector.

Serial collector | – XX: + UseSerialGC – XX: + UseSerialOldGC

THE single-threaded garbage collector, THE oldest garbage collector, “stops THE WORLD” stops THE WORLD during THE execution of THE Serial collector: only runs THE GC thread, and pauses THE other threads.

The new generation uses the replication algorithm, and the old generation uses the mark-sorting algorithm.



It is simple and efficient (compared to the single threads of other collectors). Serial collectors can achieve high single-threaded collection efficiency without the overhead of thread interaction.

The Parallel Scavenge collector | – XX: + UseParallelGC – XX: + UseParallelOldGC

Similar to the multithreaded version of the Serial collector, the number of GC threads enabled by default is the same as the number of CPU cores (-xx :ParallelGCThreads can be modified, butChange is not recommended.)



The application of the Parallel Scexploiture is the throughput (efficient utilization of the CPU). Garbage collectors such as CMS are more concerned with pause times for user threads (improving user experience). Throughput is the ratio of the CPU time spent running user code to the total CPU consumption.

| – XX ParNew collector: + UseParNewGC

ParNew and the Parallel Scavenge collector are extremely familiar. ParNew is a new Parallel Scaenge collector developed to support the CMS collector.

CMS collector | – XX: + UseConcMarkSweepGC

The CMS collector is an old-zone garbage collector that is often used in conjunction with the ParNew collector. It is suitable for use in applications that focus on user experience and allows GC threads and user threads to run concurrently (partial steps). The mark-clear algorithm is adopted. The recycling process is divided into five steps:

  1. Initial flag: Suspends other threads (STW) and flags objects directly referenced by GC Roots. Process soon
  2. Concurrent marking: Start with the object directly referenced by GC Roots and work down to all referenced objects. This process is the most time-consuming and is executed concurrently with the user thread. Possible problem with this process: new garbage (floating garbage) may be generated during user thread execution, which cannot be flagged.
  3. Re-marking: to identify the object that changes in the correction concurrent marking, the incremental update algorithm in the three-color marking is mainly used to mark. This process suspends the other processes (STW).
  4. Concurrent cleanup: Objects marked as recyclable are reclaimed and executed concurrently with user threads. * This process also generates new garbage objects (floating garbage) that will be collected in the next GC.
  5. Concurrent reset: Clears flags for objects marked as unrecyclable.

G1 collector | – XX: + UseG1GCwebsite

Heap structure

The G1 collector uses a completely different heap memory allocation method. It divides the heap memory into 2048 regions of the same size (each with a size of 1MB to 32MB).

Allocation of heap memory

These regions are logically and dynamically divided into Eden, Survivor, and old generation. These regions don’t have to be continuous.

In addition to these three categories, there is a new type added to G1: Humongous Regions. If the object is in more than 50% of the Regions, it will be moved into the region of this type.Such large objects should be avoided as much as possible

Expected GC-STW time

G1 adds a new configuration (-xx :MaxGCPauseMillis=200) to set the time we expect for each GC-STW. This is a relative value and will not be strictly executed at this time. The JVM will evaluate the total GC time, and if the expected time is not met, the JVM will use some algorithm to reclaim only part of the cost-effective memory space to achieve this goal.

Young GC

By default, 5% of the heap memory is allocated for the new generation. When the new generation is full, the JVM evaluates how long it will take to collect the new generation. If it is shorter than the expected GC-STW time, the Young GC will not be triggered immediately, and the new generation will be expanded. Otherwise, the Young GC is triggered.

The Young GC uses a replication algorithm to copy the surviving objects to Survivor regions. And clear the original region.

  • This process “stops the world” and recalculates and saves the size of the new generation so that the next GC can proceed quickly.
  • Young GC is multi-threaded.

Mixed GC

The entire heap usage rate (- XX: InitiatingHeapOccupancyPercent = 45), trigger Mixed GC

Mixed process of GC

  1. Initial flag: Suspends other threads (STW) and flags objects directly referenced by GC Roots. Process soon
  2. Concurrent marking: Start with the object directly referenced by GC Roots and work down to all referenced objects. This process is the most time-consuming and is executed concurrently with the user thread. The region will be marked blank —-
  3. Resigning: This process suspends other processes (STW) in order to fix the object identity that has changed in concurrent markup. [Then the blank region is reclaimed at this step]
  4. Concurrent cleanup: During this phase, G1 compares regions of older ages. Regions with lower “active” are reclaimed first. (Active: the region with the fewest living objects has lower replication cost and faster reclamation.) The old and the new recycle together.
  5. Copy survivor objects: The surviving objects of the previous reclamation region are copied to an unused region region and compressed.

G1’s official recommendation

  1. Do not set the young generation size
    • Setting the size of the younger generation disables the target for the desired GC-STW time.
    • The G1 no longer expands and shrinks the younger generation as needed. Because the dimensions are fixed, you cannot change the dimensions.
  2. Expected GC-STW time
    • It is best not to set the average GC time, but to use 90% GC time.
  3. In Mixed GC, Evacuation Failure
    • Evacuation Failure occurs when no more free regions are promoted to the older generation or copied to the surviving space, and the heap cannot expand because it has reached its maximum value. This triggers the Full GC, which is similar to the single-threaded garbage collection of the Serial collector and is very time-consuming
    • Increase the size of (-xx :G1ReservePercent=10)
    • To reduce the size of the (- XX: InitiatingHeapOccupancyPercent = 45)
    • Increased the number of concurrent token threads (-xx :ConcGCThreads=n)

For details about G1 JVM parameters, see the official website