An overview of the

This article mainly describes the structure of G1 garbage collector, garbage collector processing steps, garbage collection classification, common parameter Settings, as well as application scenarios and suggestions

This article is mainly based on OpenJDK-1.8

G1 garbage Collector

G1 (garbage-first) is a server-oriented Garbage collector, mainly for machines matching multi-core processors and large memory capacity. It meets the requirements of GC pause time with a high probability and features high throughput.

G1 garbage collector structure

G1’s region-based heap memory layout is key to its ability to achieve this goal. While G1 is still designed to follow the generational collection theory, its heap memory layout is very different from other collectors: G1 no longer insists on a fixed size and a fixed number of generational regions, but divides the continuous Java heap into multiple independent regions of equal size. Each Region can act as the Eden space of the new generation, Survivor space, or old chronospace as required. The collector can apply different strategies to regions that play different roles, so that both newly created objects and old objects that have survived multiple collections for a while can be collected well.

Regions also have a special class of Humongous regions that are used to store large objects. G1 considers large objects as long as the size of a Region exceeds half of its capacity. ** the size of each Region can be set using -xx :G1HeapRegionSize. The value ranges from 1MB to 32MB and is the NTH power of 2. ** The relevant setup code is as follows:

// Minimum region size; we won't go lower than that.
// We might want to decrease this in the future, to deal with small
// heaps a bit more efficiently.
#define MIN_REGION_SIZE  (      1024 * 1024 )

// Maximum region size; we don't go higher than that. There's a good
// reason for having an upper bound. We don't want regions to get too
// large, otherwise cleanup's effectiveness would decrease as there
// will be fewer opportunities to find totally empty regions after
// marking.
#define MAX_REGION_SIZE  ( 32 * 1024 * 1024 )

// The automatic region size calculation will try to have around this
// many regions in the heap (based on the min heap size).
#define TARGET_REGION_NUMBER          2048

size_t HeapRegion::max_region_size(a) {
  return (size_t)MAX_REGION_SIZE;
}

void HeapRegion::setup_heap_region_size(size_t initial_heap_size, size_t max_heap_size) {
  uintx region_size = G1HeapRegionSize;
  if (FLAG_IS_DEFAULT(G1HeapRegionSize)) {
    size_t average_heap_size = (initial_heap_size + max_heap_size) / 2;
    region_size = MAX2(average_heap_size / TARGET_REGION_NUMBER,
                       (uintx) MIN_REGION_SIZE);
  }

  int region_size_log = log2_long((jlong) region_size);
  // Recalculate the region size to make sure it's a power of
  // 2. This means that region_size is the largest power of 2 that's
  // <= what we've calculated so far.
  region_size = ((uintx)1 << region_size_log);

  // Now make sure that we don't go over or under our limits.
  if (region_size < MIN_REGION_SIZE) {
    region_size = MIN_REGION_SIZE;
  } else if (region_size > MAX_REGION_SIZE) {
    region_size = MAX_REGION_SIZE;
  }

  // And recalculate the log.
  region_size_log = log2_long((jlong) region_size);

  // Now, set up the globals.
  guarantee(LogOfHRGrainBytes == 0."we should only set it once");
  LogOfHRGrainBytes = region_size_log;

  guarantee(LogOfHRGrainWords == 0."we should only set it once");
  LogOfHRGrainWords = LogOfHRGrainBytes - LogHeapWordSize;

  guarantee(GrainBytes == 0."we should only set it once");
  // The cast to int is safe, given that we've bounded region_size by
  // MIN_REGION_SIZE and MAX_REGION_SIZE.
  GrainBytes = (size_t)region_size;

  guarantee(GrainWords == 0."we should only set it once");
  GrainWords = GrainBytes >> LogHeapWordSize;
  guarantee((size_t) 1 << LogOfHRGrainWords == GrainWords, "sanity");

  guarantee(CardsPerRegion == 0."we should only set it once");
  CardsPerRegion = GrainBytes >> CardTableModRefBS::card_shift;
}
Copy the code

For super-large objects that exceed the capacity of the entire Region, they will be stored in N consecutive Humongous regions. Most behaviors of G1 treat Humongous regions as part of the old era, as shown in the figure.

While G1 still retains the concept of Cenozoic and oldene, Cenozoic and oldene are no longer fixed, but rather dynamic collections of a series of regions that do not need to be continuous. The G1 collector is able to model predictable pause times because it uses regions as the smallest unit of a single collection, that is, each collection of memory is an integer multiple of Region size, so that it can systematically avoid all-region garbage collection across the entire Java heap. A more specific approach is to have the G1 collector track the “value” of garbage accumulation in each Region, which is the amount of space collected and the experience of how long it takes to collect, and then maintain a list of priorities in the background, each time according to the user set the allowable collection pause time (using the parameter -xx: MaxGCPauseMillis specifies, the default is 200 milliseconds), which prioritises regions that benefit most from collecting value, hence the name “Garbage First”. This use of regions and prioritized Region collection ensures that the G1 collector achieves the highest possible collection efficiency in a limited amount of time.

The G1 garbage collector has the following features:

  • Parallelism and concurrency: G1 can make full use of the hardware advantages of multi-CPU, multi-core environment, using multiple cpus to shorten the stop-the-world downtime time, some of the other collectors originally need to pause Java threads to perform GC operations, G1 collector can still allow Java programs to continue to run through the way of concurrency.
  • Generational collection: Although G1 can manage the entire GC heap independently without the assistance of other collectors, the concept of generational collection is retained.
  • Spatial consolidation: Unlike CMS’s mark-clean algorithm, G1 is a collector based on the mark-clean algorithm as a whole and on the “copy” algorithm locally (between two regions). However, both algorithms mean that G1 does not generate memory space fragmentation during operation, and that it can be collected to provide neat free memory. This feature helps programs run for a long time and allocate large objects without triggering the next GC prematurely because contiguity memory space cannot be found.
  • Predictable pauses: This is one of G1’s advantages over CMS. Reducing pauses is a common concern of BOTH G1 and CMS, but G1 also models predictable pauses that allow users to explicitly specify a time segment of M milliseconds to complete garbage collection.

G1 Collection Procedure

  • Initial Marking (STW) : Simply mark objects to which GC Roots can be directly associated, and modify the value of the TAMS pointer so that the next phase of user threads running concurrently allocates new objects in the available regions correctly. ** This phase requires the thread to pause, but it is very short, and it is done synchronously during the Minor GC, so the G1 collector actually has no extra pauses during this phase. 支那

  • Concurrent Marking: An analysis of the reachability of objects in the heap, starting with GC Root, recursively scans the entire heap object graph to find objects to reclaim. This phase is time-consuming but can be performed concurrently with user programs. When the object graph scan is complete, it is also necessary to reprocess objects recorded by SATB that have reference changes at the time of concurrency.

  • Final Marking (STW) : Another short pause on the user thread to process the last few SATB records that remain after the concurrent phase is over.

  • Live Data Counting and Evacuation (STW) : Responsible for updating statistics for regions and ordering the recovery value and cost of each Region (set JVM parameters: -xx :MaxGCPausemillis). You can select multiple regions to form a collection and copy the surviving objects of the Region to an empty Region. Then clean up the entire old Region. The operation here involves the movement of the living object, which must suspend the user thread, and is done in parallel by multiple collector threads. The reclamation algorithm mainly adopts the replication algorithm to copy the living objects of one region to other regions. This method does not have many memory fragments to be defragmentation after reclamation like CMS. G1 uses the replication algorithm to recycle memory fragments

The G1 collector maintains a list of priorities in the background. Each time, the Region with the highest collection value (hence its name garbage-first) is selected based on the allowed collection time. For example, a Region can collect 10M Garbage in 200ms. The other Region can collect 20M garbage in 50ms. If the time for collecting garbage is limited, G1 will choose the latter Region to collect garbage only in limited time. This method of using regions to divide memory space and priority ensures that the G1 collector can improve garbage collection efficiency as much as possible within a limited time.

G1 Garbage collection and classification

Young GC

G1 will calculate how long it will take to reclaim Eden. If the reclaim time is much less than the value set by -xx :MaxGCPauseMills, then add the Region of the Young generation and continue to store objects. The Young GC will not be done immediately until the next Eden area is full and G1 calculates the reclaim time close to the value set by parameter: -xx :MaxGCPauseMills, then the Young GC will be triggered.

Mixed GC

Is not a Full GC, the heap of old s share reach parameters (- XX: InitiatingHeapOccupancyPercent) set the value of the trigger, Collect all Young and part of Old (priorializing garbage collection for Old areas based on expected GC pauses) and large object areas. Normally G1’s garbage collector is MixedGC, which uses a replication algorithm. Copy live objects from each Region to other regions. If there are not enough regions available to hold the copied objects, a Full GC will be triggered

Full GC

Stopping the system program and using a single thread to mark, clean, and compress to free up a number of regions to provide a Mixed GC can be very time consuming. (Shenandoah optimized for multithreaded collection)

G1 collector parameter Settings

-xx :+UseG1GC: uses the G1 collector

-xx :ParallelGCThreads: Specifies the number of GC threads to work on

-xx :G1HeapRegionSize: specifies the partition size (1MB to 32MB, N acts of 2). By default, the heap is divided into 2048 partitions

-xx :MaxGCPausemillis: Day mark pause time (default 200ms)

-xx :G1NewSizePercent: specifies the initial memory size of the new generation. (The default value is 5%. The default value is an integer.

-xx :G1MaxNewSizePercent: indicates the maximum memory size of the new generation

-XX:TargetSurvivorRatio: If the total number of Survivors in Survivor area (age 1+ age 2+ age N) exceeds 50% of the Survivors area, all Survivors with age n and above will be transferred to older ages

-xx: MaxtenuringThreshold: maximum age threshold (default 15)

-XX:InitiatingHeapOccupancyPercent: If the space occupied by the old generation reaches the total heap memory value (45% by default), the Mixed GC of the new generation and the old generation will be implemented. For example, as we said before, the heap has 2048 regions by default. If nearly 1000 regions are regions of the old generation F, The MixedGC might be triggered.

– XX: G1MixedGCLiveThresholdPercent: 85% (the default) in the Region live objects below this value オ will recycle the Region, if more than this value, live objects too much, the significance of recycling of small

-xx :G1MixedGCCountTarget: Specifies the number of filter recycling times in a collection process (the default is 8 times). In the last filter recycling phase, the collection can be collected for a while, and then the collection can be paused to resume the system operation. In this way, the system word pause time can be reduced.

-xx :G1HeapWastePercent: (default 5%): Check whether the threshold of the Region hollow out during GC is sufficient. In mixed reclamation, the Region is reclaimed based on the replication algorithm. In all cases, the surviving objects of the Region to be reclaimed are added to other regions and then to this Region Once the number of free regions reaches 5% of the heap memory, the mixed collection stops immediately, indicating the end of the mixed collection.

G1 collector optimization recommendations

Assuming that the -xx :MaxGCPauseMills parameter is set to a large value, the young generation may occupy 60% of the heap memory if the system is running for a long time. The era GC is triggered at this point.

In this case, there may be too many surviving objects in the Survivor area, and the old age will be entered.

Or after the young generation GC, too many objects survive and enter the Survivor zone, triggering the dynamic year-by-year judgment rule, reaching 50% of the Survivor zone, and also causing some objects to enter the old age.

The key here is to adjust the value of the -xx :MaxGCPauseMills parameter to ensure that its young GC is not too frequent, but also to consider how many surviving objects will enter the old age after each GC, and to trigger the Mixed GC frequently.

Application scenarios of G1

  • More than 50% of the Java heap is occupied by active data
  • The frequency of object assignment or chronological ascension varies greatly
  • GC pauses are too long (longer than 0.5 to 1 second)
  • More than 8GB of heap memory (recommended)
  • The pause time is less than 500ms

How can a system optimize the JVM for hundreds of thousands of concurrent requests per second

Kafka has a similar system that supports high concurrency. It is common for Kafka to handle tens of thousands or even hundreds of thousands of messages per second. Generally, partial cursing Kafka requires large memory machines (such as 64GB), which means that young generations can be allocated 40 GB of memory to support high concurrency. We used to say that the young GC is fast for Eden. Will it be fast in this case? Given that 30 or 40 GIGABytes of memory can be recollected in a few seconds at the soonest, it may take only a minute or two to fill up 30 or 40 GIGABytes of Eden with Kafka concurrency. That means the entire system runs for a minute or two and is unable to process new messages for a few seconds because the Young GC is stuck, which is obviously not a good idea. So to optimize this situation, we can use the G1 collector and set -xx: MaxgcpauseMills is 50ms, assuming that 50ms can recover three to four GIGABytes of memory, and then 50ms stackage is actually completely acceptable and the user has almost no perception, so the whole system can process business and collect garbage with almost no perception of stackage.

G1 is a natural fit for JM running on such large memory machines, and can perfectly solve the problem of large memory garbage collection taking too long.

The resources

  1. Oracle Jdk documentation
  2. Some of the key technologies for Java Hotspot G1 GC are provided by the Meituan technical team
  3. Deep Understanding of the Java Virtual Machine, 3rd edition, zhiming Zhou
  4. Java Virtual Machine specification (Java SE 8 edition), translated by Love Fly, Zhou Zhiming et al