What is garbage

Memory that is not being used is garbage. Memory in Java is dynamically allocated and automatically reclaimed. Learning about garbage collection mechanisms and tuning strategies can help us deal with memory leaks in a variety of ways.

Java virtual machine runtime data is divided into program counters, virtual machine stacks, local method stacks, heaps, and method areas. Among them, the program counter, virtual machine stack, local method stack these three areas are thread private, will be automatically reclaimed when the thread dies, so there is no need to manage. So garbage collection only needs to focus on the heap and method areas.

Java memory allocation

The heap allocation

  • If local thread allocation buffering is enabled, it is allocated on TLAB priority by thread.
  • Objects are allocated first on Eden
  • Big object into the old age
  • Long-lived objects enter the old age

On the stack

Based on escape analysis techniques, if an object is always inside a method. Making sure that the object does not escape the method makes it allocate memory on the stack so that the object is automatically destroyed at the end of the method, reducing the stress of garbage collection.

How to identify garbage

Reference counting method

Add a reference counter to an object. If there is a reference to the object, the count is increased by 1. If the reference is invalid, the count is decreased by 1.

  • Advantages: simple implementation, high efficiency.
  • Disadvantages: Does not solve the problem of object circular references. In a multithreaded environment, reference count changes also require expensive synchronization and lower performance.

Accessibility analysis

Current mainstream virtual machine only have algorithm. He searches the object node from the Root (GC Root) down a path called a chain of references. If there is no connection between an object and the Root, the object is considered garbage.

The GC Root object

Objects that can be used as GC Roots include:

  • Object referenced in the virtual machine stack
  • Method area class static attribute credit object
  • Method to remove the object referenced by a constant
  • Objects referenced by JNI (Native methods) in the Native method stack

Reference types

Determining the survival of an object depends on a reference. Prior to jdk1.2, a reference was simply an address indicating that another chunk of memory was stored in the data of a type. After jdk1.2, Java references fall into the following four categories.

  • Object obj = new Object(); The most common reference, which is created by new, produces a strong reference to the object. As long as the object has a strong reference point and GC Roots are reachable, the object will not be reclaimed.

  • Soft reference: SotfReference class. Represents objects that are useful but not necessary. Before OOM, the garbage collector will add these soft reference objects to the collection scope, for soft reference associated objects, only when the memory is out of collection.

  • WeakReference: a non-essential object is referred to as a WeakReference, which will be reclaimed during YGC. Because the YGC time is uncertain, weak references can be collected at any time.

  • Virtual reference: PhantomReference, which cannot be used to get the object to which it points. It can be collected by the garbage collector at any time. Virtual references are used to track the activity of objects collected by the garbage collector and must be used in conjunction with reference queues:

Such as

ReferenceQueue<String> queue = new ReferenceQueue<>();
PhantomReference<String> phantomReference = new PhantomReference<>("Hello", queue);
Copy the code

When the garbage collector is about to reclaim an object and finds that it has a virtual reference, it queues the virtual reference before reclaiming the object’s memory. The program knows whether the object is to be collected by the garbage collector by referring to whether the virtual reference is in the queue.

Garbage collection algorithm

Mark-sweep

The most basic algorithm, the first step, first mark, starting from GC Roots, mark the object reference relationship in turn, and then clear the unmarked object. Disadvantages: one is the efficiency problem, mark and clear two process efficiency is not high; In addition, a large number of discrete memory fragments will be generated after cleaning. As a result, when we need to allocate large objects, we can’t find enough contiguous memory to trigger another GC.

Copy

Copy algorithm: divide the memory into two areas of the same size, use one area each time, when used up, copy the remaining objects in this area to another area, and then delete this area. There is no memory fragmentation, but the memory utilization is not high, resulting in half of the space waste.

Most objects on the new generation “live and die” and HotSpot defaults to dividing the new generation memory into one large Eden region and two small Surivor regions. During GC, live objects in Eden and one Surivor are copied to another Surivor. The default ratio between Eden and Surivor is 8:1:1, so only 10% of the space is wasted.

Mark-compact

As with mark clearing, mark is marked first, but instead of recollecting object cleaning directly afterwards, live objects are moved towards one end, and memory beyond the end boundary is cleaned up directly.

Since the replication algorithm has low efficiency and space waste when there are many living objects, the mark-collation algorithm is generally used in the old era.

Comparison of three GC algorithms

Recovery algorithm advantages disadvantages
Mark-clear Implement a simple Memory fragmentation exists
copy No debris, good performance Low memory usage
Mark-tidy No fragments The finishing process is expensive

Generational collection

Combine the above algorithms to adapt to local conditions. Generally we divide the heap space into new generation and old age. Depending on their specific garbage collection algorithms, in the new generation, a large number of objects die and a few survive each GC, so a copy algorithm is used. The cost of copying only a few objects does not result in discontinuous memory fragmentation. In the old age, the survival rate of the object is high, and the use of mark finishing or mark clearing.

Garbage collector

The term

  • STW: Short for “Stop The World”. Service will be suspended and there will be no response.
  • Serial collection: GC single thread memory collection, suspends all user threads. Such as Serial, Serial Old
  • Parallel collection: Multiple threads perform concurrent GC, at which point the user thread is suspended. Such as the Parallel
  • Concurrent collection: The user thread and the GC thread execute at the same time without pausing the user thread. Suitable for scenarios that require response time. Like the CMS collector

Serial/Serial Old

Serial is a single-threaded collector that uses only one CPU or one collection thread to do garbage collection and needs to stop all worker threads until the collection is complete. Serial stops the user thread and uses a copy algorithm to collect the young generation, while Serial Old uses a mark-collation algorithm to collect the Old generation.

Features: Single threaded collection, STW

Use -xx :+UseSerialGC to enable Serial + Serial Old

ParNew

A multithreaded version of Serial that uses multiple threads for garbage collection. He is the garbage collector of the new generation. Need to work with older CMS collectors. Therefore, CMS should be used to enable the new generation of default ParNew.

The number of threads collected can be controlled by the -xx :ParallelGCThreads parameter, which is also STW

Parallel Scavenge/Paraller Old

Paraller Old is a collector using the tag-collation algorithm. Paraller Avenge is a new generation collector using the duplicate algorithm. Paraller Old is a collector using multiple threads.

The main parameters

  • – XX: UseParallelGC open
  • -xx :MaxGCPauseMillis Maximum garbage collection pause time
  • -xx :GCTimeRatio Sets the throughput

Controllable throughput

The maximum GC pause time is controlled by the -xx :MaxGCPauseMillis parameter. -xx :GCTimeRatio Sets the throughput

Increased throughput enables efficient use of CPU time and faster completion of program tasks.

Adaptive GC strategy

The application provides the above two parameters to control the throughput size. The Parallel Exploitinsane can also be used to unlock the indicator GC policy by -xx :+UseAdptiveSizePolicy. After this function is enabled, there is no need to manually set the new generation size, Eden/Surivor ratio and other parameters. The virtual machine adjusts these parameters dynamically according to the system health to achieve the optimal pause time and maximum throughput.

CMS

Function with the old garbage collector.

  • 1. Initial mark: only mark objects that GC Roots can be directly associated with, STW.
  • 2. Concurrent marking: The process of GC Roots Tracing is performed simultaneously by the GC thread and the user thread.
  • 3. Relabelling: Corrects those objects whose markup changes due to program execution during concurrent marking. Will STW
  • 4. Concurrent cleanup: Concurrent garbage collection (GC thread and user thread at the same time)
  • 5. Concurrent reset: Clear the CMS GC context information to prepare for the next GC.

Advantages: Low pauses, concurrent execution

Disadvantages:

  • Due to concurrent execution, the CPU resources are stressed.
  • Unable to handle floating garbage generated during collection.
  • Because of the mark-clear algorithm, there will be a lot of memory fragmentation. FullGC is triggered when memory runs out when large objects need to be allocated.

Use -xx :UseConcMarkSweepGC to enable the ParNew+CMS/SerialOld collector combination, that is, the new generation uses ParNew, the old CMS, when the CMS error, SerialOld standby. In order to solve the problem of fragmentation, the CMS can pass – XX: + UseCMSCompactAtFullCollection, forcing the JVM in FullGC to compress the old s, after the completion of execution defragmentation, at the same time will STW. Want to decrease The Times of STW can configure – XX: + CMSFullGCsBeforeCompaction parameters, set in the implementation of The Times, the JVM again space arrangement in old age.

CMS has been marked deprecated in JDK9 and removed in JDK14.

G1

The next generation collector introduced in JDK7 is a collector for server-side applications. Compared to the previous collector, G1 works on the entire heap, while the other collectors work only on the new or old generations.

G1 divides the Java Heap into regions of the same size. Using -xx: G1HeapRegionSize to specify the Region size, the value ranges from 1 to 32M, and is the N power of 2. G1 categorizes each Region, including Eden, Surivor, Old, and Humongous. Humongous is equivalent to a large Old, which is used to store large objects.

The G1’s heap memory layout is different from the traditional heap memory layout.

G1 divides the space into zones, tracks the value of the Garbage accumulation in each zone, and builds a priority list that prioritizes the areas that collect the most Garbage, which is why it’s called garbage-first.

What are the features of G1 compared to CMS

  • Concurrency and parallelism: make full use of multiple cpus and shorten STW time. The concurrent marking phase can be executed concurrently with the user thread, and the final marking phase can be executed in parallel with the GC thread.
  • Generational collection: G1 can do garbage collection for the entire GC heap without working with other garbage collectors.
  • Spatial integration: G1 adopts “mark-collation” algorithm as a whole, and Eden and Surivor Region are partially implemented by “replication” algorithm. The whole process avoids memory fragmentation.
  • Controlled pause times: In addition to pursuing low pauses, G1 also models predictable pause times, allowing users to specify that no more than one period of time will be spent collecting.

G1 Garbage collection mode

Young GC
  • 1. When all Eden Regions are full, the Young GC is triggered
  • 2. Objects in Eden Region will be transferred to Surivor Region
  • 3. Objects in the original Surivor Region are transferred to another Surivor or promoted to the Old Region
  • 4. The free Region is added to the free list for future use.
Mixed GC

When the old s percentage accounted for the total size of the Heap to reach a threshold (- XX: InitialingHeapOccupancyPercent), the default 45%, will trigger a Mixed GC, collect the entire new generation and some of the old age.

Mixed GC collection process

  • 1. Initial tag: only tag objects that GC Roots can associate with. Change the value of TAMS, which is STW.
  • 2. Concurrent marking: Starting from GC Root, reachability analysis of objects in heap memory is performed to find viable objects. This process can be performed concurrently with the user thread.
  • 3. Final mark: Correct the record of mark changes caused by the user thread continuing to run during the concurrent mark phase. This process STW, but can be executed in parallel.
  • 4. Filter recovery: Sort the recovery value and cost of each Region, and make a recovery plan according to the expected pause time of users.
Full GC

The Full GC mode is collected in Serial Old and STW is triggered when the copied object runs out of memory or cannot allocate enough space.

How can I reduce Full GC?

  • Increase -xx :G1ReserverPercent to increase reserved memory.
  • Reduce – XX: InitialingHeapOccupancyPercent, triggered when the old s reaches this value is Mixed GC,
  • Added -xx :ConcGCThreads number of concurrent threads.

conclusion

A summary of the garbage collector described above

  • Serial A Serial that applies to the new generation and uses the replication algorithm
  • ParNew parallel, acting on the new generation, using the replication algorithm
  • Serial Old, mark-collation algorithm
  • Parallel is a new generation of replication algorithms
  • Parallel Old, mark-collation algorithm
  • CMS concurrency, for old age, mark-clear algorithm
  • G1 concurrency + parallelism, whole heap, replication algorithm, mark-collation

The resources

In-depth Understanding of Advanced Features and Best Practices of the Java Virtual Machine JVM

Welcome to the public account: Wuxiaozhu