Learning JVM found very interesting, feel are knowledge dry goods, but easy to forget, in order to prepare autumn recruit, this dish chicken had to summarize the content again, for you to see the officer comments comments, this article only involves garbage recycling part.

Xiaoxia don’t panic, first on a picture to suppress panic

Why – Why learn about recycling

  • Check for memory overflow and leakage
  • Break through garbage collection to become the bottleneck of system to achieve higher concurrency

What1 – Which memory areas need to be reclaimed

  • Don’t need the recycling area: the program counter, virtual machine, local method stack The three regions is private to the thread, each stack frame memory allocations are basically set in the class structure is known, when the memory allocation and recovery with certainty, or at the end of the thread and method, thus, the memory of recycling, recycling don’t need to consider
  • Memory allocation in the Java heap and method area is dynamic, and you don’t know what objects will be created until the program is running, so it’s these two parts of memory reclamation that need to be focused on

What2 – Which objects need to be reclaimed (Object Survival Determination Algorithm)

Reference counting algorithm

  • Principle: To add a reference counter to an object, every time there is a reference to it, the counter increment 1; When a reference is invalid, the counter value is reduced by 1; An object whose counter is 0 at any point in time cannot be used again
  • Advantages: simple implementation, high judgment efficiency
  • Disadvantages: There are problems with circular references between objects

Accessibility Analysis Algorithm (GC Roots)

  • Principle: Through a series of objects called “GC Roots” as the starting point, search down from these nodes, the search path is called the reference chain, when an object is not connected to any reference chain GC Roots, it is proved that the object is not available
  • GC Roots Object type
    • The object referenced in the virtual machine stack (the local variable table in the stack frame)
    • The object referenced by the class static property in the method area
    • The object referenced by the constant in the method area
    • Objects referenced by JNI (commonly referred to as Native methods) in the Native method stack

When – When will it be recycled

Trigger conditions for Minor GC

  • Initiate a Minor GC when there is not enough space in the Eden region

Trigger conditions for Full GC

  • When system.gc () is called, Full GC is recommended, but not necessarily executed
  • When the old age ran out of space
  • Method area space is insufficient
  • The average size of the old age after each Minor GC is greater than the available memory of the old age
  • If the size of the object is larger than the available memory of the To Survior region, the object is saved To the old age, and the available memory of the old age is smaller than the size of the object

The difference between Minor and Full GC

  • Minor GC refers to garbage collection that occurs in the new generation. Because Java objects tend to be ephemeral, Minor GC is very frequent and generally fast

  • Major GC (Full GC) refers to GC that occurs in an older era, with Major GC occurring, often accompanied by at least one Minor GC. Major GC is typically 10 times slower than Minor GC.

How – How to Recycle (garbage collection algorithm)

Mark-clear algorithm

  • The algorithm is divided into two stages: “mark” and “clear”. First, all the objects to be recycled are marked, and all the marked objects are recycled uniformly after the completion of marking.
  • The characteristics of
    • Inefficient: Both marking and cleaning processes are inefficient
    • Space problem: A large number of discrete memory fragments will be generated after clear marking. Too much space fragmentation will result in insufficient contiguous memory to be found and a garbage collection action will have to be triggered in advance when large objects need to be allocated during the program running.

Copy algorithm (algorithm adopted by the new generation)

  • The available memory is divided into two equal pieces by capacity. Each time one piece of memory is used up, the surviving objects are copied to the other piece and the used memory space is cleaned up
  • The characteristics of
    • Advantages: Each time for the whole half of the recovery, memory allocation without considering the complex situation of memory fragmentation, as long as the heap top pointer is moved, memory allocation in order, simple implementation, efficient operation.
    • Cons: Memory shrinks to half its original size

Mark-collation algorithm

  • The algorithm is divided into two stages: “mark” and “tidy up”. The “mark” stage is the same as the “mark – clear” algorithm. Instead of cleaning up the recyclable objects directly during the collation phase, all surviving objects are moved to one end and memory beyond the end boundary is cleaned up directly
  • The characteristics of
    • Advantages: Space debris is avoided and space utilization rate is improved
    • Disadvantages: inefficient, inefficient marking and cleaning processes

Generational collection algorithm

  • The Java heap is divided into new generation and old generation, and the appropriate collection algorithm is adopted according to the characteristics of each generation.
  • In the new generation, only a small number of surviving objects are found in each garbage collection, and the replication algorithm can be used to complete the collection with only a small amount of the replication cost of surviving objects
  • In the old days, because of the high survival rate of objects and the lack of extra space for allocation guarantees, “mark-clean” or “mark-tidy” algorithms were used for collection

Garbage collector

measure

  • JVM Throughput Throughput is the ratio of CPU time spent running code to total CPU consumption, i.e., throughput = user code time run/(user code time run + garbage collection time)
  • Pause time The length of time during which no application is active during which the JVM thread running code suspends user code execution by relinquiting it to the GC thread for garbage collection
  • The relationship between the two
    • The ideal JVM garbage collector is “throughput as high as possible, pause times as short as possible”;
    • You cannot have high throughput and low pause times at the same time. When choosing a JVM garbage collector, we must determine our realistic goals: a GC algorithm can only focus on maximum throughput or minimum pause times, or try to find a trade-off between the two;
    • The shorter the pause time, the more suitable for the user interaction program, good response speed can improve the user experience; And the high throughput can efficiently use the CPU time, as soon as possible to complete the operation of the program, mainly suitable for the background operation without too much interaction tasks;

Serial collector

  • The characteristics of

    • Single-threaded collectors, which use only one CPU or one collection thread to do garbage collection;
    • During garbage collection, all other worker threads must be suspended until The collection is complete (also called “Stop The World”).
    • The advantage is that it is simple and efficient (compared to the single-line layer of other collectors). For a single-CPU-limited environment, Serial collectors can achieve the highest single-thread collection efficiency naturally by focusing on garbage collection because there is no overhead of thread interaction;
  • Application scenarios and parameter Settings

    • Scenario: This collector is the default generation collector for HotSpot VIRTUAL machines running in Client mode suitable for a single CPU environment
    • Algorithm: Heap memory young generation uses “copy algorithm”; Heap memory used to be mark-de-clutter in the old days
    • Configuration: +xx:UseSerialGC; Serial is used in the younger generation and Serial Old is used in the older generation

ParNew collector

  • The characteristics of

    • The ParNew collector is the multithreaded version of the Serial collector;
    • Can only be applied to the new generation;
    • Multi-threaded collection, parallel;
    • In a multi-CPU environment, as the number of cpus increases, it is important for GC; The efficient use of system resources is beneficial. By default, it opens the same number of collection threads as the number of cpus;
    • The ParNew collector is by no means better than the Serial collector in a single-CPU environment and is not 100% guaranteed to be outperformed even in a hyperthreading two-CPU environment;
  • Application scenarios and parameter Settings

    • Algorithm: Heap young generation uses “copy algorithm”
    • Configuration: +xx:UseParNewGC; The younger generation uses ParNew, the older generation uses Serial old-XX :ParallerGCThreads; How many threads can be opened to reclaim memory for multiple cpus

Parallel avenge

  • The characteristics of

    • Multithreaded collector
    • Only for the new generation
    • Adaptive adjustment strategy
    • The goal of the Parallel Insane is to achieve a controlled throughput
    • The Parallel Scavenge collector cannot be used in conjunction with the CMS collector
  • Application scenarios and parameter Settings

    • New generation: Replication algorithms. Set parameters: -xx :+UseParallelGC;
    • Back in the day: Multithreading and mark-and-tidy algorithms. Setting parameters: -xx :+UseParallelOldGC;
    • -xx :ParallelGCThreads=, number of ParallelGCThreads;
    • -xx :MaxGCpauseMillis, set the maximum garbage collection pause time.
    • -xx :GCTimeRatio, set the throughput.
    • -xx :+UseAdaptiveSizePolicy -xx :+UseAdaptiveSizePolicy -xx :+UseAdaptiveSizePolicy There is no need to manually specify the size of the Cenozoic (Xmn), Eden and Survivor area ratio (- XX: SurvivorRatio), promotion of age old s (- XX: PretenureSizeThreshold) detail parameters, such as, The virtual machine collects performance monitoring information based on the current system performance and adjusts these parameters dynamically to provide the most appropriate pause times or maximum throughput. This approach is called GC Ergonomics.

CMS collector

  • Operation process

    • The initial tag simply marks objects that GC Roots can be directly associated with, which is fast and requires “Stop The World”
    • The process of concurrently marking GC Roots to trace all objects is the longest in the process
    • Re-mark the mark record of the portion of the object that was marked during the concurrent marking period because the user program continued to operate. The pause time in this phase is generally slightly longer than in the initial marking phase but much shorter than in the concurrent marking phase. This stage also needs to “Stop The World”
    • Concurrent remove
  • The characteristics of

    • Concurrent Mark Sweep is implemented based on a mark-sweep algorithm
    • Time of each stage: concurrent marking/concurrent cleanup > re-marking > initial marking
    • Very sensitive to CPU resources
    • Space debris caused by the mark-clear algorithm
    • Concurrent collection, low pause, so the CMS collector is also called concurrent low pause collector
    • Floating garbage cannot be handled and a “Concurrent Mode Failure” may occur, resulting in another Full GC
    • Since the collector thread can work with the user thread, both the concurrent markup and the concurrent cleanup process, which take the longest time throughout the process; So, in general, the CMS collector’s memory reclamation process is executed concurrently with the user thread.
  • Application scenarios and parameter Settings

    • You can choose CMS garbage collector when your application needs to have short application pauses and can accept the garbage collector sharing the application with the application
    • -xx :+UseConcMarkSweepGC, using the CMS collector
    • – XX: + UseCMSCompactAtFullCollection, after Full GC, a defragmentation, finishing process is exclusive, causes the pause time
    • – XX: + CMSFullGCsBeforeCompaction, set up, after several Full GC on a defragmentation
    • -xx :ParallelCMSThreads, set the number of threads in your CMS (usually approximately equal to the number of available CPUS)

G1 collector

  • Operation process

    • The initial tag simply marks objects that GC Roots can be directly associated with, which is fast and requires “Stop The World”
    • Concurrent tag the process of GC Roots tracing all objects, which can be executed concurrently with the user program
    • Final tag corrects the portion of the tag record that changes during concurrent tagging because the user program continues to operate
    • Filter collection Sorts the collection value and cost of each Region, specifying a collection schedule based on the expected GC pause time of the user
  • The characteristics of

    • Garbage collector for server-side applications
    • Parallelism and concurrency
    • Generational collection
    • Spatial integration: as a whole, it is based on the “mark-tidy” algorithm, and locally (two regions) it is based on the “copy” algorithm
    • Predictable pauses: The G1 collector can control pauses with such precision that the user explicitly specifies that no more than N milliseconds should be spent in garbage collection within a time segment of M milliseconds, which is almost characteristic of the real-time Java garbage collector
    • G1 divides the entire Java heap (including the new generation and the old generation) into multiple memory blocks (regions) of equal size. Each Region is a logically contiguous segment of memory. It maintains a priority list in the background and collects the most garbage according to the allowed collection time
  • Application scenarios and parameter Settings

    • -xx :MaxGCPauseMillis = 50 Sets the maximum allowed GC time

Content added

Reasons for generational collection algorithms

  • Why does the next generation of JVM heap memory choose “replication algorithm”?
    • In the new generation, because a large number of objects are “dead”, that is, only a few objects survive a garbage collection, HotSpot JVM divides the heap into three chunks: Eden, Survior1, Survior2, each 8:1:1 in size.
    • When allocating memory, only Eden and a chunk of Survior are used. For example, when Eden+Survior1’s memory is found to be running low, the JVM initiates a MinorGC that cleans up abandoned objects and copies all surviving objects into another Survior2 chunk. Then, Eden+Survior2 is used for memory allocation. In this way, only 10% of memory space is wasted to implement a garbage collection approach with compression, avoiding the memory fragmentation problem.
    • However, when an object needs to apply for memory space and finds that the remaining space in Eden+Survior cannot place the object, Minor GC needs to be performed. If the free memory space after Minor GC still cannot place the object, then the available object needs to be moved to the old age. Then the new object is stored in the Eden area, which is called “allocation guarantee”.
  • Why is the mark-de-clutter algorithm used for JVM heap memory in the old days?
    • Objects in older generations tend to live longer, so a large number of objects survive each GC, so if you choose the “copy” algorithm, you need to copy a large number of live objects at a time, resulting in low efficiency. Moreover, if the “replication” algorithm is used in the new generation, when Eden+Survior does not fit an object, the memory of the old age can be used for “allocation guarantee”. However, if the algorithm is used in the old age, there is no other region for allocation guarantee in the old age if an object cannot fit. Therefore, the “mark-tidy” algorithm is commonly used in the old days.

Object self rescue

  • First labeling: After the reachability analysis of the object, it is found that there is no reference chain connected with GC Roots. The first labeling and a filtering will be carried out to determine whether the object overwrites the Finalize () method
    • If the method has been overridden and the Finalize () method of the object has not been executed, then the changed object is thrown into the F-queue
    • If the Finalize () method is not overridden or the object has already executed that method, then the collection of “To be collected” is entered
  • Second mark: The virtual machine automatically creates a low-priority Finalizer thread to execute the Finalize () method of objects in the F-Queue, and GC will tag the objects in the F-queue for a second time. If the object is re-associated with any object in the reference chain (such as assigning itself (the this keyword) to a class variable or a member variable of the object), the second tag moves it out of the collection that is “about to be reclaimed.

Classification of references

  • Strong references are the ones we usually use. A a = new A(); That is, the reference associated with an object created with the keyword new is a strong reference. As long as a strong reference exists, the object is never reclaimed.
  • Soft References are only reclaimed by the JVM if the heap is about to experience an OOM exception. A SoftReference is implemented using SoftReference. Soft references have a shorter lifetime than strong references.
  • Weak references As long as the garbage collector is running, objects to which soft references point are collected. WeakReference is implemented through the WeakReference class. Weak references have a shorter lifetime than soft references.
  • A Phantom Reference, also called a Phantom Reference, is the same as no Reference and cannot be used to obtain an object instance. The only purpose of an object association virtual reference is to receive a system notification before the object is collected by the garbage collector. Virtual references are implemented through the PhantomReference class.

This article is the author of the finishing reading notes, if there are mistakes in the place trouble pointed out, welcome the big guy guidance.

The author's own public number, after the autumn recruit began to write articles, welcome attention.Copy the code

Reference source: Deep Understanding of the Java Virtual Machine, public account: Java big back end