Section one: Talking about the death of the object

An overview,

  • The amount of memory allocated to each stack frame is determined when the class structure is determined.
  • Java heap, whose size can only be determined at run time.

Second, reference counting algorithm

  • Overview: Simple implementation, high judgment efficiency. Python, AS uses this algorithm. Also, because reference counting algorithms do not solve the problem of circular references to objects, the JVM does not use them.
  • Add a reference counter to the object, which is +1 every time the object is referenced and -1 every time the reference fails, all the way down to 0

Accessibility analysis algorithm

  • Overview: Current mainstream JVM adoption
  • Schematic diagram of accessibility analysis algorithm

1. Chain of reference

  • The GC Roots object acts as the starting point and searches down from the node. The path searched is called the reference chain.

2. Recovery conditions

  • When an object reaches GC Roots without any reference chains, the object is reclaimed. (GC Roots creates a reference tree that is reclaimed when the object is no longer on the tree.)

3. Four GC Roots:

  • Pointer to the reference type in the stack frame local variable table, holding the address of the object
  • The method area has a reference to obj in the class static attribute (static attributes are class attributes and therefore have the same life cycle as class).
  • There are references to obj in the constant pool (the constant pool has nothing to do with the objects generated by class).
  • Native uses obJ.

Strong and weak

  • Strong references, as long as there are references, the GC will never reclaim
  • Soft references, useful but not required objects. Recycle before OOM.
  • Weak reference, non-required object. Only live until the next GC.
  • A virtual reference, which cannot get the object at all, exists only to get a notification before being GC.

Determination of death

  • Summary: Determine that an OBJ has been marked dead at least twice (there is no reference to this object on the GC Roots reference chain).
  • Use accessibility analysis to make death determination schematics

1. Twice marking process

  • When the reference is not found in GC Roots reference chain for the first time, tag for the first time and filter if finalize() is necessary:
    • There is no necessary
      • Object overrides the Finalize () method
      • The VIRTUAL machine has called finalize()
    • It is necessary to
      • The flag to the F-queue is executed by the low-priority thread Finalizer.
    • Finalize () is more of a compromise for C/C++ programmers to understand.

5. Method area recycling (hotspot Permanent generation recycling)

1. Deprecate constants (for constant pools in the method area)

  • Symbolic references to classes (interfaces), methods, and fields in the constant pool are reclaimed when there are no references in the system.

2. Useless classes (uninstallation of involved classes, more on this later)

  • All instances of this class have been reclaimed
  • The classLoader that loaded this class has been reclaimed
  • This class is not applied to the class object in the method area (the entry to access information about the class in the method area) and cannot be used by reflact
  • The VM reclaims classes that meet the preceding three conditions. Node:
    • It is not like an object that must be recycled when it is not used.
    • A large number of dynamic class loading framework, resulting in a large number of classes in the method area, to prevent overflow. (Extensive use of reflection, dynamic proxy, dynamically generated JSP, frequent custom classloader)

Section 2: Overview of JVM reclamation algorithms

  • Abstract: This paper mainly describes the basis of garbage recycling algorithm, laying the foundation of the serial, parNew, Parallel Avenge, CMS, G1, Serial Old, Parallel Old algorithm

    • Mark-sweep Algorithm (CMS)
    • Replication algorithm
    • Mark-collation algorithms (Serial, parNew)
    • Generational collection algorithm

1. Mark-sweep Algorithm

  • Node: Subsequent collection algorithms are based on the mark-sweep algorithm

1. Implementation principle

  • tag
  • The object has already been marked (that is, in the F-queue) during the two determinations that the object is dead
  • remove
  • After the mark is completed, it is cleared uniformly.

2. There are problems

  • Efficiency problem, marking and cleaning efficiency is not high
  • Space problem. When the mark is cleared, it will cause many discontinuous memory space. This can be particularly complicated when allocating memory for large objects and can cause GC to be triggered again.

I. Copying Algorithms

  • Node: To solve efficiency and space problems in the mark-clear algorithm (insufficient space for a single large object).

1. Implementation principle

  • 1. Divide the memory into two equal parts and use only one piece at a time
  • 2. After a block is used up, copy the surviving objects in the block to another block

2. There are problems

  • Waste of space, half of which is sitting idle
  • Need old age to carry out space guarantee
  • In the case of high object survival, large areas of replication will occur

3,Node:

  • At present, the new generation of mainstream virtual machines all use the replication algorithm.
  • IBM research shows that 98% of OBJs in the new generation die immediately, so memory doesn’t have to be allocated 1:1
  • The hotspot in according to theEden:Survivor:Survivor= 8:1:1Each time Eden+Survivor is used, the surviving OBJ of these two pieces is copied to the remaining Survivor when a GC is triggered.
    • However, if there is not enough Survivor left to store a viable OBJ, the older generation needs to share
    • Here’s a little tidbit: Obj’s age is in the Header, and generally defaults to 15 years old

Iii. Mark-compact Algorithm

1. Implementation principle

  • Marking: The two-time marking process is still used, and the object marks are placed in the F-queue.
  • Collation: Survive obJ to move a segment, then clear the boundary outside, not inGC RootsReferences objects on the chain.
  • For the new generation, efficiency is not so high
  • Serial, parNew, serial Old, parallel Old

Iv. Generation – collection algorithm

1. Implementation principle

  • Memory is divided into chunks based on the object lifetime
  • New generation: copy-collect algorithm (each time a large number of objects die, only a small fraction of the cost of copying the surviving objects)
  • Old age: mark-copy algorithm (high survival rate of objects, no guaranteed space)

Section 3: Overview of HotSpot algorithm implementation

1. Enumerate the root node

  • Problem:
    • Traversal duration: traversal the whole reference chain of GC Roots (especially when the method area is large), with high time complexity
    • Consistency: Stop all Java threads while traversing the entire GC Roots reference chain to ensure that the reference relationship is not messed up.
      • Even the supposedly never-stopping CMS collector pauses while enumerating the root node.
      • Sun calls this process stop-the-world
  • Solution:
    • The OopMap data structure holds object references.
      • When the JVM creates the OBJ, it calculates the data type.
      • JIT compilation also records at safe points where stacks and registers are references.

2. SafePoint

  • HotSpot does GC Roots enumerations accurately and quickly with OopMap.
  • Problem: Logging information in OopMap for each bytecode instruction is a waste of memory
  • Solution:
    • SafePoint: the SafePoint records the obj reference and when it runs to SafePoint, the GC is triggered
    • GC won’t Stop-The-World anytime, anywhere, HotSpot will Stop in a safe spot
  • HotSpot does not actively stop all threads when GC occurs.
    • Set a flag bit that suspends when the thread encounters true
    • The flag bit overlaps with SafePoint

3. SafeRegin

  • Problem: The thread is unable to respond to the JVM’s interrupt flag in the case of a sleep or block
  • Solution:
    • Security zone: Within a code snippet, reference relationships do not change. GC can be started anywhere, extending safePoint
    • When the thread leaves SafeRegin, check to see if the GC Roots enumeration is complete. The completion thread continues and waits for GC to complete.

4, summarize

  • HotSpot triggers GC automatically from all threads, via safe points and safe zones in the OopMap data structure

Section 4: GC collector

An overview of the

  • Young Generation:
    • Serial
    • ParNew
    • Parallel Scavenge
  • The Tenured Generation
    • Serial Old
    • Parallel Old
    • CMS
  • All applicable
    • G1
  • There is no best collector, so make your choice accordingly

1. Serial collector

  • Features:
    • Replication algorithm
    • Single-threaded collector, pauses all worker threads (stop-the-world)
    • Simple and efficient (compared to the single-threading of other collectors), with no interaction overhead
    • In Client mode, the VM performs well when the vm has a small number of Cpu cores and does not have excessive memory allocation

Serial Old collector

  • Features:
    • Mark-collation algorithm
    • Old age collector
    • Single thread
    • Serial Older version
    • Prior to jdk1.5 with Parallel scavengehe, and CMS alternatives
  • Serialserial Old collector and schematic

ParNew collector

  • Features:
    • Replication algorithm
    • Multithreaded version of serial collector
    • New generation collector in Server mode
    • Multiple threads can only be used with CMS collector (serial single thread can also be used)
    • Single-core is inferior to serial, dual-core is not guaranteed, multi-core performance is ok
    • The default number of threads is the same as the number of CPU cores
  • ParNew collector schematic

4, the Parallel Scavenge

  • Features:
    • Replication algorithm
    • Multithreaded collector
    • Achieve manageable throughput
      • Throughput = run user code time /(Run user code time +)
    • High throughput can make efficient use of CPU, suitable for background applications
    • Shortening GC pauses reduces throughput and reduces space for new generation.
    • If this parameter is not specified, the VM can automatically collect performance monitoring information and adjust the pause time and throughput

5, the Parallel Old

  • Features:
    • Mark-collation algorithm
    • multithreading
    • The Insane (in 1.6)
    • This combination completes throughput first

6. CMS(Current Mark Sweep)

  • Features:

    • Unique Mark-clear algorithm collector (epoch-making)
    • multithreading
    • Get minimum pause time for target collector, concurrent collection, low pause
  • Operation process:

    • Initial tag

      • Stop-the-world marks objects that GC Roots can reach, fast
    • Concurrent tags

      • This is the process of GC Roots Tracing
    • To mark

      • Stop-the-world fixes the concurrent marking phase, the program continues to run, causing the OBJ marking to change, longer than the initial marking time, much shorter than the concurrent marking time
    • Concurrent remove

    • Both concurrent phases can work with the user thread. Overall, the CMS memory reclamation process is executed concurrently with the user thread.

  • Disadvantages:

    • Sensitive to CPU resources
      • Concurrent CPU usage reduces throughput. The program is slow.
      • CMS starts (CPU +3)/4 threads by default, so in the case of low CPU cores, especially those below 4 cores, CMS takes up very significant resources and user program execution is slow
    • Unable to handle floating garbage
      • CMS concurrent cleaning, cleaning process, user threads are also constantly generating garbage. And this part of the garbage, WHEN CMS cannot deal with, can only be left for the next time.
    • Space debris
      • When no contiguous memory space is allocated to large objects, a Full GC has to be triggered early
  • CMS collector diagram

G1 collector

  • Features:
    • Service-oriented garbage collector
    • Parallelism and concurrency. Give full play to multi-nuclear advantages to shorten stop-the-world time.
    • Collect by generation. The entire heap can be managed independently without the cooperation of other collectors. Can use different algorithms according to different situations
    • Spatial integration
      • It is a mark-collation algorithm as a whole and a copy algorithm as a part
      • Both algorithms provide continuous space for allocating large objects without triggering GC
    • Predictable pause times
      • Another advantage over CMS is that G1 has a predictable pause model.
      • Allow time segments of M ms to be set and GC pauses cannot exceed M ms
  • The G1 doesn’t differentiate between the new generation and the old generation in terms of overall memory structure.
    • Instead of physically isolating the heap as with previous collectors, divide the heap into equal sized areas (small areas dividing the new generation and the old generation).
    • There is a rememberSet for each region
  • G1 Builds a predictable pause time model
    • Avoid entire Java heap garbage collection
    • Based on the value of garbage collection in each Region (how much space is acquired and how long it takes to collect)
  • G1 avoids full heap scan
    • There is a rememberSet for each region
    • When the JVM finds that the reference type is being written to, it interrupts the write to check whether the reference is from different regions. If so, add a RememberSet to the root node when GC Roots enumerates to ensure that the whole heap will not be scanned and that nothing will be missed
  • Operation process:
    • Initial tag
      • Stop-the-world, annotate only objects that GC Roots can associate with
    • Concurrent tags
      • Reachability analysis, concurrent with user threads
    • In the end tag
      • A pause thread is required and can be executed in parallel
    • Screening of recycling
      • According to the pause time set by the user, the optimal recovery
  • G1 collector Diagram

Section 5: Memory allocation and reclamation strategies

  • Automatic Java memory management: OBJ memory allocation, OBJ memory reclamation

First, memory allocation

1. Allocate space in Eden first

  • A Minor GC is triggered when Eden space is insufficient
  • At this time, in the new generation, there are two survivor: one is the surviving object of the previous generation, and the other is the replication space to be reserved for the next generation
  • Cenozoic OBj life is very short Minor GC is very frequent and very fast
  • Major GC: Obj has a long life and is more than 10 times slower than Minor GC

2, big object directly into the old age

  • Large objects: Java objects with contiguous storage space, most typically long strings and bytes []
    • Large objects with a short life cycle will cause great pressure to be recycled in the old age

3. Age of the object

  • Generational idea
    • If you enter Eden space for the first time, you are moved to Survivor space (if there is always enough space, you do not enter the old age directly). After surviving Minor GC, your age is +1, and you enter the old age until you are 15
  • Dynamic object age
    • If the sum of all object sizes of the same age in a survivor space is greater than half of the survivor space. Objects older than or equal to that age go directly to the old age

4. Guarantee of space allocation

  • Before Minor GC, check whether the old space is larger than the total space of all objects (as long as it is larger than Eden)
    • If it is greater than, it is safe, then Minor GC is performed
    • If not, it is dangerous:
      • To allow the guarantee to fail, check if the maximum contiguous available space of the old age is greater than the average size of objects promoted to the old age, risk Minor GC if it is greater, and Major GC if it is less
      • Do not allow guarantees to fail, then major GC