Moment For Technology

Understand JVM PART II: Garbage collection algorithms

Posted on May 28, 2023, 2:22 a.m. by Stuart Cresswell
Category: The back-end Tag: The back-end java algorithm jvm

Determine which objects need to be reclaimed

  • Reference counting algorithm:
    • Add a reference counter to the object. The counter is incremented each time a reference is made. When a reference is invalid, the counter value is reduced by 1; An object whose counter is 0 at any point in time cannot be used again. But the JVM doesn't use this method because it doesn't solve the problem of two objects referring to each other in a loop.
  • Accessibility analysis algorithm:
    • The basic idea of this algorithm is to search down from a series of objects called "GC Roots" as the starting point, and the search path is called the Reference Chain. When an object is not connected to GC Roots by any Reference Chain, it is proved that the object is unavailable.
  • In the Java language, objects that can be used as GC Roots include the following:
    • The object referenced in the virtual machine stack (the local variable table in the stack frame).
    • The object referenced by the class static property in the method area.
    • The object referenced by the constant in the method area.
    • Objects referenced by JNI (commonly referred to as Native methods) in the Native method stack.

After JDK1.2, there are four kinds of references, and the intensity of references gradually decreases

  • Strong Reference
    • Strong references are common in program code, similarObject obj=new Object()For such references, the garbage collector will never reclaim the referenced object as long as the strong reference exists.
  • Soft Reference
    • Soft references are used to describe objects that are useful but not necessary. Objects associated with soft references are listed in the collection scope for a second collection before the system is about to run out of memory. An out-of-memory exception is thrown if there is not enough memory for this collection. Provided after JDK1.2SoftReferenceClass to implement.
  • Weak Reference
    • Weak references are also used to describe non-essential objects, but they are weaker than soft references, and objects associated with weak references only survive until the next garbage collection occurs. When the garbage collector works, objects associated only with weak references are reclaimed regardless of whether there is currently enough memory. Provided after JDK1.2WeakReferenceClass to implement.
  • Phantom Reference
    • Virtual references, also known as ghost references or phantom references, are the weakest type of reference relationship. The existence of a virtual reference does not affect the lifetime of an object, nor can an object instance be obtained through a virtual reference. The sole purpose of setting a virtual reference association for an object is to receive a system notification when the object is reclaimed by the collector. Provided after JDK 1.2PhantomReferenceClass to implement.

Garbage collection algorithm

Mark-sweep algorithm

  • This method is divided into "mark" and "clear" two stages: first mark all the objects that need to be recycled, after the completion of mark all the marked objects. It is the most basic collection algorithm, and subsequent collection algorithms are based on this idea and improve its shortcomings.
  • There are two main deficiencies: one is the efficiency problem, the efficiency of both marking and clearing is not high; The other is the space problem. After the mark is cleared, a large number of discontinuous memory fragments will be generated. Too much space fragment may cause that when the program needs to allocate large objects in the future, it cannot find enough contiguous memory and has to trigger another garbage collection action in advance.

Copying algorithms

  • This method divides the available memory into two equally sized pieces by capacity and uses only one of them at a time. When this area of memory is used up, the surviving objects are copied to the other area, and the used memory space is cleaned up again. In this way, the memory is reclaimed for the whole half region every time, and there is no need to consider the complicated situation of memory fragmentation when allocating memory. As long as the heap top pointer is moved, the memory can be allocated in order, which is simple to implement and efficient to run. But half the available memory is costly.

  • This method is generally used in the recovery of the new generation, because 98% of the objects of the new generation will be recovered soon, so the division is not 1:1, but divided into a large Eden space and two small Survivor Spaces. Use Eden and one of the pieces Survivor each time. When recycling is done, the surviving objects in Eden and Survivor are copied to another Survivor space once and for all, and Eden and the Survivor space that was just used are cleaned up. By default, the HotSpot VIRTUAL machine has an 8:1:1 ratio of Eden to Survivor, meaning that 90% of the available memory in the new generation is wasted and only 10% is wasted.

Mark-compact algorithm

  • The copy-collection algorithm has to carry out more replication operations when the object survival rate is high, and the efficiency will be low. More importantly, if you do not want to waste 50% of the space, you need to have extra space for allocation guarantee, in the extreme case that all objects in the memory used are 100% alive, so in the old days, this algorithm generally cannot be used directly.
  • According to the characteristics of the old s, someone put forward another "tag - finishing" (Mark - Compact) algorithm, the labeling still like "tag - clear" algorithm, but the subsequent steps are not directly to recyclable objects to clean up, but let all live objects move to the end, and then clear directly outside the boundary of the memory.

Generational Collection

  • The current garbage collection of commercial virtual machines adopts the "generational collection" algorithm, which has no new idea, but divides the memory into several blocks according to the different life cycle of objects. Typically, the Java heap is divided into the new generation and the old generation, so that the most appropriate collection algorithm can be used for each generation. In the new generation, a large number of objects are found dead and only a few survive in garbage collection, so the replication algorithm is selected, and only a small amount of the replication cost of the surviving objects can be collected. In the old days, because the object has a high survival rate and there is no extra space to allocate it, it has to use the "mark-clean" or "mark-tidy" algorithm for recycling.
About (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.