The object is dead

Generally, there are two ways to determine whether an object is “dead”, one is reference counting and the other is reachability analysis.

Reference counting algorithm

Add a reference counter to the object, incrementing it every time a reference is made to it; When a reference is invalid, the counter value is reduced by 1; An object whose counter is 0 at any time cannot be used again.

In Java virtual machines, reference counting algorithms are not used to determine whether objects are alive or not.

Accessibility analysis algorithm

Through a series of objects called “GC Roots” as the starting point, search down from these nodes is called the Reference Chain. When there is no Reference Chain between an object and GC Roots (that is, from GC Roots to this object is unreachable), Proves that the object is not available. In the Java language, objects that can be used as GC Roots include the following:

  • The object referenced in the virtual machine stack (the local variable table in the stack frame)
  • The object referenced by the class static property in the method area.
  • The object referenced by the constant in the method area
  • Objects referenced by JNI in the local method stack

Talk about reference

Sometimes we may need to keep it in memory when there is enough space. When memory space is too tight after garbage collection, these objects can be discarded. After JDK 1.2, Java introduced four types of references: Strong Reference, Soft Reference, Weak Reference, Phantom Reference. These four references are briefly described below.

  • A strong reference is typically a reference created with new, and the garbage collector will never reclaim the referenced object as long as the strong reference exists.
  • Soft references are used to describe objects that are useful but not necessary. These objects are reclaimed before the system is about to run out of memory. An out-of-memory exception is thrown if there is not enough memory for this collection.
  • Weak references are also used to describe non-essential objects, and objects associated with weak references can only survive until the next garbage collection occurs. When the garbage collector works, objects associated only with weak references are reclaimed regardless of whether there is currently enough memory.
  • The only purpose of a virtual reference is to receive a system notification when the object is reclaimed by the collector.

Garbage collection algorithm

Mark-clear algorithm

The most basic collection algorithm is the Mark-sweep algorithm, which is divided into two phases: “Mark” and “Sweep” : flag all objects that need to be collected, and then collect them all when they’re done. The markup here is marked by an accessibility analysis algorithm.

This method has two shortcomings: one is efficiency, the efficiency of both marking and clearing is not high; The other is the space problem. After the mark is cleared, there will be a large amount of discontinuous memory fragmentation. Too much space fragmentation may cause that when allocating large objects later, enough contiguous memory cannot be found and another garbage collection action has to be triggered in advance.

Replication algorithm

To solve the efficiency problem, it divides the available memory capacity into two equally sized pieces, using only one of them at a time. When this block of memory is used up, the surviving objects are copied to the other block, and the used memory space is cleaned up again.

Modern commercial virtual machines use this collection algorithm to reclaim the new generation, dividing the memory into a large Eden space and two smaller Survivor Spaces, using Eden and one Survivor at a time. When recycling is done, the surviving objects in Eden and Survivor are copied to another Survivor space once and for all, and Eden and the Survivor space that was just used are cleaned up.

Mark-collation algorithm

Typical of the old days, mark-compact is used. The marking process is still the same as mark-clean, but instead of cleaning up the recyclable objects directly, the next step is to move all surviving objects to one end and then clean up the memory directly beyond the end boundary.

Generational collection algorithm

Current garbage Collection on commercial virtual machines uses a “Generational Collection” algorithm that divides memory into chunks based on the life cycle of an object. Generally, the Java heap is divided into new generation and old generation, and the most appropriate collection algorithm is adopted according to the characteristics of each generation. That is, in the new generation, a large number of objects will die every time garbage collection, and only a small number of objects survive, so the replication algorithm is used. For old age objects with a high survival rate and no extra space to allocate guarantees for it, the “mark-clean” or “mark-clean” algorithms must be used for recycling.

Garbage collector

If the collection algorithm is the methodology of memory collection, then the garbage collector is the concrete implementation of memory collection. The CMS collector and G1 collector are introduced here.

CMS collector

The CMS(Concurrent Mark Sweep) collector is a collector whose goal is to obtain the shortest collection pause time. Its operation process is divided into four steps:

  • CMS Initial Mark
  • CMS Concurrent Mark
  • Re-marking (CMS Remark)
  • CMS Concurrent sweep

The initial marking and re-marking steps still need to “Stop The World”. Initial marking only marks the objects directly associated with GC Roots, which is very fast. The concurrent marking stage is the process of GC Roots Tracing, while the re-marking stage is to revise the marking records of those objects whose marks are changed due to the continuous running of user programs during concurrent marking. The pause time in this phase is generally slightly longer than the initial tag phase, but much shorter than the concurrent tag.

G1 collector

The G1(garbage-First) collector is a Garbage collector for server-side applications. The G1 features the following:

  • Parallelism and concurrency
  • Generational collection
  • Spatial integration: Unlike “tag – cleaning” algorithm of CMS, the G1 as a whole is based on “tag – sorting algorithm of the collector”, from the perspective on the local () between the two Region is based on the “copy” algorithm, but anyway, these two means during the running of the G1 algorithm does not produce memory space debris, after collection can provide neat available memory.
  • Predictable pauses: This is another advantage G1 has over CMS. Reducing pause times is a common focus for BOTH G1 and CMS, but G1 models predictable pauses in addition to pursuing low pauses.

While other collectors prior to G1 collected the entire Cenozoic or old age, G1 no longer does. The OPERATION of the G1 collector can be roughly divided into the following steps:

  • Initial Marking
  • Concurrent Marking
  • Final Marking
  • Live Data Counting and Evacuation

Memory allocation and reclamation policies

The automatic memory management advocated in the Java technology architecture ultimately boils down to the automated solution of two problems: allocating memory to objects and reclaiming memory allocated to objects.

Objects are allocated in Eden first

In most cases, objects are allocated in the Eden region of the new generation. When the Eden area does not have enough space to allocate, the virtual machine will initiate a Minor GC.

Big object goes straight to the old age

Large objects are Java objects that require a large amount of contiguous memory. The most typical large objects are long strings and arrays.

Long-lived objects will enter the old age

Virtual machines use the idea of generational collection to manage memory, so the memory collection must be able to identify which objects should be in the new generation and which objects should be in the old generation. To do this, the virtual machine defines an object Age counter for each object. If the object is still alive after Eden’s birth and after the first Minor GC and can be accommodated by Survivor, it is moved to Survivor space and the object’s age is set to 1. Each time an object passes through a Minor GC, its age increases by one year, and when it reaches a certain age (15 by default), it will be promoted to the old age.

Dynamic object age determination

In order to better adapt to the memory status of different programs, the virtual machine does not always require that the age of the object must reach Max TenuringThreshold to advance to the old age. If the sum of all object sizes of the same age in the Survivor space is greater than half of the Survivor space, Objects older than or equal to this age can go directly to the old age without waiting until the age specified in MaxTenuringThreshold.

conclusion

The above content mainly refers to chapter 3 garbage collector and Memory allocation strategy in In-depth Understanding of Java Virtual Machine. It mainly explains the algorithm to determine whether objects are dead in Java, garbage collection algorithm, and mainly introduces CMS and G1 garbage collector.

reference

  • In-depth Understanding of the Java Virtual Machine