The Java virtual machine heap holds all the objects created during the running of the program. Virtual machines can use the new, Newarray, Anewarray, and multianewarray directives to create objects, but there is no explicit code to release them. Garbage collection is the process of automatically releasing objects that are no longer used by programs.

This article is not intended to describe a formal Java garbage collector, because there is no such formal description. As mentioned earlier, the Java Virtual Machine specification does not require any specific garbage collection techniques, which are not required at all. But until unlimited memory was invented, most Java virtual machines came with garbage collection.

Why use garbage collection

The name garbage collection implies that objects no longer needed by a program are garbage and can be discarded. More accurately, it should be called memory reclamation. When an object is no longer referenced by a program, the heap space it uses can be reclaimed for subsequent use by new objects.

The garbage collector must be able to identify which objects are no longer referenced and free up the heap space they occupy. During the process of freeing an object, the garbage collector also runs finalizers for the object to be freed.

In addition to releasing objects that are no longer referenced, the garbage collector also deals with heap debris. Debris is generated during the running of a program. When a request is made to allocate memory space for a new object, the size of the heap may have to be increased, even though the total free space available is sufficient. This is because the free space in the heap is not continuous enough to fit a new object.

Leaving the task of memory reclamation to the virtual machine has several advantages:

  • Improve production efficiency. When programming in a language that doesn’t have garbage collection, you can spend a lot of time looking for elusive memory problems (C/C++ is rarely used, never experienced…).
  • Maintain program integrity. Garbage collection is an important part of Java security policy, and it is impossible for a Java programmer to crash a Java virtual machine by mistakenly freeing memory

However, a potential drawback of using garbage collection is that it increases the burden on your application and may affect its performance. The Java virtual machine must keep track of which objects are being used and which need to be released. The memory freeing process also requires more CPU time slices than the explicit freeing process.

Garbage collection algorithm

Any garbage collection algorithm must do two things:

  • Detecting garbage Objects
  • Reclaim heap space used by garbage objects and return it to the program

Root object (Root)

Garbage detection is typically done by creating a collection of root objects and checking their accessibility from the start. An object is accessible if the executing program has access to a reference path between it and an object. Objects are always accessible to programs. From these root objects, any object that can be touched is considered an active object. Objects that cannot be touched are considered garbage and are no longer used by the program.

Any object referenced by the root object is touchable and therefore active. In addition, any object referenced by the active object is touchable. Programs can access any object they touch, so they must be kept in the heap. Any untouchable objects can be collected because the program has access to them.

source

The Java virtual machine’s root object collection varies depending on the implementation, but always contains object references in local variables and operand stacks of stack frames (as well as object references in class variables). There are probably several sources of root objects:

  • One source is a reference to an object in the class’s constant pool, such as a string. The constant pool of a loaded class may point to strings stored in the heap, such as the class name, superclass name, field name, field signature, method name, or method signature.
  • Another source is an object reference passed to a local method that has not been freed by the local method (according to the local method interface, the local method can simply return to free the reference; Either explicitly call a callback function to release the passed reference, or a combination of the two.)
  • Another potential source of objects is the portion of the Runtime data area of the Java virtual machine allocated from the garbage collector heap. For example, in some implementations, the class data in the method area may itself be stored in the heap using the garbage collector so that classes that are no longer used can be detected and unloaded using the same garbage collection algorithm as objects.

To tell the truth, these three zha so abstract, first look down

Basic types of interference

In Java virtual machine implementations, some garbage collectors can distinguish between a real object reference and a primitive type (such as an int variable) that looks a lot like a legitimate object reference. (For example, an int integer, if interpreted as a local pointer, may point to an object in the heap.) However, some garbage collectors still choose not to distinguish between true object references and spooks. Such garbage collectors are said to be conservative because they may not be able to free every object that is no longer referenced. For conservative collectors, garbage objects are sometimes misjudged to be alive because they are “referenced” by an underlying type that looks like an object reference. This conservative garbage collector increases garbage collection speed because some garbage is forgotten.

Basic algorithm classification

The two basic ways to distinguish live objects from junk objects are reference counting and tracing.

  • The reference counting garbage collector distinguishes live objects from garbage objects by saving a count for each object in the heap. This count counts the number of times an object is referenced.
  • The trace garbage collector traces the reference graph from the root node. Objects encountered during tracing are marked in a certain way, and when tracing is complete, unmarked objects are judged untouchable and can be reclaimed.

Reference counting collector

Reference counting was an early strategy for garbage collection. In this approach, each object in the heap has a reference count.

The rules

The rules include:

  • When an object is created and a reference to the object is assigned to a variable, the reference count of the object is set to 1.
  • When any other variable is assigned a reference to this object, the count is incremented by one.
  • When a reference to an object exceeds its lifetime or is set to a new value, the object’s reference count is reduced by one.
  • Any object with a count of 0 can be garbage collected.
  • When an object is reclaimed, the count of the objects it references is also reduced by one.

In this approach, an object being garbage collected may trigger subsequent garbage collection actions for other objects.

disadvantages

Reference counting does not detect circular references (that is, references between two or more objects).

Examples of circular references are

class A{
  public B b;
}
class B{
  public A a;
}
public class Main{
    public static void main(String[] args){
        A a = new A();
        B b = new B();
        a.b=b;
        b.a=a;
        a=null;
        b=null; }}Copy the code

A and B are null, but are never collected according to reference counting rules because a and B hold their own references.

Trace collector

The trace collector traces the object reference graph from the root node. Objects encountered during tracking are marked in a certain way. To mark, you either set the mark on the object itself or set the mark on a separate bitmap. When tracing ends, unmarked objects are known to be unreachable and can be collected.

The basic tracking algorithm is called the mark-sweep algorithm. This name identifies two phases of the garbage collection process:

  • Marking phase: Garbage collection traverses the reference tree, marking each object it encounters.
  • Clear phase: Free memory occupied by unmarked objects. This phase triggers the object’s finalization method.

Compression collector

The Java virtual machine garbage collector may have a policy for dealing with heap fragmentation. The two strategies commonly used by mark-sweep collectors are compression and copying. Both methods reduce fragmentation by moving objects around quickly.

The compression collector moves live objects across the free area to the other end of the heap so that there is a large contiguous space at the other end of the heap. After that, all references to moved objects will also be updated to point to the new memory address.

Updating references to moved objects is sometimes done through an indirect object reference layer, instead of referring directly to objects in the heap, where the object reference actually points to a table of object handles. The object handle actually points to the actual location of the object in the heap. When the object is moved, just update the handle table. However, on object access, performance is compromised by adding a handle table.

Copy collector

The copy garbage collector moves all live objects to a new region. During copying, they are placed next to each other so that gaps in the old area can be eliminated. The original area is considered vacant.

The advantage of this approach is that when traversing from the root object, objects are copied as soon as they are found, eliminating the distinction between marking and cleaning. The object is quickly copied to the new region, while the turn pointer remains in its original position. The pivot pointer allows the garbage collector to find references to objects that have been moved. The garbage collector then sets the reference associated with this object to the value of the steering pointer.

The general copy-collector algorithm is called stop-copy. The scheme is as follows:

  • The heap is divided into two regions, and only one of them is used at any one time.
  • Objects are allocated in the same region until the region is exhausted.
  • At this point, program execution is terminated, the heap is traversed, and objects encountered during the traversal are copied to another region.
  • whenStop-copyAfter the process, the program resumes execution.
  • Memory will be allocated from the new heap area until it is also used up. The program is aborted again and the process is repeated.

The price of this approach is that it actually takes twice as much memory to run for the specified size of the heap.

The graphic description is as follows:

  • In Snapshot 1, the bottom half of the heap is unused, and the top half is filled with objects piecemeal (the parts that contain objects are shown in orange)
  • In Snapshot 2, the top half is gradually filled with objects as the program runs.
  • In Snapshot 3, the top half of the object has been filled.
  • In snapshot 4, because snapshot 3 has filled the top half, the garbage collector stops the program and traces the live object graph from the root node. When active objects are encountered, they are copied to the bottom half of the heap, and each object is next to the other.
  • In Snapshot 5, the garbage collection has just finished and the program has resumed running. The upper part is cleaned up and left as unused. Previously alive objects moved to the lower half.
  • In Snapshot 6, the bottom half is gradually filled with objects as the program runs.
  • In snapshot 7, the object has filled the bottom half.
  • In Snapshot 8, the garbage collector aborts the program again, tracing the live objects. This time it copies all the live objects it encounters to the top half of the heap.
  • In snapshot 9, garbage collection is complete and the lower part of garbage objects are cleared and become unused. Previously alive objects moved to the top half.

This process is repeated over and over again during program execution.

A collector that collects by generation

The disadvantage of a simple stop-copy collector is that all live objects must be copied each time a collection is made. Most programs in most languages have the following characteristics, and the copying algorithm can be improved if we consider all of them.

  • Most objects created by most programs have a short lifetime.
  • Most programs create objects with very long life cycles.

One of the main reasons simple stop-copy collectors waste efficiency is that they spend a lot of time copying these long-lived objects back and forth each time.

The generation collector addresses the stop-copy inefficiency problem by grouping objects by lifetime, collecting more ephemeral young objects than long-lived ones. The logic is as follows:

  • The heap is divided into two or more subheaps
  • Each heap serves a generation of objects
  • The youngest generation does the most frequent garbage collection. Because most objects are ephemeral, only a small percentage of young objects survive the first collection
  • If the youngest object survives several garbage collections, the object grows into a longer-lived generation: moved to another subheap
  • Each older generation collects less frequently than the younger generation
  • Each time an object matures in its age group (having survived multiple garbage collections), it is moved to a higher age group

In addition to stop-copy garbage collection algorithms, generational collection techniques can also be used for mark-sweep garbage collection algorithms. In either case, breaking down the heap by age can improve the performance of the most basic garbage collection algorithm.

Adaptive collector

Adaptive collector algorithms Li Yongle is based on the fact that some garbage collection algorithms work better in some situations and some collection algorithms work better in other situations.

Adaptive algorithms monitor the situation in the heap and adjust accordingly to appropriate garbage collection techniques.

With an adaptive approach, the Implementer of the Java virtual machine does not need to select just one particular algorithm. You can use multiple techniques so that you can use them where you’re best at.

Train algorithm (train GC)

The train algorithm was first proposed by Richard Hudson and Eliot Moss to provide time-limited progressive collection in mature object space and was first used in Sun’s Hotspot VIRTUAL machine. The algorithm details the organization of the mature object space of the garbage collector collected by generation.

Sadly, the Train GC was completely removed by Sun JDK 6, but the idea is more important, take a look.

Problems existing in previous collection algorithms

Garbage collection algorithms have a potential disadvantage compared to active release of object memory in that the programmer has no control over the process of scheduling CPU time for memory collection.

It is almost impossible to predict with any accuracy when garbage collection will take place and how long it will take. Because garbage collection generally suspends the entire program to find and collect garbage objects, they can trigger garbage collection at any point in the program’s execution, and the duration of the termination is uncertain. This pause in garbage collection is sometimes long enough for users to notice.

A garbage collection algorithm is said to be destructive when it may cause noticeable pauses by the user or cause the program to fail to meet the requirements of a real-time system.

Progressive collection algorithm

The way to achieve non-destructive garbage collection is to use progressive garbage collection algorithms.

A progressive garbage collector does not attempt to discover and recycle all garbage objects at once, but rather to discover and recycle some at a time. So only a portion of the heap is garbage collected at a time, so theoretically each collection should last for a shorter time.

If you can ensure that each collection does not exceed a maximum length of time, you can make the Java VIRTUAL machine suitable for a real-time environment and also eliminate noticeable pauses for users.

Typically progressive collectors are collectors collected by generation.

Carriages, trains and railway stations

The train algorithm divides the mature object space into fixed-length memory blocks, and the algorithm is executed separately in one block at a time. Here are the rules:

  • Each memory block belongs to a collection and is ordered in the collection.
  • Orderly arrangement between sets.

In the original paper, memory blocks were called compartments; A collection is called a train. The memory space of a mature object is called a train station.

The algorithm organization chart is as follows:

Naming scheme

Trains are assigned numbers in the order in which they were created. Therefore, suppose we pull the first train (the memory of the object that enters the age group first) into track 1, called train 1. The second train that arrives is pulled to track 2, called Train 2. The next train to arrive is pulled to track 3. And so on, under such a plan, the smaller train is always the earlier train.

Inside the train, carriages (memory blocks) are always attached to the rear of the train. The first car attached to the train is called car 1, and the next car attached to the train is called car 2. So inside the train, smaller numbers always indicate earlier carriages.

This naming scheme gives the overall order of memory blocks in the mature object space.

The figure above shows three trains, labeled train 1, train 2, and train 3.

  • The train 1It has four carriages, marked 1.1-1.4
  • The train 2It has three carriages, marked 2.1-2.3
  • The train 3It has five carriages, marked 3.1-3.5

And the order of addition is:

  • Car 1.1 is in front of car 1.2, car 1.2 is in front of car 1.4, and so on.
  • The train 1The last carriage is always thereThe train 2In front of the first car, so car 1.4 precedes car 2.1. Similarly, carriage 2.1 precedes carriage 3.1.

The train algorithm only performs garbage collection on one block (the lowest number block) at a time. For the figure above, it collects car 1.1, and car 1.2 for the next execution. When it collects the last car of train 1, the algorithm collects car 2.1 of train 2 on the next execution. (From this part, after collecting a car, the algorithm should remove the collected car).

Objects enter the mature object space from the subheap of younger ages, and whenever they enter, they are attached to any trains that already exist (except the smallest number trains), or to one or more trains built specifically to accommodate them. That is, the object has two ways to get to the train station:

  • Packed into carriages and attached to the rear of the train outside of the smallest number of trains ==
  • Into the station as a new train

== minimum number of trains except == is why? Because the algorithm is always detecting the smallest number of trains, or the smallest number of cars of the smallest number of trains. The train will not directly store objects that have just entered the station. Look at the carriage collection below to understand!

Car collection

Each time the algorithm is executed, it collects either the smallest digital car in the smallest digital train or the entire smallest digital train. Here’s the idea:

  • First check pointingMinimum number trainIf there is no reference from any carriageMinimum number trainExternal reference pointMinimum number trainInternal objects, then the entire train contains garbage objects that can be discarded.
  • ifMinimum number trainIt’s not all garbage, so the algorithm turns its attention to trainsMinimum number carOn. In this process, the algorithm transfers the detected referenced objects to other carriages, and then any objects remaining in the carriages are recyclable.

We know that there is a circular reference problem, and for the train algorithm, the key to ensuring that there is no circular data structure in the entire train is how the algorithm moves objects, including the following rules:

  • If the car being collected == = (this car is actuallyMinimum number traintheMinimum number carThere is a reference to an object from outside the train station, and the object is moved to another car outside the train being collected.
  • If the object is referenced by another train in the station, the object is moved to the train that referenced it.
    • The transferred object is then scanned, moving all the objects it references in the original car to the car that references them, and this process is repeated until no references from other trains point to the car being collected.
    • If the recipient’s train car runs out of space, the algorithm creates a new car and attaches it to the rear of the train.
  • Once there are no citations outside the station, and no citations from other trains inside the station, the remaining external citations of the car being collected are from other cars of the same train.
    • The algorithm moves such an object to the last car on the smallest digital train.
    • The newly moved objects are then scanned to see if any references point to objects in the collected car.
      • Any newly discovered objects are also moved to the last car of the smallest digital train
      • Then it continues to scan for new objects, and the whole process repeats
      • Until there is no reference of any kind to the car being collected
    • The algorithm then returns the entire space occupied by the smallest digital car, frees any objects still in the car, and returns

Thus, at each execution, the train algorithm collects either the smallest number of the smallest number train, or the entire smallest number train. Move the correspondences to the trains that reference them, and the related objects become centralized. Finally, all objects in a circular data structure called garbage, no matter how large, are placed on the same train. Increasing the loop data structure only increases the number of cars that eventually make up the same train. As stated earlier, the train algorithm checks to see if the smallest number train is completely garbage, whereas for internal references such as circular data structures, it can do the job.

Memory collection

The goal of the train algorithm is to provide a time-limited incremental collection for the generational garbage collector.

For car, can specify a maximum when the allocated memory size, and each collect only a car, so in most cases, the algorithm can ensure that every time the execution time within a maximum time limit, but can’t ensure that every time is, because the algorithm is not only in the course of execution copy objects.

To optimize the collection process, the train algorithm uses memory sets. A memory set is a data structure that contains external references to a carriage or train. The algorithm maintains a memory set for each car and each train in the train station (mature object space). So the memory set of a particular car records all references to objects in the car. An empty memory set indicates that objects in the car or train are no longer referenced by any variables outside the car or train (forgotten). What is forgotten is untouchable and can be recycled.

The advantage of memory sets

Memory sets are a technique that can help train algorithms do their job more efficiently. When the carriage return algorithm finds that the memory set of a car or train is empty, it knows that the car is full of garbage and can free up that memory for recycling. And when moving an object to another car, the information in the memory set helps it efficiently update all references to the object being moved.

limit

We can limit the size of a car to control the upper limit of each byte copy, but when moving a popular object (when there are a lot of external connection), the required work is almost impossible to limit, every time algorithm is used to move an object, it must traverse object memory collection, update each connection, in order to make the connection point to the new address. Because there is no limit to the number of connections to an object, there is no limit to how long it takes to update all connections to a moved object.

That is, under certain conditions, train algorithms can still be destructive. But the train algorithm works pretty well for the most part, except that it’s a welcome exception.

Again, the Train GC was completely eliminated by The Sun JDK 6, but how different will subsequent GC strategies be? Right?

Put an end to

In the Java language, an object can have finalizing methods: methods that the garbage collector must run before releasing the object. This possible finalization method complicates the job of the garbage collector for any Java virtual machine.

Put an end to the method

To add a finalizer to a class, do this:

public class FinalizerTest {
    @Override
    protected void finalize(a) throws Throwable {
        //do something 
        super.finalize(); }}Copy the code

The garbage collector must check if objects it finds that are no longer referenced have a Finalize () method.

Because, in the presence of finalization methods, the Java virtual machine’s garbage collector must perform some additional steps each time it collects:

  • First, the garbage collector must use some method to detect objects that are no longer referenced (called the first scan).
  • It must then check whether the objects it detects are no longer applied declare finalization methods. (If time permits, it may be at this point that the garbage collector starts working on these existing finalization methods).
  • When all finalization methods have been executed, the garbage collector must detect objects that are no longer referenced again (called a second scan), starting at the root node. This step is necessary because the finalization method may resurrect some objects that are no longer referenced, causing them to be referenced again.
  • Finally, the garbage collector can release objects that are not referenced in the first and second scans.

To reduce the time it takes to free memory, the garbage collector can optionally insert a step between the time it scans that some object has a finalizing method and the time it runs a finalizing method:

  • Once the garbage collector has performed its first scan and finds objects that are no longer referenced that need to be finalised, it can run a small trace. Starting with the object (rather than the root node) on which the finalization method needs to be executed, the execution logic is as follows:
    • Any contentNot reachable from the root node&&Untouchable from the object to be finalisedThese objects cannot be revived when the finalization method is executed; they can be released immediately.
    • Please note the two conditions marked above, yeswithThe relationship between

If an object with a finalizing method is no longer referenced, and its summing method has already been executed, the garbage collector must use some way to remember that and not execute the object’s finalizing method again.

If the corresponding has been resurrected by its own finalizing method or the finalizing method of another object and is not referenced again later, the garbage collector must treat it as if it were an object with no finalizing method (that is, the reason finalize() is executed only once).

When programming in Java, remember that the garbage collector runs the finalization method of an object. Because we cannot predict when garbage collection will be triggered, we cannot predict when an object’s finalization method will be executed.

The life cycle of object accessibility

Prior to version 1.2, from the garbage collector’s point of view, every object in the heap had three states:

  • Touchable: Objects that the garbage collector can trace through the root node
  • Resurrectable: Once the program releases all references to the object (not reachable from the root node trace diagram), the object becomes resurrectable. Regarding resurrection, please note:
    • It’s not just a statementfinalize()Method, but all objects will pass through the resurrectable state.
    • Because you can customize the objectfinalize()Method (again referencing an object), any object in a resurrectable state can be resurrected again
    • The garbage collector ensures that all resurrectable objects have been executedfinalize()After (if declared), the state of the resurrectable object is either reachable or unreachable.
  • Untouchable: The object is no longer touched and cannot be revived by any finalizing method. Untouchable objects no longer affect program execution and can be recycled freely.

In version 1.2, the reachable state was extended with three new states: soft reachable, weak reachable, and shadow reachable. The reachable state becomes strongly reachable. (Actually is our programming used weak reference, strong reference what bar)

Any direct reference from the root node, such as a local variable, is strongly touchable. Similarly, any object referenced by a strongly touchable object is also strongly touchable

Reference object (Reference)

Java provides java.lang.rf.Reference class to manage object connection, including SoftReference, WeakReference, and PhantomReference implementation classes. The inheritance diagram is as follows:

  • SoftReference: encapsulates a soft reference to a reference target
  • WeakReference: encapsulates a weak reference to a reference target
  • PhantomReference: encapsulates a shadow reference to a reference target

The difference between strong references and the above three types of references is that strong references prevent the reference target from being garbage collected, while soft, weak, and shadow references do not.

When we need to create a Reference object, we simply pass a strong Reference to the constructor of the corresponding Reference implementation class. The following uses SoftReference as an example:

public class ReferenceTest{
    public static void main(String[] args) {
        Cow c = new Cow();
        SoftReference<Cow> softReference = new SoftReference<Cow>(c);
        c = null; }}class Cow{}
Copy the code

We maintain a softReference to a Cow instance object by maintaining a softReference. The reference diagram is as follows:

The SoftReference object encapsulates a SoftReference to a Cow object. The SoftReference object is strongly referenced by a local variable SoftReference, == like all local variables, this is a root node == to the garbage collector (dubious part).

Once a reference object is created, it persists until the soft reference of its reference target is cleared by the program or garbage collector. To clear a reference object, the program or garbage collector simply calls the clear() method of the Referece object.

Changes to the reachable state

As mentioned earlier, the purpose of referencing objects is to be able to point to objects that can be collected by the garbage collector at any time. In other words, the garbage collector can change the reachable state of objects that are not strongly reachable at will.

If you want to monitor the state of change, we can use the Java lang. Rf. ReferenceQueue < T > class. We have a Reference constructor:

public abstract class Reference<T> {
    Reference(T referent) {
        this(referent, null);
    }
    Reference(T referent, ReferenceQueue<? super T> queue) {
        this.referent = referent;
        this.queue = (queue == null)? ReferenceQueue.NULL : queue; }}Copy the code

As for the overall structure of Reference, the figure is as follows:

== To be honest, Reference’s system has not been very much in touch with beforeAndroidSimple use inWeakReferencethegetMethods, such as the end of this garbage collection, separate teasing ==

So we could write:

class ReferenceTest {
    public static void main(String[] args) throws InterruptedException {
        ReferenceQueue<Cow> referenceQueue = new ReferenceQueue<>();
        Cow c = new Cow();
        WeakReference<Cow> softReference = new WeakReference<Cow>(c, referenceQueue);
        // set strong references to COW to null
        c = null;
        
        softReference.clear();
        System.out.println("clear Reference");

        System.out.println("Get Cow under soft reference =" + softReference.get());

        // The virtual machine will do this step, but due to time constraints, we will trigger manually
        softReference.enqueue();
        
        Reference<? extends Cow> cow = referenceQueue.remove();
        System.out.println("Get Cow from the release queue ="+ cow); }}class Cow{}
Copy the code

When the garbage collector decides to collect weak reachable objects, it clears Cow objects referenced in WeakReference objects (through the clear() method). The WeakReference object may then be added to its reference queue either immediately or at a later time.

To add a reference object to its associated queue, the garbage collector executes its enqueue() method. A reference object is only associated with a queue when it is created, and is added to the queue if and only if the object first executes the enqueue() method.

In different cases, the garbage collector queues soft reference, weak reference, and shadow reference objects to represent three different reachable state transitions. This represents a total of six reachable states, as follows:

  • Strongly reachable: Objects can be searched from the root node without any reference objects. The object life cycle begins in a strongly reachable state and remains so long as the root node or another strongly reachable object references it. Garbage collector does not attempt to reclaim objects in this state.
  • Soft touchable: Objects are not strongly touchable, but can be touched by one or more soft reference objects starting from the root node. Garbage collector == may == reclaim soft reachable objects. If it does, it clears all soft references to the soft touchable object. When the garbage collector clears a soft reference object associated with the reference queue, it enqueues that soft reference object.
  • Weakly reachable: Objects are neither strongly reachable nor soft reachable, but are reachable by one or more weakly referenced objects starting at the root node. Garbage collector == must == reclaim memory occupied by weak reachable objects. When the garbage collector collects, it clears all weak references to this weakly reachable object and queues the weakly referenced object (if associated)
  • Resurrectable: An object is neither strongly reachable, soft reachable, nor weakly reachable, but may still be resurrected to one of these states by some finalizing method.
  • Shadow reachable: The object is neither strongly reachable, soft reachable, nor weak reachable, and has been determined not to be resurrected by any finalizing method (if the object defines a finalizing method). The finalizing method has already been executed), and it can be reached by one or more shadow reference objects starting from the root node. When an object referenced by a shadow becomes shadow-reachable, the garbage collector immediately enqueues the referenced object. The == garbage collector does not clear a shadow reference; all shadow references must be cleared explicitly by the program
  • Unreachable: An object is not strongly reachable, soft reachable, weak reachable, or shadow reachable, and it cannot be resurrected. Untouchable objects are ready to be reclaimed.

Note that the garbage collector enqueues soft and weak reference objects when their reference target leaves the corresponding reachable state (by calling clear) and shadow reference objects when the reference target enters the corresponding state (that is, after constructing a shadow reference object and executing enqueue()). That is, the garbage collector enqueues a soft or weak reference object to indicate that the reference object has just left the soft or weak reachable state; The garbage collector queues the shadow reference to indicate that the reference target has entered the shadow reachable state. == Shadow reachable objects remain shadow reachable until the program explicitly clears the reference object ==.

Usage of different types of references

The garbage collector treats soft, weak, and shadow objects differently because each is designed to provide a different service to the program.

  • Soft references can create an in-memory cache, which is related to the overall memory requirements of the program.
  • Weak references can create a canonical mapping, such as a hash table, whose keywords and values can be purged from the mapping table when there are no other references.
  • Shadow references can implement more complex end-of-life cleanup policies in addition to finalization methods.

Some considerations for shadow references

Note that to use a soft or weak reference to a reference target, you can call the get() method of the object. If the reference target is not cleared, the referenced object is returned; If cleared, null is returned.

But for get() methods on shadow reference objects, null is always returned; Take a look at the source code for PhantomReference

public class PhantomReference<T> extends Reference<T> {
    public T get(a) {
        return null;
    }
    public PhantomReference(T referent, ReferenceQueue<? super T> q) {
        super(referent, q); }}Copy the code

It’s so simple… == why? We described six states earlier, and for shadow reachable states, it means that the object is not resurrected. This rule is broken if the get() method referenced by the shadow returns an object. Remember that == if an object reaches a shadow reachable state, it cannot be resurrected. = =

But is virtual machine design really that rigorous? Let’s look at the following code:

    public static void main(String[] args) {
        ReferenceQueue<Cow> referenceQueue = new ReferenceQueue<>();
        Cow c = new Cow();
        PhantomReference<Cow> softReference = new PhantomReference<Cow>(c, referenceQueue);
        // set strong references to COW to null
        c = null;

        System.out.println("Get () gets Cow under shadow reference =" + softReference.get());

        // The virtual machine will do this step, but due to time constraints, we will trigger manually
        softReference.enqueue();
        
        //remove the shadow reference object
        Reference<? extends Cow> cow = referenceQueue.remove();
        
        // Get () does not get the object
        Field field = Reference.class.getDeclaredField("referent");
        field.setAccessible(true);
        Object obj = field.get(cow);
        System.out.println("Get Cow from the release queue =" + obj);
        
        // Release it manually
        cow.clear();
        
        // get a reflection
        obj = field.get(cow);
        System.out.println("Get Cow from the release queue =" + obj);
    }
Copy the code

The output is as follows:

Get Cow = NULL under shadow reference Get Cow = hua.lee.jvm.Cow@60e53b93 get Cow = null from the release queueCopy the code

It is important to note that objects in the == shadow reachable state are not collected by the garbage collector. We need to clear() manually as in the above example to free the object

Common scenarios of soft reference

Virtual machine implementations need to clear soft references before throwing the OOM, but in other cases they can choose when or whether to clear them. It is best for implementations to clear soft references only when memory is low, to clear old ones first rather than new ones, and to clear long-unused ones rather than recently used ones.

Soft references allow you to cache data in memory that requires time to be retrieved from external sources, such as files, databases, or data on the network. As long as the virtual machine has enough memory, all strong and soft reference data can be stored in the heap. If memory is tight, the garbage collector decides to clean up soft references, reclaiming the space occupied by soft-referenced data. The next time a program needs to use this data, it may have to load it again from an external data source.

Common scenarios for weak references

Weak references are similar to soft references, but differ:

  • The garbage collector can decide whether to remove soft references to objects in a soft reachable state.
  • For weakly reachable objects, the garbage collector immediately clears the associated weak references.

This property of weak references allows us to create canonical mappings with keywords and values. The Java.lang.WeakHashMap class provides such a specification mapping with weak references. Key-value pairs can be added to an instance of a WeakHashMap using the put() method. Unlike a HashMap, however, in a WeakHashMap, the keyword object is implemented through a weak reference associated with a reference queue. If the garbage collector detects that the keyword object is weakly reachable, it clears the reference and adds the reference object weakly to the object to its respective queue. The next time a WeakHashMap is accessed, it pulls all weak reference objects stored by the garbage collector from the reference queue and clears the mapping associated with them.