The introduction

Hello, everyone, I am South orange, from contact with Java to now also have almost two years, two years, from a Java have several data structures do not understand super small white, to now understand a little bit of advanced small white, learned a lot of things. The more knowledge is shared, the more valuable, I this period of time summary (including from other big guy over there to learn, quote) some of the focus in the ordinary study and interview (self think), hope to bring some help to everyone

Here are some of the previous articles, if you are interested, you can read them.

  • Some overlooked points in the index
  • Redis basic knowledge of two articles to meet (1)
  • Redis basic knowledge to meet two (2)
  • Everything — Locks in JAVA
  • Conquering the JVM — JVM Objects and object access location (1)
  • Conquer the JVM — Garbage collection for the JVM (Part 2)
  • Conquering the JVM — The JVM’s Garbage collector (part 3)

The idea of this article comes from the ape man valley big guy, big guy technology is very good, write the article is also very hard, eat very satisfied. (^_^)

Students in need can add my public account, the latest articles in the future are in the first time, you can also ask me for mind map

The previous article covered JVM objects and their access location, with a lot of references to garbage collection. Java gives up a lot of things compared to c++ (like Pointers, my favorite), but there are also things c++ doesn’t have (like garbage collection). In this article, I’ll focus on garbage collection in the JVM.

1. Confirm the death of the object

To reclaim an object, first make sure the object is dead. The last chapter said that the death of an object must be marked at least twice, so how, what methods does the JVM give us to mark and determine the death of an object?

1. Reference counting

Add a reference counter to the object to catch all reference updates. Every time there is a reference to the object, the counter is +1. When the reference is invalid, the counter is -1.

However, there are a number of disadvantages to referencing counters:

  • 1. Requires extra space to store counters and needs to be updated with each reference, which is very tedious.
  • 2, if there is A circular reference, then GG (e.g. A references B,B references A)

2. Accessibility analysis

Reachable refers to an object that is reachable if it is referred to by at least one variable in the program by other reachable objects in a direct or indirect way. More precisely, an object is considered reachable only if one of two conditions is met:

  • 1. The object itself is the root object. The Root is an object that is accessed from space outside the heap. The JVM marks a set of objects as roots, including global variables, some system classes, and objects referenced in the stack, such as local variables and parameters in the current stack frame.
  • 2. The object is referenced by a reachable object.

The GC Roots object (a reference from outside the heap to inside the heap) is as follows:

  • Objects referenced in the virtual machine stack (local variable table in the stack frame).
  • An object referenced by a class static property in the method area.
  • Object referenced by a constant in the method area.
  • Objects referenced by JNI (commonly known as Native methods) in the Native method stack.
  • Java threads started and not stopped.

The basic idea of this algorithm is described in the last section of the last object recovery, if you are interested in the previous chapter to review.

However, there are some problems with accessibility analysis. For example, in a multithreaded environment, other threads may update references to objects that have already been accessed, causing false positives (setting the reference to null) or missing positives (setting the reference to an object that has not been accessed). The Java virtual machine loses at most a portion of the garbage collection opportunity in the case of false positives. Underreporting is a big problem, because the garbage collector may reclaim the memory of objects that are actually still referenced. Once a reclaimed object is accessed from the original reference, the Java virtual machine is likely to collapse directly.

Two, garbage collection algorithm

Next, let’s look at how to quickly recycle previously found garbage.

1. Mark-clear algorithm

Mark (mark all objects that can be reclaimed) – Clear (reclaim all objects that have been marked, freeing the space).

The advantage of the marker cleanup algorithm is that it can solve the problem of circular reference, and at the same time, it is simple and practical. Generally speaking, there are not many GC’s, and it works on the old collection of THE CMS garbage collector

Disadvantages of the marker clearing algorithm:

  • Memory fragmentation: Because objects in the heap of the Java virtual machine must be continuously distributed, there may be an extreme case where the total free memory is sufficient but cannot be allocated. Could not find enough continuous memory and had to trigger a garbage collection action early.
  • 2. Low distribution efficiency. If it’s a contiguous chunk of memory space, then we can make allocations around here by pointer bumping. For a free list, the Java VIRTUAL Machine accesses the list item by item to query for free memory that can be put into the newly created object.

2. Mark-copy algorithm

The mark-copy algorithm solves the problem of memory fragmentation, and its process is as follows:

  • 1. Partition area: The memory area is divided into one Eden area as the “main battlefield” of the allocation object and two Survivor areas (that is, Survivor space, divided into two equally proportional from and to areas).
  • 2. Copy: When collecting, clean up the “battlefield” and copy the surviving objects in Eden to a surviving area.
  • 3. Clear: Since the last stage made sure that the surviving objects were properly located, you can now “clear the battlefield”, releasing Eden and the other surviving areas.
  • 4. Promotion: In the “copy” phase, for example, one survival block cannot accommodate all “surviving” objects. It goes straight to the old age.

From these steps alone, we can see that the mark-copy algorithm uses heap space very inefficiently. In particular, when the object survival rate is high, more mark-copy operations are required and the efficiency becomes low.

3. Mark-collation algorithm

Mark (mark out all recyclable objects) – Tidy (move objects in the mark phase to one end of the space to free up the remaining space)

The marking process of the algorithm is the same as the mark-clean algorithm, but the next step is not to clean the recyclable objects directly, but to make all the living objects move to one end, and then directly clean up the memory outside the end boundary.

The mark-collation algorithm solves the problem of memory fragmentation and avoids the disadvantage that the replication algorithm can only use half of the memory area. However, it changes memory more frequently, needs to collate all the references of living objects, and is much less efficient than the replication algorithm.

4, generational collection algorithm

Basically, JVM memory objects are collected in generations, which means that memory is divided into chunks based on the lifetime of the object, so that the most appropriate collection algorithm can be adopted for each generation.

  • In the new generation, when a large number of objects die and a small number survive each garbage collection, the mark-copy algorithm can be used to complete the collection at the cost of copying a small number of surviving objects.
  • In the old era, because the object has a high survivability and there is no extra space to guarantee its allocation, it must be reclaimed using a mark-clean or mark-clean algorithm.

3. GC classification

Having mentioned generational collection, let’s talk about how generational collection is defined in the JVM.

1, Minor GC

Minor GC is triggered when the JVM is unable to allocate space for a new object, such as when the Eden region is full. Therefore, the higher the allocation rate, the more frequently Minor GC is performed.

All Minor GC fires stop-the-world, stopping the application thread. For most applications, the delay caused by pauses is negligible. Since most objects in Eden can be considered garbage, they will never be copied to Survivor or old age Spaces. If the reverse is true, most of the new objects in Eden will not be eligible for GC, and the pause time of Minor GC will be much longer.

Trigger condition: When the Eden space is full

2, Major GC

Exploiture refers to the collection process that takes place in the old days. A Major GC occurs. It is often accompanied by at least one Minor GC. The Major GC is typically more than 10 times slower than the Minor GC.

Trigger condition: The Minor GC moves the object to the old age, and if the old age is running out of space, the Major GC fires.

3, Full GC

Clean up the entire heap space. In a sense, Full GC is a combination of Minor AND Major GC.

Trigger condition: call system.gc (); Lack of space in the old age; Space allocation guarantee failed.

Stop-the-world

Wait, we’ve talked about stop-the-world a lot. What is stop-the-world?

All Java execution threads must be paused while GC proceeds, which is called stop-the-world.

Object accessibility analysis must be conducted in a can ensure consistency in the snapshot, the meaning of “consistency” here refers to the whole analysis during the whole execution system looks like frozen at some point in time, can’t appear analysis object references in the process of relationship is still in changing circumstances, this does not meet the results of the analysis accuracy cannot be guaranteed.

Stop-the-world is implemented through the safety point mechanism. When the Java virtual machine receives a stop-the-world request, it waits for all threads to reach a safe point before allowing the thread requesting stop-the-world to work exclusively.

Four, HotSpot algorithm implementation

In stop-the-world’s explanation, we talked about safe spots, so what is safe spots? Let’s talk about this topic through the HotSpot virtual machine’s implementation of the garbage collection algorithm.

1. Safepoint

The safe point, that is, the program does not stop to start GC at all points during execution, only when the safe point is reached. Safepoint’s selections should be neither so small that the GC waits too long, nor so frequent that the runtime load is too heavy **.

The original purpose of a safe point is not to stop other threads, but to find a stable state of execution. In this state of execution, the Stack of the Java virtual machine does not change. In this way, the garbage collector can “safely” perform the reachability analysis. As long as you don’t leave this safe point, the Java Virtual Machine can continue to run this local code while garbage collection is carried out.

The program does not stop to start GC at all points when it is running, but only when it reaches a safe point. The selection of safe point is basically based on the standard of “whether the program has the characteristics that make the program run for a long time”. The most obvious feature of “long execution” is the reuse of instruction sequences, such as method calls, loop jumps, exception jumps, etc., so instructions with these features produce Safepoint.

Another consideration for safe points is how to “run” all threads (not in this case the thread executing the JNI call) to the nearest safe point when GC occurs and then stop.

Two solutions:

Pre-emptive Suspension

Preemptive interrupts do not require the execution code of the thread to actively cooperate. When GC occurs, all threads are first interrupted. If any thread interrupt is found to be not at the safe point, the thread is resumed and “run” to the safe point. Few virtual machines now pause threads in this way in response to GC events.

Active Suspension

The idea of active interrupt is that when GC needs to interrupt a thread, it does not directly operate on the thread, but simply sets a flag, and each thread actively polls for the flag when executing, and suspends the interrupt itself when the flag is true. The polling flag and the safe point overlap, plus the memory allocation required to create the object.

2. Security zone

Refers to a code snippet in which reference relationships do not change. It is safe to start GC anywhere in this region. You can also think of Safe Region as an extended Safepoint.

3, the card table

Due to a GC, some of the Cenozoic objects are lost to the old age, so when the mark scans the surviving objects, the old age objects need to be scanned. Also, since this object has a reference to the new generation, a full heap scan is required to scan it… Gul ‘dan, that’s too high a price to pay.

The solution to HotSpot is a technique called Card Table, which divides the heap into 512-byte cards and maintains a Card Table that stores one identifier bit for each Card. This identification bit indicates whether the corresponding card is likely to have a reference to a new generation object. If there is, then we think the card is dirty.

In Minor GC, instead of scanning the entire age, we can look for dirty cards in the card table and add the objects in the dirty cards to the GC Roots of the Minor GC. When all the dirty cards are scanned, the Java VIRTUAL machine zeroes out the identification bits of all the dirty cards.

To ensure that every card that might have a reference to a new generation of objects is marked as a dirty card, the Java Virtual Machine needs to intercept every write to a reference instance variable and perform the corresponding write identification bit.

Card tables can be used to reduce the old full heap space scanning, which can greatly improve GC efficiency.

The root nodes outside the enumeration are already mentioned above, so I won’t repeat the jump here.

V. Recycling of special objects

Reclaim the method area object

We’ve been talking about heap GC, so when do we go back to collect objects in the method area? When do you recycle static objects?

A class in a method area must meet all three conditions to be reclaimed:

  • All instances of this class have been reclaimed, which means there are no instances of this class in the Java heap.

  • The class loader that loaded this class has been reclaimed.

  • The java.lang.Class object corresponding to this Class is not referenced anywhere, and the methods of this Class cannot be accessed anywhere by reflection.

Reclaim static members

Static members can also be classified as static basic types and static reference types.

Static basic types are stored in the static variable area;

The references of static reference types are stored in the static variable area, while the instances (specific contents) are stored on the heap. Static member loading time: when the class is loaded (the first time it is accessed), all static members of the class are loaded into the static store. Members stored in the static variable area, once created, are not collected until the program exits.

Of course, if we set static instance =null, the variable in the static store will always exist, but the instance object in the heap will still be collected because there are no variables pointing to it. Therefore, if the static reference type is not used, set = NULL to allow the GC to reclaim the space on the heap.

Attempts to reclaim singletons

A method area is an area of MEMORY in the JVM where class-related information is stored. Obviously, the singleton pattern in Java creates objects that are referenced by static properties in their own classes, which conform to the GC Roots, so that singleton objects are not garbage collected by the JVM.

However, singletons are still reclaimed in the case of method area objects.

conclusion

The second chapter of the JVM is finished. Because it is during the Tomb-sweeping day, there are more things to do, so I have less time to study and write. I want to finish garbage recycling at one breath, and I think there are so many garbage recyclers. Let’s do more chapters. If you think it’s ok, you can give me a “like” or something. See you!

At the same time, if you need a mind map, you can contact me, after all, the more knowledge is shared, the more fragrant!