This article mainly introduces the concept of garbage collection, Golang GC garbage collection algorithm and working principle, after reading this article can let you have a comprehensive understanding of Golang garbage collection mechanism. Since I do not know the GC of other languages, I have not compared the garbage collection algorithm of other languages, so you can Google it yourself if you need it.

What is garbage collection

Garbage Collection (ABBREVIATED as GC) is an automatic memory management mechanism in computer science. When dynamic memory on a computer is no longer needed, it should be released to free up memory. This memory resource management is called garbage collection. The garbage collector can take a lot of the burden off the programmer and reduce the chances for the programmer to make mistakes. From Wikipedia

Simply put, garbage collection (GC) is a daemon thread running in the background that monitors the status of objects, identifying and discarding objects that are no longer used to free and reuse resources.

Go’s garbage collection

The current garbage collection mechanism used by Golang is the tricolor markup co-write barrier and auxiliary GC, which is an enhanced version of mark-purge.

Mark and sweep

The original labeling method was divided into two steps:

  1. The tag. STP(Stop The World), suspends all running threads of The entire program, and marks The referenced object
  2. Clear the object that is not marked, that is, reclaim memory resources, and then resume the running thread.

The big problem with this is that the STW ensures that the state of the marked object does not change during GC, the whole program is paused, and the program will appear to stall from the outside.

Tricolor marking

The three-color marking method is an improvement on the marking stage, and the principle is as follows:

  1. Initial state all objects are white.
  2. Scan all root objects from the root root (figure A,b) and color the objects they reference gray (figure A, B)

So what is root? The root region is mainly the stack and global data region where the program is running up to the current time.

  1. Analyze whether the gray object references other objects. Mark the gray object as black if no other object is referenced (A in figure above); If there is a reference, turn it black and turn the object it refers to gray.
  2. Repeat step 3 until the gray object queue is empty. The white object is garbage and collected.

You can also refer to the following GIF to help you understand:

How does the Go GC work

How to reduce the impact of STW on the program? This is because the Golang GC does most of its processing in parallel with the user code.

User code may change the state of some objects during GC. How do you parallelize GC and user code? Let’s take a look at the complete flow of GC work:

  1. Mark: There are two parts:
  • Mark Prepare: Initializes the GC task, including enabling the write barrier and mutator assist (GC), and counting the number of root tasks. This process requires STW
  • GC Drains: Scans all root objects, including global Pointers and Pointers on the Goroutine (G) stack that will be stopped while scanning, adding them to a labeled queue (a gray queue), and cycles through grey-queued objects until the gray queue is empty. The process is executed in parallel in the background
  1. Mark Termination: Completes the marking and re-scans the global pointer and stack. Since Mark and the user program are in parallel, new object assignments and pointer assignments may occur during the Mark process, which needs to be recorded through the write barrier and checked by re-scan. This process will also STW.
  2. Sweep: Reclaims all white objects as a result of the markup, which is done in parallel in the background
  3. Sweep Termination: Sweep unswept spans. A new GC can be started only after the Sweep of the previous GC is complete. What if during tagging the user logic changes the reference state of the object just tagged?

Write Barriers

Write barrier: The write operations before this barrier are perceived by other components of the system. Good to understand oh, combined with GC work above the complete process is well understood, is in the beginning of each round of GC will initialize a something called a “barrier”, and then by its record scan for the first time the state of an object, so that the second time and re – compare the scan, cited the change of state object is marked as gray in order to prevent the loss, Continue processing objects whose state has not changed before and after the barrier.

Assist the GC

As you can see from the complete flow of GC work above, Golang GC actually breaks up the single pause time, which could be “user code -> large GC- > user code”. So it actually becomes “user code –> small GC–> user code –> small GC–> user code”. What if the GC can’t collect objects faster than the user code can allocate them? If the Go language finds that the speed of collection after scanning cannot keep up with the speed of allocation, it will still suspend the user logic. After the user logic is suspended, it means that there will be no new objects, and at the same time, it will grab the user thread and join the garbage collection to accelerate the garbage collection speed. So the original concurrency is still STW, and you still have to suspend the user thread, otherwise the scan and the collection will never stop, because the newly allocated object is faster than the collection, so this thing is called auxiliary collection.

How do YOU tune GC

To measure the impact of GC on programs, see this article, Performance Debugging issues for Go programs.

Reduce the allocation of objects, reasonable reuse; Avoid string / []byte conversions;

When the two are converted, the underlying data junction structure is replicated, resulting in lower GC efficiency.

Little use of + concatenate string;

String is the most basic type in Go. It is a read-only type, and a new string is created for each operation. For small text concatenations, use “+”; For a large number of small text concatenations, use strings.Join. For large text concatenations, use bytes.buffer.

GC trigger condition

There are two triggers for automatic garbage collection:

  1. The memory size exceeds the threshold. Procedure
  2. Reaching the timing time threshold is controlled by a variable called gcpercent, which is triggered when the percentage of newly allocated memory in use exceeds gCprecent. For example, if the memory usage is 5M after a reclamation, the next reclamation is when the memory allocation reaches 10M. In other words, the more memory allocated, the more garbage collected. What if the memory size threshold is never reached? At this time, GC will be triggered at a fixed time. For example, if the value is less than 10M, GC will be triggered at a fixed time (once every 2 minutes by default) to ensure resource recovery.

Write in the last

While Golang has an automatic garbage collection mechanism, GC is not a cure-all, so it’s best to get into the habit of manually collecting memory: for example, manually freeing unused memory, setting objects to nil, or consider calling Runtime.gc () when appropriate to trigger GC.

Recently in the maintenance of go learning sample code, new pit friends can pay attention to go programming.

Reference:

The string to discuss

Go language – Garbage collection GC

Golang garbage collection anatomy

Golang garbage collection mechanism details

Go Garbage collection profile

Common GC algorithm and Golang GC