WeChat search
The brain is in the fried fishPay attention to this fried fish with fried liver. In this paper,
github.com/eddycjy/blogIs included, with my series of articles, materials, and open source GO books.

Hello, everyone. I’m Fried Fish.

DEFER is an interesting keyword feature in the Go language. Examples are as follows:

Package main import "FMT" func main() {defer FMT.Println(" brain in ")}

The output is:

The brain is in the fried fish

A few days ago, some of my readers discussed the following question:

In a nutshell, the question is whether there is any performance impact on making the defer keyword in the for loop?

Because in the Go language’s underlying data structure design, DEFER is a linked list data structure:

People worry that if they loop too much, the linked list will be too long and not “refined” enough. Or guess if you’d Go defer your design similar to the design of the Redis data structures and make the optimization yourself without making a big difference?

In this article, we will explore the loop Go defer. Does making the underlying linked list too long cause any problems? If so, what is the specific effect?

Begin the way of sucking fish.

DEFER Performance optimized by 30%

In the early days of Go1.13, I did a round of performance tuning for Defer, which improved the performance of Defer by 30% in most scenarios:

Let’s review the changes to Go1.13 and see where the Go Defer optimizations were, which is the crux of the matter.

Before and now

Before Go1.12, the assembly code for calling Go Defer was as follows:

    0x0070 00112 (main.go:6)    CALL    runtime.deferproc(SB)
    0x0075 00117 (main.go:6)    TESTL    AX, AX
    0x0077 00119 (main.go:6)    JNE    137
    0x0079 00121 (main.go:7)    XCHGL    AX, AX
    0x007a 00122 (main.go:7)    CALL    runtime.deferreturn(SB)
    0x007f 00127 (main.go:7)    MOVQ    56(SP), BP

On Go1.13 and later, the assembly code to call Go Defer is as follows:

    0x006e 00110 (main.go:4)    MOVQ    AX, (SP)
    0x0072 00114 (main.go:4)    CALL    runtime.deferprocStack(SB)
    0x0077 00119 (main.go:4)    TESTL    AX, AX
    0x0079 00121 (main.go:4)    JNE    139
    0x007b 00123 (main.go:7)    XCHGL    AX, AX
    0x007c 00124 (main.go:7)    CALL    runtime.deferreturn(SB)
    0x0081 00129 (main.go:7)    MOVQ    112(SP), BP

From an assembly point of view, something like calling Runtime.DeferprocStack instead of calling Runtime.DeferprocStack?

We read on with questions.

DEFER The smallest unit: _DEFER

Compared to previous versions of Go Defer, the smallest unit _Defer structure mainly adds the heap field:

type _defer struct {
    siz     int32
    siz     int32 // includes both arguments and results
    started bool
    heap    bool
    sp      uintptr // sp at time of defer
    pc      uintptr
    fn      *funcval

This field identifies whether the _defer was allocated on the heap or on the stack. Since the rest of the fields have not explicitly changed, we can focus on the stack allocation of the defer and see what was done.


func deferprocStack(d *_defer) { gp := getg() if gp.m.curg ! = gp { throw("defer on system stack") } d.started = false d.heap = false d.sp = getcallersp() d.pc = getcallerpc() *(*uintptr)(unsafe.Pointer(&d._panic)) = 0 *(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer)) *(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d)) return0() }

This bit of code is fairly routine, mainly getting the stack pointer to the call to defer the function, the specific address of the arguments passed to the function, and the PC (the program counter), which I covered in detail in the previous article, “Understanding Go Defer,” but I won’t Go over it here.

What’s special about this deferprocStack?

You can see that it sets d.heap to false, which means that the deferprocStack method is for application scenarios that allocate _defer on the stack.


The question is, where does it handle the application scenarios assigned to the heap?

func newdefer(siz int32) *_defer {
    d.heap = true
    d.link = gp._defer
    gp._defer = d
    return d

Where the specific newdefer was called, as follows:

func deferproc(siz int32, fn *funcval) { // arguments of fn follow fn
    sp := getcallersp()
    argp := uintptr(unsafe.Pointer(&fn)) + unsafe.Sizeof(fn)
    callerpc := getcallerpc()

    d := newdefer(siz)

It is clear that the deferProc method that was called in the previous version is now used for scenarios that are assigned to the heap.


  • One thing is certaindeferprocIt wasn’t eliminated, it was optimized.
  • The Go compiler chooses to use it according to the application scenariodeferprocordeferprocStackMethods, which are for usage scenarios allocated on the heap and stack, respectively.

Where is the optimization

The main optimization is a change in the stack allocation rules for its deferred object, as the compiler analyzes the depth of the for-loop iteration for the deferred object.

// src/cmd/compile/internal/gc/esc.go
case ODEFER:
    if e.loopdepth == 1 { // top level
        n.Esc = EscNever // force stack allocation of defer record (see ssa.go)

If the GO compiler detects a loopdepth of 1, the result of escape analysis is set and allocated to the stack, otherwise to the heap.

// src/cmd/compile/internal/gc/ssa.go
case ODEFER:
    d := callDefer
    if n.Esc == EscNever {
        d = callDeferStack
    s.call(n.Left, d)

In this way, a large amount of performance overhead brought by frequent calls to SystemStack, MallocGC and other methods is removed, so as to achieve the role of improving performance in most scenarios.

Loop call defer

Back to the problem itself, once you know the mechanics of DEFER tuning. Is there any performance impact to “leaving the defer keyword in the loop?”

The most immediate effect is that around 30% of the performance optimizations are completely lost, and because of the incorrect posture, the theoretical deferring of the existing overhead (long linked list) also increases and the performance degrades.

So we want to avoid code for the following two scenarios:

  • Explicit loop: An explicit loop is called outside of the call to defer keyword, such as:for-loopStatements, etc.
  • Implicit loop: There is loop-like nested logic on the call to defer keyword, such as:gotoStatements, etc.

Explicit cycle

The first example is to use the defer keyword directly within the code’s for loop:

func main() { for i := 0; i <= 99; I++ {defer func () {FMT. Println (" brain into fish ")} ()}}

This is the most common pattern, whether it’s for crawlers or for Goroutine calls, and many people like to write this.

This is an explicit call to the loop.

The implicit cycle

The second example is using a keyword like goto in your code:

func main() {
    i := 1
    defer func() {}()
    if i == 1 {
        i -= 1
        goto food

This is rare, because the goto keyword is sometimes even listed as a code specification that is not intended for use, which is often a source of abuse, so most of the logic is implemented the way it is.

This is an implicit call, resulting in a class loop.


Obviously, Defer didn’t say that he was doing anything particularly fantastic by design. He is mainly based on the actual application of some scenarios for optimization, to achieve better performance.

Although there is a little overhead involved in defering itself, it is not as cumbersome as you might think. Only consider tuning if you defer the code you’ve deferred is code that needs to be executed frequently.

Otherwise, there’s no need to get too tangled up. In fact, when you guess or run into a performance problem, just look at the Pprof analysis to see if you’re in the corresponding hot path and tune it properly.

An optimization might as well be simply removing the defer and executing manually, which is not complicated. Avoid the minefields of deferred explicit and implicit loops when coding to maximize performance.

If you have any questions, please feel free to give feedback and exchange in the comment section. The best relationship is mutual achievement. Your thumb up is the biggest motivation for Fried Fish’s creation.

The article is updated continuously, you can search “brain into fried fish” to read, reply [
000I have prepared the first line of big factory interview algorithm solutions and information; In this paper,
github.com/eddycjy/blogHas been included, welcome STAR to urge more.