This paper mainly analyzes the expansion mechanism of Go Slice. Environment, 64-bit centos docker image +go1.12.1.

Normal operation

Expansion occurs at slice append when the slice cap is insufficient to accommodate new elements and growSlice is made

For example, for the code below

slice1 := make([]int.1,)
fmt.Println("cap of slice1".cap(slice1))
slice1 = append(slice1,1)
fmt.Println("cap of slice1".cap(slice1))
slice1 = append(slice1,2)
fmt.Println("cap of slice1".cap(slice1))

fmt.Println()

slice1024 := make([]int.1024)
fmt.Println("cap of slice1024".cap(slice1024))
slice1024 = append(slice1024,1)
fmt.Println("cap of slice1024".cap(slice1024))
slice1024 = append(slice1024,2)
fmt.Println("cap of slice1024".cap(slice1024))
Copy the code

The output

cap of slice1 1
cap of slice1 2
cap of slice1 4

cap of slice1024 1024
cap of slice1024 1280
cap of slice1024 1280
Copy the code

Many blogs on the Internet have also mentioned that slice expansion, cap is not 1024, directly double; If the cap exceeds 1024, the new cap becomes 1.25 times of the old cap.

The idea that the relevant portions of the source code is as follows, the specific code in the $GOROOT/SRC/runtime/slice. Go

func growslice(et *_type, old slice, cap int) slice {
    
	// Omit some judgments...

    newcap := old.cap
    doublecap := newcap + newcap
    if cap > doublecap {
        newcap = cap
    } else {
        if old.len < 1024 {
            newcap = doublecap
        } else {
            // Check 0 < newcap to detect overflow
            // and prevent an infinite loop.
            for 0 < newcap && newcap < cap {
                newcap += newcap / 4
            }
            // Set newcap to the requested cap when
            // the newcap calculation overflowed.
            if newcap <= 0 {
                newcap = cap}}}// omit some follow-up...
}
Copy the code

Sharp-eyed friends may see the problem, the expansion mechanism is actually corresponding to a branch of the source code, in other words, in fact, the expansion mechanism is not necessarily so, what is it? Go to the next section with questions

Unconventional operation

The above operation appends one element at a time. Consider the other case, appends many elements at once. What happens? For example, what is the capacity of the following code?

package main

import "fmt"

func main(a) {
    a := []byte{1.0}
    a = append(a, 1.1.1)
    fmt.Println("cap of a is ".cap(a))
    
    b := []int{23.51}
    b = append(b, 4.5.6)
    fmt.Println("cap of b is ".cap(b))
    
    c := []int32{1.23}
    c = append(c, 2.5.6)
    fmt.Println("cap of c is ".cap(c))

    type D struct{
        age byte
        name string

    }
    d := []D{
        {1."123"},
        {2."234"},
    }

    d = append(d,D{4."456"},D{5."567"},D{6."678"})
    fmt.Println("cap of d is ".cap(d))
}
Copy the code

It should be four eights, right? Based on the idea of doubling, CAP goes from 2->4->8.

Or four fives? To 4 5 speculation based on the following conjecture: if the append multiple elements, a capacity is not enough to satisfy the placement of elements, if I am a designer, I’ll be good estimate how much capacity can place elements, and then conduct a expansion, advantage is that, no need to frequently apply for a new underlying array, and does not require frequent data copy.

But the results were a bit surprising.

cap of a is  8
cap of b is  6
cap of c is  8
cap of d is  5
Copy the code

Do you feel confused?” No, I know it is.” Dude, you can close this article.

Why this strange phenomenon? On the body

GDB analysis

Looking at the source code has not been too much progress, can only use some auxiliary tools to see the operation, so as to better analyze the source code, just so, GDB is suitable for doing so.

Same code as above, let’s compile it and load it into GDB

[root@a385d77a9056 jack]# go build -o jack
[root@a385d77a9056 jack]# ls
jack  main.go
[root@a385d77a9056 jack]# gdb jack
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-114.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/goblog/src/jack/jack... done. Loading Go Runtime support. (gdb)Copy the code

Set a breakpoint on the line of code where append occurs, and then run the program. For better illustration, set the breakpoint on the append of []int slice B with capacity 6 after capacity expansion

gdb) l 10
5	)
6
7	func main() {
8
9		a := []byte{1, 0}
10		a = append(a, 1, 1, 1)
11		fmt.Println("cap of a is ".cap(a))
12
13		b := []int{23, 51}
14		b = append(b, 4, 5, 6)
(gdb) b 14
Breakpoint 2 at 0x4872d5: file /home/goblog/src/jack/main.go, line 14.
(gdb) r
Starting program: /home/goblog/src/jack/jack
cap of a is  8

Breakpoint 2, main.main () at /home/goblog/src/jack/main.go:14
14		b = append(b, 4, 5, 6)
Copy the code

Jump in and see how it works

(gdb) s runtime.growslice (et=0x497dc0, old=... .cap=5, ~r3=...) at /usr/local/src/go/src/runtime/slice.go:76
76	func growslice(et *_type, old slice, cap int) slice {
(gdb) p *et
The $1 = {size = 8, ptrdata = 0, hash = 4149441018, tflag = 7 '\a', align = 8 '\b', fieldalign = 8 '\b', kind = 130 '\ 202', alg = 0x555df0 <runtime.algarray+80>,
  gcdata = 0x4ce4f8 "\001\002\003\004\005\006\a\b\t\n\v\f\r\016\017\020\022\024\025\026\027\030\031\033\036\037\"%&,2568, str = 987, ptrToThis = 45312}
(gdb) p old
$2 = {array = 0xc000074ec8, len = 2, cap = 2}
Copy the code

It’s complicated. The only thing I can make sense of at first

First, the cap passed in is 5, which means that the idea mentioned above is correct at present. When append has multiple elements, the capacity should be estimated before expansion. Slice is a struct, and struct is a value type.

Et is a kind of metadata information about the type of elements in slice. In order to analyze slice, it is enough to know size in ET, which represents the size of bytes occupied by the element in the computer. I use a 64-bit centos docker image, int is int64, also is 8 bytes.

Moving on, this part of the analysis involves another part of the code, which is pasted first

switch {
case et.size == 1:
    lenmem = uintptr(old.len)
    newlenmem = uintptr(cap)
    capmem = roundupsize(uintptr(newcap))
    overflow = uintptr(newcap) > maxAlloc
    newcap = int(capmem)
case et.size == sys.PtrSize:
    lenmem = uintptr(old.len) * sys.PtrSize
    newlenmem = uintptr(cap) * sys.PtrSize
    capmem = roundupsize(uintptr(newcap) * sys.PtrSize)
    overflow = uintptr(newcap) > maxAlloc/sys.PtrSize
    newcap = int(capmem / sys.PtrSize)
case isPowerOfTwo(et.size):
    var shift uintptr
    if sys.PtrSize == 8 {
        // Mask shift for better code generation.
        shift = uintptr(sys.Ctz64(uint64(et.size))) & 63
    } else {
        shift = uintptr(sys.Ctz32(uint32(et.size))) & 31
    }
    lenmem = uintptr(old.len) << shift
    newlenmem = uintptr(cap) << shift
    capmem = roundupsize(uintptr(newcap) << shift)
    overflow = uintptr(newcap) > (maxAlloc >> shift)
    newcap = int(capmem >> shift)
default:
    lenmem = uintptr(old.len) * et.size
    newlenmem = uintptr(cap) * et.size
    capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))
    capmem = roundupsize(capmem)
    newcap = int(capmem / et.size)
}
Copy the code

Paste the GDB analysis, omit some details, only extract some of the more important processes

(GDB) n 96 Doublecap := newCap + newCap // NewCap is initialized to old.cap, which is 2, and doublecap to 4 (GDB) n 97if cap > doublecap { // capIs the parameter passed in with a value of 5, which is larger than doublecap=4 after doubling (GDB) n 98 newCap =cap// So newcap is assigned the calculated capacity 5, and branches with len<1024 do not go in (GDB) n 123caseEt. size == 1: (GDB) disp newCap // Prints the value of newCap 3: newCap = 5 (GDB) n 129caseEt. size == sys.PtrSize: // et.size = 8 bytes of the type, exactly equal to the 64-bit system pointer size 3: Newcap = 5 (GDB) n 132 capmem = roundupsize(uintptr(newcap) * sys.ptrsize) // Newcap = 5 (GDB) disp capmem Capmem = <optimized out> (GDB) n 134 newCap = int(capmem/sys.PtrSize) // newcap = 5 (gdb) n 122 switch { 4: capmem = <optimized out> 3: newcap = 5 (gdb) n 169ifOverflow | | capmem > maxAlloc {/ / this is out of the switch after the code block of code, is not important, but we have to see the desired results, newcap capacity is just 6, which is on the papercap(b)
4: capmem = 48
3: newcap = 6
Copy the code

Capmem = roundupsize(Uintptr (newCap) * sys.ptrsize)

Round-up, round up,roundupsize, and take a size up.(uintptr(newcap) * sys.PtrSize)The product of should be 5*8=40, rounded up to get the new required memorycapmem=48, and then the required memory/type sizeint(capmem / sys.PtrSize)And I get my new capacity, which is 6.

To understand why roundupsize changes from 40 to 48, simply introduce go’s memory management. You can trace it to the roundupsize method and then to the sizeclasses.go file, which begins with a table of golang object sizes that looks something like this

// class bytes/obj bytes/span objects tail waste max waste
// 1 8 8192 1024 0 87.50%
// 2 16 8192 512 0 43.75%
// 3 32 8192 256 0 46.88%
// 4 48 8192 170 32 31.52%
// 5 64 8192 128 0 23.44%
// 6 80 8192 102 32 19.07%
// 7 96 8192 85 32 15.95%
// 8 112 8192 73 16 13.56%
// 9 128 8192 64 0 11.72%
// 10 144 8192 56 128 11.82%

/ /...
// 65 28672 57344 2 0 4.91%
// 66 32768 32768 1 0 12.50%
Copy the code

The bytes/obj column is the predefined object size in GO. The minimum size is 8B and the maximum size is 32K. There are 67 classes of objects larger than 32K (66+1=67). As you can see, there is no type with size 40, so round 40 up to 48, and that’s what happens in roundupsize. There’s a technical term for it, memory alignment. Why do we need to do this? For those interested, take a closer look at Golang’s memory management, which I won’t go into here.

Unconventional operation there are other types of append, here will not stick GDB analysis, the same have roundupsize operation, similar, interested friends can play by themselves.

doubt

In Append, roundupsize is not a special branch operation, I feel that it is not always double and 1.25 times.

So I tested it again

e := []int32{1.2.3}
fmt.Println("cap of e before:".cap(e))
e = append(e,4)
fmt.Println("cap of e after:".cap(e))

f := []int{1.2.3}
fmt.Println("cap of f before:".cap(f))
f = append(f,4)
fmt.Println("cap of f after:".cap(f))

cap of e before: 3
cap of e after: 8
cap of f before: 3
cap of f after: 6
Copy the code

Well, that’s exactly what happened. The size of the expanded slice depends on the type.

summary

It’s a little jumpy, so to summarize

Expansion occurred during append. Procedure

  • Append single element, or append small number of multiple elements, where the small number refers to the capacity after double, the following expansion process will be followed: less than 1024, double capacity, more than 1024, 1.25 times capacity.

  • If the append has more than one element and the capacity after double cannot accommodate it, use the estimated capacity directly.

Knock key !!!! In addition, after the above two branches obtain new capacity, they need to calculate the memory required by the new capacity according to the slice type sizecapmemAnd then proceedcapmemRound up to get the new required memory, divided by type size to get the true final capacity, which is the capacity of the new slice.

The above, the end of the play, welcome to discuss ~