Introduction to the

Slice is a data structure provided by Go language, which is very simple and convenient to use. However, slicing often produces confusing results for implementation reasons. Mastering the underlying structure and principle of slicing can avoid many common misunderstandings.

The underlying structure

The slice structure is defined in the slice. Go file in the source runtime package:

// src/runtime/slice.go
type slice struct {
  array unsafe.Pointer
  len int
  cap int
}
Copy the code
  • array: a pointer to the underlying array that stores the data
  • len: slice length, which we can use in codelen()The function gets this value
  • cap: Slice capacity,That is, how many elements can be accommodated without expansion. We can use it in our codecap()The function gets this value

We can output the underlying structure of the slice with the following code:

type slice struct {
  array unsafe.Pointer
  len   int
  cap   int
}

func printSlice(a) {
  s := make([]uint32.1.10)
  fmt.Printf("%#v\n", *(*slice)(unsafe.Pointer(&s)))
}

func main(a) {
  printSlice()
}
Copy the code

Run output:

main.slice{array:(unsafe.Pointer)(0xc0000d6030), len:1, cap:10}
Copy the code

Note one detail here, since the Runtime. slice structure is non-exported, we can’t use it directly. So I manually defined a slice structure in the code with the same fields as runtime.slice.

With the underlying structure of slicing, we first review the basic knowledge of slicing, and then look at the common problems of slicing one by one.

Basic knowledge of

Create a slice

There are four ways to create a slice:

  1. var

Var declares a variable of type slice, where the slice value is nil.

var s []uint32
Copy the code

For slices created in this way, the array field is a null pointer, and len and CAP fields are equal to 0.

  1. Slice literal

List all elements using a slice literal, where the slice length and size are equal to the number of specified elements.

s := []uint32{1.2.3}
Copy the code

The underlying structure of S after creation is as follows:

Both len and CAP fields are equal to 3.

  1. make

Created using make, you can specify the length and capacity. The format is make([]type, len[, cap]). The length can be specified only or the length and capacity can be specified simultaneously:

s1 := make([]uint32)
s2 := make([]uint32.1)
s3 := make([]uint32.1.10)
Copy the code
  1. Slice operator

Use the slice operator to create a new slice by cutting a portion of an existing slice or array. The slicing operator is in the format of [low:high], for example:

var arr [10]uint32
s1 := arr[0:5]
s2 := arr[:5]
s3 := arr[5:]
s4 := arr[:]
Copy the code

The interval is left closed and right open, i.e. [low, high], including index low but excluding high. The length of slices generated by cutting is high-low.

In addition, both low and high have default values. Low defaults to 0, and high defaults to the length of the original slice or array. They can all be omitted, and when omitted, the default value is taken.

The underlying layer of slicing created in this way shares the same data space, which may cause data overwriting. Therefore, be careful.

Add elements

You can use the append() function to add elements to the slice, zero or more at a time. If the remaining space (that is, cap-len) is sufficient for the element, simply add the element to the end and increment len. Otherwise, you need to expand, allocate a larger array space, copy the elements from the old array, and then add them.

package main

import "fmt"

func main(a) {
  s := make([]uint32.0.4)

  s = append(s, 1.2.3)
  fmt.Println(len(s), cap(s)) / / 3 4

  s = append(s, 4.5.6)
  fmt.Println(len(s), cap(s)) / / 6, 8
}
Copy the code

You don’t know slice

  1. Empty slice is equal tonil?

What is the output of the following code?

func main(a) {
  var s1 []uint32
  s2 := make([]uint32.0)

  fmt.Println(s1 == nil)
  fmt.Println(s2 == nil)
  fmt.Println("nil slice:".len(s1), cap(s1))
  fmt.Println("cap slice:".len(s2), cap(s2))
}
Copy the code

Analysis:

First of all, s1 and S2 have a length and a capacity of 0, which makes sense. Comparing slices to nil actually checks whether the array field in the slice structure is a null pointer. Obviously s1 == nil returns true, s2 == nil returns false. Even though s2 has length 0, make() allocates space for it. Therefore, the form of VAR is generally used to define slices with length 0.

  1. Pass value or reference?

What is the output of the following code?

func main(a) {
  s1 := []uint32{1.2.3}
  s2 := append(s1, 4)

  fmt.Println(s1)
  fmt.Println(s2)
}
Copy the code

Analysis:

Why does append() have a return value? Because when we pass slices to append(), we pass runtime.slice. The structure is passed by value, so changes to the array/len/cap fields inside the function do not affect the outer slice structure. In the code above, len and cap of S1 remain unchanged after append(), so the output is:

[1 2 3]
[1 2 3 4]
Copy the code

So we call append() in the form of s = append(s, elem) and assign the return value to the original slice to overwrite the array/len/cap fields.

Beginners may also make the mistake of ignoring the return value from append() :

append(s, elem)
Copy the code

This is even more wrong. The added elements will be lost, as the internal fields of the outer slice of the function remain unchanged.

We can see that slices are passed by reference, but are actually passed the value of the runtime.slice structure. Changes to existing elements are reflected outside the function because the underlying array space is shared.

  1. Slice capacity expansion policy

What is the output of the following code?

func main(a) {
  var s1 []uint32
  s1 = append(s1, 1.2.3)
  s2 := append(s1, 4)
  fmt.Println(&s1[0] == &s2[0])}Copy the code

This involves the capacity expansion policy for slices. During capacity expansion, if:

  • If the current capacity is less than 1024, expand the capacity twice.
  • If the current capacity is greater than or equal to 1024, increase the capacity by 0.25 times until the required capacity is met.

I looked at the Go1.16 runtime/ Slice. go source code for scaling, and after implementing the above rules, I adjusted the size of slice elements and computer bits accordingly. The whole process is more complex, interested in their own research.

All we need to know is that the capacity is small at the beginning and is doubled to reduce the frequency of subsequent capacity expansion due to the addition of elements. When the capacity is expanded to a certain extent, doubling the capacity will cause a large waste.

In the example above, s1 = append(s1, 1, 2, 3) increases the capacity to 4. S2 := append(s1, 4) Because there is enough space, the underlying array of S2 will not change. So s1 and S2 have the same address for the first element.

  1. The slicing operator slashes strings

Slicing operators can slice strings, but not the same as slicing and arrays. Slicing a string returns a string, not a slice. Because strings are immutable, if you return slices. Slicing and strings share the underlying data, and you can modify strings through slicing.

func main(a) {
  str := "hello, world"
  fmt.Println(str[:5])}Copy the code

Hello, of output.

  1. Slice the underlying data sharing

What is the output of the following code?

func main(a) {
  array := [10]uint32{1.2.3.4.5}
  s1 := array[:5]

  s2 := s1[5:10]
  fmt.Println(s2)

  s1 = append(s1, 6)
  fmt.Println(s1)
  fmt.Println(s2)
}
Copy the code

Analysis:

First notice that s2 := s1[5:10] upper bound 10 is already greater than the length of slice S1. Remember that when using the slice operator to slice a slice, the upper bound is the size of the slice, not its length. At this point, the underlying structure of the two slices overlaps, as shown below:

Then the output s2 is:

[0.0.0.0.0]
Copy the code

Then add element 6 to slice S1, and the structure is shown as follows, where slice S1 and s2 share element 6:

In this case, the output s1 and s2 are:

[1, 2, 3, 4, 5, 6]
[6, 0, 0, 0, 0]
Copy the code

It can be seen that the modification of one slice may lead to the modification of other slices due to the underlying data sharing of slices. This can sometimes cause bugs that are difficult to debug. To alleviate some of this problem, Go 1.2 provides an extended slice operator: [low:high: Max], which limits the capacity of new slices. The slice capacity produced in this manner is max-low.

func main(a) {
  array := [10]uint32{1.2.3.4.5}
  s1 := array[:5:5]

  s2 := array[5:10:10]
  fmt.Println(s2)

  s1 = append(s1, 6)
  fmt.Println(s1)
  fmt.Println(s2)
}
Copy the code

Execute s1 := array[:5:5] we limit the capacity of S1 to 5, then the structure is as follows:

If s1 = append(s1, 6) and len == cap == 5, create an underlying array and add. In this case, the structure is shown as follows, s1 and S2 do not interfere with each other:

conclusion

By understanding the underlying data structure of slicing and knowing that slicing is passing runtime.slice values, we can solve more than 90% of slicing problems. Combined with the graph, we can see how the underlying data is operated intuitively.

The name of this series is my imitation of JavaScript you Don’t Know 😀.

reference

  1. The experts Go programming, watercress link: book.douban.com/subject/351…
  2. What you don’t know Go GitHub: github.com/darjun/you-…

I

My blog is darjun.github. IO

Welcome to follow my wechat public account [GoUpUp], learn together, progress together ~