Original link: Poke here

preface

Unsafe. pointer is used extensively in the Go library. What is unsafe.pointer? Unsafe bags!

What is theunsafe

As we all know, Go is designed to be a strongly typed static language, so its types cannot be changed. Static also means that type checking is done before running. So in the language is not allowed two pointer types transform, used C language friends should know that this can be implemented in C language, Go do not allow such use are in security concerns, after all the compulsory transition will cause all sorts of trouble, sometimes the trouble is easy to detect, sometimes they are hidden deep, difficult to detect. Most readers probably don’t understand why casting is unsafe. Here’s a simple example in C:

int main(a){
  double pi = 3.1415926;
  double *pv = π
 	void *temp = pd;
  int *p = temp;
}
Copy the code

In standard C language, any pointer of non-void type can be assigned to each other with a pointer of void type, and a pointer of void type can also be used as an intermediary to achieve indirect conversion between Pointers of different types. In the example above, the pointer pv points to an 8-byte double, but after conversion, p points to a 4-byte int. This design flaw that memory truncation occurs and memory access after conversion is a security risk. I think this is one of the reasons Go is designed to be strongly typed.

Although casting is not secure, it can be used in some special scenarios to break the type and memory safety mechanism of Go, which can bypass the type system inefficiencies and improve operation efficiency. Even though unsafe, the Go library offers an unsafe package, which isn’t recommended, it’s not unsafe to use. Even if you’re proficient, you can put it into practice.

unsafeRealize the principle of

Before looking at the unsafe source, the standard library unsafe package only provides three methods:

func Sizeof(x ArbitraryType) uintptr
func Offsetof(x ArbitraryType) uintptr
func Alignof(x ArbitraryType) uintptr
Copy the code
  • Sizeof(x ArbitrayType)The main function of the method is to use return typesxNumber of bytes occupied, but does not containxThe size of the content pointed to, andCLanguage standard librarySizeof()Methods have the same function, such as in32On a bit machine, a pointer returns a size of 4 bytes.
  • Offsetof(x ArbitraryType)The offset () method returns the number of bytes between the location of a structure member and the start of the structure. The offset must be a structure and the return value is a constant.
  • Alignof(x ArbitratyType)Is used to return an alignment value of type, also known as alignment coefficient or alignment multiple. Alignment value is a value related to memory alignment. Proper memory alignment can improve the performance of memory reads and writes. The general alignment value is2^n, the maximum will not exceed8(affected by memory alignment). You can also use the reflection package function to get the alignment value, that is:unsafe.Alignof(x)Is equivalent toreflect.TypeOf(x).Align(). For any type of variablex.unsafe.Alignof(x)At least one. forstructStruct type variablexTo calculate thexEach fieldftheThe unsafe. Alignof (x, f).unsafe.Alignof(x)Is equal to the maximum of that. forarrayA variable of array typex.unsafe.Alignof(x)Equal to the alignment multiple of the element types that make up the array. There are no empty fieldsstruct{}And without any elementsarrayThe size of the memory space occupied is0, different sizes are0May refer to the same block of address.

Uintptr = uintptr; uintptr = uintptr; uintptr = uintptr; In this way, specific memory can be accessed to achieve the purpose of reading and writing to different memory. All three methods are arguments to a ArbitraryType, which means any type. They also provide a Pointer type, which is a generic Pointer like void *.

type ArbitraryType int
type Pointer *ArbitraryType
// Uintptr is an integer type that is large enough to store
type uintptr uintptr
Copy the code

This may seem a bit confusing, but here’s a summary of the three pointer types:

  • *T: Common type Pointer type, used to pass the address of an object. Pointer operations cannot be performed.
  • unsafe.poniter: generic pointer type, used to convert different types of Pointers, cannot perform pointer operations, cannot read values stored in memory (need to convert to a common pointer type)
  • uintptr: for pointer operations,GCDon’t put theuintptrWhen a pointer,uintptrUnable to hold objects.uintptrThe target of the type is reclaimed.

Unsafe.Pointer is a bridge that allows Pointers of any type to be converted to each other and to be converted to uintptr for Pointer operation. In other words, the Uintptr is used for Pointer operation in combination with unsafe. Let me draw a picture:

So much for the basic principle, let’s take a look at how to use ~

unsafe.PointerThe basic use

In atomic/value.go, an ifaceWords structure is defined, where typ and data fields are unsafe.poniter. Poniter is used here because the value passed in is the interface{} type and is strongly converted to the ifaceWords type using unbroadening. This saves both the type and the value for later write type checking. Part of the captured code is as follows:

// ifaceWords is interface{} internal representation.
type ifaceWords struct {
	typ  unsafe.Pointer
	data unsafe.Pointer
}
// Load returns the value set by the most recent Store.
// It returns nil if there has been no call to Store for this Value.
func (v *Value) Load(a) (x interface{}) {
	vp := (*ifaceWords)(unsafe.Pointer(v))
  for {
		typ := LoadPointer(&vp.typ) // Reads the type of an existing value
    / * *... **/ is omitted
    // First store completed. Check type and overwrite data.
		iftyp ! = xp.typ {// Compare the current type with the type to be saved
			panic("sync/atomic: store of inconsistently typed value into Value")}}Copy the code

This is an example of using unsafe.Pointer in source code, and one day when you’re ready to read the source code, it will be everywhere. Okay, let’s write a simple example to see how unsafe.Pointer is used.

func main(a)  {
	number := 5
	pointer := &number
	fmt.Printf("number:addr:%p, value:%d\n",pointer,*pointer)

	float32Number := (*float32)(unsafe.Pointer(pointer))
	*float32Number = *float32Number + 3

	fmt.Printf("float64:addr:%p, value:%f\n",float32Number,*float32Number)
}
Copy the code

Running results:

number:addr:0xc000018090, value:5
float64:addr:0xc000018090, value:3.000000
Copy the code

Unbroadening.Pointer the address to which the Pointer points does not change, but only the type. This example doesn’t make sense by itself, nor would it be used in a normal project.

To summarize the basics: first, cast the *T type to unbroadening.Pointer, then cast it to the Pointer type you need.

Sizeof, Alignof, OffsetofBasic use of three functions

Let’s start with an example:

type User struct {
	Name string
	Age uint32
	Gender bool // Male :true female: false for example
}

func func_example(a)  {
	// sizeof
	fmt.Println(unsafe.Sizeof(true))
	fmt.Println(unsafe.Sizeof(int8(0)))
	fmt.Println(unsafe.Sizeof(int16(10)))
	fmt.Println(unsafe.Sizeof(int(10)))
	fmt.Println(unsafe.Sizeof(int32(190)))
	fmt.Println(unsafe.Sizeof("asong"))
	fmt.Println(unsafe.Sizeof([]int{1.3.4}))
	// Offsetof
	user := User{Name: "Asong", Age: 23,Gender: true}
	userNamePointer := unsafe.Pointer(&user)

	nNamePointer := (*string)(unsafe.Pointer(userNamePointer))
	*nNamePointer = "Golang Dream Factory"

	nAgePointer := (*uint32)(unsafe.Pointer(uintptr(userNamePointer) + unsafe.Offsetof(user.Age)))
	*nAgePointer = 25

	nGender := (*bool)(unsafe.Pointer(uintptr(userNamePointer)+unsafe.Offsetof(user.Gender)))
	*nGender = false

	fmt.Printf("u.Name: %s, u.Age: %d, u.Gender: %v\n", user.Name, user.Age,user.Gender)
	// Alignof
	var b bool
	var i8 int8
	var i16 int16
	var i64 int64
	var f32 float32
	var s string
	var m map[string]string
	var p *int32

	fmt.Println(unsafe.Alignof(b))
	fmt.Println(unsafe.Alignof(i8))
	fmt.Println(unsafe.Alignof(i16))
	fmt.Println(unsafe.Alignof(i64))
	fmt.Println(unsafe.Alignof(f32))
	fmt.Println(unsafe.Alignof(s))
	fmt.Println(unsafe.Alignof(m))
	fmt.Println(unsafe.Alignof(p))
}
Copy the code

The sizeof the int type depends on the number of CPU bits on the machine. The sizeof the int type depends on the number of CPU bits on the machine. If the CPU is 32-bit, then the int is 4 bytes. If the CPU is 64-bit, then the int is 8 bytes. In this case, my computer is 64-bit, so the result is 8 bytes.

Unsafe. pointer; unsafe.pointer; unsafe.pointer; unsafe.pointer; unsafe.pointer; unsafe.pointer; unsafe.pointer; The Offsetof method returns the Offsetof the member variable in the structure, which is the number of bytes between the initial position of the structure and the member variable. The uintptr can’t be used as a temporary variable to store the uintptr type. We mentioned above that it is used for pointer operations. GC does not use the Uintptr as a pointer, and the Uintptr cannot hold objects. Uintptr targets are recycled, so you don’t know when they will be gapped and what errors will occur in subsequent memory operations. Here’s an example:

// Do not use it this way
p1 := uintptr(userNamePointer)
nAgePointer := (*uint32)(unsafe.Pointer(p1 + unsafe.Offsetof(user.Age)))
Copy the code

Finally, take a look at Alignof function, which is mainly to get the alignment value of variables. Except for CPU bit dependent types such as int and uintptr, the alignment value of basic types is fixed. The alignment value of structure takes the maximum value of its member alignment value.

Classic use: string and []byte conversion

Implement string to byte conversions. Normally, we might write standard conversions like this:

// string to []byte
str1 := "Golang Dream Factory"
by := []byte(s1)

// []byte to string
str2 := string(by)
Copy the code

Unsafe. Pointer (unsafe.Pointer); unsafe.Pointer ([]byte); unsafe.Pointer ([]byte); In the Reflect package there are constructs for ·string and slice, which are:

type StringHeader struct {
	Data uintptr
	Len  int
}

type SliceHeader struct {
	Data uintptr
	Len  int
	Cap  int
}
Copy the code

StringHeader represents a string runtime representation (as SliceHeader does). Comparing the String and Slice runtime representations shows that they differ by only one Cap field, so their memory layout is aligned. Unbroadening.Pointer is the best way to convert, because you can write the following code:

func stringToByte(s string) []byte {
	header := (*reflect.StringHeader)(unsafe.Pointer(&s))

	newHeader := reflect.SliceHeader{
		Data: header.Data,
		Len:  header.Len,
		Cap:  header.Len,
	}

	return* (* []byte)(unsafe.Pointer(&newHeader))
}

func bytesToString(b []byte) string{
	header := (*reflect.SliceHeader)(unsafe.Pointer(&b))

	newHeader := reflect.StringHeader{
		Data: header.Data,
		Len:  header.Len,
	}

	return* (*string)(unsafe.Pointer(&newHeader))
}
Copy the code

[]byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: []byte: [] Strong transfer automatic construction, omitted code as follows:

func bytesToString(b []byte) string {
	return* (*string)(unsafe.Pointer(&b))
}
Copy the code

Although this method is more efficient, it is not recommended for everyone to use. The previous improvement has also been made. If it is unsafe, there will be great hidden dangers if it is used improperly, and some serious cases cannot be captured in RECOVER.

Memory alignment

Now in the computer memory space is divided according to the byte, theoretically seems to access to any type of variables can be from any address, but the reality is when they visit a specific type variables are often in a particular memory address access, which requires all kinds of data according to certain rules in space arrangement, Instead of sequentially discharging one after the other, this is aligned.

When the CPU accesses memory, instead of byte by byte, it accesses memory in word size units. For example, if a 32-bit CPU has a word length of 4 bytes, the CPU accesses memory in 4 bytes. This design can reduce the number of CPU accesses to the memory and increase the throughput of CPU accesses to the memory. Let’s say we need to read 8 bytes of data, 4 bytes at a time so we only need to read it 2 times. Memory alignment is also beneficial for atomic operations on variables. Each memory access is atomic, and if the size of a variable is no larger than a word length, the access to the variable after memory alignment is atomic, which is critical in concurrent scenarios.

Let’s look at an example:

// 64 bit platform, alignment parameter is 8
type User1 struct {
	A int32 / / 4
  B []int32 / / 24
  C string / / 16
  D bool / / 1
}

type User2 struct {
	B []int32
	A int32
	D bool
	C string
}

type User3 struct {
	D bool
	B []int32
	A int32
	C string
}
func main(a)  {
	var u1 User1
	var u2 User2
	var u3 User3

	fmt.Println("u1 size is ",unsafe.Sizeof(u1))
	fmt.Println("u2 size is ",unsafe.Sizeof(u2))
	fmt.Println("u3 size is ",unsafe.Sizeof(u3))
}
// Run result MAC: 64 bits
u1 size is  56
u2 size is  48
u3 size is  56
Copy the code

As can be seen from the results, different order of field placement will occupy different memory, which is because memory alignment affects the size of struct, so sometimes a reasonable field can reduce the memory overhead. C has the same alignment rules as Go, so the alignment rules of C also apply to Go:

  • For each member of the structure, the first member is located in the position of the deviation of 0, structure the offset of the first members (offset) to 0, then each member relative to the first address offset are members of the structure size and the effective alignment value the smaller the integer times, if necessary the compiler will add padding bytes between members.
  • In addition to the structure members needing to be aligned, the structure itself needs to be aligned, and the length of the structure must be a multiple of the compiler’s default alignment length and the smallest data size of the longest type in the member.

Well, know the rules, we now come to analyze the above example, based on my MAC using 64 – bit CPU, is 8 to analyze the alignment parameters, int32, int32, string [], bool alignment values are respectively 4, 8, 8, 1, memory size is 4, 24, 16, 1 respectively, Let’s first analyze User1 according to the first alignment rule:

  • The first field type isint32, the alignment value is 4 and the size is 4, so it is placed first in the memory layout.
  • The second field type is[]int32, alignment value is 8, size is 24, so its memory offset must be a multiple of 8, so in the currentuser1In, can’t from the first4We’re starting at place 1. We have to start at place 15It starts at bit, which is offset by zero8. The first4, 7Bits are populated by the compiler, typically0Value, also called void. The first9Position to the first32Bit is the second fieldB.
  • The third field type isstring, the alignment value is8, the size of16, so his memory offset must be a multiple of 8 becauseuser1The first two fields are already number one32Bit, so the offset of the next bit is exactly zero32, which happens to be the fieldCA multiple of the alignment value of, without padding, can be directly arranged in the third field, that is, from the first32A to48Bit third fieldC.
  • The third field type isbool, the alignment value is1, the size of1, so his memory offset must be1Multiple of theta, becauseuser1The first two fields are already number one48Bit, so the offset of the next bit is exactly zero48. It happens to be a fieldDMultiples of the alignment value of, without padding, can be directly sorted into the fourth field, that is, from48To the first49Bit is the third fieldD.
  • Ok, now after the first memory alignment rule, the memory length is zero49Byte, we start with the first byte of memory2Rule for alignment. According to the second rule, the default alignment is8, the maximum type degree in the field is24, take the smallest one, so find the alignment value of the structure is8, our current memory length is49, not8Multiple of PI, so I have to complete it, so the final result is PI56, to fill the7position

Having said that, let’s draw a picture:

So now you get the idea, let’s do the same thing for the other two structs, but I’m not going to do it here.

One last thing to note about memory alignment is that empty struct{} does not take up any storage space and is generally not required when used as a field of another struct. There is one exception: when struct{} is the last field in a structure, memory alignment is required. Because if there is a pointer to that field, the address returned will be outside the structure, and if the pointer stays alive without freeing the corresponding memory, there will be a memory leak (the memory is not freed by the structure). Here’s an example:

func main(a)  {
	fmt.Println(unsafe.Sizeof(test1{})) / / 8
	fmt.Println(unsafe.Sizeof(test2{})) / / 4
}
type test1 struct {
	a int32
	b struct{}}type test2 struct {
	a struct{}
	b int32
}
Copy the code

Simply put, for any type that occupies 0 bytes, such as struct {} or [0]byte, if it occurs at the end of a structure, we assume that it occupies 1 byte. So for the test1 structure, it looks like this: ‘

type test1 struct {
	a int32
//	b struct{}
  b [1]byte
}
Copy the code

Therefore, in memory alignment, the last byte occupied by the structure is 8.

Important note: Do not add a zero-size type to the end of the structure definition

conclusion

To conclude, the unsafe package bypasses the Go type system in its quest to manipulate memory directly, which is risky to use. However, in some cases, using the functions provided by the Unsafe package can make code more efficient, and the Go source code uses the unsafe package extensively.

The unsafe package defines Pointer and three functions:

type ArbitraryType int
type Pointer *ArbitraryType

func Sizeof(x ArbitraryType) uintptr
func Offsetof(x ArbitraryType) uintptr
func Alignof(x ArbitraryType) uintptr
Copy the code

The Uintptr can convert to and from unsafe.Pointer, and the Uintptr can do math. In this way, the combination of uintptr and unbroadening.Pointer resolves the constraint that the Go Pointer cannot perform mathematical operations. Using the unsafe function, you can obtain the addresses of private members of structures and perform read and write operations on them, breaking the type-safety restrictions of Go.

Finally, we learned about memory alignment. This design can reduce the number of times the CPU accesses memory and increase the throughput of the CPU accesses memory. Therefore, the structure can save more memory by sorting the fields properly.

Well, that’s all for this article, the three qualities (share, like, read) are the author’s motivation to continue to create more quality content!

We have created a Golang learning and communication group. Welcome to join the group and we will learn and communicate together. Join the group: add me vX pull you into the group, or the public number to get into the group two-dimensional code

At the end, I will send you a small welfare. Recently, I was reading the book [micro-service architecture design mode], which is very good. I also collected a PDF, which can be downloaded by myself if you need it. Access: Follow the public account: [Golang Dreamworks], background reply: [micro service], can be obtained.

I have translated a GIN Chinese document, which will be maintained regularly. If you need it, you can download it by replying to [GIN] in the background.

Translated a Machinery Chinese document, will be regularly maintained, there is a need for friends to respond to the background [Machinery] can be obtained.

I am Asong, an ordinary programming ape. Let’s get stronger together. We welcome your attention, and we’ll see you next time

Recommended previous articles:

  • Mechanics-go Asynchronous task queues
  • Source analysis panic and recover, do not understand you hit me!
  • Atomic Operations: The basics of concurrent programming
  • Detail the implementation mechanism of defer
  • You really understand interface
  • Leaf-segment Distributed ID Generation System (Golang implementation version)
  • 10 GIFs to help you understand sorting algorithms (with go implementation code)
  • Go parameter transfer type
  • Teach my sister how to write message queues
  • Cache avalanche, cache penetration, cache breakdown
  • Context package, read this article enough!!
  • Implementation of Sync. Once for concurrent programming (with three interview questions)
  • Interviewer: Have you used for-range in go? Can you explain the reasons for these problems