This article describes some usage and considerations for Golang’s built-in string type.

The reflect/value.go file describes the runtime structure of the built-in string type. Data is a pointer and Len is the length.

type StringHeader struct {
	Data uintptr
	Len  int
}
Copy the code

In Golang, strings are immutable and multiple Data can share the same underlying Data.

// stringPTR Mandatory string Pointer to data
func stringptr(s string) uintptr {
	return (*reflect.StringHeader)(unsafe.Pointer(&s)).Data
}

func TestShare(t *testing.T) {
	s1 := "1234"
	s2 := s1[:2] / / "12"
	// s1,s2 are different strings, pointing to the same string data
	t.Log(stringptr(s1) == stringptr(s2)) // true

	s3 := "12"
	s4 := "1" + "2"
	// Golang internalizes string constants during compilation
	t.Log(stringptr(s3) == stringptr(s4)) // true

	s5 := "12"
	s6 := strconv.Itoa(12)
	// Strings generated at runtime cannot be internalized
	t.Log(stringptr(s5) == stringptr(s6)) // false
}
Copy the code

Type conversion

A string’s Data is a byte slice, and a string can be converted to []byte.

The runtime structure for Slice is as follows, similar to StringHeader.

type SliceHeader struct {
	Data uintptr
	Len  int
	Cap  int
}
Copy the code

A memory copy occurs when a string is directly rolled out of []byte.

func String2Bytes(s string) []byte {
	sh := (*reflect.StringHeader)(unsafe.Pointer(&s))
	bh := reflect.SliceHeader{
		Data: sh.Data,
		Len:  sh.Len,
		Cap:  sh.Len,
	}
	return* (* []byte)(unsafe.Pointer(&bh))
}

func Benchmark_NormalString2Bytes(b *testing.B) {
	x := "Hello Gopher! Hello Gopher! Hello Gopher!"
	for i := 0; i < b.N; i++ {
		_ = []byte(x)
	}
}

func Benchmark_String2Bytes(b *testing.B) {
	x := "Hello Gopher! Hello Gopher! Hello Gopher!"
	for i := 0; i < b.N; i++ {
		_ = String2Bytes(x)
	}
}
Copy the code

Run the test

go test -bench=. -benchmem -run=^Benchmark_$    

goos: darwin
goarch: amd64
pkg: github.com/liangyaopei/GolangTester/str
Benchmark_NormalString2Bytes- 8 -          27215298                40.2 ns/op            48 B/op          1 allocs/op
Benchmark_String2Bytes- 8 -                1000000000               0.306 ns/op           0 B/op          0 allocs/op
PASS
ok      github.com/liangyaopei/GolangTester/str 1.494s

Copy the code

traverse

There are two ways to traverse:

  • for-range: sets the string toruneTo resolve.
  • Traversal of subscript: the result isbyte

The for – range traversal

func TestRangeStr(t *testing.T) {
	const nihongo = "Japanese"
	for index, runeValue := range nihongo {
		t.Logf("%#U starts at byte position %d\n", runeValue, index)
	}
}

// === RUN TestRangeStr
// str_test.go:59: U+65E5 'day' starts at byte position 0
// str_test.go:59: U+672C 'this' starts at byte position 3
// str_test.go:59: U+8A9E '" starts at byte position 6
// -- PASS: TestRangeStr (0.00s)
Copy the code

The subscript traversal

func TestRangeStr2(t *testing.T) {
	const nihongo = "Japanese"
	for i := 0; i < len(nihongo); i++ {
		t.Logf("%v starts at byte position %d\n", nihongo[i], i)
	}
}
Copy the code

The internalization

String internalization is a technique for keeping only one copy of the same string in memory. For applications that store large numbers of strings, it can significantly reduce memory footprint.

package main

import (
    "fmt"
    "strconv"
)

type stringInterner map[string]string

func (si stringInterner) Intern(s string) string {
    if interned, ok := si[s]; ok {
        return interned
    }
    si[s] = s
    return s
}

func main(a) {
    si := stringInterner{}
    s1 := si.Intern("12")
    s2 := si.Intern(strconv.Itoa(12))
    fmt.Println(stringptr(s1) == stringptr(s2)) // true
}
Copy the code

reference

  1. Strings, bytes, runes and characters in Go
  2. String interning in Go

My official account: THE place to share lyP

My Zhihu column: zhuanlan.zhihu.com/c_127546654…

My blog: www.liangyaopei.com

Github Page: liangyaopei.github.io/