Strings are a common data type, and in many languages, including Go, they are designed to be immutable for security. Each generated string creates a new string, not a modification of the existing string.

In Go, string concatenation can be done in many ways, from + directly to fmt.sprintf, strings.Builder and bytes.buffer.

In this article, I’ll discuss how best to do string concatenation in code.

1. Take a benchmark

Before we start looking at the pros and cons of each concatenation method, let’s run a simple benchmark test to see how each string concatenation method performs.

The Benchmark framework is provided in Go. The test file needs to end with test, and then each test method starts with Benchmark. This time, the Benchmark is tested in three ways: plus, FTt. SPrintf, and Strings. Builder.

func BenchmarkPlus(b *testing.B) {
	str := "this is just a string"

	for i := 0; i < b.N; i++ {
		stringPlus(str)
	}
}

func BenchmarkSPrintf(b *testing.B) {
	str := "this is just a string"
	for i := 0; i < b.N; i++ {
		stringSprintf(str)
	}
}

func BenchmarkStringBuilder(b *testing.B) {
	str := "this is just a string"
	for i := 0; i < b.N; i++ {
		stringBuilder(str)
	}
}

func stringPlus(str string) string {
	s := ""
	for i := 0; i < 10000; i++ {
		s += str
	}
	return s
}

func stringSprintf(str string) string {
	s := ""
	for i :=0; i < 10000; i++ {
		s += str
	}
	return s
}

func stringBuilder(str string) string {
	builder := strings.Builder{}
	for i := 0; i < 100000; i++ {
		builder.WriteString(str)
	}
	return builder.String()
}
Copy the code

The benchmark uses * testing.b, where b.N is not a fixed value and the size of the value is determined by the framework itself.

Here, we tested different ways to concatenate a fixed string 10,000 times, and then counted the average code execution time and memory consumption. Run the benchmark with the following command:

go test -bench=. -benchmem
Copy the code

The -bench=. Argument runs all benchmarks in the current package, and -benchmem counts memory usage for the tests. After running the above command, the output is as follows:

goos: darwin goarch: amd64 pkg: zxin.com/zx-demo/string_benchmark BenchmarkPlus-12 12 96586447 ns/op 1086401355 B/op 10057 allocs/op BenchmarkSPrintf-12  12 97037216 ns/op 1086402698 B/op 10065 allocs/op BenchmarkStringBuilder-12 655 1713353 ns/op 11671537 B/op 35 Allocs /op PASS OK zxin.com/zx-demo/string_benchmark 6.186sCopy the code

The first column represents the method name of the benchmark and the value of GOMAXPROCS used, the second column represents the number of test cycles, the third column represents the average time taken per test in nanoseconds, the fourth column represents the average memory allocated per run, and the fifth column represents the number of memory allocated per run.

From the above tests, you can see that the Strings. Builder performs best, consuming 100 times less memory than simply concatenating strings with a plus sign.

2. Why such a big difference in performance

As you can see from the benchmark above, performance varies greatly depending on how strings are concatenated.

The string of Go is immutable, and if the string is concatenated using a plus sign, memory will be reallocated each time. The Strings. Builder preallocates memory and automatically expands the string length as it continues to be written.

The underlying storage of strings.Builder uses [] bytes, and the initial length allocation is 32, which is then doubled with each expansion.

type Builder struct {
	addr *Builder
	buf  []byte
}
Copy the code

When the length reaches 2048, it will not directly double, but increase the multiple of 640 each time, the first increase 640, the second increase 1280, and so on.

Strings. Builder can be more efficient than concatenating large numbers of strings.

Bytes.buffer is another similar library with the same performance as Strings. Builder, but strings.Builder is still recommended for scenarios with pure concatenated strings.

3. Best practices for spelling strings

Although strings.Builder has high performance, not all scenarios are compatible with this. If it’s just a simple string concatenation, use the plus sign.

If some string formatting is involved, FMT.Sprintf is more appropriate.

In a scenario where a large number of strings are concatenated, you can simply use strings.Builder. When you use strings.Builder, if strings keep growing, the underlying storage keeps growing. If you can predict the length of a string, you can allocate memory in advance. Reduce capacity expansion times.

Add a test case:

func BenchmarkStringBuilderPre(b *testing.B) {
	str := "this is just a string"
	for i := 0; i < b.N; i++ {
		stringBuilderPre(str)
	}
}

func stringBuilderPre(str string) string {
	builder := strings.Builder{}
	builder.Grow(1000000)
	for i := 0; i < 100000; i++ {
		builder.WriteString(str)
	}
	return builder.String()
}
Copy the code

Here are the results of the benchmark:

pkg: zxin.com/zx-demo/string_benchmark BenchmarkPlus-12 12 96676019 ns/op 1086401676 B/op 10057 allocs/op BenchmarkSPrintf-12  12 96693407 ns/op 1086402022 B/op 10058 allocs/op BenchmarkStringBuilder-12 607 1822282 ns/op 11671543 B/op 35 allocs/op BenchmarkStringBuilderPre-12 860 1393689 ns/op 8257539 B/op 5 allocs/opCopy the code

As you can see, when the length is specified in advance, performance is improved, memory usage and allocation times are reduced, and the running time is improved.

The text/Rayjun