The problem

While coding yesterday, I discovered a feature of the Go language that I hadn’t noticed before.

Take a look at the code below

import "strings"

func someFunc(a) {
    s := "some words"
    for i, c := range s {
        strings.Contains("another string", c)   
    }
}
Copy the code

This code is actually uncompilable. Because the signature of Contains is func Contains(S, substr String) bool, and the type of C has become rune.


why

Rune is a rune in English, as anyone who plays fantasy RPGS should know. In Go, a rune represents a Unicode code point. Go’s for loop automatically parses the string as Unicode, returning I as a byte bit and C as a single Unicode character.

Take this code for example (from official documentation: golang.org/doc/effecti…)

for pos, char := range "Japanese \x80" { // \x80 is an illegal UTF-8 encoding
    fmt.Printf("character %#U starts at byte position %d\n", char, pos)
}
Copy the code

prints

character U+65E5 'day' starts at byte position 0
character U+672C 'this' starts at byte position 3
character U+FFFD '�' starts at byte position 6
character U+8A9E 'language' starts at byte position 7
Copy the code

The answer

So there are two ways to improve my code.

Methods a

import "strings"

func someFunc() {
    s := "some words"
    for i, c := range s {
        strings.Contains("another string", string(c))   
    }
}
Copy the code

Force c back to string.

Method 2

import "strings"

func someFunc(a) {
    s := "some words"
    for i, c := range s {
        strings.ContainsRune("another string", c)   
    }
}
Copy the code

Use the ContainsRune function.


conclusion

Programmers working in non-English speaking countries have more experience with Unicode. I rarely deal with them myself. However, I like Go’s design for [] Byte, String, and Rune. This makes character processing very easy. As a former Python2 programmer, my quip is in place.

space.bilibili.com/16696495

Welcome to pay attention to my public number and B station!