Problems with writing Go code [Issue 3]

I have a habit of recording the problems encountered in the programming process at any time (including the problem site, the cause of the problem and the analysis of the problem), and I like to review and summarize the gains and losses of the coding process in a period of time. This can include: a new understanding of programming syntax, a pothole experience, a few tips/practices that make me feel better, etc. Record and summarize much, feel valuable, written hair on the blog; Some small points, or not clear things, or ideas can not be structured and unified, put in the database for backup. The “Problems With Writing Go Code” series is based on this idea.

In this article, I have divided the “problems encountered” into three categories: language category, library and tool category, and practice category, which should be easier for everyone to read and understand. In addition, in this article, we first take a look at the current State of Go language, which comes from Twitter, Reddit, Golang-Dev Forum, The Issue/CL of Golang project on Github and various Talk materials of GopherCon.

Zero. Current status of the Go language

1. vgo

Go 1.10 was officially released during the Chinese Lunar New Year. Then the Go Team entered the Go 1.11 development cycle.

In a 2017 Go user survey, the lack of a good package management tool and Generics ranked as the top two challenges and problems Gopher faced, and the Go Team is finally taking them seriously, especially package dependency management. At the end of February this year, Russ Cox published a series of seven blog posts detailing the design and implementation of vGO, a Go command line tool with version awareness and support, and formally submitted a “versioned- Go Proposal” at the end of March.

Currently, relatively mature package management schemes are as follows:

"Semantic version" +manifest file (manually maintained dependency constraint description file) + LOCK file (tool-generated transfer dependency description file) + version selection engine tool (such as GPS-Go Packaging Solver in DEP)Copy the code

In contrast, VGO has both inheritance and innovation. Support for semantic versions is inherited, new mechanisms such as Semantic Import Versioning and minimal version selection are innovated, and compatibility with Go1 syntax remains unchanged. Russ Cox’s plan for Go 1.11 is likely to provide an experimental vGO implementation (presumably merged into Go Tools, of course) for gopher to try out and give feedback, and then, like vendor, gradually become the default option in subsequent versions of Go.

2. wasm porting

Richard Musiol, author of the popular open source project GopherJS, submitted a proposal last month: The idea behind WebAssembly Architecture for Go is to make it possible for Gopher to write front-end code in Go, so that code written by Go can run in a browser. This doesn’t actually make Go run directly on a browser or NodeJS like JS does, but rather compiles Go to WebAssembly, WASM intermediate bytecode, and runs it in the browser or nodeJS initialized runtime environment. Here is a sketch of the comparison between binary machine code GO APP and intermediate code WASM. I hope it will be useful to you:

Wasm Porting has had its first commit and is likely to release the first version as well as Go1.11.

3. Non-cooperative Goroutine preemptive scheduling

Currently goroutine’s “preemptive” scheduling relies on the “cooperative preemption point” automatically inserted by compiler into the function, but there are still various problems in using this method, such as: Performance degradation at checkpoints, weird overall latency issues, and debugging difficulties. Austin Clements, who was recently responsible for the gc design and implementation of go Runtime, proposed a proposal: Non-cooperative goroutine preemption, the proposal would remove cooperative Preemption point, Instead, goroutine preemption is implemented by building and logging stack and register maps for each instruction, which is expected to be implemented in Go 1.12.

4. History and future of Go

At GopherConRu 2018, Brad Fitzpatrick, a key member of the Go Team, gave a keynote speech on “The History and Future of Go”. Bradfitz “revealed” several possibilities for Go2, considering Bradfitz’s position on the Go team, These possibilities have a lot of credibility:

1). Never split the community like Perl6 and Python3 2). Go1 packages can import Go2's package 3). Ian Lance Taylor should lead the Proposal 4). Go2 will be improved in error handling, but not in the try-catch form 5). Compared to Go1, Go2 will only make significant changes in 1-3 aspects 6). Go2 will likely have a new standard library, which will be smaller than the existing standard library, with many functions placed outside 7). But Go2 gives a list of the most popular, recommended, and potentially certified common packages outside of the standard library, which can be updated continuously, unlike those in the standard library, which can only be updated every six months.Copy the code

I. Language

1. Use len(channel)

Len is a built-in function of the Go language that accepts array, slice, map, string, and channel parameters and returns the length of the corresponding type — an integer value: len is a built-in function of the Go language that accepts array, slice, map, string, and channel parameters and returns the length of the corresponding type:

Len (s) If s is a string, len(s) returns the number of bytes in the string. Len (s) returns the length of the array. Len (s) returns the length of the array. Len (s) returns the current length of slice. If s is the map type of map[K]T, Len (s) returns the number of defined keys in the mapCopy the code

However, we often see len calls to arrays, slices, and strings in our code, while len and channel are rarely used together. Does that mean len(channel) is unavailable? Let’s first look at the semantics of Len (channel).

When a channel is unbuffered, len(channel) always returns 0;
When a channel is buffered, len(channel) returns the number of unread elements in the current channel.

Thus, a channel in len(channel) is a buffered channel. Len (channel) is semantically used for full, owning, and null logic:

If len(channel) == 0 { If len(channel) > 0 {if len(channel) > 0 { If len(channel) == cap(channel) { }Copy the code

As you can see, I put a question mark after the comments “empty”, “data available” and “full” in the code above! A channel is used for communication between multiple Goroutines. Once multiple Goroutines read and write a channel together, len(channel) will form a “race condition” between multiple Goroutines, and the singleton will rely on Len (channel) to judge the queue state. There is no guarantee that the state of a channel will remain the same when a channel is actually read or written. Take shorting as an example:

As you can see from the figure above, when Goroutine1 nullates with len(channel), it attempts to read data from the channel. However, before the data is actually read from the Channel, another Goroutine2 has already read the data out. The read after Goroutine1 will block on the Channel, causing the subsequent logic to fail. Therefore, in order not to block on a channel, the common method is to do both null&read and full &write together, and achieve the “transactionality” of the operation through select:

//writing-go-code-issues/3rd-issue/channel_len.go/channel_len.go.go
func readFromChan(ch <-chan int) (int, bool) {
    select {
    case i := <-ch:
        return i, true
    default:
        return 0, false // channel is empty
    }
}

func writeToChan(ch chan<- int, i int) bool {
    select {
    case ch <- i:
        return true
    default:
        return false // channel is full
    }
}

Copy the code

We can see that readFromChan does not block when the channel is empty due to the select-default trick. WriteToChan does not block when the channel is full. This method may be suitable for most situations, but there is a “problem” with this method, which “changes the state of the channel” : an element is read or an element is written. Sometimes we don’t want to do this, we want to simply detect the channel state without changing the channel state! Unfortunately, there is no single method that can be used in every situation. But in certain scenarios, we can use Len (channel). Take this scenario:

This is a “multiple producer + 1 consumer” scenario. A controller is a master control coroutine that determines, initially, whether there is a message in a channel. If there’s a message, it doesn’t consume the “message” itself, but creates a consumer to consume the message until the consumer exits for some reason, and control goes back to the Controller, and the Controller doesn’t immediately create a new consumer, Instead, it waits for the next time a channel has a message. In such a scenario, we can use Len (channel) to determine whether there is a message.

2. Formatted output of time

The formatted output of time is a common “problem” encountered in everyday programming. In the past, strftime was used when programming in C. Let’s recall the c code:

// writing-go-code-issues/3rd-issue/time-format/strftime_in_c.c #include <stdio.h> #include <time.h> int main() { time_t  now = time(NULL); struct tm *localTm; localTm = localtime(&now); char strTime[100]; strftime(strTime, sizeof(strTime), "%Y-%m-%d %H:%M:%S", localTm); printf("%s\n", strTime); return 0; }Copy the code

The output of this C code is:

The 2018-04-04 16:07:00Copy the code

We see that strfTime uses “characterized” placeholders (e.g. %Y, %m, etc.) to “spell out” the target output format layout of time (e.g. “%Y-%m-% D %H:% m :%S”), this method is not only used in C, many other mainstream programming languages, such as shell, Python, Ruby, Java, etc., this seems to have become the standard of various programming languages at the time of formatting output. These placeholders are relatively easy to remember because the characters they correspond to (Y, M, H, for example) are the beginning of the corresponding English word.

But if you use strftime’s “standards” in Go, you’re going to cry foul the moment you see the output.

// writing-go-code-issues/3rd-issue/time-format/timeformat_in_c_way.go
package main

import (
    "fmt"
    "time"
)

func main() {
    fmt.Println(time.Now().Format("%Y-%m-%d %H:%M:%S"))
}
Copy the code

The output of the above GO code is as follows:

%Y-%m-%d %H:%M:%S
Copy the code

Go prints the “time format placeholder string” intact!

This is because Go takes a different approach to time formatting output than StrfTime. Go’s designers had this in mind: even though strftime’s individual placeholders take the form of the first letter of the corresponding word, it’s hard to really spell out a complex time format without opening the Strftime manual or looking at the Web version of the Strftime mnemonic instructions. And for a format string “%Y-%m-%d %H:% m :%S”, without reference to the document, it is difficult to accurately give the time result after formatting in the brain, such as the difference between %Y and %Y, and the difference between %m and %m?

Go language uses the more intuitive “Reference time” to replace various standard placeholders of Strftime, and the “time format string” constructed by “reference time” is “exactly the same” as the final output string, which saves the programmer the process of parsing the format string in his mind again:

Format string: "2006 01 02 15:04:05 "=> Output result: 2018 04 18:13:08Copy the code

The standard reference times are as follows:

2006-01-02 15:04:05 PM -07:00 Jan Mon MST
Copy the code

This absolute time itself has no practical significance, but for the sake of “easy to remember”, we change this reference time to another time output format:

01/02 03:04:05PM '06 -0700
Copy the code

We can see that the Go designer’s “careful”, this time actually happens to be the result of sorting the mnemonic from small to large (from 01 to 07), which can be understood as: 01 corresponds to %M, 02 corresponds to %d, etc. The following graph graphically shows the relationship between “reference time”, “format string” and the final formatted output:

In my own experience with Go, I still have to go Doc Time package or open the Web manual of the Time package when DOING time formatting output, especially when building slightly more complex time format output. Based on community feedback, a lot of Gopher’s had similar experiences, especially those that were used to the Strftime format. A ‘Fucking Go Date Format’ page was even created to help automatically convert strftime to Go Time.

The following cheatsheet can also provide some help (generated by the writing-go-code-issues/3rd-issue/time-format/timeformat_cheatsheet.go output) :

Libraries and tools

1. Golang.org/x/text/encoding/unicode encounter a pit

In the goCMPp project, I used Unicode character set conversions: converting UTF8 to UCS2 (UTF16), UCS2 to UTF8, UTF8 to GB18030, etc. I use encoding/ Unicode and Transform under the golang.org/x/text project to implement these conversions. X/Text is an official golang maintained text processing toolkit that contains operations on the Unicode character set.

To implement a utF8 to UCS2 (UTF16) character set conversion, just do something like this (which was my original implementation) :

func Utf8ToUcs2(in string) (string, error) { if ! utf8.ValidString(in) { return "", ErrInvalidUtf8Rune } r := bytes.NewReader([]byte(in)) //UTF-16 bigendian, no-bom t := transform.NewReader(r, unicode.All[1].NewEncoder()) out, err := ioutil.ReadAll(t) if err ! = nil { return "", err } return string(out), nil }Copy the code

Note that the unicode.All section holds All utF-16 formats:

var All = []encoding.Encoding{
    UTF16(BigEndian, UseBOM),
    UTF16(BigEndian, IgnoreBOM),
    UTF16(LittleEndian, IgnoreBOM),
}
Copy the code

Here I originally used All[1], which is UTF16(BigEndian, IgnoreBOM), and everything is fine.

But a few years ago, I updated the Text project to the latest version and found that the unit tests failed:

-- FAIL: TestUtf8ToUcs2 (0.00s) utils_test.go:58: The first char is fe, The not equal to expected 6 c FAIL FAIL github.com/bigwhite/gocmpp/utils 0.008 sCopy the code

The search found: text project golang.org/x/text/encoding/unicode package made incompatible changes, above the unicode. All section into the following:

// All lists a configuration for each IANA-defined UTF-16 variant.
var All = []encoding.Encoding{
    UTF8,
    UTF16(BigEndian, UseBOM),
    UTF16(BigEndian, IgnoreBOM),
    UTF16(LittleEndian, IgnoreBOM),
}
Copy the code

The All section inserts a UTF8 element at the beginning, which causes my code to use UTF16(BigEndian, IgnoreBOM) instead of UTF16(BigEndian, UseBOM), which is understandable.

How do you change it? This time I’ll use UTF16(BigEndian, IgnoreBOM) instead of All slicing:

func Utf8ToUcs2(in string) (string, error) { if ! utf8.ValidString(in) { return "", ErrInvalidUtf8Rune } r := bytes.NewReader([]byte(in)) //UTF-16 bigendian, no-bom t := transform.NewReader(r, unicode.UTF16(unicode.BigEndian, unicode.IgnoreBOM).NewEncoder()) out, err := ioutil.ReadAll(t) if err ! = nil { return "", err } return string(out), nil }Copy the code

This way, if the All slice changes any more, my code won’t be affected.

2. Logrus customized output of unstructured logs

In the first article in this series, I mentioned using Logrus + LumberJack to implement the logging that supports rotate.

By default the log output format is like this (writing – go – code – issues / 3 rd – issue/logrus/logrus2lumberjack_default. Go) :

time="2018-04-05T06:08:53+08:00" level=info msg="logrus log to lumberjack in normal text formatter"
Copy the code

This relatively structured log is suitable for subsequent centralized log analysis. However, logs carry too much “meta information” (time, level, MSG), which is not preferred in all situations, so we expect plain unstructured log output. We customize formatter:

// writing-go-code-issues/3rd-issue/logrus/logrus2lumberjack.go func main() { customFormatter := &logrus.TextFormatter{ FullTimestamp: true, TimestampFormat: "The 2006-01-02 15:04:05", } logger := logrus.New() logger.Formatter = customFormatter rotateLogger := &lumberjack.Logger{ Filename: "./foo.log", } logger.Out = rotateLogger logger.Info("logrus log to lumberjack in normal text formatter") }Copy the code

We use TextFormatter and customize the timestamp format to produce the following output:

time="2018-04-05 06:22:57" level=info msg="logrus log to lumberjack in normal text formatter"
Copy the code

Logs are still not what we want. But the same customFormatter output to terminal is what we want:

//writing-go-code-issues/3rd-issue/logrus/logrus2tty.go

INFO[2018-04-05 06:26:16] logrus log to tty in normal text formatter
Copy the code

So how do we set the TextFormatter property so that we can output the log format that we want to the LumberJack? We were forced to dig into the logrus source code and found this:

//github.com/sirupsen/logrus/text_formatter.go // Format renders a single log entry func (f *TextFormatter) Format(entry  *Entry) ([]byte, error) { ... . isColored := (f.ForceColors || f.isTerminal) && ! f.DisableColors timestampFormat := f.TimestampFormat if timestampFormat == "" { timestampFormat = defaultTimestampFormat  } if isColored { f.printColored(b, entry, keys, timestampFormat) } else { if ! f.DisableTimestamp { f.appendKeyValue(b, "time", entry.Time.Format(timestampFormat)) } f.appendKeyValue(b, "level", entry.Level.String()) if entry.Message ! = "" { f.appendKeyValue(b, "msg", entry.Message) } for _, key := range keys { f.appendKeyValue(b, key, entry.Data[key]) } } b.WriteByte('\n') return b.Bytes(), nil }Copy the code

If isColored is false, the output is a structured log with time, MSG, and level. Only isColored is true to print the normal log we want. The isColored value is associated with three properties: ForceColors, isTerminal, and DisableColors. Let’s reset the three properties to the combination of conditions that make isColored true. IsTerminal is automatically false because it is printed to file.

//writing-go-code-issues/3rd-issue/logrus/logrus2lumberjack_normal.go func main() { // isColored := (f.ForceColors || f.isTerminal) && ! f.DisableColors customFormatter := &logrus.TextFormatter{ FullTimestamp: true, TimestampFormat: "2006-01-02 15:04:05", ForceColors: true, } logger := logrus.New() logger.Formatter = customFormatter rotateLogger := &lumberjack.Logger{ Filename: "./foo.log", } logger.Out = rotateLogger logger.Info("logrus log to lumberjack in normal text formatter") }Copy the code

When we set ForceColors to true, we get the output we expect in foo.log:

INFO[2018-04-05 06:33:22] logrus log to lumberjack in normal text formatter
Copy the code

Three. Practice

1. Describe how to read timeout data on the network. – The following uses SetReadDeadline as an example

Go is a natural fit for network programming, but the complexity of network programming is obvious, and there are many factors to consider in order to write a stable and efficient network side program. For example, reading data from the socket timed out.

The standard network library of Go language does not realize the “Idle timeout” as implemented by EPoll, but provides the Deadline mechanism. Let’s use a picture to compare the differences between the two mechanisms:

A) and B) in the figure above show the “Idle Timeout” mechanism. The so-called idle timeout means that the timeout is actually in the case of no data ready (as shown in Figure A). If there is data ready to read (as shown in Figure B), Then the timeout mechanism is suspended until the data is read. When the data waits again, idle Timeout starts again.

Deadline (take read Deadline as an example) returns a timeout error when read again after the deadline, regardless of whether there is data ready or data read activity. And all subsequent network read operations also return timeout (d in the figure), Unless SetReadDeadline(time.time {}) is called again to cancel the Deadline or reset the Deadline before reading the action again. Go network programming is usually a “blocking model”, so why SetReadDeadline? Sometimes we want to give callers a chance to “feel” other “exceptions”, such as if they received an exit notification from Main Goroutine.

The Deadline mechanism is prone to error. Here are two examples:

A) After SetReadDeadline is set, idle timeout may be implemented for each subsequent Read

In the figure above, we see that the process is to Read a complete business package using three Read calls, but only SetReadDeadline is called before the first Read. This usage implements A full “idle timeout” only when Read A, and only when A data is still not ready. Once data A is ready and Read, expecting A full idle timeout when data B and C is Read misunderstands the meaning of SetReadDeadline. Therefore, to achieve a “full Idle timeout” for each Read, the deadline needs to be reset before each Read.

B) Processing of abnormal situations in which a complete “business package” is read for several times

In this picture, the deadline is reset before each Read. Is this ok? For business logic that reads a “complete business package” in a process, we also have to consider handling each read exception, especially when timeout occurs. In this example, there are three Read positions to consider for exception handling.

If Read A fails to Read the data and the deadline expires, return timeout. This is easiest because the previous complete packet has been Read and the new complete packet has not arrived. The outer control logic receives timeout and restarts the Read process again.

If no data is Read at Read B or Read C and the deadline expires, exception handling becomes trickier because part of A complete packet (A) has already been Read from the stream and the remaining data is not A complete business packet and cannot simply be restarted in the outer control logic. We either try multiple rereads at Read B or Read C until the complete packet is Read and returned; Either it is considered unreasonable to have timeout at B or C, and the error code different from that at A is returned to the outer control logic, so that the outer control logic can determine whether the connection is abnormal.

Note: The sample code for this article can be downloaded here.

Weibo: @ tonybai_cn WeChat number: public iamtonybai github.com: https://github.com/bigwhite

Wechat appreciation:

Problems with Writing Go code [Issue 1]
Problems with Writing Go code [Issue 2]
Source Creation open Source interview: Ten Years of Growth, The Evolution of Go
Go coding in go way
On the correct posture of golang Timer Reset method