Guide language | in the use of the development of the Go backstage service for error handling, has long been a variety of different solutions, this paper discusses and puts forward a kind of service from the inside to the outside of the service of a complete scheme of uniform transmission, back and back, topic and want to share with everybody together to discuss.

First, question raising

In background development, there are three dimensions of error handling that need to be addressed:

  1. Error handling within a function: This is error handling when a function encounters various errors during execution. This is a language level problem.

  2. Function/module error message return: after a function operation error, how to gracefully return the error message, easy to handle (also gracefully) the caller. This is also a linguistic problem.

  3. Service/system error message return: How a microservice/system returns a friendly error message in the event of a processing failure still needs to be understood and handled elegantly by the caller. This is a service-level problem and applies to any language.

Error handling within functions

A process-oriented function that needs to handle different error messages in different processing procedures; An object-oriented function may need to handle different types of errors returned by an operation. In addition, when errors are encountered, assertions can be used to quickly stop the function flow, greatly improving the readability of the code.

Try is provided in many high-level languages… Catch syntax, a unified error handling logic can be implemented within a function through this scheme. Even “intermediate languages” such as C do not have them, but programmers can implement some degree of error assertion using macro definitions. However, the Go case is more awkward.

(1) False assertion of Go

Let’s start with assertions. Our goal is to check for errors and terminate the current function with a single line of code. Since there are no throws and no macros, there are two ways to implement a line of assertions.

The first is to write the if error on one line, as in:

if err ! = nil { return err }Copy the code

The second method is to borrow the panic function, combined with recover:

func SomeProcess() (err error)
    defer func() {
        if e := recover(); e != nil {
            err = e.(error)
        }
    }()

    assert := func(cond bool, f string, a ...interface{}) {
        if !cond {
            panic(fmt.Errorf(f, a...))
        }
    }

    // ...

    err = DoSomething()
    assert(err == nil, "DoSomething() error: %w", err)

    // ...
}
Copy the code

Both approaches are questionable.

First, the problems with if in the same line are:

  1. This writing method, although theoretically conforms to the code specification of Go, but in practice, the fact that curly braces are not newlines is still a bit controversial, and I have rarely seen it in the actual code.

  2. It’s not intuitive, and it’s not convenient to write other statements in curly braces, because it’s strongly discouraged in the Go specification; To separate code statements (except for if judgments)

As for the second method, it depends on the situation:

  • First, panic is designed to be called when a program or coroutine encounters a critical error that completely stops it from running (such as segment errors, shared resource contention errors). This is equivalent to a FATAL level error log in Linux. Used only for general ERROR handling (ERROR level).
  • The panic call itself, compared to normal business logic, has a lot of overhead. Error handling can be normalized logic, and frequent panic-recover operations can greatly reduce system throughput.

However, scenarios that assert using Panic, while rarely used in business logic, are very common in test scenarios. Why not test it with a knife? A little bit of overhead is fine. For Go, a very popular unit testing framework, GoConvey, uses the Panic mechanism to implement assertions in unit testing, which people use well.

To sum up, I do not recommend using assertions for business code in Go, and recommend using this format honestly when encountering errors:

if err := DoSomething(); err ! = nil { // ... }Copy the code

In single-test code, however, it is perfectly acceptable to adopt panic-based assertions like GoCONVEY.

(二) Go to try… catch

It is well known that Go is not try… And from the official point of view, there is no plan to consider any time soon. But programmers need it. The method adopted by the author is to globalize the err variable that needs to be returned inside the function, and then unify the process with defer:

Func SomeProcess() (err error) {// <-- defer func() {if err == nil {return} As catch the role if errors. Is (err, somepkg ErrRecordNotExist) {err = nil / / here Is an example, Is likely to capture some mistakes, for this function Is not wrong, Therefore err = nil} else if errors. Like (err, somepkg ErrConnectionClosed) {/ /... // If the connection is disconnected, you may need to do some reconnection. } else {//... }} () / /... if err = DoSomething(); err ! = nil { return } // ... }Copy the code

This approach pays special attention to variable scope issues, such as if err=DoSomething(); err! =nil{line, if we put err=… To err: =… , so the ERR variable in this line is not the same variable as the one defined earlier in the function (Err Error), so even if an error occurs there, the ERR variable cannot be caught in the defer function.

In the try… In terms of catch, I don’t have a particularly good way to simulate it, and even the above method has a very annoying problem: the writing of defer causes error processing to be front-loaded and normal logic to be back-loaded, which is very unfriendly from a readability perspective. Therefore, I hope readers can give advice. Hopefully, Go officials will continue to iterate and support this syntax.

Error message returned from function/module

This point in Go seems to be relatively uniform at the beginning. This is the error type defined by Go from the very beginning, which unified the error return mode at the function level in the process in a system standard way. The caller uses if err! =nil to determine if a call was successful.

However, with the gradual promotion of Go, due to the high degree of freedom of error interface, programmers have different opinions on “how to determine what the error is”.

(I) Before Go1.13

Prior to Go1.13, there were three common modes for error type delivery:

  • genre

The school is simply a pattern that defines various error messages directly as a class enumerated value, such as:

var (
    ErrRecordNotExist   = errors.New("record not exist")
    ErrConnectionClosed = errors.New("connection closed")
    // ...
)
Copy the code

When an error message is encountered, simply return the corresponding enumeration value of the error class. It is also convenient for callers to use switch-case to determine the type of error:

 switch err {
    case nil:
        // ...
    case ErrRecordNotExist:
        // ...
    default:
        // ...
    }
Copy the code

Personally, I think this design mode is still C Error code mode in essence.

  • Type assertion genre

This school takes full advantage of the “error is an interface” feature and redefines an error type. On the one hand, different types are used to represent different types of errors. On the other hand, it can provide better information to the caller for the same error type. For example, we can define several different error types as follows:

type ErrRecordNotExist errImpl

type ErrPermissionDenined errImpl

type ErrOperationTimeout errImpl

type errImpl struct {
    msg string
}

func (e *errImpl) Error() string {
    return e.msg
}
Copy the code

For the caller, the different errors are determined by the following code:

if err == nil { // OK } else if _, ok := err.(*ErrRecordNotExist); } else if _, ok := err.(*ErrPermissionDenined); Ok {// handle permission errors} else {// Handle other types of errors}Copy the code
  • FMT. Errorf genre
if err := DoSomething(); err ! = nil { return fmt.Errorf("DoSomething() error: %v", err) }Copy the code

This mode, on the one hand, can pass through the underlying errors, on the other hand, can add custom information. However, for the caller, the disaster is that if we want to determine the specific type of a certain error, we can only use strings.Contains(), and the specific description of the error is unreliable. The same type of information may have different expressions. In the process of FMT.Errorf, additional information added by each business may also have different characters, which brings great unreliability and improves the coupling degree between modules.

(2) After Go1.13

After the release of Go1.13, wraping was added for fmt.errorf and the Is() and As() functions were added to the errors package. Many articles have been written about the principles and uses of this pattern, which I will not cover in this article.

Errorf () : errors.is (); errors.is (); errors.is (); errors.is (); errors.is (); In addition, it is an official endorsement of the genre of type assertion (specifically supported by the As() function).

In practice, error wrapping mode of Go should be used when functions/modules pass through errors, that Is, fmt.errorf () Is used together with %w. Business side can add their own error information without worry, As long As callers use errors.is () and errors.as ().

Service/system error message returned

(I) Traditional plan

Error messages are returned at the service/system level. Most protocols can be thought of as code-message patterns or variations of them:

  • Code is either a number or a predefined string, which can be treated as an integer or an enumerated value of string type.
  1. If it is a number, in most cases 0 is used to indicate success, and in some cases a more regular decimal number such as 1000, 10000, etc.
  2. If it is a predefined string, use a string such as “success” or “OK” to indicate success, or use an empty string or even a field that does not return a string.
  • The message field is a detailed description of the error message, which in most cases is a human-readable sentence.

  • In general, this message field should only be returned if code indicates an error.

The characteristics of this mode are: code is used by the program code, the code determines what type of error this is, and enters the corresponding branch processing; Message is a human error message that can be thrown or logged in some form for the user to view.

(2) Existing problems

What are the problems at this level? Code for computer, message for user.

But sometimes, we may receive a user/customer feedback with a question: “XXX reported an error, please check what is the problem?” Can’t users understand our error message?

In my experience, when we use code-message, especially in the early days of our business, it is inevitable that the design copy on the front and back ends does not fully cover all the wrong use cases, or that errors are extremely rare. Therefore, when an error occurs, the message is ambiguous (or even a direct error message), causing the user to find a solution from the error message.

In this case, covering as many error paths as possible is surely the perfect approach. Until that happens, however, code farmers often have the following solutions:

  • When an undefined error is encountered, the back end returns a uniform error code in code and logs the detailed error information in Message. However, this model has the following problems:
  1. When the client prompts such messages, if the message is displayed directly, it may display a lot of text that the user cannot understand (and does not need to understand), and the text may be very long (it is a panic message), which is very unfriendly to the user. \

  2. If the developer is not careful, the message may reveal details of the application, such as the user name and IP of the database in the message of the DB connection failure. Sensitive information once exposed, light safety education, heavy high-voltage line service.

  • It is similar to the above method, which returns a uniform error code, and message directly uses a general message like “Unknown error” or “Unknown error, please contact XXX”. But at this point, how do we get it wrong?
  1. If the caller is a different module, the user must be a programmer. In this case, just provide requestID/trackID to the caller.

  2. If the other person is an ordinary user, should the user F12 look at the console? If it’s mobile, there’s no chance to watch it at all. If you expose traceID to the user, who can remember the ID?

It’s hard to hide information and expose it at the same time…

(iii) Solutions

Here, I take inspiration from the growing popularity of SMS verification codes – people’s short-term memory is relatively strong for four characters, so we can consider shortening the error code to four characters – case insensitive, because if people have to remember the case, it will be a lot more difficult.

How do you represent as much data as possible in four characters? A total of 36 characters are included in the numbers and letters. In theory, the 4-digit base 36 can be used to represent 36x36x36x36=1679616. So we just need to find a hash algorithm for the error string that limits the output to 1679616.

I’m using MD5 as an example. MD5 output is 128 bits, theoretically I can take the MD5 output, mod 1679616 can get a simple result. In fact, to reduce division, I took the easy way of taking 20 bits higher (0xFFFFF) (the maximum value for a 20-bit binary is 1048575), and then printed that number as a 36-base string.

When abnormal errors occur, we can display the prompt message of Message as follows: “Unknown error, error code 30EV, if you need assistance, please contact XXX”. By the way, 30EV is the result of “Access denied for user’db_user’@’127.0.0.1′”, so I’m hiding sensitive information from the caller.

On the back end, you still need to actually log the hash value and the specific error message in a log or other searchable channel. When the user provides this code, it can be quickly located.

The advantages of this scheme are obvious:

  • It provides enough information for the user to remember the code and send it back to the development side for debugging.
  • For the same error, because of the nature of hashing, the results are the same. Even if there is a collision, it can be quickly distinguished as long as you don’t enter too much data.
  • Since no matter how long the error message is, the feedback to the front end is only four characters long, the back end can safely use the Error Wraping mechanism of Go1.13 to record enough error messages when recording error messages.

The simple error code generation code is as follows:

import ( // ... "github.com/martinlindhe/base36" ) var ( replacer = strings.NewReplacer( " ", "0", "O", "0", "I", "1", ) ) // ... func Err2Hashcode(err error) (uint64, string) { u64 := hash(err.Error()) codeStr := encode(u64) u64, _ = decode(codeStr) return u64, codeStr } func encode(code uint64) string { s := fmt.Sprintf("%4s", base36.Encode(code)) return replace.Replace(s) } func decode(s string) (uint64, bool) { if len(s) ! = 4 { return 0, false } s = strings.Replace(s, "l", "1", -1) s = strings.ToUpper(s) s = replace.Replace(s) code := base36.Decode(s) return code, Func hash(S string) uint64 {H := MD5.sum ([]byte(s)) U := binary.bigendian. Uint32(H [0:16]) return uint64(u &amp; 0xFFFFF) }Copy the code

Of course, this scheme also has limitations. The author can think of the following two points to pay attention to:

  • When generating error, avoid recording random data, unrepeatable data, and data with thousands of faces, such as time, account number, flow ID, etc., so that users can generate the same error code when performing unified operations as much as possible.
  • Since the number 1 is similar to the letter I and the number 0 to the letter O, uniform conversions are needed to avoid ambiguity. This is why, in Err2Hashcode, the hash result is re-decode and then returned.

In addition, the author needs to emphasize again that in the development process, different and formal error code and message use cases still need to be completely covered, and the existing code-message mechanism should be used as much as possible to inform the caller with clear information. This hashCode error code generation method is only suitable for use as a temporary solution for finding and debugging missing error use cases during delivery iterations.

Author’s brief introduction

Zhang min

Tencent Senior background engineer.