Why do you need timeout control?

A common problem in many chained failure scenarios is that the server is consuming a lot of resources to process requests that have already passed the client deadline. As a result, the server is consuming a lot of resources without doing any valuable work, and it makes no sense to respond to requests that have already expired.

Timeout control can be said to be an important defense line to ensure service stability. Its essence is fail fast. A good timeout control strategy can clear requests with high delay as soon as possible and release resources as soon as possible to avoid the accumulation of requests.

Timeout delivery between services

If a request has multiple phases, such as a series of RPC calls, then our service should check the deadline before each phase to avoid futility, that is, to see if there is enough time left to process the request.

A common mistake is to set a fixed timeout time for each RPC service. We should pass the timeout time between each service. The timeout time can be set at the top of the service call. For example, set the timeout time on the top layer of the service request to 3s. Service A requests service B, and service B requests service C again, and service B requests service C again, and service D requests service D again, and service D takes 500ms, and so on. Ideally, the same timeout passing mechanism should be used throughout the call chain.

If timeout is not passed, the following situation occurs:

  1. Service A sends A request to service B with A timeout of 3s
  2. Service B takes 2s to process the request and continues to request service C
  3. If timeout passing is used, then the timeout of service C should be 1s, but there is no timeout passing, so the timeout is 3s written dead in the configuration
  4. It takes 2 seconds for service C to continue execution. In fact, at this time, the timeout set by the top layer has expired, and the following request is meaningless
  5. Continue requesting service D

If service B uses a time-out mechanism, the request should be abandoned immediately in service C because the client may have reported an error after the deadline. When setting the timeout delivery, we typically reduce the delivery cutoff time by a bit, say 100 milliseconds, to take into account the network transmission time and the processing time after the client receives the reply.

In-process timeout delivery

Timeout transfer is not only required between services but also within processes. For example, in one process, Mysql, Redis and service B are successively called, and the total request time is set to 3s. The request to Mysql takes 1s, and the request to Redis again takes 2s. Redis takes 500ms to execute and then requests service B. In this case, the timeout time is 1.5s, because each middleware or service will set a fixed timeout time in the configuration file, so we need to take the minimum value of the remaining time and the setting time.

Context implements timeout passing

Context is a very simple principle, but it is very powerful. The go standard library has implemented the support for context, and various open source frameworks have implemented the support for context. Context has become the standard, and timeout passing relies on context.

We typically pass timeout control at the top of the service by setting the initial context, for example, to 3s

ctx, cancel := context.WithTimeout(context.Background(), time.Second*3)
defer cancel()
Copy the code

When context is passed, such as requesting Redis in the figure above, obtain the remaining time by using the following method and set the timeout time to a smaller value than Redis

dl, ok := ctx.Deadline()
Copy the code
timeout := time.Now().Add(time.Second * 3)
if ok := dl.Before(timeout); ok {
	timeout = dl
}
Copy the code

Timeout transfer between services mainly refers to timeout transfer during RPC calls. For gRPC, there is no need for additional processing. GRPC itself supports timeout transfer, which is similar to the above principle. As shown in the following code GRPC – go/internal/transport/handler_server go: 79

if v := r.Header.Get("grpc-timeout"); v ! ="" {
		to, err := decodeTimeout(v)
		iferr ! =nil {
			return nil, status.Errorf(codes.Internal, "malformed time-out: %v", err)
		}
		st.timeoutSet = true
		st.timeout = to
}
Copy the code

Timeout delivery is an important line of defense to ensure service stability. The principle and implementation are very simple. Has timeout delivery been implemented in your framework? If not, do it now.

Timeout pass in Go-Zero

Go-zero can configure timeouts for the API Gateway and RPC services through Timeout in the configuration file, and they are passed automatically between services.

The previous article on Figuring out how to implement the Go timeout control explained how to use the timeout control.

reference

SRE: Google Operation, Maintenance and Decryption

The project address

Github.com/zeromicro/g…

Welcome to Go-Zero and star/ Fork support us!

Wechat communication group

Pay attention to the public account of “micro-service Practice” and click on the exchange group to obtain the QR code of the community group.