Background and Objectives

I have often encountered the problem of interface call timeout exceptions. I have tried to change the timeout Settings of client code, server code, and proxy server (such as Nginx) without determining the cause of timeout. Sometimes it happened to solve the problem, but sometimes it left the risk of not doing the right thing. The purpose of this paper is to provide systematic suggestions for dealing with the problem of timeout. We will try to objectively analyze the causes of all kinds of timeout, but there will also be subjective factors and incomplete places. Different opinions are welcome to put forward.

Timeout type

  1. Client Timeout

    • Send the request with XMLHttpRequest

      • Timeout Settings: developer.mozilla.org/en-US/docs/…
    • Send the request via AXIOS (call XMLHttpRequest)
      • Timeout Settings: github.com/axios/axios…
      • Client symptom: The Chrome console displays the following error when setting timeout to 1000ms and no exceptions are caughtError message timeout of 1000ms exceeded Thrown by XMLHttpRequest
  2. The server gateway or proxy (such as Nginx) timed out

    • Concept:Developer.mozilla.org/en-US/docs/…
    • Symptom: 504 gateway timeout
    • Solution: Adjust the gateway or proxy timeout as appropriate

    • Nginx timeout configuration: www.scalescale.com/tips/nginx/…

  3. Transmission link timeout

    • Concept: The client communicates with the server not only through the server gateway (such as Nginx), but also through the carrier’s routing and gateway devices. The timeout setting will result in the failure of the interface invocation due to the timeout of the transmission link

      • Client symptom: Chrome Console displays the following error when called through AXIos and no exception is caught

        Error message Network Error is thrown by XMLHttpRequest as ERR_EMPTY_RESPONSE

    • Breakdown of transmission link timeouts
      • NAT timeout
        • NAT concept: NAT (Network address translation) is designed to help slow the depletion of the available IP address space by using fewer public IP addresses to represent more private IP addresses
        • Description: About TCP long connection, NAT timeout, heartbeat packet
  4. Server Timeout

    • Koa timeout Settings (since I am familiar with the KOA framework, I will only use KOA as an example)

      • Method 1, call server.setTimeout: github.com/koajs/koa/i…
        • Client symptom: Chrome Console displays the following error when called through AXIos and no exception is caughtError information is the same as transmission link timeout. Therefore, the root cause of timeout cannot be found only on the client, and the server needs to cooperate with log query
      • Method 2, returns using KOA middlewareA status code of 408On the client, you can clearly see that the interface error is caused by server timeout
        • Example code: github.com/wejendorp/k…

Practical advice

  1. The client
    • Timeout setting: You need to set client timeout even if timeout is set on the server (XMLHttpRequest and AXIOS do not set timeout by default), and send a reasonable message to the user after timeout. When a request fails to reach the server, the transmission link timeout is triggered, which is usually very long. If the client does not set the timeout, it will be pending for a long time
    • Exception handling: In addition to handling service errors and client timeout errors, the client needs to catch errors caused by link timeout and server timeout, and send reasonable error messages to users
  2. The service side
    • Timeout Settings: This is required when performance and stability are sensitive, so that long unresponsive connections can be shut down if necessary, freeing resources to respond to new requests. Furthermore, in order for the client to distinguish between link timeout and server timeout, the server needs to returnStatus code 408
  3. Transport link: The timeout of the transport link is uncontrollable and cannot be avoided
  4. Interface design: In summary, even if client and server timeouts are not set, the interface will time out because the transport link will time out. Therefore, if an interface takes a long time, an asynchronous solution needs to be considered. If the interface takes more than 1 minute, an asynchronous solution must be implemented. Specific solutions include HTTP poll and Websocket.