Optimization from design optimization, service separation, automatic capacity expansion, etc., sometimes cannot solve the problem completely. For example, sometimes the traffic increases so fast that the server may become overwhelmed before the capacity expansion process is completed
Since we can’t predict unexpected traffic and the business can’t be independent of any external services and resources, what can we do to minimize or minimize the impact of these problems on the core business?

Flow control

01. Common algorithms for flow control

At present, there are two commonly used flow control methods in the industry: leaky bucket algorithm and token bucket algorithm

  • Bucket algorithm

The leaky bucket algorithm controls the rate at which data is injected into the network and smoothes the burst traffic on the network. The leaky bucket algorithm works exactly as its name suggests: it simulates a leaking bucket in which all the outside water is put first, and the bucket leaks out evenly at a constant rate. If the bucket is full, the outside water can no longer be poured into the bucket. Here you can think of the external water as the original request, and the water leaking out of the bucket is the request smoothed out by the algorithm. It can also be seen that the leaky bucket algorithm can better control the flow access speed.

  • The token algorithm

Token bucket algorithm is another common algorithm in flow control, which controls the amount of data passing in a time window. The token bucket algorithm looks like this:

  1. One token is placed in the bucket every 1/r second, and r is the average user-configured sending rate (r tokens are placed per second).

  2. A maximum of B tokens can be placed in the bucket. If the bucket is full, new tokens are discarded.

  3. If n requests come in, n tokens will be consumed from the bucket.

  4. If the number of tokens available in the bucket is less than N, the n requests are discarded or new tokens are placed.

The algorithm evenly puts tokens into the bucket at a certain speed. After the original request enters, the required token number is extracted from the token bucket according to the amount of requests. If the number of tokens is insufficient, the request exceeding the limit will be directly discarded or the request that can successfully obtain tokens will enter the back-end server.

Unlike the “exact control rate” of the leaky bucket algorithm, since the bucket itself has a certain capacity that allows all the tokens in the bucket to be taken out at one time, the ** token bucket algorithm limits the average rate of requests while allowing a certain degree of burst traffic **.
The algorithm evenly puts tokens into the bucket at a certain speed. After the original request is entered, the demand is extracted from the token bucket according to the amount of requests
Global flow control **

In the scenario of distributed services, the bottleneck point in many cases lies in global resources or dependencies. In this case, distributed global flow control is needed to protect the whole business.

The most common global flow control solutions in the industry are centralized resources (e.g. Redis, Nginx) with scripts to implement global counters, or to implement more complex leak-bucket algorithm and token bucket algorithm, such as Redis INCR command with Lua to implement a flow control component to limit QPS (queries per second).

One important detail to note is that you need to set an expiration time each time you create a limiting Key. The entire operation is atomized, preventing the failure to set the expiration time in distributed operations. As a result, the Key of traffic limiting cannot be reset and the traffic limiting function becomes unavailable. In addition, there are two problems to be paid attention to when implementing global flow control: one is the granularity of flow control, the other is the resource-dependent bottleneck of flow control. Let’s take a look at how these two problems are solved when implementing global flow control.
**03. Fine-grained control **

The first is the granularity of flow control. For example, when limiting QPS, the granularity of flow control is too coarse and the QPS is not evenly distributed to each millisecond, and the boundary processing is not smooth enough. For example, when the maximum flow occurs in the last millisecond of the last second and the first millisecond of the next second, the QPS will double in two milliseconds.

A simple processing method is to divide one second into several buckets of N milliseconds, refine the flow control granularity to N milliseconds by sliding window, and calculate QPS based on sliding window each time, which can also avoid the problem of uneven boundary processing.

Automatic circuit breaker mechanism

Avalanches in multi-dependent services tend to cause one service to fail, slowing down the entire system.

In order to facilitate management and isolation, we often decouple services to separate them into different microservices, and call and rely on each other through RPC:

  • Manually downgrade dependencies by switching them on and off

  • Automatic circuit breakers are mainly through continuous access to the data collection is depend on the services or resources and performance indicators, when there is a certain degree of performance degradation or failure quantity reaches a certain threshold, automatically trigger fusing, let the current depend on the rapid failure (Fail – fast), and relegated to other spare dependence, or temporary to facilitate subsequent retry recovery elsewhere. During the fusing, the system continuously detects whether dependent services or resources are recovered to determine whether the fusing is automatically disabled and services are restored.