I once read a sentence in a big god’s blog: when developing a high concurrency system, there are three powerful tools to protect the system: cache, degrade and limiting traffic. So what is flow limiting? As the name implies, limiting the flow is to limit the flow, just like you broadband packet 1 GIGAByte of data, used up. Through current limiting, we can control the QPS of the system well, so as to achieve the purpose of protecting the system. This article will introduce the common traffic limiting algorithms and their characteristics.

Algorithm is introduced

Counter method

The counter method is the simplest and easiest algorithm to implement in the current limiting algorithm. For example, we stipulate that for interface A, we can not access more than 100 times per minute. We can start by setting a counter that increments every time a request comes in. If the value of counter is greater than 100 and the interval between the request and the first request is less than 1 minute, then there are too many requests. If the interval between this request and the first request is greater than 1 minute and the value of counter is still within the flow limit range, then reset counter. The algorithm is shown as follows:

The specific pseudocode is as follows:

public class CounterDemo { public long timeStamp = getNowTime(); public int reqCount = 0; public final int limit = 100; Public final Long Interval = 1000; Ms public Boolean grant() {long now = getNowTime(); If (now < timeStamp + interval) {// reqCount++; Return reqCount > limit; } else { timeStamp = now; ReqCount = 1; return true; }}}Copy the code

Although this algorithm is simple, there is a very fatal problem, that is, the critical problem, we see the following figure:

As can be seen from the figure above, if a malicious user sends 100 requests at 0:59 and another 100 requests at 1:00, then the user sends 200 requests in one second. We just specified a maximum of 100 requests per minute, which is 1.7 requests per second, and the user can exceed our rate limit instantaneously by popping requests at the reset node in the time window. Users could overwhelm our app with this flaw in the algorithm.

Smart friends may have noticed that the problem is actually due to our low statistical accuracy. So how to deal with this problem well? Or, how do you reduce the impact of critical problems? We can look at the sliding window algorithm below.

The sliding window

Rolling window, also known as rolling Window. To solve this problem, we introduce the sliding window algorithm. If you have learned the TCP network protocol, then you must be familiar with the term sliding window. Here is a good illustration of the sliding window algorithm:

In the figure above, the entire red rectangle represents a time window, in our case a time window is a minute. Then we divide the time window, for example, in the picture, we have divided the sliding window into 6 squares, so each square represents 10 minutes. Every 10 minutes, our time window slides one space to the right. Each grid has its own counter. For example, when a request arrives at 0:35 seconds, the counter from 0:30 to 0:39 is incremented by one.

So how does sliding Windows solve the critical problem just now? If you look at the figure above, 100 requests arriving at 0:59 will land in the gray grid, and requests arriving at 1:00 will land in the orange grid. When the time reaches 1:00, our window will move one space to the right, so the total number of requests in the time window at this time is 200, exceeding the limit of 100, so it can be detected that the traffic limiting is triggered at this time.

Let me review the counter algorithm, and we can see that the counter algorithm is actually a sliding window algorithm. But it doesn’t divide the time window further, so there’s only one.

It can be seen that the more grids the sliding window is divided, the smoother the rolling of the sliding window will be, and the more accurate the statistics of current limiting will be.

Bucket algorithm

Leaky Bucket algorithm, also known as leaky bucket. To understand the leaky bucket algorithm, let’s take a look at wikipedia’s illustration of the algorithm:

As you can see from the figure, the whole algorithm is actually quite simple. First of all, we have a fixed capacity bucket with water coming in and water going out. We can’t predict how much water will flow in, or how fast it will flow. But for the water that goes out, the bucket can fix the rate of flow out. Also, when the bucket is full, the excess water will overflow.

Replacing the water in the algorithm with the request in the real world, we can see that the leaky bucket algorithm inherently limits the speed of the request. When using the leaky bucket algorithm, we can guarantee that the interface will process requests at a constant rate. So leaky bucket algorithms inherently don’t have criticality problems. Specific pseudo-code implementation is as follows:

public class LeakyDemo { public long timeStamp = getNowTime(); public int capacity; // Bucket capacity public int rate; Public int water; Public Boolean grant() {long now = getNowTime(); water = max(0, water - (now - timeStamp) * rate); TimeStamp = now; If ((water + 1) < capacity) {// if ((water + 1) < capacity) { return true; } else {return false when water is full; }}}Copy the code

Token bucket algorithm

Token bucket algorithm, also known as token bucket. To understand the algorithm, let’s take a look at wikipedia’s illustration of the algorithm:

As we can see from the figure, the token bucket algorithm is slightly more complex than the leaky bucket algorithm. First, we have a fixed capacity bucket with tokens in it. The bucket is initially empty, and the tokens fill the bucket at a constant rate r until the bucket reaches its capacity, and excess tokens are discarded. Every time a request comes in, an attempt is made to remove a token from the bucket, and if there is no token, the request cannot pass.

Specific pseudo-code implementation is as follows:

public class TokenBucketDemo { public long timeStamp = getNowTime(); public int capacity; // Bucket capacity public int rate; // Add tokens public int tokens; Public Boolean grant() {long now = getNowTime(); Tokens = min(capacity, tokens + (now-timestamp) * rate); timeStamp = now; If (tokens < 1) {// Tokens < 1; } else {// Tokens -= 1; return true; }}}Copy the code

Related variants

If we take a closer look at the algorithm, we see that by default removing the token from the bucket is not time consuming. If you set a delay for removing the token, then you are using the leak-bucket algorithm again. The SmoothWarmingUp class from Google’s Guava library takes this approach.

The critical problem

Let’s consider the critical problem scenario again. At 0:59 seconds, since the bucket was full of 100 tokens, the 100 requests could pass instantaneously. However, tokens are filled at a low rate. Therefore, the number of tokens in the bucket cannot reach 100 at 1:00. In this case, no more 100 requests can pass. So token bucket algorithm can solve the critical problem well. The figure below compares the rate changes of the counter (left) and the token bucket algorithm (right) at critical points. We can see that although the token bucket algorithm allows the burst rate, the next burst rate cannot occur until there are enough tokens in the bucket:


Counter VS sliding window

The counter algorithm is the simplest algorithm and can be regarded as a low-precision implementation of the sliding window. Sliding Windows require more storage space to implement because they need to store multiple counters (one per cell). In other words, the higher the accuracy of the sliding window, the more storage space is required.

Leaky bucket vs. token bucket

The most obvious difference between the leaky bucket algorithm and the token bucket algorithm is that the token bucket algorithm allows a certain degree of burst traffic. Because of the default token bucket algorithm, taking tokens is not time consuming, that is, assuming there are 100 tokens in the bucket, 100 requests can be allowed through instantly.

Token bucket algorithm is widely used in the industry because it is simple to implement, allows some traffic burst and is user-friendly. Of course, we need to do it on a case-by-case basis, there is only the best algorithm, there is no optimal algorithm.