reference

Mp.weixin.qq.com/s/k9tm-4lBw…

Segmentfault.com/a/119000001…

Look at these two articles first.

The main purpose of this article is to cover sliding time Windows.

Schematic diagram

Fixed time window

Sliding time window

The core

The algorithm has two core data: 1. Time 1s; 2. Times 100 times

In other words, 100 times per second.

Fixed time window

The end of the first 1s is 250ms, 100 requests. The second 1s starts 250ms with 100 requests.

Because your machine can only handle 100 requests /s right now, 200 requests at a time, all in the same second. For example, in the case above, it is actually 200 requests in 500ms, at which point the machine dies. How to do?

First, we need to understand our purpose, our purpose is not to let the machine die, but can discard some requests. How to do?

For example, let’s make 100 requests in the first 250ms of the 500ms in that case normal. After 250ms of 100 requests, discard. // To solve this problem, namely implement this, is the sliding time window algorithm.

Sliding time window

Look at the diagram.

Let’s say there are only 200 requests, 100 for the last 250ms of the first 1s. The beginning of the second 1s 250ms, 100 times. Now: first 1s, 100 requests, ok.

101 requests, and at this point, in fact, if you had a fixed time window, it would have been ok, because in the second 1s, the counter goes to zero, so you can continue. But the problem is, this can cause the machine to die. How to do?

When 101 requests are made and the request time is longer than the first 1s, the second 1s does not start from the end of the first 1s, but from the first small period of the first 1s, as shown in the diagram.

So at this point, 101 requests are still in the second 1s, but the counter for the second 1s is not going to 0, it’s going to be 100. So, the time is in the second 1s, but the count exceeds the counter, so discard. So, the next 100 requests, like the 101 requests, are discarded. In the end, instead of being overwhelmed, the machine just discarded some of the requests. This is current limiting.

Limiting current is to keep the machine from collapsing. The idea is to ensure that the server can handle only as many requests as it can handle, discarding any more, rather than being overwhelmed and unable to handle a single request.

What exactly is current limiting?

There’s no limit to the flow. Requests will keep coming. Until the machine breaks down.

After traffic limiting, the service is denied if the request exceeds the traffic limit. How do you do that? When the request comes in, check it, if it doesn’t meet the requirement, just return it and say you’ve restricted the flow. In effect, in the background, the request is discarded, that is, the request is not processed, even though it is accepted. Because it’s the client that sends the request, there’s no limit. Unless the client code is your own, you can also limit the flow by time on the client.

What happens when you overwhelm the system?

The system is completely jammed, unable to respond to any requests. // Discard requests that can’t be processed. If discarding is not addressed, a single request will not be processed. The solution to discarding is current limiting.