preface

Recently, we introduced Guava’s RateLimiter stream limiting component to our system, which is based on the token bucket algorithm, which is a very classic stream limiting algorithm. This article will learn several classical traffic limiting algorithms together with you.

  • Public number: a boy picking up snails
  • Making the address

What is limiting current?

Wikipedia’s concept is as follows:

In computer networks, rate limiting is used to control the rate of requests sent or received by a network interface controller. It can be used  to prevent DoS attacks and limit web scrapingCopy the code

In computer networks, limiting traffic is controlling the rate at which a network interface sends or receives requests, preventing DoS attacks and limiting Web crawlers.

Flow limiting, also known as flow control. In the case of high concurrency or heavy traffic requests, the system blocks new requests from accessing the system to ensure system stability. Limiting the flow may cause some user requests to be delayed or rejected, which affects user experience. So there is a balance between system stability and user experience. Here’s an example from life:

Some popular tourist attractions, the general daily travel to visit the number of restrictions. Only a fixed number of tickets, say 5,000, will be sold each day. If you arrive late during the May Day or National Day holidays, the tickets may have been sold out and you will not be able to go in. Even if you get in, the line is long enough for you to doubt your life.

Common traffic limiting algorithms

Fixed window flow limiting algorithm

Start by maintaining a counter that treats the unit time period as a window that records the number of requests received by that window.

  • When the number of times is below the limit threshold, access is allowed and the counter +1
  • When the number of times is greater than the limit threshold, access is denied.
  • After the current time window has passed, the counter is cleared to zero.

Assume the unit time is 1 second and the limit threshold is 3. The counter is incremented by 1 for each request within 1 second of time. If the counter accumulates more than 3 times, all subsequent requests are rejected. Wait until the 1s end, the counter clear 0, start counting again. The diagram below:

The pseudocode is as follows:

/** * fixedWindowsTryAcquire() {long currentTime = system.currentTimemillis (); If (currentTime-lastreQuestTime > windowUnit) {if (currentTime-lastreQuestTime > windowUnit) { LastRequestTime = currentTime; If (counter < threshold) {counter++; // count increment 1 return true; } return false; }Copy the code

However, this algorithm has an obvious critical problem: assuming a limit threshold of 5 requests and a unit time window of 1s, if we concurrently have 5 requests in the first 0.8-1s and 1-1.2s per unit time, respectively. None of them exceed the threshold, but if you count 0.8-1.2s, the number of concurrent requests is up to 10, which has exceeded the definition of the threshold of 1s per unit time not exceeding 5.

Sliding window flow limiting algorithm

Sliding window current limiting solves the problem of fixed window critical value. It divides a unit time period into n periods, records the number of interface access times in each period, and deletes expired periods based on time.

A diagram explains the sliding window algorithm, as follows:

Assuming the unit time is 1s again, the sliding window algorithm divides it into five small periods, so the sliding window is divided into five small cells. Each cell represents 0.2s. Every 0.2s, the time window slides one space to the right. Then, each cycle has its own counter. If the request arrives in 0.83s, the counter for 0.8 to 1.0s is incremented by 1.

How does sliding Windows solve critical problems?

Suppose we still have 5 requests in 1s, and 5 requests come in 0.8-1.0s (say 0.9s) in the yellow box. After the 1.0s point, five more requests fall on the purple grid. If it’s a fixed window algorithm, it won’t be limited, but if it’s a sliding window, it’ll move one cell to the right after every little period. After 1.0s, it will move one space to the right. The current time interval is 0.2-1.2s. Requests in this area have exceeded the limit 5.

TIPS: When the sliding window’s lattice period is divided more, then the sliding window’s scrolling will be smoother, and the statistics of flow limiting will be more accurate.

The pseudo-code of the sliding window algorithm is as follows:

/** * private int SUB_CYCLE = 10; /** private int SUB_CYCLE = 10; /** ** private int thresholdPerMin = 100; */ private final TreeMap<Long, Integer> counters = new TreeMap<>(); */ Boolean slidingWindowsTryAcquire() {long currentWindowTime = LocalDateTime.now().toEpochSecond(ZoneOffset.UTC) / SUB_CYCLE * SUB_CYCLE; Int currentWindowNum = countCurrentWindow(currentWindowTime); If (currentWindowNum >= thresholdPerMin) {return false; } // counter +1 counters.get(currentWindowTime)++; return true; } /** * Count requests for the current window */ private int countCurrentWindow(Long currentWindowTime) {// Count window start position long startTime = currentWindowTime - SUB_CYCLE* (60s/SUB_CYCLE-1); int count = 0; Iterator< map.entry <Long, Integer>> Iterator = counters.entryset ().iterator(); while (iterator.hasNext()) { Map.Entry<Long, Integer> entry = iterator.next(); If (entry.getKey() < startTime) {iterator.remove(); Count =count + entry.getValue();} else {count =count + entry.getValue(); } } return count; }Copy the code

Although the sliding window algorithm solves the critical problem of fixed window, once the flow limit is reached, the request will be rejected directly. So we’re going to lose some requests, which is actually not very friendly for the product.

Bucket algorithm

Leak-bucket algorithm is more flexible in the face of current limiting, and there is no direct violent rejection.

Its principle is very simple, can be considered to be the process of water leakage. Pour water into the drain bucket at any rate and drain water at a constant rate. When the water exceeds the capacity of the bucket, it overflows, or is discarded. Because the bucket capacity is constant, the overall rate is guaranteed.

  • The inflow of water droplets, which can be viewed as requests to access the system, is at an indeterminate rate.
  • The bucket capacity generally indicates the number of requests that the system can handle.
  • If the bucket is full, the limit threshold is reached and the drop is discarded (request rejected)
  • The outgoing water droplets are constantly filtered, and the corresponding service processes requests at a fixed rate.

The pseudocode of the leaky bucket algorithm is as follows:

/** ** private long rate; /** ** private long currentWater; /** * private long refreshTime; /** * private long capacity; / * * * * @ return * / bucket algorithm Boolean leakybucketLimitTryAcquire () {long currentTime = System. CurrentTimeMillis (); Long outWater = (currentTime-refreshTime) / 1000 * rate; CurrentWater = math. Max (0, currentwer-outwater); currentWater = math. Max (0, currentWater); // Current water = current water - current water refreshTime = currentTime; If (currentWater < capacity) {currentWater++; return true; } // If the current amount of water is greater than or equal to the bucket capacity, return false; }Copy the code

During normal traffic, the system processes requests at a fixed rate, which is what we want. However, in the face of sudden traffic, the leaky bucket algorithm will still handle requests in a conventional way, which is not desirable. When traffic changes suddenly, we want the system to process requests as quickly as possible to improve the user experience.

Token bucket algorithm

In the face of burst traffic, we can use the token bucket algorithm to limit traffic.

Token bucket algorithm principle:

  • There is a token manager who puts tokens into the token bucket at a fixed speed according to the flow limit.
  • If the token count is full and exceeds the token bucket capacity limit, it is discarded.
  • When the system receives a user request, it first goes to the token bucket and asks for a token. If the token is given, the business logic of the request is processed;
  • If the token is not available, the request is simply rejected.

The pseudocode of the leaky bucket algorithm is as follows:

/** * Private long putTokenRate; /** * private long refreshTime; /** * private long capacity; /** * private long currentToken = 0L; /** * tokenBucketTryAcquire() {long currentTime = system.currentTimemillis (); Long generateToken = (CurrentTime-refreshTime) / 1000 * putTokenRate; CurrentToken = math.min (Capacity, generateToken + currentToken); // Current token count = previous token count + token count refreshTime = currentTime; If (currentToken > 0) {currentToken--; // Token count -1 return true; } return false; }Copy the code

If the token issuing strategy is correct, the system will not be overwhelmed and the utilization of the machine will increase. Guava’s RateLimiter stream limiting component is based on the token bucket algorithm.

Reference and thanks

  • Interviewer: Come on, young man! Please hand off 5 common traffic limiting algorithms!
  • Ali Cloud: How much do you know about limiting traffic?