This paper mainly describes several common flow limiting algorithm: counter algorithm, leaky bucket algorithm, token bucket algorithm. Then combined with my understanding of Sentinel 1.8.0, I will share with you how Sentinel uses these algorithms in the source code for flow control judgment. Due to my limited understanding, if there is an incorrect place, I hope you can leave a message to discuss 😊😊😊.

Counter current limiting algorithm

We can limit the number of requests we can receive per second directly through a counter. For example, if QPS is set to 1000, then the idea is to start from the first request, in the next 1s, each request, add 1 to the count, if the cumulative number reaches 1000, then all subsequent requests will be rejected. Wait until the 1s is over, return the count to 0 and start counting again.

Advantage: Simple implementation

Disadvantages: If 1000 requests have been passed in the first half of the 1s, the second half of the 1s can only be rejected. This phenomenon is called “spike phenomenon”.

Implementation code case:

public class Counter {
    public long timeStamp = getNowTime();
    public int reqCount = 0;
    public final int limit = 100; // The maximum number of requests in the time window
    public final long interval = 1000; // Time window ms

    public boolean limit(a) {
        long now = getNowTime();
        if (now < timeStamp + interval) {
            // Within the time window
            reqCount++;
            // Determine whether the maximum number of requests has been exceeded in the current time window
            return reqCount <= limit;
        } else {
            timeStamp = now;
            // Reset after timeout
            reqCount = 1;
            return true; }}public long getNowTime(a) {
        returnSystem.currentTimeMillis(); }}Copy the code

Sliding time window algorithm

A sliding Window is also called a Rolling Window. In order to solve the defects of the counter algorithm, we introduce the sliding window algorithm. Here’s a good illustration of the sliding window algorithm:

In the figure above, the entire red rectangle represents a time window, which in our case is one minute. Then we divided the time window. For example, in the figure, we divided the sliding window into 6 cells, so each cell represents 10 seconds. Every 10 seconds, our time window slides one space to the right. Each cell has its own counter. For example, when a request arrives at 0:35 seconds, the counter between 0:30 and 0:39 increases by 1.

So how does the sliding window solve the critical problem? If we look at the figure above, 100 requests that arrive at 0:59 will fall into the gray grid, while requests that arrive at 1:00 will fall into the orange grid. When the time reaches 1:00, our window will move one space to the right, then the total number of requests in the time window is 200 in total, exceeding the limit of 100, so the flow limiting can be detected at this time.

Let me just review the counter algorithm, and we can see that the counter algorithm is actually a sliding window algorithm. It just doesn’t divide the time window any further, so it only has 1 space.

It can be seen that the more compartments of the sliding window are divided, the smoother the scrolling of the sliding window will be, and the more accurate the statistics of flow limiting will be.

Implementation code case:

public class SlideWindow {

    /** The mapping between the queue ID and the queue. The queue stores the timestamp of each pass. This allows the program to have multiple flow limiting queues */
    private volatile static Map<String, List<Long>> MAP = new ConcurrentHashMap<>();

    private SlideWindow(a) {}

    public static void main(String[] args) throws InterruptedException {
        while (true) {
            // In any 10 seconds, only 2 passes are allowed
            System.out.println(LocalTime.now().toString() + SlideWindow.isGo("ListId".2.10000L));
            // Sleep 0-10 seconds
            Thread.sleep(1000 * new Random().nextInt(10)); }}/** * Sliding time window flow limiting algorithm * in the specified time window, the specified limit times, whether to allow through **@paramListId Queue ID *@paramCount Limit number of times *@paramTimeWindow timeWindow size *@returnWhether to allow */ to pass
    public static synchronized boolean isGo(String listId, int count, long timeWindow) {
        // Get the current time
        long nowTime = System.currentTimeMillis();
        // Fetch the corresponding traffic limiting queue based on the queue ID. If no traffic limiting queue exists, create it
        List<Long> list = MAP.computeIfAbsent(listId, k -> new LinkedList<>());
        // If the queue is not full, allow it to pass and add the current timestamp to the start of the queue
        if (list.size() < count) {
            list.add(0, nowTime);
            return true;
        }

        // If the queue is full (the limit number of times reached), the earliest timestamp added to the queue is obtained
        Long farTime = list.get(count - 1);
        // Subtract the earliest added timestamp from the current timestamp
        if (nowTime - farTime <= timeWindow) {
            // If the result is less than or equal to timeWindow, the number of passes in timeWindow is greater than count
            // Not allowed to pass
            return false;
        } else {
            // If the result is greater than timeWindow, the number of passes in timeWindow is less than or equal to count
            // Allow the queue to pass and remove the earliest added timestamp, adding the current time to the start of the queue
            list.remove(count - 1);
            list.add(0, nowTime);
            return true; }}}Copy the code

In Sentinel, LeapArray structure is used to implement the time window algorithm, and its core code is as follows (only the time window method is listed) :

** Get bucket item at provided timestamp. **@param timeMillis a valid timestamp in milliseconds
     * @return current bucket item at provided timestamp if the time is valid; null if time is invalid
     */
public WindowWrap<T> currentWindow(long timeMillis) {
  if (timeMillis < 0) {
    return null;
  }

  int idx = calculateTimeIdx(timeMillis);
  // Calculate current bucket start time.
  // Calculate the start time of the window. Calculate the start time of each cell
  long windowStart = calculateWindowStart(timeMillis);

  /* * Get bucket item at given time from the array. * * (1) Bucket is absent, then just create a new bucket and CAS update to circular array. * (2) Bucket is up-to-date, then just return the bucket. * (3) Bucket is deprecated, then reset current bucket and clean all deprecated buckets. */
  while (true) {
    WindowWrap<T> old = array.get(idx);
    // If no panes are available, create panes
    if (old == null) {
      /* * B0 B1 B2 NULL B4 * ||_______|_______|_______|_______|_______||___ * 200 400 600 800 1000 1200 timestamp * ^ * time=888 * bucket is empty, so create new and update * * If the old bucket is absent, then we create a new bucket at {@code windowStart}, * then try to update circular array via a CAS operation. Only one thread can * succeed to update, while other threads yield its time slice. */
      WindowWrap<T> window = new WindowWrap<T>(windowLengthInMs, windowStart, newEmptyBucket(timeMillis));
      if (array.compareAndSet(idx, null, window)) {
        // Successfully updated, return the created bucket.
        return window;
      } else {
        // Contention failed, the thread will yield its time slice to wait for bucket available.
        Thread.yield();
      }
      // The current pane exists. Return the history pane
    } else if (windowStart == old.windowStart()) {
      /* * B0 B1 B2 B3 B4 * ||_______|_______|_______|_______|_______||___ * 200 400 600 800 1000 1200 timestamp * ^ * time=888 * startTime of Bucket 3: 800, so it's up-to-date * * If current {@code windowStart} is equal to the start timestamp of old bucket, * that means the time is within the bucket, so directly return the bucket. */
      return old;
      //
    } else if (windowStart > old.windowStart()) {
      /* * (old) * B0 B1 B2 NULL B4 * |_______||_______|_______|_______|_______|_______||___ * ... 1200 1400 1600 1800 2000 2200 timestamp * ^ * time=1676 * startTime of Bucket 2: 400, deprecated, should be reset * * If the start timestamp of old bucket is behind provided time, that means * the bucket is deprecated. We have to reset the bucket to current {@code windowStart}. * Note that the reset  and clean-up operations are hard to be atomic, * so we need a update lock to guarantee the correctness of bucket update. * * The update lock is conditional (tiny scope) and will take effect only when * bucket is deprecated, so in most cases it won't lead to performance loss. */
      if (updateLock.tryLock()) {
        try {
          // Successfully get the update lock, now we reset the bucket.
          // Clear all pane data
          return resetWindowTo(old, windowStart);
        } finally{ updateLock.unlock(); }}else {
        // Contention failed, the thread will yield its time slice to wait for bucket available.
        Thread.yield();
      }
      // If the clock is set back, recreate the time grid
    } else if (windowStart < old.windowStart()) {
      // Should not go through here, as the provided time is already behind.
      return newWindowWrap<T>(windowLengthInMs, windowStart, newEmptyBucket(timeMillis)); }}}Copy the code

Bucket algorithm

Leaky Bucket algorithm Leaky Bucket algorithm is an algorithm often used in Traffic Shaping or Rate Limiting in the network world. Its main purpose is to control the Rate of data injection into the network and smooth the burst Traffic on the network. The leaky bucket algorithm provides a mechanism by which burst traffic can be shaped to provide a stable flow to the network, as shown in the following figure.

Implementation code case:

public class LeakyBucket {
  public long timeStamp = System.currentTimeMillis();  // The current time
  public long capacity; // The capacity of the bucket
  public long rate; // The rate of water leakage
  public long water; // Current volume (current cumulative requests)

  public boolean grant(a) {
    long now = System.currentTimeMillis();
    // Execute the leakage first, calculate the remaining water quantity
    water = Math.max(0, water - (now - timeStamp) * rate); 

    timeStamp = now;
    if ((water + 1) < capacity) {
      // Try to add water, and the water is not full
      water += 1;
      return true;
    } else {
      // Water is full, refuse to add water
      return false; }}}Copy the code

Description:

(1) Not fully adding water: continuously adding water through the code water +=1. (2) Water leakage: through the time difference to calculate the water leakage. (3) Surplus water: total water – water leakage.

RateLimiterController in Sentine implements the leaky bucket algorithm. The core code is as follows

@Override
public boolean canPass(Node node, int acquireCount, boolean prioritized) {
  // Pass when acquire count is less or equal than 0.
  if (acquireCount <= 0) {
    return true;
  }
  // Reject when count is less or equal than 0.
  // Otherwise,the costTime will be max of long and waitTime will overflow in some cases.
  if (count <= 0) {
    return false;
  }

  long currentTime = TimeUtil.currentTimeMillis();
  // Calculate the interval between every two requests.
  // Calculate the time interval
  long costTime = Math.round(1.0 * (acquireCount) / count * 1000);

  // Expected pass time of this request.
  // Expected execution time
  long expectedTime = costTime + latestPassedTime.get();

  Current time > expected time
  if (expectedTime <= currentTime) {
    // Contention may exist here, but it's okay.
    // Can pass, and set the last pass time
    latestPassedTime.set(currentTime);
    return true;
  } else {
    // Calculate the time to wait.
    // Wait time = expected time - last time - current time
    long waitTime = costTime + latestPassedTime.get() - TimeUtil.currentTimeMillis();
    // Wait time > maximum queue time
    if (waitTime > maxQueueingTimeMs) {
      return false;
    } else {
      // Last time + interval time
      long oldTime = latestPassedTime.addAndGet(costTime);
      try {
        // Wait time
        waitTime = oldTime - TimeUtil.currentTimeMillis();
        // Wait time > maximum queue time
        if (waitTime > maxQueueingTimeMs) {
          latestPassedTime.addAndGet(-costTime);
          return false;
        }
        // in race condition waitTime may <= 0
        // Wait
        if (waitTime > 0) {
          Thread.sleep(waitTime);
        }
        // After waiting, release
        return true;
      } catch (InterruptedException e) {
      }
    }
  }
  return false;
}
Copy the code

Token bucket algorithm

Token bucket algorithm is one of the most commonly used algorithms in Traffic Shaping and Rate Limiting. Typically, the token bucket algorithm is used to control the amount of data sent to the network and allow bursts of data to be sent. As shown in the figure below:

In simple terms, one side consumes tokens from the bucket while the other side puts tokens into the bucket at a fixed rate. When the number of requests consumed is greater than the rate at which they can be put in, take action, such as waiting, or rejecting.

Implementation code case:

public class TokenBucket {
  public long timeStamp = System.currentTimeMillis();  // The current time
  public long capacity; // The capacity of the bucket
  public long rate; // Token placement speed
  public long tokens; // The current number of tokens

  public boolean grant(a) {
    long now = System.currentTimeMillis();
    // Add the token first
    tokens = Math.min(capacity, tokens + (now - timeStamp) * rate);
    timeStamp = now;
    if (tokens < 1) {
      // If there is less than 1 token, reject
      return false;
    } else {
      // Get the token
      tokens -= 1;
      return true; }}}Copy the code

Sentinel uses the token-bucket algorithm in WarmUpController, where it can warm up the system, set warm-up times and water levels, and reject unwanted requests during warm-up.

public boolean canPass(Node node, int acquireCount, boolean prioritized) {
  long passQps = (long) node.passQps();

  long previousQps = (long) node.previousPassQps();
  syncToken(previousQps);

  // Start calculating its slope
  // If he enters the warning line, start adjusting his QPS
  long restToken = storedTokens.get();
  if (restToken >= warningToken) {
    long aboveToken = restToken - warningToken;
    // Consume faster than warning, but slower
    // current interval = restToken*slope+1/count
    double warningQps = Math.nextUp(1.0 / (aboveToken * slope + 1.0 / count));
    if (passQps + acquireCount <= warningQps) {
      return true; }}else {
    if (passQps + acquireCount <= count) {
      return true; }}return false;
}
Copy the code

Summary of current limiting algorithm

Counters VS time Windows

  1. The essence of the time window algorithm is also realized by the counter algorithm.
  2. The more compartments the time window algorithm has, the smoother the scrolling of the sliding window will be, the more accurate the current-limiting statistics will be, but the more memory will be used.

Open bucket vs. token bucket

  1. The leaky bucket algorithm and the token bucket algorithm are essentially designed to do traffic shaping or rate limiting to prevent the system from being broken due to heavy traffic, but the core difference between the two algorithms is that the direction of traffic limiting is opposite

  2. Leaky bucket: Limits the flow rate and is relatively fixed.

  3. Token bucket: Limits the average inflow rate of traffic and allows a certain degree of unexpected traffic. The maximum rate is the capacity of the bucket and the rate of token generation.

  4. In some scenarios, the leaky bucket algorithm cannot effectively use network resources. Because the leakage rate of leaky buckets is relatively fixed, the leaky buckets are still limited and cannot be released when the network situation is good and there is no congestion, so network resources cannot be effectively used. The token bucket algorithm, on the other hand, supports a certain burst of traffic while limiting the average rate.

Reference documentation

www.cnblogs.com/linjiqin/p/…

www.cnblogs.com/dijia478/p/…