[TOC]

This is the fifth day of my participation in Gwen Challenge

“This article has participated in the weekend learning plan, click to see details”

The most commonly used traffic limiting algorithm and how to add flow control in HTTP middleware

What is flow limiting?

The system is protected by limiting the rate of concurrent access/requests or requests within a time window. Once the rate reaches the limit, denial of service, queuing or waiting, and degradation can be processed

Basically, limiting the number of requests, or limiting the total number of requests for a certain period of time

For example, the second kill site, the limit of 22:5-22:10 seconds to kill 999 products, limit the release of 5W requests, if in this period of time, the request after 5W, directly rejected the door, that is, when we enter the website display, the system is busy

Why limit the flow?

  • Background service capacity is limited. Traffic limiting is required; otherwise, the service will crash
  • Flow limiting Settings can be evaluated based on test performance, such as test maximum number of connections, QPS (maximum number of requests per second)
  • Prevent crawler, malicious attack

For example, when the system visits suddenly surge and a large number of requests flood in, we may know that there will be a sudden peak, at this time, if the server can not bear the pressure, it will crash, such as the following types of services

  • Seconds kill the business
  • All kinds of brush list
  • Trending on Weibo
  • For some reason users are surging, too enthusiastic
  • A large number of users (malicious users or normal users with excessive requests) frequently access the server, which causes insufficient capacity of the server
  • Web crawlers and so on

How is current limiting generally implemented?

Are we anxious to wait for the line to drop one by one as we chop our hands off at some treasure or some popular holiday and pay?

We are frantically buying goods, but the pop-up system is busy for many times because we click too fast and are too enthusiastic. Please try again later, remember?

What’s more, when the flow is too large, the direct hint refuses to visit, these are not one by one emerge in my mind?

According to the above scenario, our current limiting idea would look like this:

  • To refuse the request
  • Set up queuing until the number of requests per unit reaches normal levels

Rejecting requests is relatively straightforward, and there are multiple ways to set up queues, so let’s keep talking

What other ways can you solve or mitigate this sudden flood of requests besides limiting traffic?

There are circuit breakers, downgrade, can effectively solve such problems

What is a downgrade?

In order to ensure the high availability of core modules, when the pressure on our servers increases dramatically, we refer to our own system failure and degradation. There are two common solutions as follows

  • Reduce the performance of non-core modules
  • Directly shut down unimportant functions to ensure the normal functioning of the core module

As shown in the picture, when the number of user requests surges and the server is overwhelmed, the website can choose to turn off the comment function, password modification and other functions to ensure the normal operation of the payment system, data system and other core functions

Oh? What is a circuit breaker?

It is different from service degradation. In this case, when the dependent external interface fails, the relationship between the external interface and the dependent external interface will be severed.

Server A relies on the external interface of server B. If the interface of server B is abnormal at A certain moment, the response time is extremely slow, but this interface will affect the entire operation of the server. In this case, when server A requests the interface of server B, the default setting returns an error

The most commonly used traffic limiting algorithm

Let’s share one of the most commonly used traffic limiting algorithms, which can be roughly divided into the following four types

  • Fixed window counter
  • Sliding window counter
  • bucket
  • The token bucket

Fixed time window control

The simplest is to use a counter to control, set a fixed number of requests for a fixed amount of time

In the figure above, fixed time window is used to limit the processing of only 2 requests per second, and red requests will be directly discarded

  • Fixed limiting the number of simultaneous requests to 2 per second
  • The request in red is dropped, and the overall service load may be reduced
  • But this will drop the request, which is bad for the experience

Sliding window counter algorithm

Be able to smooth out the number of tasks being processed. Sliding window counter is by subdividing the window and sliding according to time. This algorithm avoids the double burst request brought by the fixed window algorithm, but the higher the accuracy of time interval, the larger the space capacity required by the algorithm

  • Divide time into intervals
  • Each request within each interval increases the counter by one to maintain a time window, occupying multiple intervals
  • After each interval, the oldest interval is discarded and the latest interval is included
  • If the total number of requests in the current window exceeds the limit, all requests in this window are discarded

bucket

In order to solve the problem of losing the red part, the leaky bucket method is introduced to limit the flow. The leaky bucket is cached, and any request will be put into the cache

A leaky bucket, which sounds a bit like a funnel, also trickles down

As shown in figure, the droplets is the request event, if the bucket can be cached 5000 events, the actual server 1 s processing 1000 events, so at the time of peak, response time, such as 5 seconds, but you can’t always is the rush hour, otherwise, the response time is 5 s, it can be a very slow time, this time also is very affect experience

If the bucket is full and there are more requests coming in, they’re just going to be thrown away, which is, again, the request is thrown away

  • Each request is treated as a drop and stored in the drop
  • The leaking bucket leaks water out at a constant rate, and stops leaking if the bucket is empty. For example, 1s leaks 1000 water drops, just as 1s processes 1000 requests
  • If the bucket is slow to drain, excess water droplets are discarded

advantage

  • There is a certain cache capacity, than the above 2 ways will be better

disadvantage

  • When the bucket is full and new requests come in, data is still lost
  • If the bucket is full for a long time, the response rate will be affected. Whether the experience is OK depends on the size of the bucket

The token bucket

By dynamically controlling the number of tokens to better serve the requests of clients, the number of token generation and production rate can be flexibly controlled

As above, the difference between token buckets and leaky buckets is that

  • The token bucket can control the rate at which tokens are generated. For example, during peak hours, more tokens can be generated to meet the needs of clients
  • You can also cache data

If you find that it is always peak time, consider expanding the token bucket

advantage
  • The token bucket can dynamically control the rate at which tokens are generated
  • You can also cache data

How to add flow control to HTTP Middleware

How do you add flow control to the HTTP middleware to limit the flow, so that every request has to go through the middleware, so that it has a chance to go back, so that it has a chance to be processed

type middleWareHandler struct {
	r *httprouter.Router
	l *ConnLimiter
}

func NewMiddleWareHandler(r *httprouter.Router, cc int) http.Handler {
	m := middleWareHandler{}
	m.r = r
	m.l = NewConnLimiter(cc) // Limit the number
	return m
}
Copy the code

With the token bucket out of the way, let’s talk about the restrictor

Current limiter

Current limiter is a very important component in background service

What can it be used for?

  • Limiting request rate
  • Protection services

There are many methods to realize the current limiter, which are basically based on the above flow limiting algorithm to achieve

  • Sliding window method
  • Token Bucket
  • Leaky Bucket

The golang standard library has its own implementation of the limiting algorithm, so we don’t need to build our own wheels

golang.org/x/time/rate, you can use it directly. The current limiter is based on Token Bucket

A token bucket is the bucket we mentioned above, which holds tokens, and the system pours tokens into the bucket at a constant rate

When the bucket is full. The user requests the token from the bucket

  • If there are any tokens left, you can always withdraw them

  • If there are no tokens left, you can proceed until the Token is placed in the system

Let’s see how the restrictor works

Construct a stream limiting object

limiter := NewLimiter(5.1);
Copy the code
  • The first parameter is r Limit, which represents how many tokens can be generated per second into the token bucket
  • The second argument is b int, which represents the size of the token bucket

In other words, the current limiter constructed by it is

  • The token bucket size is 1
  • Tokens are placed into the bucket at a rate of five tokens per second

Of course, we can also use other Settings, which are provided in the package

limit := Every(500 * time.Millisecond);
limiter := NewLimiter(limit, 1);
Copy the code

The Every method can be used to specify the interval at which tokens are placed in the Token bucket. The code above indicates that one Token is placed in the bucket Every 500 ms

In other words, the code above generates 2 tokens in 1 second

Limiter is support can adjust the speed and bucket size, we have a look at the source code

// Change the rate at which tokens are placed
SetLimit(Limit) 
// Change the token bucket size
SetBurst(int) 
Copy the code

Let’s write an example:

package main

import (
    "context"
    "log"
    "time"

    "golang.org/x/time/rate"
)


func main(a) {
    l := rate.NewLimiter(1.2)
    // limit indicates the number of tokens generated per second. Buret indicates the maximum number of tokens stored
    log.Println(l.Limit(), l.Burst())
    for i := 0; i < 50; i++ {
        // this block waits until a token is fetched
        log.Println("... . Wait")
        c, _ := context.WithTimeout(context.Background(), time.Second*2)
        // Wait is blocked
        iferr := l.Wait(c); err ! =nil {
            log.Println("limiter wait error : " + err.Error())
        }
        log.Println("Wait ... . ")

        // Reserve returns to wait time before fetching the token
        // Returns how long it takes to wait for a new token, so that the task can wait for the specified time
        r := l.Reserve()
        log.Println("reserve time :", r.Delay())

        // Determine whether the current token can be fetched
        // Allow checks whether the current token can be fetched
        a := l.Allow()
        log.Println("Allow == ", a)
    }
}
Copy the code

Looking at the number of cases, we can see that the package provides us with three consumption methods to use

  • Wait

Wait is equal to WaitN(CTX,1)

If there are not enough tokens in the bucket (less than N), the Wait method will block for some time until the token condition is satisfied, otherwise it will block

If the condition is met, the result is returned directly

Wait context parameter. You can set the timeout period

  • Allow

Look at the function prototype

func (lim *Limiter) Allow(a) bool
func (lim *Limiter) AllowN(now time.Time, n int) bool
Copy the code

Allow = AllowN(time.now (),1)

The AllowN method refers to whether there are at least N tokens in the bucket by a certain time, and returns true if so, consuming N tokens from the bucket at the same time. Otherwise return no consumption token, return false

  • Reserve

Reserve, equal to ReserveN(time.now (), 1)

ReserveN returns a Reservation* object when the call is complete, regardless of whether the token is sufficient

We can call the Delay() method of this object, with the following notes:

This method returns the time required to wait

  • If the waiting time is 0 seconds, no waiting is required
  • If the value is greater than 0 seconds, you must wait until the waiting time to proceed

Of course, if you don’t want to wait, you can return the token, no less, by calling the object’s Cancel() method

conclusion

  • This paper briefly introduces current limiting, fusing and service degradation
  • Image share 4 algorithms of limiting traffic
  • This paper introduces the simple writing method of HTTP middleware access flow control
  • Share the gogolang.org/x/time/rateIn, the basic use of current limiter

Okay, that’s it for the next Internet Protocol presentation and sharing,

Technology is open, our mentality, should be more open. Embrace change, live in the sun, and strive to move forward.

I am Nezha, welcome to like, see you next time ~