Introduction of current limiting

When it comes to high availability systems, there’s a lot of talk about high availability protection: caching, downgrading, and limiting traffic. This post will focus on limiting traffic. Traffic limiting is short for Rate Limit. Only specified events are allowed to enter the system. If the events exceed the Rate Limit, they will be denied service, queued or waiting, or degraded.

For server services, traffic limiting ensures that part of the request traffic can be properly responded to, which is better than not responding to all requests, or even causing a system avalanche. Traffic limiting and circuit breaker are often confused by people. Bloggers think that the biggest difference between them is that traffic limiting is mainly implemented in server, while circuit breaker is mainly implemented in client. Of course, a service can act as both server and client, which makes traffic limiting and circuit breaker exist in the same service at the same time. Follow the public number Ape tech column for more interview resources.

So why do we need to limit the flow? Many people’s first reaction is that the service is broken and they need to limit the flow. This is not a comprehensive statement. Bloggers believe that limiting the flow is due to the scarcity of resources or for security purposes, self-protection measures. Traffic limiting ensures that services can be maximized with limited resources. Services are provided according to the expected traffic. If the traffic exceeds the limit, services will be denied, queued or waiting, or degraded.

Support for traffic limiting varies from system to system, but there are some standards. The HTTP RFC 6585 standard specifies 429 Too Many Requests, and the 429 status code indicates that the user sent Too Many Requests in a given period of time for traffic limiting (” rate limiting “). It also includes a retry-after response header that tells the client how long it will be before the service can be requested again

HTTP/1.1 429 Too Many RequestsContent-Type: text/ htmlretry-after: 3600 <title>Too Many Requests</title> <h1>Too Many Requests</h1> <p>I only allow 50 requests per hour to this Web site per logged in user. Try again soon.</p>Copy the code

Many application frameworks also integrate traffic limiting functions and give explicit traffic limiting identification in the Header returned.

X-rate-limit-limit: indicates the maximum number of requests allowed in a period of time.
X-rate-limit-remaining indicates the number of Remaining requests in the current period.
X-rate-limit-reset: indicates the number of seconds to wait for the maximum number of requests.

The response Header tells the caller the frequency of stream limiting on the server side to ensure the upper limit of interface access on the back end, and the client can adjust the request according to the Header in the response.

Current limit classification

Limit flow, split, just two word limit and flow, limit is a verb limit, very easy to understand. However, flows are different resources or indicators in different scenarios, and diversity is reflected in flows. In network traffic it can be byte stream, in database it can be TPS, in API it can be QPS or concurrent requests, in merchandise it can also be inventory… . But whatever flow it is, it must be quantifiable, measurable, observable, and statistically measurable. We divide the classification of traffic limiting into different categories based on different ways as shown in the following figure.

Current limit classification

Due to limited space, this article will only select a few common categories to illustrate.

Flow limiting particle size classification

According to particle size classification of current limiting:

Single current limiting
Distributed current limiting

The current system is basically distributed architecture, the mode of single-machine has been very few, here said that the single-machine traffic limiting is more accurate to say that the single-service node traffic limiting. Single-node traffic limiting means that traffic limiting measures are taken by a service node after the request reaches the traffic limiting threshold.

Diagram of single-machine current limiting

In a narrow sense, distributed traffic limiting refers to the realization of multi-node combined traffic limiting at the access layer, such as NGINX+ Redis and distributed gateway, etc. In a broad sense, distributed traffic limiting refers to the organic integration of multiple nodes (which can be different service nodes) to form the overall traffic limiting service.

Schematic diagram of distributed traffic limiting

Single-node traffic limiting prevents the traffic from crushing the service nodes and lacks the perception of the overall traffic. Distributed traffic limiting is suitable for fine-grained traffic limiting control and can match different traffic limiting rules according to different scenarios. Different from single-node traffic limiting, distributed traffic limiting requires centralized storage, which is usually implemented by Redis. With the introduction of centralized storage, the following issues need to be addressed:

The ideal mode of data consistency in traffic limiting mode is point in time consistency. Time points in the definition of the consistency requires that all components of the data at any time is completely consistent, but in general information propagation velocity is the speed of light, actually cannot reach agreement at any time, there is always a certain amount of time, for the consistency of our CAP just need to read the latest data, Achieving this does not require strict arbitrary time consistency. This can only be a consistent model in theory, which can achieve linear consistency in current limiting.
Time consistency The time consistency of each service node is different from that of the preceding point in time. There are three machines in A cluster, but the time on one of the A/B machines is Tue Dec 3 16:29:28 CST 2019, and the time on C is Tue Dec 3 16:29:28 CST 2019. Then using NTPDate for synchronization will also have a certain error, for the time window sensitive algorithm is the error point.
Timeout In a distributed system, the network needs to communicate, resulting in network jitter, or the distributed traffic limiting middleware is under too much pressure, resulting in slow response, or even the improper setting of the timeout threshold, resulting in the timeout of the application service node. In this case, should traffic be allowed or denied?
Performance and reliability Distributed flow limiting middleware resources are always limited, and may even be single point (write single point), with a performance upper limit. If the distributed traffic limiting middleware is not available, how to degenerate to single-machine traffic limiting mode is also a good demotion scheme.

Type of traffic limiting objects

Classification by object type:

Traffic limiting based on requests
Traffic limiting based on resources

Based on the request limiting, the general implementation of the total limit and limit QPS. Limit the total amount is to limit the upper limit of a certain index, for example, buy a certain commodity, the volume is 10W, then can sell 10W at most. Wechat grab a red envelope, the group sent a red envelope split into 10, so there can only be 10 people can grab, the eleventh person will open it will show “slow hand, red envelope sent over”.

Limit QPS, also known as flow limiting, as long as it is done at the interface level, an interface can only be accessed 100 times per second, so its peak QPS can only be 100. The most difficult aspect of limiting QPS is how to estimate the threshold and how to locate the threshold, as discussed below.

Resource-based traffic limiting is based on the usage of service resources. Key resources of a service need to be identified and restricted, for example, the number of TCP connections, threads, and memory usage. Limiting resources more effectively reflects the current cleanup of the service, but like limiting QPS, there are thresholds on how to identify resources. This threshold value needs to be continuously tuned and practiced to get a more satisfactory value.

Classification of traffic limiting algorithms

No matter in what dimension and in what way the classification is based, the bottom layer of the flow limiting is realized by algorithm. The following describes the implementation of common traffic limiting algorithms:

counter
Token bucket algorithm
Bucket algorithm

counter

Fixed window counter

Count traffic limiting is the simplest traffic limiting algorithm. In daily development, we refer to the fixed window count traffic limiting algorithm. For example, an interface or service can only receive 1000 requests at most for 1s, so we will set the traffic limiting to 1000QPS. The realization idea of this algorithm is very simple, maintain a fixed unit time counter, if the detection of unit time has passed, reset the counter to zero.

Fixed window counter principle

Its operation steps:

The timeline is divided into multiple independent and fixed size Windows.
Requests that fall within each time window increment the counter by one;
If the counter exceeds the traffic limiting threshold, subsequent requests to this window will be rejected. But when the time reaches the next time window, the counter is reset to 0.

Let’s implement a simple code.

Package limiTIMPort ("sync/atomic" "time")type Counter struct {Count uint64 // initial Counter Limit uint64 // Maximum request frequency in a unit time window Func NewCounter(count, limit uint64, Interval, rt int64) *Counter { return &Counter{ Count: count, Limit: limit, Interval: interval, RefreshTime: rt, }}func (c *Counter) RateLimit() bool { now := time.Now().UnixNano() / 1e6 if now < (c.RefreshTime + c.Interval) { atomic.AddUint64(&c.Count, 1) return c.Count <= c.Limit } else { c.RefreshTime = now atomic.AddUint64(&c.Count, -c.Count) return true }}Copy the code

Test code:

package limitimport ( "fmt" "testing" "time")func Test_Counter(t *testing.T) { counter := NewCounter(0, 5, 100, time.Now().Unix()) for i := 0; i < 10; i++ { go func(i int) { for k := 0; k <= 10; k++ { fmt.Println(counter.RateLimit()) if k%3 == 0 { time.Sleep(102 * time.Millisecond) } } }(i) } time.Sleep(10 * time.Second)}Copy the code

If you look at the logic above, does it seem easy to fix a window counter? Yes, it is. That’s one of the advantages of it. There are also two serious drawbacks. Imagine that the traffic limiting threshold of 1s in a fixed time window is 100, but 99 requests have been received in the first 100ms, so only one request can be received in the subsequent 900ms, which is a defect and basically has no ability to deal with sudden traffic. The second defect is that 100 requests passed in the last 500ms of the time window 00:00:01, and another 100 requests passed in the first 500ms of the time window 00:00:01. For the service, the number of requests reached 2 times of the traffic limiting threshold in 1 second.

Sliding window counter

The sliding time window algorithm is an improvement on the fixed time window algorithm, which is commonly known in TCP traffic control. Fixed window counter can be said to be a special case of sliding window counter, the operation steps of sliding window:

The unit time is divided into multiple intervals, generally divided into multiple small time periods;
Each interval has a counter. If a request falls within this interval, the counter within this interval will be increased by one.
After each time period, the time window slides one space to the right, discarding the oldest interval and incorporating a new one;
When the total number of requests in the entire time window is calculated, all counters in the time segment are accumulated. If the total number of counts exceeds the limit, all requests in the window are discarded.

By dividing the time window into smaller segments and “sliding” by time, this algorithm avoids the above two problems of fixed window counters. The disadvantage is that the higher the accuracy of the time interval, the larger the space capacity required by the algorithm.

The common implementation methods are mainly based on Redis Zset and circular queue implementation. Based on Redis Zset, Key can be used as stream limiting identification ID, Value can be unique, can be generated by UUID, Score can also be recorded as the same timestamp, preferably nanosecond level. Use the ZADD, EXPIRE, ZCOUNT, and ZremrangebyScore provided by Redis to do this, and also turn on Pipeline to maximize performance. The implementation is simple, but the downside is that zset data structures get bigger and bigger.

Bucket algorithm

The leaky bucket algorithm is that the water enters the leaky bucket first, and then the leaky bucket gives water at a certain rate. When the amount of inflow water is greater than outflow water, the excess water directly spills out. In terms of requests, a leaky bucket is equivalent to a server queue. However, if the number of requests exceeds the traffic limit threshold, the extra requests will be denied service. The leaky bucket algorithm uses queues to control the access speed of traffic at a fixed rate and smooth traffic.

You can see it in one of the most popular images on the Internet.

Leaky bucket algorithm implementation steps:

Store each request in a fixed-size queue;
The queue makes outbound requests at a fixed rate. If the queue is empty, the outbound requests stop.
If the queue is full, additional requests are rejected

The leak-bucket algorithm has an obvious drawback: when there are a large number of burst requests in a short period of time, each request has to wait in the queue for some time before being responded to, even if the server load is not high.

Token bucket algorithm

The token bucket algorithm works by putting tokens into the bucket at a constant rate. If the request needs to be processed, a token needs to be obtained from the bucket first. When no token is available in the bucket, the service is denied. In principle, the token bucket algorithm and the leak bucket algorithm are opposite, the former is “in”, the latter is “out”.

In addition to the difference in “direction” between the leaky bucket algorithm and the token bucket algorithm, there is a more important difference: the token bucket algorithm limits the average inflow rate (allowing burst requests, as long as there are enough tokens, supporting multiple tokens at a time), and allows a certain degree of burst traffic;

The steps of token bucket algorithm are as follows:

Tokens are generated at a fixed rate and put into the token bucket;
If the token bucket is full, the excess tokens will be discarded directly. When the request arrives, an attempt will be made to fetch tokens from the token bucket. The request with the token can be executed.
If the bucket is empty, the request is rejected.

How to choose the four strategies?

Fixed window: simple implementation, but too rough, unless the situation is urgent, in order to quickly stop the immediate problem can be used as a temporary emergency plan.
Sliding window: The traffic limiting algorithm is simple and easy to implement, and can cope with scenarios with a small amount of sudden traffic increase.
Leaky bucket: it has a strong requirement for absolute uniformity of flow, and the utilization rate of resources is not extreme. However, its wide in and strict out mode protects the system while leaving some allowance, which is a universal scheme.
Token buckets: The system often has traffic surges and squeezes service performance as much as possible.

How to limit the current?

No matter which classification or implementation is used, the system faces a common problem: how to confirm the traffic limiting threshold. Some teams set a small threshold based on experience and adjust it gradually; Some teams do it through stress testing. The problem of this method is that the pressure measurement model may not be consistent with the online environment, the single voltage of the interface cannot feedback the status of the whole system, and the full-link pressure measurement cannot truly reflect the traffic proportion of the actual traffic scenario.

Another idea is through pressure measurement + application monitoring data. According to the QPS of the system peak and the usage of system resources, the equal-water-level amplification is used to estimate the flow limiting threshold. The problem lies in that the inflection point of system performance is unknown, and the simple prediction may not be accurate or even greatly deviates from the real scene. As described in Overload Control for Scaling Microservices, in systems with complex dependencies, Overload Control of a particular service may be detrimental to the entire system or the implementation of the service may be flawed.

It is hoped that in the future, a more AI operation feedback system can automatically set the traffic limiting threshold, which can dynamically implement overload protection according to the current QPS, resource status, RT and other related data.

Regardless of the traffic limiting threshold, the system should pay attention to the following points:

Operating indicator status, such as current service QPS, machine resource usage, database connections, concurrent threads, etc.
Call relationship between resources, external link request, association between internal services, strong and weak dependence between services, etc.
Control mode: reject subsequent requests directly, fail quickly, and wait in a queue when the traffic limit is reached

Use the GO stream limiting class library

There are many concurrency- limiting libraries for Java, such as concurrency-limits, Sentinel, Guava, etc. There are many concurrency- limiting libraries for Java, such as Concurrency -limits, Sentinel, Guava, etc.

Github.com/golang/time…

The code that can go into the language class library is worth reading some time, the classmate who has studied Java whether to the exquisite design of AQS and sigh! Time/Rate also has its subtle parts, so let’s start with the class library phase.

github.com/golang/time/rate

The best thing to do before source analysis is to understand how the class libraries are used, how they are used, and the apis. With an initial understanding of the business, you can get twice the result with half the effort by reading the code. Because space is limited, subsequent blog posts are analyzing the source code of multiple stream limiting libraries.

Class library API documentation:

Godoc.org/golang.org/…

Time/Rate class library is based on the token bucket algorithm implementation of traffic limiting function. The principle of the token bucket algorithm mentioned above is that the system will put tokens into the bucket at a constant rate, so the bucket has a fixed size, and the rate of putting tokens into the bucket is also constant, allowing burst traffic. Review the documentation and find a function:

func NewLimiter(r Limit, b int) *LimiterCopy the code

NewLimiter returns a newLimiter that allows event rates up to r and bursts up to b tokens. Limter limit time frequency that is to say, but the barrels initially capacity for b, b and filled with a token (token pool up to b a token, so a maximum allowed b event, an event spent a token), then each unit time interval (default 1 s) into a bucket into the r a token.

limter := rate.NewLimiter(10, 5)Copy the code

The above example shows that the token bucket has a capacity of 5 and that 10 tokens are put into the bucket every second. The careful reader will notice that the first argument to NewLimiter is of type Limit. If you look at the source code, you will see that Limit is actually an alias of Float64.

// Limit defines the maximum frequency of some events.// Limit is represented as number of events per second.// A zero Limit allows no events.type Limit float64Copy the code

The limiter can also specify the interval at which tokens are placed in the bucket as follows:

limter := rate.NewLimiter(rate.Every(100*time.Millisecond), 5)Copy the code

The effect of the two examples is the same; instead of putting 10 tokens in every second interval at once, the tokens are evenly distributed over 100ms intervals. Rate.limiter provides three methods for limiting speed:

Allow/AllowN
Wait/WaitN
Reserve/ReserveN

The following compares the usage modes and application scenarios of the three traffic limiting modes. Let’s look at the first class of methods:

func (lim *Limiter) Allow() boolfunc (lim *Limiter) AllowN(now time.Time, n int) boolCopy the code

Allow is a simplified method of AllowN(time.now (), 1). The API is a bit abstract and confusing. See the API documentation below.

AllowN reports whether n events may happen at time now. Use this method if you intend to drop / skip events that exceed the rate limit. Otherwise use Reserve or Wait.Copy the code

This is essentially to say whether the method AllowN can retrieve N tokens from the token bucket at a given time. That means you can limit whether N events can happen at the same time. These two methods are non-blocking, meaning that if they are not satisfied, they are skipped without waiting for enough tokens to execute.

As the second line of the document explains, use this method if you intend to lose or skip time that exceeds the rate limit. For example, using the previously instantiated stream limiter, at a certain point the server receives more than eight requests simultaneously. If there are fewer than eight tokens in the token bucket, these eight requests are discarded. A small example:

func AllowDemo() { limter := rate.NewLimiter(rate.Every(200*time.Millisecond), 5) i := 0 for { i++ if limter.Allow() { fmt.Println(i, "====Allow======", time.Now()) } else { fmt.Println(i, "====Disallow======", time.Now()) } time.Sleep(80 * time.Millisecond) if i == 15 { return } }}Copy the code

Execution Result:

1 ====Allow====== 2019-12-14 15:54:09.9852178 +0800 CST M =+0.0059980012 ====Allow====== 2019-12-14 15:54:10.1012231 +0800 CST m=+0.1220033013 ====Allow====== 2019-12-14 15:54:10.1823056 +0800 CST M =+0.2030858014 ====Allow====== 2019-12-14 15:54:10.263238 +0800 CST M =+0.2840182015 ====Allow====== 2019-12-14 15:54:10.344224 +0800 CST M = = = = = + 0.3650042016 Allow 15:54:10. = = = = = = 2019-12-14 4242458 + 0800 CST m = = = = = + 0.4450260017 Allow = = = = = = 2019-12-14 15:54:10.5043101 +0800 CST m=+0.5250903018 ====Allow====== 2019-12-14 15:54:10.5852232 +0800 CST M =+0.6060034019 ====Disallow====== 2019-12-14 15:54:10.6662181 +0800 CST M =+0.68699830110 ====Disallow====== 2019-12-14 15:54:10.7462189 +0800 CST m=+0.76699910111 ====Allow====== 2019-12-14 15:54:10.8272182 +0800 CST M =+0.84799840112 ====Disallow====== 2019-12-14 15:54:10.9072192 +0800 CST m=+0.92799940113 ====Allow====== 2019-12-14 15:54:10.9872224 +0800 CST M =+1.00800260114 ====Disallow====== 2019-12-14 15:54:11.0672253 +0800 CST M =+1.08800550115 ====Disallow====== 2019-12-14 15:54:11. 1472946 + 0800 CST m = + 1.168074801Copy the code

Second method: Because ReserveN is more complex, WaitN is used first.

func (lim *Limiter) Wait(ctx context.Context) (err error)func (lim *Limiter) WaitN(ctx context.Context, n int) (err error)Copy the code

Similar to Wait is a simplified method of WaitN(CTX, 1). Different from AllowN, WaitN blocks. If the number of tokens in the token bucket is less than N, WaitN blocks for a period of time. The length of the blocking time can be set using the first parameter CTX. Specify the blocking duration of a context instance as context.WithDeadline or context.WithTimeout.

func WaitNDemo() { limter := rate.NewLimiter(10, 5) i := 0 for { i++ ctx, canle := context.WithTimeout(context.Background(), Millisecond) if I == 6 {// Millisecond) err := limter.waitn (CTX, 4) if err! Println(err) continue} fmt.Println(I, ", execute: ", time.now ())) if I == 10 {return}}}Copy the code

Execution Result:

1, execute: 2019-12-14 15:45:15.538539 +0800 CST m=+0.0110234012, execute: 2019-12-14 15:45:15.8395195 +0800 CST m=+0.3120039013 2019-12-14 15:45:16.2396051 +0800 CST M =+0.7120895014 2019-12-14 15:45:16.6395169 +0800 CST m=+1.1120013015, execute: 2019-12-14 15:45:17.0385893 +0800 CST m=+1.511073701context canceled7, run: 2019-12-14 15:45:17.440514 +0800 CST m=+1.9129984018 2019-12-14 15:45:17.8405152 +0800 CST M =+2.3129996019, execute: 2019-12-14 15:45:18.2405402 +0800 CST M =+2.71302460110, execute: 2019-12-14 15:45:18.6405179 +0800 CST M =+3.113002301Copy the code

For scenarios that allow blocking waiting, such as consuming message queues, you can limit the maximum consumption rate, which is too high and will be curtailed to avoid excessive consumer load.

The third method:

func (lim *Limiter) Reserve() *Reservationfunc (lim *Limiter) ReserveN(now time.Time, n int) *ReservationCopy the code

Unlike the previous two methods, Reserve/ReserveN returns an instance of a Reservation. Reservation has five methods in the API documentation:

Func (r *Reservation) CancelAt(time.now ())func (r *Reservation) CancelAt(Now time.time)func (r *Reservation) CancelAt(Now time.time)func (r *Reservation) Cancel( *Reservation) Delay() time.duration // equivalent to DelayFrom(time.now ())func (r *Reservation) DelayFrom(Now time.time) time.Durationfunc (r *Reservation) OK() boolCopy the code

These five methods allow developers to operate according to business scenarios, which are much more complex than the first two types of automation. Here’s a quick example to learn about Reserve/ReserveN:

func ReserveNDemo() { limter := rate.NewLimiter(10, 5) i := 0 for { i++ reserve := limter.ReserveN(time.Now(), 4) // If flase indicates that the specified number of tokens cannot be retrieved, such as the scenario where the number of tokens required is greater than the token bucket capacity if! Reserve.ok () {return} ts := reserve.delay () time.sleep (ts) FMT.Println(" execute: ", time.now (),ts) if I == 10 {return}}}Copy the code

Execution Result:

2019-12-14 16:22:26.6446468 +0800 CST M =+0.008000201 0s 2019-12-14 16:22:26.9466454 +0800 CST m=+0.309998801 247.999299ms 2019-12-14 16:22:27.3446473 +0800 CST m=+0.708000701 398.001399ms 2019-12-14 16:22:27.7456488 +0800 CST m=+1.109002201 399.999499ms 2019-12-14 16:22:28.1456465 +0800 CST m=+1.508999901 398.997999ms 2019-12-14 16:22:28.5456457 +0800 CST m=+1.908999101 399.0003ms 2019-12-14 16:22:28.9446482 +0800 CST m=+2.308001601 399.001099ms 2019-12-14 16:22:29.3446524 +0800 CST m=+2.708005801 399.998599ms 2019-12-14 16:22:29.7446514 +0800 CST m=+3.108004801 399.9944ms 2019-12-14 16:22:30.1446475 +0800 CST M =+3.508000901 399.9954msCopy the code

If Cancel() is performed before Delay(), the return interval is zero, meaning that the operation can be performed immediately without limiting traffic.

func ReserveNDemo2() { limter := rate.NewLimiter(5, 5) i := 0 for { i++ reserve := limter.ReserveN(time.Now(), 4) // If flase indicates that the specified number of tokens cannot be retrieved, such as the scenario where the number of tokens required is greater than the token bucket capacity if! reserve.OK() { return } if i == 6 || i == 5 { reserve.Cancel() } ts := reserve.Delay() time.Sleep(ts) fmt.Println(i, ", time.now (), ts) if I == 10 {return}}}Copy the code

Execution Result:

2019-12-14 16:25:45.7974857 +0800 CST m=+0.007005901 0s2 2019-12-14 16:25:46.3985135 +0800 CST m=+0.608033701 552.0048ms3 2019-12-14 16:25:47.1984796 +0800 CST m=+1.407999801 798.9722ms4 2019-12-14 16:25:47.9975269 +0800 CST m=+2.207047101 799.0061ms5 2019-12-14 16:25:48.7994803 +0800 CST m=+3.009000501 799.9588ms6 2019-12-14 16:25:48.7994803 +0800 CST m=+3.009000501 0s7 下 载 : 2019-12-14 16:25:48.7994803 +0800 CST M =+3.009000501 0s8 下 载 : 2019-12-14 16:25:49.5984782 +0800 CST m=+3.807998401 798.0054ms9 2019-12-14 16:25:50.3984779 +0800 CST m=+4.607998101 799.0075ms10 2019-12-14 16:25:51.1995131 +0800 CST M =+5.409033301 799.0078msCopy the code

In addition to the three traffic limiting modes mentioned above, time/rate also provides the function of dynamically adjusting traffic limiter parameters. Related apis are as follows:

Func (lim *Limiter) SetBurst(newBurst int) newBurst).func (lim *Limiter) SetBurstAt(now time.Time, Func (lim *Limiter) SetLimit(newLimit Limit) // SetLimitAt(time.now (), NewLimit)func (lim *Limiter) SetLimitAt(now time. time, newLimit Limit)// Reset the rate at which tokens are placedCopy the code

These four methods allow applications to dynamically adjust the token bucket rate and token bucket capacity based on their state.

At the end

Through the above series of explanations, I believe that you have a general grasp of the application scenarios and advantages and disadvantages of each flow limiting, hoping to be helpful in daily development. Traffic limiting is only a small part of the entire service governance. It needs to be combined with various technologies to improve service stability and user experience.

Original text: mp.weixin.qq.com/s/HTQoAo1hV…

Three things to watch ❤️

If you find this article helpful, I’d like to invite you to do three small favors for me:

Like, forward, have your “like and comment”, is the motivation of my creation.

Follow the public account “Java Doudi” to share original knowledge from time to time.

Also look forward to the follow-up article ing🚀

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Several flow limiting schemes programmers must know about

Introduction of current limiting

Current limit classification

Flow limiting particle size classification

Type of traffic limiting objects

Classification of traffic limiting algorithms

counter

Fixed window counter

Sliding window counter

Bucket algorithm

Token bucket algorithm

How to choose the four strategies?

How to limit the current?

Use the GO stream limiting class library

github.com/golang/time/rate

At the end

Recommended reading

Three things to watch ❤️

Several flow limiting schemes programmers must know about

Introduction of current limiting

Current limit classification

Flow limiting particle size classification

Type of traffic limiting objects

Classification of traffic limiting algorithms

counter

Fixed window counter

Sliding window counter

Bucket algorithm

Token bucket algorithm

How to choose the four strategies?

How to limit the current?

Use the GO stream limiting class library

github.com/golang/time/rate

At the end

Recommended reading

Three things to watch ❤️

Related Posts

Concurrent queues: used by PriorityBlockingQueue and DelayQueue cases

Macbook Pro black screen no response can not boot solution

Uglify vs Babel-minify vs Terser: a game of code compression