In the second half of this year, Alibaba opened source its own stream limiting system Sentinel. The official introduction of Sentinel used a series of high-mountain terms such as current limiting, fuse downgrading, flow shaping, system load protection, etc., as well as beautiful adjectives such as lightweight, professional, real-time, etc. As a consumer of technology, you can’t help but exclaim — NiuB! What’s more, Sentinel’s launch was delivered by Zijin, a senior tech specialist at Alibaba, who is a female developer, a rare sight in the male-dominated IT industry.

I have spent a whole day studying the functionality and code of Sentinel, and have a rough understanding of the overall architecture and some technical details of some parts. Here is a comprehensive share for you.

Sentinel entry

First, Sentinel is not a particularly complex system, and its principles and architecture can be easily understood by ordinary technology developers. You don’t see on the architecture that Sentinel is surrounded by a whole bunch of other big open source middleware, but this is just a fancy wrapper, and the Sentinel Core is really lightweight.

Let’s start with its Hello World, and get some insight into the architecture by understanding this introductory code.

<dependency>
    <groupId>com.alibaba.csp</groupId>
    <artifactId>sentinel-core</artifactId>
    <version>1.4.0</version>
</dependency>
Copy the code

There are two types of single-node and distributed flow limiting. Single-node flow limiting limits the QPS of a code fragment in the current process, the number of concurrent threads or the load index of the whole machine. Once the value exceeds the configured value, an exception will be thrown or false will be returned. I refer to the restricted snippets here as “critical sections.”

This server will only generate a certain number of votes per second for each specified resource. Before executing the critical section code, it will go to the centralized invoice service to get tickets. If it succeeds, it can execute, otherwise it will throw a limiting exception. Therefore, distributed traffic limiting requires more network read/write operations. For those of you who have read my little book Redis Deep Adventure, you will find that the Redis flow limiting module is similar to Sentinel flow limiting, except Sentinel invoice server is self-developed and uses Netty framework.

Sentinel is used in two forms, one is the exception capture form and the other is the Boolean form. That is, whether to throw an exception or return a false when limiting is triggered. Let’s take a look at its exception catching form, which is standalone

import com.alibaba.csp.sentinel.Entry;
import com.alibaba.csp.sentinel.SphU;
import com.alibaba.csp.sentinel.slots.block.BlockException;

public class SentinelTest {
	public static void main(String[] args) {
		// Configure rules
		List<FlowRule> rules = new ArrayList<>();
		FlowRule rule = new FlowRule();
		rule.setResource("tutorial");
		// QPS must not exceed 1
		rule.setCount(1);
		rule.setGrade(RuleConstant.FLOW_GRADE_QPS);
		rule.setLimitApp("default");
		rules.add(rule);
        // Load the rule
		FlowRuleManager.loadRules(rules);
        // Start running the flow-limited scoped code
		while (true) {
			Entry entry = null;
			try {
				entry = SphU.entry("tutorial");
				System.out.println("hello world");
			} catch (BlockException e) {
				System.out.println("blocked");
			} finally {
				if(entry ! =null) { entry.exit(); }}try {
				Thread.sleep(500);
			} catch (InterruptedException e) {}
		}
	}
}
Copy the code

Using Sentinel requires us to provide flow limiting rules, on the basis of which, the critical region code is wrapped with flow limiting scope structure. In the example above, the standalone QPS of the tutorial resource is limited to 1, but in reality its running QPS is 2. This additional execution logic is limited, and the corresponding sphu.entry () method throws a BlockException limiting the flow. Here is the result

INFO: log base dir is: /Users/qianwp/logs/csp/
INFO: log name use pid is: false
hello world
blocked
hello world
blocked
hello world
blocked
hello world
blocked
...
Copy the code

It can be seen from the output that Sentinel records detailed flow limiting logs in local files, which can be collected as a data source for alarm.

So let’s look at the bool form, which is also very easy to use, very similar.

import java.util.ArrayList;
import java.util.List;
import com.alibaba.csp.sentinel.SphO;
import com.alibaba.csp.sentinel.slots.block.RuleConstant;
import com.alibaba.csp.sentinel.slots.block.flow.FlowRule;
import com.alibaba.csp.sentinel.slots.block.flow.FlowRuleManager;

public class SentinelTest {
	public static void main(String[] args) {
		// Configure rules
		List<FlowRule> rules = new ArrayList<>();
		FlowRule rule = new FlowRule();
		rule.setResource("tutorial");
		// QPS must not exceed 1
		rule.setCount(1);
		rule.setGrade(RuleConstant.FLOW_GRADE_QPS);
		rule.setLimitApp("default");
		rules.add(rule);
		FlowRuleManager.loadRules(rules);
		// Run restricted-scoped code
		while (true) {
			if (SphO.entry("tutorial")) {
				try {
					System.out.println("hello world");
				} finally{ SphO.exit(); }}else {
				System.out.println("blocked");
			}
			try {
				Thread.sleep(500);
			} catch (InterruptedException e) {}
		}
	}
}
Copy the code

Control rules

In the examples above, the rules are written in code. In a real project, the rules would need to support dynamic configuration. This requires a rule configuration source, which can be a database such as Redis and Zookeeper, as well as a rule change notification mechanism and a rule configuration background, allowing managers to dynamically configure rules in the background and deliver them to the service server in real time for control.

/ / redis address
RedisConnectionConfig redisConf = new RedisConnectionConfig("localhost".6379.1000);
// Deserialization algorithm
Converter<String, List<FlowRule>> converter = r -> JSON.parseArray(r, FlowRule.class);
// Define the rule source, including the full and incremental parts
// Full is a string key and delta is a Pubsub channel key
ReadableDataSource<String, List<FlowRule>> redisDataSource = new RedisDataSource<List<FlowRule>>(redisConf,"app_key"."app_pubsub_key", converter);
FlowRuleManager.register2Property(redisDataSource.getProperty());
Copy the code

Health status report and check

Application servers connected to Sentinel need to report their traffic limiting status to Dashboard, so that the traffic limiting status of all services can be displayed in the background in real time. Sentinel uses a pull model to report status. It registers an HTTP service with the current process and Dashboard periodically accesses this HTTP service to obtain health and traffic limiting information for each process.

The current open-source Dashboard does not have the persistence capability. When the administrator modifies rules in the background, the Dashboard synchronizes traffic limiting rules using HTTP health service addresses to control specific service processes. If the application is restarted, the rules are automatically reset. If you want to persist rule sources through Redis, you’ll need to customize the Dashboard yourself. Customization is easy, just implement its built-in persistence interface.

Distributed current limiting

As mentioned above, distributed traffic limiting requires another Ticket Server to distribute tickets. Only when Ticket requests can be obtained, the critical section code can be executed, and the Ticket Server also needs to provide rule input sources.

Framework of adaptation

Sentinel-protected critical sections are code blocks that can be directly adapted to various frameworks such as Dubbo, SpringBoot, GRPC and message queues by stretching the boundaries of critical sections. The adaptors of each framework define the critical scope uniformly at the request boundary, so that users can automatically implement limiting protection without having to manually add fusing protection code.

Fusing the drop

Limiting traffic is about limiting the number of QPS or concurrent threads, or unstable request processing or service corruption, resulting in requests taking too long or exceptions being thrown too frequently, and in this case the service needs to be degraded. There is no obvious difference between the so-called downgrading processing and limiting flow processing in form, and a critical region is defined in the same form. The difference is that statistics are required for the exceptions thrown out, so that the frequency of requesting exceptions can be known, and only with this index can downgrade be triggered.

// Define the reversion rule
List<DegradeRule> rules = new ArrayList<>();
DegradeRule rule = new DegradeRule();
rule.setResource("tutorial");
// Within 5s, exceptions should not exceed 10
rule.setCount(10);
rule.setGrade(RuleConstant.DEGRADE_GRADE_EXCEPTION_COUNT);
rule.setLimitApp("default");
rules.add(rule);
DegradeRuleManager.loadRules(rules);

Entry entry = null;
try {
  entry = SphU.entry(key);
  // The business code is here
} catch (Throwable t) {
  // Record an exception
  if (!BlockException.isBlockException(t)) {
    Tracer.trace(t);
  }
} finally {
  if(entry ! =null) { entry.exit(); }}Copy the code

FlowException is thrown when flow limiting is triggered, and DegradeException is thrown when fuse is triggered. Both exceptions inherit from BlockException.

Hot spot in current limit

There is also a special dynamic traffic limiting rule for limiting dynamic hotspot resources. Internally, the LRU algorithm is used to calculate topN hotspot resources, and then the flow limiting of topN resources is performed. Meanwhile, the parameter setting of special treatment for special resources is provided. For example, the following example limits the access frequency of the same user and the access frequency of the same book, but sets a special frequency for a particular user and a particular book.

ParamFlowRule ruleUser = new ParamFlowRule();
// The same userId QPS must not exceed 10
ruleUser.setParamIdx(0).setCount(10);
// QPS upper limit is 100
ParamFlowItem uitem = new ParamFlowItem("qianwp".100, String.class);
ruleUser.setParamFlowItemList(Collections.singletonList(uitem));

ParamFlowRule ruleBook = new ParamFlowRule();
// The same bookId QPS cannot exceed 20
ruleBook.setParamIdx(1).setCount(20);
// For redis books, the QPS limit is 100
ParamFlowItem bitem = new ParamFlowItem("redis".100, String.class);
ruleBook.setParamFlowItemList(Collections.singletonList(item));

// Load the rule
List<ParamFlowRule> rules = newArrayList<>(); rules.add(ruleUser); rules.add(ruleBook); ParamFlowRuleManager. LoadRules (rules);// The userId user accesses the bookId book
Entry entry = Sphu.entry(key, EntryType.IN, 1, userId, bookId);
Copy the code

The difficulty of hotspot flow limiting lies in how to count the traffic volume of hotspot resources within a fixed length sliding window time. Sentinel designs a special data structure called LeapArray, which has complex algorithm design and needs separate analysis later.

System adaptive current limiting – overload protection

When the system is heavily loaded, limiting the current system is required to prevent the system from being overwhelmed by a flood of requests. The protection method is to gradually limit THE QPS, observe the system load recovery, then gradually release the QPS, if the system load drops again, then gradually reduce the QPS. Thus a dynamic equilibrium is achieved, which involves a special balancing algorithm. There is a problem with the load index of the system, which is derived from the load1 parameter of the operating system. The load1 parameter is not updated in real time, and there is a long transition time from overload of load1 to recovery. If the one-size-fits-all scheme is used, any request is blocked during this recovery time. Release of load1 immediately after the resumption of load1 will inevitably lead to load ups and downs, and service processing on and off. Therefore, the author transplants the idea of TCP congestion control algorithm to realize the smooth overload protection function of the system. The algorithm is very clever, the code implementation is not complicated, but the effect is very significant.

The algorithm defines a steady state formula, once the steady state is broken, the system load will fluctuate. The essence of the algorithm is to re-establish the steady state by continuously adjusting the relevant parameters when the steady state is broken.

The steady-state formula is simple: ThreadNum * (1/ResponseTime) = QPS, which is easy to understand as the system QPS is equal to the number of threads times the number of requests a single thread can execute per second. The system will real-time sampling statistics of all critical areas of QPS and ResponseTime, can calculate the corresponding steady state concurrent threads. When the load exceeds the threshold, we can determine whether the current request needs to be rejected by determining whether the current number of threads exceeds the number of steady state threads.

Multiple parameters are required to define adaptive traffic limiting rules

  1. The system load level above which the overload protection function is triggered
  2. If the overload protection exceeds the threshold, you do not need to set the maximum number of threads, maximum response time, and maximum QPS
List<SystemRule> rules = new ArrayList<SystemRule>();
SystemRule rule = new SystemRule();
rule.setHighestSystemLoad(3.0);
rule.setAvgRt(10);
rule.setQps(20);
rule.setMaxThread(10);
rules.add(rule);
SystemRuleManager.loadRules(Collections.singletonList(rule));
Copy the code

It can also be seen from the code that the system adaptive flow limiting rule does not need to define the resource name, because it is a global rule and will be automatically applied to all critical areas. If the load exceeds the threshold, all critical zone resources will tighten their belts together.