An overview,

In Sentinel, all resources are assigned a resourceName (resourceName). Each resource call creates an Entry object. Entry can be created automatically by adapting to the mainstream framework or explicitly by annotating or calling the SphU API. When an Entry is created, a series of slot chains are also created. These slots have different responsibilities, for example:

  • NodeSelectorSlotCollect the paths of resources and store the call paths of these resources in a tree structure for limiting traffic degradation according to the call paths;
  • ClusterBuilderSlotThe statistics of the storage resource and caller information, such as RT, QPS, thread count of the resource, will be used as the basis for multi-dimensional flow limiting and degradation;
  • LogSlotIs used to record is used to record block exceptions to provide specific logs for troubleshooting
  • StatisticSlotIt is used to record and statistics the monitoring information of runtime indicators in different latitudes.
  • AuthoritySlotAccording to the configuration of the blacklist and whitelist and call source information, to do the blacklist and whitelist control;
  • SystemSlotThe total inlet flow is controlled by the state of the system, such as load1.
  • FlowSlotIs used for traffic control according to preset traffic limiting rules and slot statistics.
  • DegradeSlotBy statistics and preset rules, to do the circuit breaker downgrade;

The following is a diagram of the relationship


2. FlowSlot analysis

1. FlowSlot is introduced

The official documentation describes FlowSlot as follows:

Flow control monitors application traffic indicators, such as QPS or concurrent threads, and controls the traffic when it reaches a specified threshold to avoid being overwhelmed by instantaneous traffic peaks and ensure high availability of applications.

FlowSlot controls traffic based on preset rules and real-time information collected by NodeSelectorSlot, ClusterBuilderSlot, and StatisticSlot.

A FlowException is thrown when Entry nodeA = SphU. Entry (resourceName) is executed. FlowException is a subclass of BlockException. You can capture BlockException from the definition of the processing logic after the flow is restricted.

Multiple traffic limiting rules can be created for a resource. FlowSlot traverses all the limited flow rules of the resource until a rule triggers flow limiting or all the rules are traversed.

A traffic limiting rule mainly consists of the following factors, which can be combined to achieve different traffic limiting effects:

  • Resource: indicates the name of the resource, which is the object of the traffic limiting rule

  • Count: traffic limiting threshold

  • Grade: Type of flow limiting threshold (QPS or number of concurrent threads)

  • LimitApp: call source of flow control. If it is default, call source is not differentiated

  • Strategy: Invokes the relational traffic limiting policy

  • ControlBehavior: Flow control effect (direct reject, Warm Up, uniform queuing)

    • Immediately rejectedImmediately reject({@code RuleConstant.CONTROL_BEHAVIOR_DEFAULT})This is the default behavior; requests beyond this will be rejected. Throw a FlowException.Copy the code
    • Service Warmup ({@codeCONTROL_BEHAVIOR_WARM_UP}) If the load of the system has been low for a while, and a large number of requests come in, the system may not be able to handle all of them. However, if we steadily increase incoming requests, the system will heat up and eventually be able to handle all of them. This warm-up period can be achieved by setting the field {in the flow rule@codeWarmupperiods}.Copy the code
    • In line at a constant speedUniform Rate Limiting ({@code RuleConstant.CONTROL_BEHAVIOR_RATE_LIMITER})This policy strictly controls the interval between requests. In other words, it allows requests to pass at a steady, uniform rate. This policy is an implementation of leaky bucket. It is used to process requests at a steady rate and is often used for burst traffic(for example, message processing).Copy the code

We can obtain the sample diagram by following the command

/ / command:
curl http://localhost:8719/tree
// Example:
idx id    thread pass  blocked   success total aRt   1m-pass   1m-block   1m-all   e
2   abc647 0      460    46          46   1    27      630       276        897      0

Copy the code

Among them:

  • thread: represents the resource currently being processedconcurrentThe number;
  • pass: on behalf of asecondsFrom the inside outrequest;
  • blocked: on behalf of asecondsIn the flowcontrolNumber of requests;
  • success: on behalf of asecondsWithin thesuccessfulCompleted requests;
  • total: means to onesecondsIncoming requests as well as beingstopThe request ofThe sum of the;
  • RT: indicates the average response time of the resource in one second.
  • 1m-pass: is aminutesThe arrival of therequest;
  • 1m-block: is aminutesbestopThe request;
  • 1m-all: is aminutesIncoming requests andBe prevented fromThe request of theThe sum of the;
  • (e)exception: is the business itself in a secondabnormaltheThe sum of the.

2. Source code interpretation

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count, boolean prioritized, Object... args) throws Throwable {
    checkFlow(resourceWrapper, context, node, count, prioritized);

    fireEntry(context, resourceWrapper, node, count, prioritized, args);
}


@Override
public void exit(Context context, ResourceWrapper resourceWrapper, int count, Object... args) {
    fireExit(context, resourceWrapper, count, args);
}
Copy the code

1. In the Entry phase, a verification method is executed.

void checkFlow(ResourceWrapper resource, Context context, DefaultNode node, int count, boolean prioritized) throws BlockException {
    checker.checkFlow(ruleProvider, resource, context, node, count, prioritized);
}
Copy the code

2. We can see that there is an extra key parameter “ruleProvider”, let’s look at the implementation of this extra parameter.

private final Function<String, Collection<FlowRule>> ruleProvider = new Function<String, Collection<FlowRule>>() {
    @Override
    public Collection<FlowRule> apply(String resource) {
        // Flow rule map should not be null.
        Map<String, List<FlowRule>> flowRules = FlowRuleManager.getFlowRuleMap();
        returnflowRules.get(resource); }};Copy the code

3. Firstly, the global FlowRuleMap will be obtained from the global FlowRuleManager, and then the corresponding FlowRuleList will be obtained according to our unique judgment criterion “Resource”.

4. Let’s look at the checkFlow method.

public void checkFlow(Function<String, Collection<FlowRule>> ruleProvider, ResourceWrapper resource, Context context, DefaultNode node, int count, boolean prioritized) throws BlockException {
    
    if (ruleProvider == null || resource == null) {
        return;
    }
    Collection<FlowRule> rules = ruleProvider.apply(resource.getName());
    if(rules ! =null) {
        for (FlowRule rule : rules) {
            if(! canPassCheck(rule, context, node, count, prioritized)) {throw newFlowException(rule.getLimitApp(), rule); }}}}Copy the code

5. Firstly, check resource and ruleList. If they are empty, the check is skipped. Throw new FlowException(rul.getLimitApp (), rule);

public boolean canPassCheck(/*@NonNull*/ FlowRule rule, Context context, DefaultNode node, int acquireCount,boolean prioritized) {
    String limitApp = rule.getLimitApp();
    if (limitApp == null) {
    return true;
    }

    if (rule.isClusterMode()) {
    return passClusterCheck(rule, context, node, acquireCount, prioritized);
    }

    returnpassLocalCheck(rule, context, node, acquireCount, prioritized); }}Copy the code

6. If we enter the inner method, we can find that cluster mode and local mode are distinguished here. Even if the cluster mode is selected, the subsequent code will re-verify the cluster mode, and if the verification fails, it will be degraded and return to local mode.

static Node selectNodeByRequesterAndStrategy(/*@NonNull*/ FlowRule rule, Context context, DefaultNode node) {
    // The limit app should not be empty.
    String limitApp = rule.getLimitApp();
    int strategy = rule.getStrategy();
    String origin = context.getOrigin();

    if (limitApp.equals(origin) && filterOrigin(origin)) {
        if (strategy == RuleConstant.STRATEGY_DIRECT) {
            // Matches limit origin, return origin statistic node.
            return context.getOriginNode();
        }

        return selectReferenceNode(rule, context, node);
    } else if (RuleConstant.LIMIT_APP_DEFAULT.equals(limitApp)) {
        if (strategy == RuleConstant.STRATEGY_DIRECT) {
            // Return the cluster node.
            return node.getClusterNode();
        }

        return selectReferenceNode(rule, context, node);
    } else if (RuleConstant.LIMIT_APP_OTHER.equals(limitApp)
               && FlowRuleManager.isOtherOrigin(origin, rule.getResource())) {
        if (strategy == RuleConstant.STRATEGY_DIRECT) {
            return context.getOriginNode();
        }

        return selectReferenceNode(rule, context, node);
    }

    return null;
}
Copy the code

7. Now we finally into the logic of the strategy, here is the main logic judgment under different limitApp, such as the specified type, cluster, as well as other STRATEGY_DIRECT process, if all match will enter selectReferenceNode after failure, This includes the STRATEGY_RELATE process and the STRATEGY_CHAIN process.

Then comes the final piece: controlBehavior

The next step is to judge whether it passes according to data and flow control rules.

8. This controller has four implementation classes corresponding to the last key factor of flow, “controlBehavior”

  • DefaultControllerDefault throttling controller (reject policy immediatelyImmediately).
  • RateLimiterControllerSteady uniform token bucket mode (uniform queuingUniform Rate)
  • WarmUpController(Service heating upWarmup)
  • WarmUpRateLimiterController(Service warming + token bucketWarmup+RateLimiter)

As the core flow limiting controller of Sentinel, a large number of rules corresponding to specific resources are preset in the same way that we use. The rules will be registered as a map when initialized. Here we can see that Sentinel uses the idea of CopyOnWrite to operate the flow map. After multi-dimensional combined flow limiting of grade, strategy and controlBehavior, the function of flow limiting is completely realized.

3. DegradeSlot analysis

1. DegradeSlot is introduced

It describes Survival of a circuit breaker as follows. In my opinion, this is the main difference of Sentinel compared to the normal gateway. It has both abundant current limiting and fusing capability.

2. Source code interpretation

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node, int count,  boolean prioritized, Object... args) throws Throwable {
    performChecking(context, resourceWrapper);

    fireEntry(context, resourceWrapper, node, count, prioritized, args);
}


@Override
public void exit(Context context, ResourceWrapper r, int count, Object... args) {
    Entry curEntry = context.getCurEntry();
    if(curEntry.getBlockError() ! =null) {
        fireExit(context, r, count, args);
        return;
    }
    List<CircuitBreaker> circuitBreakers = DegradeRuleManager.getCircuitBreakers(r.getName());
    if (circuitBreakers == null || circuitBreakers.isEmpty()) {
        fireExit(context, r, count, args);
        return;
    }

    if (curEntry.getBlockError() == null) {
        // passed request
        for (CircuitBreaker circuitBreaker : circuitBreakers) {
            circuitBreaker.onRequestComplete(context);
        }
    }

    fireExit(context, r, count, args);
}
Copy the code

If the CircuitBreaker is a CircuitBreaker, the CircuitBreaker is a CircuitBreaker. If the CircuitBreaker is a CircuitBreaker, the CircuitBreaker is a CircuitBreaker.

public interface CircuitBreaker {

    /** * get the relevant break rules. * /
    DegradeRule getRule(a);

    /** * Gets the call permission only if the call is available at the time of the call. * *@paramContext Context of the current call *@returnTrue If permission is granted, return false */ is used
    boolean tryPass(Context context);

    /** * Gets the status of the circuit breaker. * /
    State currentState(a);

    /** * Record a completed request in context and perform a state transition on the breaker */
    void onRequestComplete(Context context);

    /** * Circuit breaker state. */
    enum State {
        In {/ * * *@codeIn the OPEN} state, all requests will be rejected until the next recovery point. * /
        OPEN,
        In {/ * * *@codeIn the HALF_OPEN} state, the breaker allows a "probe" call. * If the call is abnormal, depending on the policy (for example, it is slow), the breaker * will revert to {@codeOPEN} state, waiting for the next recovery point; * Otherwise, the resource will be treated as "recovery" and the breaker * will stop the cut-off request and convert to {@codeCLOSED} state. * /
        HALF_OPEN,
        In {/ * * *@codeIn CLOSED} state, all requests are allowed. When the current metric exceeds the threshold, the * breaker converts to {@codeThe OPEN} state. * /
        CLOSED
    }
}

Copy the code

The whole section maintains the state of the CircuitBreaker, and we can see that it’s not a resource anymore, it’s a context, and the state here determines whether or not the request will pass, and if it’s blocked and rejected in the lowest slot, So it can be understood that there is no need to limit the flow again (although the number will be counted).

void performChecking(Context context, ResourceWrapper r) throws BlockException {
    List<CircuitBreaker> circuitBreakers = DegradeRuleManager.getCircuitBreakers(r.getName());
    if (circuitBreakers == null || circuitBreakers.isEmpty()) {
        return;
    }
    for (CircuitBreaker cb : circuitBreakers) {
        if(! cb.tryPass(context)) {throw newDegradeException(cb.getRule().getLimitApp(), cb.getRule()); }}}Copy the code

2. The main logic of performChecking is to obtain fuse breakers for a resource into the current survival manager. Throw new DegradeException(cb.getrule ().getLimitApp(), cb.getrule ());

Four, summary

This installment introduces the FlowSlot and DegradeSlot subclasses of Slot.

Now build our knowledge tree

Instantiate DefaultNode and ClusterNode to create a structure tree


When creating a context, the presence of DefaultNode is first checked in NodeSelectorSlot.

If not, add a New Resource-based DefaultNode and execute the next slot.

The next slot is ClusterBuilderSlot. ClusterBuilderSlot checks whether there is a Corresponding ClusterNode. If there is no ClusterNode, a New ClusterNode based on resource is added and the next process (slot) proceeds.

In summary, these two slots set the tone for resource-based global control.

Collect information


After DefaultNode and ClusterNode are initialized, LogSlot is used as the demarcating point of service instance modules to collect global exceptions and process them.

StatisticSlot, as a global statistics instance, stores global RT, QPS, thread Count and other information in clusterNodeMap based on ClusterNode.

Perform permission verification and system-level traffic limiting


After the tree structure and slot for information collection are established, the implementation of business logic starts. The first one is the black-and-white list ability of AuthoritySlot. Relying on the definition of Sentinel resource, we can easily obtain the authorityRules of Resource. After taking out the corresponding rules, you can use the rules to judge the black and white lists. It can also be regarded as a traffic limiting measure at the permission level.

“SystemSlot” is a fully statistical, fully constrained stream. It reads the configured stream limiting measures from the “origins” configuration at the call point, and completes all the determinations (QPS, number of threads, number of successful accesses, RT, and CPU status) before the next slot is implemented. If an exception occurs, throws BlockException and passes the corresponding logic to the previous slot. At this point, a basic flow limiting framework has been basically implemented.

Perform current limiting and fusing


Once all configuration items have been configured and traffic limiting at the permission level and system level is complete, it is time to turn to the last two slots.

Flowslot and Assist eslot correspond to flow limiting and fuse reduction, respectively.

At this point, a full-fledged distributed gateway has been completed and the full functionality of our Sentinel has been described.