Foreword: Why is flow control downgrading needed

Our production environment often has some unstable situations, such as:

  • The system exceeded the maximum load due to the instantaneous flood peak flow during the rush, and the load skyrocketed. The system crashed and users could not place orders
  • “Dark horse” hot commodity breakdown cache, DB was destroyed, crowded out the normal flow
  • The calling end is dragged down by unstable services, and the thread pool is full, causing the entire invocation link to freeze

These unstable scenarios can have serious consequences. You may be wondering: How to achieve even and smooth user access? How to prevent the impact of heavy traffic or service instability? At this time we will please a magic weapon of micro service stability – high availability flow protection, which is an important means of flow control and fuse degradation, they are an important part of the protection of micro service stability.

Why flow control?

Traffic is very random and unpredictable. One second may be calm, the next second may appear flood peak (such as double eleven zero scene). However, the capacity of our system is always limited. If the sudden flow exceeds the system’s capacity, it may lead to unprocessed requests, slow processing of accumulated requests, high CPU/Load, and finally lead to system crash. Therefore, we need to limit such sudden traffic to handle requests as much as possible while ensuring that the service is not overwhelmed, which is flow control.

Why do you need a circuit breaker downgrade?

A service often calls another module, perhaps another remote service, a database, a third-party API, and so on. For example, when making a payment, you may need to remotely call the API provided by UnionPay. Querying the price of an item may require a database query. However, the stability of the dependent service is not guaranteed. If the dependent service is unstable and the response time of the request is longer, the response time of the method that invokes the service is also longer, threads pile up, and eventually the business’s own thread pool may be exhausted and the service itself becomes unavailable.

Modern microservice architectures are distributed and consist of a very large number of services. Different services call each other and form a complex call link. The above problems can have a magnified effect in link calls. If a link in a complex link is unstable, it may cascade to make the whole link unavailable. Therefore, we need to fuse downgrade unstable weak dependent services to temporarily cut off unstable calls to avoid local unstable factors leading to an avalanche of the whole.

Sentinel: High-availability escort weapon

Sentinel is a high-availability protection component of Distributed service architecture, which is open source of Alibaba. It mainly takes traffic as the entry point and helps developers guarantee the stability of micro-services from multiple dimensions such as traffic control, traffic shaping, fuse downgrading, system adaptive protection and hotspot protection. Sentinel has undertaken the core scenarios of alibaba’s double eleven in the past 10 years to promote traffic, such as second killing, cold start, message peak cutting and valley filling, adaptive flow control, real-time fusing of downstream unavailable services, etc. Sentinel is a powerful tool to ensure the high availability of micro services. It supports Java, Go, C++ and other languages. Istio/Envoy global flow control support is also provided to provide high availability protection for the Service Mesh.

Sentinel technology highlights:

  • High scalability: Basic core + SPI interface expansion capability, users can easily expand flow control, communication, monitoring and other functions
  • Diversified traffic control policies (such as resource granularity, invocation relationship, flow control indicator, and flow control effect) provide distributed cluster traffic control capabilities
  • Hotspot traffic detection and prevention
  • Fuse degrade and isolate unstable services
  • The global dimension of system load adaptive protection, according to the system water level real-time regulation of flow
  • The API Gateway scenario is covered to provide Gateway flow control capability for Spring Cloud Gateway and Zuul
  • Real-time monitoring and dynamic rule configuration management capabilities

Some common usage scenarios:

  • In the Service Provider scenario, we need to protect the Service Provider from being overwhelmed by traffic peaks. In this case, traffic control is usually based on the service provider’s service capability or restricted to a particular service caller. We can evaluate the bearing capacity of the core interface in combination with the preliminary pressure test, and configure the traffic limiting in QPS mode. When the number of requests per second exceeds the set threshold, the redundant requests will be automatically rejected.
  • To avoid being dragged down by unstable services when calling other services, we need to isolate and fuse unstable Service dependencies at the Service caller. The means include semaphore isolation, abnormal proportion degradation, RT degradation and so on.
  • When the system is at a low water level for a long time and the flow suddenly increases, pulling the system directly to a high water level may overwhelm the system instantly. At this time, the WarmUp flow control mode of Sentinel can be used to control the slow increase of the flow through and gradually increase to the upper limit of the threshold in a certain period of time, instead of releasing all the traffic in a moment. This gives the cold system time to warm up and prevents the cold system from being overwhelmed.
  • Sentinel’s uniform queuing mode is used to “peak-cutting and valley filling” to evenly spread the request spikes over a period of time, keeping the system load within the request processing water level while processing as many requests as possible.
  • The Sentinel gateway flow control feature is used to protect traffic at the gateway entrance or limit the frequency of API calls.

Sentinel has a rich open source ecosystem. Sentinel open Source was soon incorporated into CNCF Landscape and is one of Spring Cloud’s official recommended downgrading components for flow control. Community to provide Spring Cloud, Dubbo, gRPC, Quarkus and other commonly used micro-service framework adaptation, out of the box; At the same time, it supports Reactive ecology and asynchronous and responsive architecture of Reactor and Spring WebFlux. Sentinel is also increasingly covering API Gateway and Service Mesh scenarios to play a larger role in cloud native architectures.

In the original Spring Cloud Netflix series, Hystrix is a built-in fuse breaker. It is an open source component provided by Netflix that provides these features, but Hystrix has been available since November 2018. Instead of iterative development, we go into maintenance mode. In the same year, open source Spring Cloud Alibaba (SCA) provides a one-stop solution that integrates Spring Web, RestTemplate, FeignClient and Spring WebFlux by default for Sentinel. Sentinel not only fills Hystrix’s gaps in Servlet, RestTemplate and API Gateway in the Spring Cloud ecosystem, It is also fully compatible with Hystrix’s usage of traffic limiting degradation in FeignClient, and supports flexible configuration and adjustment of traffic limiting degradation rules at runtime. Meanwhile, SCA also integrates the API Gateway flow control module provided by Sentinel, which can seamlessly support the flow control degradation of Spring Cloud Gateway and Zuul gateway.

Spring Cloud Alibaba Sentinel service flow limiting/fusing actual combat

It’s time to do it!

We have integrated the case code in the sandbox and accessed it directly by clicking on the link.

Let’s put the Spring Cloud service limiting/fusing into action with an example. Our example project consists of four modules:

  • Service-api: Service interface definition that is referenced by the consumer/provider

  • Dubbo-provider: a Dubbo provider that provides services externally

  • Web-api-demo: Spring Boot Web application, where some API calls dubo-provider as consumer to get results. There are three API paths defined:

  • /demo/hello: Takes a name argument and calls the FooService:sayHello(name) method at the back end.

  • / demo/time: call the backend FooService: getCurrentTime method gets the current time. The slow request parameter can be used to simulate slow calls.

  • /demo/bonjour/{name}: directly invoks the local DemoService.

  • Demo-gateway: Spring Cloud Gateway. It serves as the access gateway for the entire project and forwards traffic to back-end or third-party services. All of our entry URL access goes through the API Gateway. The route configuration of demo-gateway is as follows:

    Spring: Cloud: gateway: enabled: true Discovery: locator: # route ID lower-case-service-id: true routes: -id: foo-service-route uri: http://localhost:9669/ predicates: – Path=/demo/** – id: httpbin-route uri: httpbin.org predicates: – Path=/httpbin/** filters: – RewritePath=/httpbin/(?.*), /${segment}

This routing configuration contains two routes:

  • foo-service-routeWill:/demo/The initial access is routed to the localhost:9669 back-end service, which corresponds to our Web service. We will access the API in the example through this route, for examplelocalhost:8090/demo/time.
  • httpbin-route: This is an example route that will/httpbin/The initial access is routed tohttpbin.orgOn this example website, for examplelocalhost:8090/httpbin/jsonIt actually maps tohttps://httpbin.org/jsonThe above.

Our environment also includes the Sentinel console, which is enabled and directly accessible to various services. Corresponding address: http://139.196.203.133:8080

Step by step, let’s access SCA Sentinel and verify the effect by configuring flow control degradation rules through the console /Nacos dynamic data source.

Spring – the cloud – alibaba – dependencies configuration

The first step is to import the latest version of spring-Cloud-Alibaba-dependencies in the project’s parent POM, so that we don’t need to specify a version number when we actually introduce sca-related dependencies:

<dependencyManagement> <dependencies> <dependency> <groupId>com.alibaba.cloud</groupId> < artifactId > spring - the cloud - alibaba - dependencies < / artifactId > < version > 2.2.2. RELEASE < / version > < type > pom < type > <scope>import</scope> </dependency> </dependencies> </dependencyManagement>Copy the code

Service access SCA Sentinel

Firstly, we introduce Spring Cloud Alibaba Sentinel dependency for three service modules respectively:

<dependency>
    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-starter-alibaba-sentinel</artifactId>
</dependency>
Copy the code

The Starter automatically configures the Sentinel adapter module, enabling Sentinel to be quickly accessed and connected to the Sentinel console with simple configuration.

For Dubbo services, we also need to introduce additional Dubbo adaptation modules. Sentinel provides an out-of-the-box integration module for Apache Dubbo, simply by introducing the Sentinel-Apache-Dubbo-Adapter dependency to access Dubbo automatic buried point statistics (provider and consumer supported) :

<dependency> <groupId>com.alibaba.csp</groupId> <artifactId>sentinel-apache-dubbo-adapter</artifactId> The < version > 1.8.0 comes with < / version > < / dependency >Copy the code

We added adapter dependencies to the POM files of web-apI-Demo and Dubbo provider so that the Dubbo consumer/provider interfaces of the two applications could be automatically counted by Sentinel.

For Spring Cloud Gateway, Zuul 1.x and other gateways, we also need to introduce additional spring-cloud-Alibaba-Sentinel-Gateway dependencies on the basis of the above SCA dependencies:

<dependency>
    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-alibaba-sentinel-gateway</artifactId>
</dependency>
Copy the code

This dependency automatically adds Sentinel related configuration to the gateway so that the API Gateway can automatically access Sentinel. We added this dependency to the POM file of the Demo-Gateway application so that our Gateway application could access Sentinel.

With the introduction of dependencies, we were able to quickly access the Sentinel console with a simple configuration. Properties file to configure the application name and console address, using web-api-demo as an example:

spring.application.name=foo-web
spring.cloud.sentinel.transport.dashboard=localhost:8080
Copy the code

The name of Spring. Application. I believe you are familiar with it. Here, Spring Cloud Alibaba Sentinel will automatically extract this value as the appName of the access application. At the same time we use spring. Cloud. Sentinel. Transport. The dashboard configuration is to connect to the console address and port.

After the configuration is complete, start the dubo-provider, web-api-demo, and Demo-gateway applications in sequence, and access localhost:8090/demo/time through the gateway to obtain the current time. After triggering the service, we can see our three applications later on the Sentinel console, and we can see the access information on the monitoring page, which represents successful access.

We can see buried calls for the current application in the cluster link page of each application. For example, the Web application can see the Web URL and the Dubbo consumer call:

Flow control rules

Now let’s match the simplest flow control rule. At the Dubbo Provider end, we enter the cluster point link page, For com. Alibaba. CSP. Sentinel. Demo. Dubbo. FooService: getCurrentTime (Boolean) rules of the current limiting service call configuration (needs a views to see). We have a flow control rule with a QPS of 1, which means that calls to the service method should not exceed 1 call per second, beyond which calls are rejected outright.

Click “Add” button to add the rule successfully. We can repeatedly request localhost:8090/demo/time in the browser (the frequency should not be too slow), and we can see the abnormal information of traffic limiting. (The default traffic limiting logic of Dubbo Provider is to throw an exception, and the exception information is directly returned by Dubbo. And presented by Spring as the default error page) :

We can also see the real-time views and rejections on the “Real-time Monitoring” page:

We can also configure traffic limiting rules on the Web API to see the effect. Spring Web’s default stream limiting logic is to return the default prompt (Blocked by Sentinel) with a status code of 429. We will describe how to customize the flow control processing logic in later sections.

Knowing the basic usage of traffic limiting, you may wonder: Do I need to configure flow control rules for each interface in the production environment? What if the thresholds are not matched? In fact, the configuration of traffic limiting degradation needs to be combined with capacity planning and depends on combing. We can use pressure measurement tools such as JMeter or ALI Cloud PTS to perform full-link pressure measurement on our services, understand the maximum bearing capacity of each service, and determine the maximum capacity of the core interface as a QPS threshold.

Gateway flow control rules

Sentinel customizes the API Gateway flow control scenario. It supports flow control for Gateway routes (such as foo-service-route defined by Gateway above) or user-defined API groups. It also supports flow control for request attributes (such as a header). Users can customize API groups in the Sentinel console, which can be viewed as combinations of URL matches. For example, we could define an API called my_api and ask /foo/** and /baz/** to fall under the my_API group. When limiting traffic, you can limit traffic for this user-defined API grouping dimension.

Let’s configure a gateway flow control rule on the console for the Gateway. We can see some differences between the API Gateway console page and the normal application page. These are the customizations for the Gateway scenario. Sentinel gateway flow control rules can extract request attributes of a route, including remote IP, header, URL parameters, cookie, etc., automatically count hotspot values and limit them separately, and also limit a specific value (such as a uid limit).

We assign a gateway flow rule to foo-service-route for request attributes. This rule limits each hotspot UID parameter extracted from the URL parameter to a maximum of two requests per minute.

Once the rule is saved, we can construct requests for back-end services with different uid parameters (even if they are not used), such as localhost:8090/demo/time? Uid = XXX. We can observe that more than two visits per uid per minute will result in a stream limiting page.

For detailed configuration guidelines and implementation principles of Sentinel gateway flow control, please refer to the Gateway flow control documentation.

Fuse downgrading rules

Fuse downgrades are often used to automatically disconnect unstable services and prevent cascading failures by dragging down callers. Fusing degrade rules are usually configured on the calling side for weakly dependent calls. When fusing, the predefined FALLback value is returned to ensure that the core link is not affected by unstable bypasses.

Sentinel offers the following circuit breaker strategies:

  • SLOW_REQUEST_RATIO: Select SLOW_REQUEST_RATIO as the threshold. You need to set RT (maximum response time) for slow calls. If the response time of a request is greater than this value, the request is counted as slow calls. If the number of requests within a statIntervalMs (1s by default) is greater than the minimum number of requests and the ratio of delayed calls is greater than the threshold, the requests will be fused automatically within the following fuse duration. After the fuse duration, the fuse will enter the probe recovery state (half-open state). If the response time of the next request is less than the set slow-call RT, the fuse will end. If the response time is longer than the set slow-call RT, the fuse will be disconnected again.
  • ERROR_RATIO: When the number of requests in a unit statistics period is greater than the minimum number of requests and the percentage of exceptions is greater than the threshold, the requests will be fused automatically in the following fusing period. After the fuse period, the fuse enters the probe half-open state, terminating the fuse if the next request completes successfully (without error), otherwise it will be fused again. The threshold range for abnormal ratio is[0.0, 1.0], represents 0-100%.
  • Number of exceptions (ERROR_COUNT) : When the number of exceptions in a unit statistics period exceeds the threshold, the circuit breaker is automatically disabled. After the fuse period, the fuse enters the probe half-open state, terminating the fuse if the next request completes successfully (without error), otherwise it will be fused again.

Let’s configure the slow call fuse breaker rule for Dubbo Consumer in a Web application and simulate the slow call to see the effect. We in the web – API – demo for com. Alibaba. CSP. Sentinel. Demo. Dubbo. FooService service call configuration fusing downgrade rules.

The statistics duration configured on the console is 1s by default. In the above rule, we set the critical value of slow call as 50ms, and the response time beyond 50ms is recorded as slow call. When the number of requests >=5 within the statistical period and the proportion of delayed calls exceeds the configured threshold (80%), a fuse will be triggered. The fuse duration is 5s. After the fuse duration, a probe will be allowed to pass.

In our example, the /demo/time API simulates slow calls with the slow request parameter, which takes more than 100ms when slow=true. Localhost :8090/demo/time? Slow =true, you can observe the return of a fuse break

If we keep simulating slow calls, we can observe that a request is allowed to pass every 5s after the fuse is broken, but the request is still a slow call, and the fuse is reset with no recovery. After triggering the fuse, we can wait for a period of time and manually send a normal request without slow=true. Then we can observe that the fuse is restored.

It is important to note that even if the service caller introduces a circuit breaker degradation mechanism, we still need to configure request timeout on the HTTP or RPC client to provide a cushion.

Annotation mode custom buried point

The burial points we just saw are automatic burial points provided by the Sentinel Adapter module. Sometimes the automatic burying point may not meet our needs, we want to limit the flow at a certain location of the business logic, can we do that? Of course you can! Sentinel provides two ways to customize buried points: SphU API and @SentinelResource annotation. The SphU API is the most common but complex code with high coupling. Annotations are less intrusive, but are limited by usage scenarios. Here we add annotations to the DemoService of the Web application to achieve the target of local service burial.

In DemoService we implemented a simple greeting service:

@Service public class DemoService { public String bonjour(String name) { return "Bonjour, " + name; }}Copy the code

Let’s add a @sentinelResource annotation to the bonjour function. The value of the annotation represents the name of the buried point, which will be displayed on the cluster point link/monitor page.

@SentinelResource(value = "DemoService#bonjour")
public String bonjour(String name)
Copy the code

After adding this annotation, when accessing the /demo/bonjour/{name} API through the gateway, we can see our custom DemoService#bonjour buried in the cluster link page.

Adding annotation burying points is just the first step. In general, in a production environment, we want to have some fallback logic when limiting the flow of these custom buried points, rather than directly throwing exceptions. Here we can write a fallback function:

public String bonjourFallback(Throwable t) {
    if (BlockException.isBlockException(t)) {
        return "Blocked by Sentinel: " + t.getClass().getSimpleName();
    }
    return "Oops, failed: " + t.getClass().getCanonicalName();
}
Copy the code

Our fallback function takes a Throwable argument from which to retrieve exception information. The Sentinel annotated Fallback catches business exceptions and flow control exceptions (i.e., BlockException and its subclasses) that we can handle in fallback logic (such as logging) and return the value of fallback.

Note: The Sentinel annotation requires method signatures for fallback and blockHandler functions, as described in the documentation here.

Once we’ve written our implementation of the fallback function, we’ll specify this in the @sentinelResource annotation:

@SentinelResource(value = "DemoService#bonjour", defaultFallback = "bonjourFallback")
public String bonjour(String name)
Copy the code

This way, when our custom DemoService#bonjour resource is restricted or fuses, the request will go to the fallback logic and return the fallback result instead of directly throwing an exception. We can set a QPS=1 limiting rule and observe the return value after a quick request:

?  ~ curl http://localhost:8090/demo/bonjour/Sentinel
Bonjour, Sentinel
?  ~ curl http://localhost:8090/demo/bonjour/Sentinel
Blocked by Sentinel: FlowException
Copy the code

Note: using the @sentinelresource annotation requires that the corresponding class must be spring-hosted (that is, a Spring bean), not bean internal call (no way to go to a proxy), and not bea private method. The Sentinel annotation relies on the Spring AOP dynamic proxy mechanism.

Configure custom flow control processing logic

Sentinel adaptions support custom flow control logic. Taking Spring Web adaptation as an example, we only need to provide a custom BlockExceptionHandler implementation and register it as a bean to provide custom processing logic for Web burial points. BlockExceptionHandler is defined as follows:

Public interface BlockExceptionHandler {// Handle traffic limiting exceptions here. Void Handle (HttpServletRequest Request, HttpServletResponse Response, BlockException e) throws Exception; }Copy the code

An example of Web buried point custom flow control processing logic is provided in our Web application:

@Configuration public class SentinelWebConfig { @Bean public BlockExceptionHandler sentinelBlockExceptionHandler() { return (request, response, e) -> { // 429 Too Many Requests response.setStatus(429); PrintWriter out = response.getWriter(); out.print("Oops, blocked by Sentinel: " + e.getClass().getSimpleName()); out.flush(); out.close(); }; }}Copy the code

This handler retrieves the flow control type and prints a return message with a status code of 429. You can configure the jump or customize the return information based on actual service requirements.

As for annotation methods, we mentioned in the previous section that we can specify the fallback function to handle flow control exceptions and business exceptions. For Dubbo adaptation, we can register provider/ Consumer Fallback with DubboAdapterGlobalConfig to provide custom flow control processing logic. For Spring Cloud Gateway adaptation, we can register our custom BlockRequestHandler implementation class to register our custom processing logic for Gateway flow control.

Support for other components of Spring Cloud

Spring Cloud Alibaba Sentinel also provides support for other common components of Spring Cloud, including RestTemplate, Feign, etc. We don’t have space to do this. You can refer to the Spring Cloud Alibaba documentation for access and configuration.

How to select flow control degradation components

At this point, you might wonder: How does Sentinel compare to other products like Hystrix? Is it necessary to migrate to Sentinel? How to migrate quickly? Here’s how Sentinel compares to other fault-tolerance components:

Sentinel Hystrix resilience4j
Isolation strategy Semaphore isolation (concurrent control) Thread pool isolation/semaphore isolation Semaphore isolation
Fuse downgrading strategy Based on the ratio of slow calls, exception ratio and number of exceptions Based on abnormal scale Based on the proportion of exceptions, response time
Real-time statistical implementation Sliding Windows (LeapArray) Sliding Windows (based on RxJava) Ring Bit Buffer
Dynamic Rule Configuration Support for multiple data sources Support for multiple data sources Support co., LTD.
scalability Multiple extension points Plug-in form Interface form
Annotation-based support support support support
Current limiting Based on QPS, traffic limiting based on call relationships is supported Limited support Rate Limiter
Traffic shaping Support preheating mode and uniform queuing control effect Does not support Simple Rate Limiter mode
System adaptive protection support Does not support Does not support
Multilingual support Java/Go/C++ Java Java
Service Mesh support Support Envoy/Istio Does not support Does not support
The console Provides out-of-the-box console for configuring rules, real-time monitoring, machine discovery, and more Simple monitoring view No console is provided, and other monitoring systems can be connected

conclusion

Through this tutorial, we learned about the importance of flow control degradation as a highly available means of protection, learned about the core features and principles of Sentinel, and learned how to quickly access SCA Sentinel to perform flow control degradation for microservices with a hands-on approach. Sentinel also has many advanced features to explore, such as hotspot protection and cluster flow control. You can refer to the Sentinel official documentation for more features and scenarios.

So is the magnitude of the service so small that there is no need for traffic limiting protection? Is the architecture of microservices too simple to introduce circuit breakers? In fact, this is independent of the magnitude of the request or the complexity of the architecture. Many times, it can be a very marginal service that fails and the entire business is affected, causing huge losses. We need to have a sense of failure oriented design, and do a good job of capacity planning and sorting out strong and weak dependence at ordinary times, reasonably configure flow control degradation rules, and do a good job of prevention in advance, rather than to remedy problems after they occur online.

At the same time, we also offer AHAS Sentinel, the enterprise version of Sentinel, on Alibaba Cloud, providing enterprise-level high availability protection capability out of the box. Compared to the open source version, AHAS also offers the following expertise:

  • Reliable real-time monitoring and historical second-level monitoring data, including interface QPS, response time, system load, and CPU usage, can be classified by call type and displayed year-on-year or quarter-on-quarter
  • Top K interface monitoring statistics, quickly understand the slow call system and large traffic interface; Thermal map overview, quick location of unstable machines
  • Java Agent /K8s Fast access to Java applications with zero intrusion, supporting nearly 20 mainstream frameworks and API gateways
  • Fully automated hosted, highly available cluster traffic control
  • Nginx flow control, support rule dynamic configuration, cluster flow control

You are welcome to try out the enterprise version of Sentinel on the cloud and contribute to help the community evolve.