Summary: The stability of microservices has always been a topic of great concern to developers. As services evolve from single architecture to distributed architecture and deployment modes change, the dependency relationship between services becomes more and more complex, and service systems also face huge high availability challenges.

Author: Doodle

The stability of microservices has always been a topic of great concern for developers. As services evolve from single architecture to distributed architecture and deployment modes change, the dependency relationship between services becomes more and more complex, and service systems also face huge high availability challenges. Application High Availability Service (AHAS) is a cloud product deposited by Alibaba’s internal High Availability system for many years. It takes traffic and fault tolerance as the entry point. It helps ensure the stability of services and gateways from multiple dimensions, including flow control, unstable call isolation, fusible downgrading, hotspot traffic protection, system adaptive protection, and cluster flow control. It also provides second-level traffic monitoring and analysis functions. AHAS not only has a wide range of applications in e-commerce fields such as Taobao and Tmall within Alibaba, but also has a large number of practices in Internet finance, online education, games, live broadcasting and other large government-owned enterprises.

Flow funnel protection principles

In a distributed system architecture, each request goes through many layers of processing, such as calls from the gateway to the Web Server to the service, to storage such as the service access cache or DB. In the high availability traffic protection system, we usually follow the traffic funnel principle to implement high availability traffic protection. At each layer of the traffic link, we need to carry out targeted traffic protection and fault tolerance measures to ensure the stability of the service. At the same time, traffic protection should be front-loaded as much as possible. For example, some TRAFFIC control of HTTP requests should be front-loaded to the gateway layer and some traffic should be controlled in advance. In this way, excess traffic can be avoided to the back end, causing pressure to the back end and resource waste.

Ingress/Nginx Gateway traffic control

Nginx is a popular high-performance open source server, while Ingress is the actual Kubernetes cluster traffic portal. AHAS Sentinel provides the Ingress/Nginx gateway with native inbound traffic control capability, prefacing traffic protection and intercepting excess traffic in advance to ensure the stability of back-end services. The recently released AHAS Nginx traffic protection plug-in is based on the Sentinel C++ native version. Compared to the old sidecar version, the new AHAS Nginx traffic protection plug-in has a large number of performance optimizations. It can guarantee accurate traffic control in tens of thousands of QPS scenarios without impacting the performance of the gateway itself.

AHAS Nginx/Ingress protection has the following core capabilities and advantages:

  • Low cost of use: With simple configuration, Nginx/Ingress gateway can be quickly connected to AHAS traffic protection and visually configured on the console for monitoring, rules, and return behavior
  • The flow control rules are dynamically configured on the console and take effect in real time without reload Nginx
  • Precise total flow control at the entrance: AHAS Nginx/Ingress protection supports precise total flow control at the level of tens of thousands of QPS, and supports customized flow control granularity (such as a set of Host and URL dimensions, or even parameters and IP dimensions)
  • Supporting observability to learn about gateway traffic and the effectiveness of protection rules in real time

Here we use an example to introduce how to quickly connect the Ingress gateway in Kubernetes cluster to AHAS to play flow control capability and ensure service stability.

Fast play AHAS Ingress traffic protection

First, we assume that we have an ACK cluster that has been created for ali Cloud container service (if there is no Ingress in the cluster, you can install it manually in ACK component management). We just need to add the following two fields to the nginx-Configuration item (ConfigMap) in the kube-system namespace:

use-sentinel: true
sentinel-params: --app=ahas-ingress-demo
Copy the code

Nginx/Ingress traffic protection can be implemented. Now open the AHAS console and see the Ingress gateway named ahas-ingress-demo.

Once AHAS traffic protection is successfully enabled, all we need to do is define a request group. Click on the Request Group Management Tab and create a new request group named test1. We set Host to an exact match type with a value of 127.0.0.1; Set Path to a prefix matching type with a value of /test/. The specific configuration is shown in the figure below:

At this point we can expect that all requests with Host 127.0.0.1 and a path starting with /test/ will be grouped into a group named test1. At this time we visit a URL that match the request packet, such as http://127.0.0.1/test/demo, details in AHAS console – interface monitoring page can see test1 the packet traffic monitoring.

Add a new flow control rule to the interface Details Tab or rule Management Tab:

The flow control configuration for the test1 request group is complete. This flow control rule means that within one second, the requests with more than 10 requests in the group will be blocked. The threshold takes effect in the single-node dimension. By default, the 429 Too Many Requests status code is returned when a request is intercepted. You can also configure flow-triggered return logic using ConfigMap or directly on the console.

If we use the pressure measuring tool to initiate the flow with QPS greater than 10, the specific effect will be shown as the figure below (interface detail monitoring) :

If we want to accurately control the total number of cluster access for a certain request group, we can configure the cluster flow control rule and set the total threshold, regardless of the number of gateway instances and load balancing.

To sum up, for flow control of Ingress/Nginx gateway, we need to define a request group and then configure flow control rules for this group. For the complete process, please refer to the following flow chart:

The above is an example of Ingress flow control practice on alibaba cloud container service ACK cluster. If you are a self-built Ingress or Nginx, you can also reference the following two article for quick access:

  • Help.aliyun.com/document\_d…
  • Help.aliyun.com/document\_d…

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.