Gateway Spring Cloud Zuul adds automatic retry to Zuul routes

The article directories

- The profile
- Rely on the import
- The configuration file
- Isolation mechanism
- Retry mechanism

The profile

When Netflix Zuul is used as a gateway to forward incoming requests to the back-end service, there is always a chance that the request may not be available to the back-end service.

When a request fails, you may want to retry the request automatically. To do this, when using Sping Cloud Netflix, you need to include Spring Retry in your application’s classpath. When Spring Retry occurs, load-balanced Zuul automatically retries any failed requests (Zuul retries twice if the back-end service is down in the example below).

The default HTTP client used by Zuul is now the Apache HTTP client instead of the deprecated RestClient of the Ribbon.
Netflix Ribbon HTTP client: by setting the opening Ribbon. The restclient. Enabled = true. The client has limitations, including no support for the PATCH method, but also a built-in retry capability.

Corresponding use of Client source code:

/**
 * An Apache HTTP client which leverages Spring Retry to retry failed requests.
 *
 * @author Ryan Baxter
 * @author Gang Li
 */
public class RetryableRibbonLoadBalancingHttpClient
		extends RibbonLoadBalancingHttpClient {
Copy the code

Rely on the import

      <dependency>
         <groupId>org.springframework.retry</groupId>
         <artifactId>spring-retry</artifactId>
         <version>1.3.0</version>
      </dependency>
Copy the code

The configuration file

Configuration items:

Ribbon.MaxAutoRetries: 1 – Maximum number of retries on the same server (excluding the first attempt)
Ribbon. MaxAutoRetriesNextServer: 1 – will try again next the maximum number of servers (not including the first server)
Ribbon. OkToRetryOnAllOperations: true – whether can retry this client all operations
Ribbon. ServerListRefreshInterval: 2000 – the refresh interval of server list

roadnet-service:
  ribbon:
    NIWSServerListClassName: com.netflix.loadbalancer.ConfigurationBasedServerList
    listOfServers: http://10.7.11.13:9006,http://localhost:8081
    ConnectTimeout: 1000
    ReadTimeout: 3000
    MaxTotalHttpConnections: 500
    MaxConnectionsPerHost: 100
    MaxAutoRetries: 1
    MaxAutoRetriesNextServer: 1
Copy the code

Isolation mechanism

In the microservices model, the connections between applications become less strong, and ideally any application that gets overloaded or dies should not affect the other applications. But at the Gateway level, is it possible that one application becomes so overloaded that the Gateway collapses and all applications are cut off?

This is certainly possible, imagine an application that receives many requests per second. In normal circumstances, these requests might respond within 10 milliseconds, but if it goes wrong one day, all requests will be blocked until 30 seconds have expired (for example, the frequent Full GC fails to free up memory efficiently). At this point, the Gateway will also have a large number of threads waiting for a response to the request, eventually eating up all the threads and affecting other normal application requests.

In Zuul, each back-end application is called a Route. To prevent one Route from preempting too many resources and affecting other routes, Zuul uses Hystrix to isolate and limit traffic for each Route.

Hystrix has two isolation strategies, thread-based or semaphore – based. Zuul has a thread-based isolation mechanism by default, which means that each Route request is executed in a fixed size, separate thread pool, so that if one Route has a problem, only one thread pool blocks and the other routes are not affected. (Semaphore by default in 2.27)

With Hystrix, the semaphore isolation strategy is typically used only when the thread overhead is affected by the high call volume, and thread isolation is more secure for network request purposes such as Zuul.

Retry mechanism

In general, the health of back-end applications is unstable and the list of applications can change at any time, so the Gateway must have sufficient fault tolerance to reduce the impact of back-end application changes.

Zuul routes are routed in two modes: Eureka and the Ribbon. The following describes the fault-tolerant configurations supported by the Ribbon.

There are three retry scenarios:

OkToRetryOnConnectErrors: Retries network errors only
OkToRetryOnAllErrors: Retry all errors
OkToRetryOnAllOperations: Retries all operations.

There are two types of retries:

MaxAutoRetries: indicates the maximum number of retries for a node
MaxAutoRetriesNextServer: indicates the maximum number of retries for replacing a node

In general, we want to retry only when the network connection fails, or retry 5XX GET requests (retries on POST requests are not recommended, as idempotent data inconsistencies are not guaranteed). The number of retries for a single node should be as small as possible and the number of retry nodes should be as large as possible to achieve better overall performance.

If there is a more complicated retry scenarios, such as the need for some specific apis, the return value of a particular retry, you can also by implementing RequestSpecificRetryHandler custom logic (does not recommend using RetryHandler directly, Because this subclass can use a lot of existing functionality).

Reference:

www.jianshu.com/p/e0434a421…

Keep warm together and make progress together

🍎QQ group [837324215] 🍎 pay attention to my public number [Java Factory interview officer], learn together 🍎🍎🍎 🍎 personal vx [Lakernote]

Gateway Spring Cloud Zuul adds automatic retry to Zuul routes

The article directories

The profile

Rely on the import

The configuration file

Isolation mechanism

Retry mechanism

Related Posts

Talk about @async annotations and Future types

Pygame library | Python theme month

Subseries | Java brush title punch card