Spring Cloud(07) — Introduction and Use of Ribbon Load balancing services and Spring Cloud(08) — Feign Service Interface Invocation solve the problem of inter-service invocation, now to deal with service fuses and degradation in distributed scenarios.

1. Background knowledge

Problems faced by distributed systems

Applications in complex distributed architectures have dozens of dependencies, each of which will inevitably fail at some point.

As shown above: the request above requires four services A, P, H, I to be called. If all goes well, there is no problem. The key is what happens if the I service times out? There will be a service avalanche.

Service avalanche

When invoking between microservices, it is assumed that microservice A calls microservice B and microservice C, and microservice B and C call other microservices, which is the so-called “fan-out effect”. If the response time of A microservice invocation on the fan-out link is too long or unavailable, the invocation of microservice A will occupy more and more system resources, leading to the system crash, which is the “avalanche effect”.

2. Hystrix Circuit breaker Overview

Hystrix profile

Hystrix is an open source library for handling latency and fault tolerance in distributed systems, where many dependencies inevitably fail calls, such as timeouts, exceptions, etc. Hystrix can improve the resiliency of distributed systems by ensuring that the entire service does not fail in the event of a dependency failure, avoiding cascading failures.

“Circuit breaker” itself is a kind of switching device. When a service unit fails, through the fault monitoring of the circuit breaker (similar to blowing a fuse), an expected and manageable FallBack response (FallBack) is returned to the caller instead of waiting for a long time or throwing exceptions that cannot be handled by the calling method. This ensures that the threads of service callers are not tied up unnecessarily for long periods of time, preventing failures from spreading and even avalanches in a distributed system.

Hystrix was designed for

  • Service degradation
  • Service fusing
  • Near real-time monitoring
  • .

Hystrix making address

Hystrix’s GitHub website has this quote:

Hystrix is no longer in active development, and is currently in maintenance mode.

Hystrix is no longer under active development and is currently in maintenance mode

Although the Hystrix component has stopped being updated for now, there is no let-up in the enthusiasm of the project.

3. Important Concepts of Hystrix

Service degradation

Instead of waiting for a long time or throwing an exception that the calling method cannot handle, return an expected, processable alternative response (FallBack) to the caller

When service degradation occurs:

  • Abnormal program running
  • timeout
  • Service fuse fault triggers service degradation
  • A full thread pool/semaphore can also cause service degradation

Service fusing

Analogy fuse, after reaching the maximum service traffic, directly deny access, pull the power limit, and then call the method of service degradation and return a friendly prompt.

Service current limiting

Current limiting is the purpose of through to the concurrent access/request for speed or a request within the time window for the speed limit to protect the system, once reached limit rate could be denial of service (directed to the error page or inform resources without), line, or wait for (such as seconds kill, reviews, order), the drop (return out data or the default data, If the product details page inventory is available by default).

4. Hystrix uses the environment to build +Jmeter high concurrency test

4.1 Establishment of Hystrix service provider environment

1. Create the cloud-provider-Hystrix-Payment8001 module

2. Import POM dependencies

<dependencies>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-devtools</artifactId>
        <scope>runtime</scope>
        <optional>true</optional>
    </dependency>

    <dependency>
        <groupId>com.cheng.springcloud</groupId>
        <artifactId>cloud-api-commons</artifactId>
        <version>${project.version}</version>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
    </dependency>

    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>

</dependencies>
Copy the code

3. Write the YML configuration file

server:
  port: 8001

spring:
  application:
    name: cloud-provider-hystrix-payment


# eureka configuration
eureka:
  client:
    service-url:
      defaultZone: http://localhost:7001/eureka/   Stand-alone version
# defaultZone: http://eureka7001.com:7001/eureka/,http://eureka7002.com:7002/eureka/
    register-with-eureka: true Register yourself with Eureka Server
    Whether to fetch existing registration information from eurekaServer. Default is true. Single Eureka does not matter. Eureka clusters must be set to True in order to use load balancing with the ribbon
    fetchregistry: true
  instance:
    instance-id: springcloud-provider-payment8001   # Customize status information
    prefer-ip-address: true  Access path display IP address
info:
  app.name: springcloud-pengcheng
  company.name: wanli
Copy the code

4. Create the main startup class

@SpringBootApplication
@EnableEurekaClient
public class PaymentHystrixMain8001 {
    public static void main(String[] args) { SpringApplication.run(PaymentHystrixMain8001.class,args); }}Copy the code

5. Write business classes

Two methods: one is executed directly and the other is executed after a delay of 3 seconds

package com.cheng.springcloud.service;

import org.springframework.stereotype.Service;

import java.util.concurrent.TimeUnit;

@Service
public class PaymentService {

    public String paymentInfo_Ok(Integer id){
        return "Thread pool:"+Thread.currentThread().getName()+" paymentInfo_OK,id: "+id+"\t"+"O (studying studying) O ha ha ~";

    }

    // Delay execution by 3 seconds
    public String paymentInfo_TimeOut(Integer id){
        int timeout = 3;
        try {
            TimeUnit.SECONDS.sleep(timeout);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return "Thread pool:"+Thread.currentThread().getName()+" paymentInfo_TimeOut,id: "+id+"\t"+"O (studying studying) O ha ha ~"+"Time (s) :"+timeout; }}Copy the code

6. Write controllers

package com.cheng.springcloud.controller;

import com.cheng.springcloud.service.PaymentService;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

import javax.annotation.Resource;

@RestController
@Slf4j
public class PaymentController {

    @Resource
    private PaymentService paymentService;

    @Value("${server.port}")
    private String serverPort;

    @GetMapping(value = "/payment/hystrix/ok/{id}")
    public String paymentInfo_Ok(@PathVariable("id") Integer id){
        String result = paymentService.paymentInfo_Ok(id);
        log.info("=====result:"+result);
        return result;
    }
    
    @GetMapping(value = "/payment/hystrix/timeout/{id}")
    public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
        String result = paymentService.paymentInfo_TimeOut(id);
        log.info("=====result:"+result);
        returnresult; }}Copy the code

7, test,

  1. Start the cloud-Eureka-Server7001 and Cloud-Eureka-Server7002 modules
  2. Start the cloud-provider-Hystrix-Payment8001 module

Access request: http://localhost:8001//payment/hystrix/ok/1

Access request: http://localhost:8001//payment/hystrix/timeout/1 delay 3 seconds before execution

Self-test OK!

4.2 Jmeter high concurrency test

In the case of non-high concurrency, the above can be barely satisfied, we use Jmeter pressure test below:

Start Jmeter and use 20000 concurrent access 8001 and 20000 requests to access the paymentInfo_TimeOut service

  1. Add a thread group to Jmeter

2. Create 20000 concurrent threads in the thread group and save

Create the HTTP request

4. Set the access path in the request and save it

5. Start 20000 concurrent tests:

6. View the results

  1. Start the cloud-Eureka-Server7001 and Cloud-Eureka-Server7002 modules
  2. Start the cloud-provider-Hystrix-Payment8001 module
  3. Start the cloud-consumer-Feign-Hystrix-Order80 module

Access request: http://localhost:8001//payment/hystrix/ok/1

Access request: http://localhost/consumer/payment/hystrix/timeout/2

The results show:

The paymentInfo_TimeOut and paymentInfo_Ok services are not executed immediately. Obviously, the two services are under the same micro-service. Due to the concurrency of paymentInfo_TimeOut, the default tomcat worker thread number is full. There are no extra threads to decompose the stress and processing, so the execution of the paymentInfo_Ok service is delayed.

4.3 Join service consumers for high concurrency test

Create the cloud-consumer-feign-Hystrix-Order80 module

2. Import POM dependencies

<dependencies>

    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-openfeign</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-devtools</artifactId>
        <scope>runtime</scope>
        <optional>true</optional>
    </dependency>

    <dependency>
        <groupId>com.cheng.springcloud</groupId>
        <artifactId>cloud-api-commons</artifactId>
        <version>${project.version}</version>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
    </dependency>

    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>

</dependencies>
Copy the code

3. Write the YML configuration file

server:
  port: 80


eureka:
  client:
    register-with-eureka: false
    service-url:
      defaultZone: http://eureka7001.com:7001/eureka/,http://eureka7002.com:7002/eureka/
      
# Set feign client timeout
ribbon:
  # The time it took to establish the connection
  ReadTimeout: 5000
  The time to read the available resources from the server after the connection is established
  ConncetTimeout: 5000
Copy the code

4. Create the main startup class

@SpringBootApplication
@EnableFeignClients
public class OrderHystrixMain80 {
    public static void main(String[] args) { SpringApplication.run(OrderHystrixMain80.class,args); }}Copy the code

5. Write feIGN interface

@Component
@FeignClient(value = "CLOUD-PROVIDER-HYSTRIX-PAYMENT")
public interface PaymentHystrixService {

    @GetMapping(value = "/payment/hystrix/ok/{id}")
    public String paymentInfo_Ok(@PathVariable("id") Integer id);

    @GetMapping(value = "/payment/hystrix/timeout/{id}")
    public String paymentInfo_TimeOut(@PathVariable("id") Integer id);

}
Copy the code

6. Write controllers

@RestController
@Slf4j
public class PaymentHystrixController {

    @Resource
    private PaymentHystrixService paymentHystrixService;

    @GetMapping(value = "/consumer/payment/hystrix/ok/{id}")
    public String paymentInfo_Ok(@PathVariable("id") Integer id){
        return paymentHystrixService.paymentInfo_Ok(id);
    }

    @GetMapping(value = "/consumer/payment/hystrix/timeout/{id}")
    public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
        returnpaymentHystrixService.paymentInfo_TimeOut(id); }}Copy the code

7, test,

  1. Start the cloud-Eureka-Server7001 and Cloud-Eureka-Server7002 modules
  2. Start the cloud-provider-Hystrix-Payment8001 module
  3. Start the cloud-consumer-Feign-Hystrix-Order80 module

Access request: http://localhost:8001//payment/hystrix/ok/1

Access request: http://localhost/consumer/payment/hystrix/timeout/2

Results analysis:

Other interface services at the same level as the 8001 microservice are stuck because the Tomcat thread pool is crowded with worker threads. 80 When the service of 8001 is invoked, the client responds slowly or even reports a timeout exception.

conclusion

Problems encountered in the above tests:

  • The timeout causes the server to slow down
  • Error (downtime or program execution error)

How to solve:

  • Service provider 8001 has timed out, and service consumer 80 cannot remain stuck, there must be a service downgrade
  • Service provider 8001 is down, and service consumer 80 can’t keep waiting, there must be a service downgrade
  • Service provider 8001 is normal. Service consumer 80 is faulty or has requirements (for example, its waiting time is shorter than the service provider’s processing time)

5. Service degradation

5.1 Service provider 8001 degrades the service

Setting timeout exception: in the service provided by 8001 service provider, set the peak value (3S) of its call timeout time. Within the peak value, it can run normally. If the value exceeds the peak value, an alternative method is needed to deal with it as a fallback of service degradation.

1. Modify the service in the cloud-provider-Hystrix-Payment8001 module:

package com.cheng.springcloud.service;

import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand;
import com.netflix.hystrix.contrib.javanica.annotation.HystrixProperty;
import org.springframework.stereotype.Service;
import org.springframework.web.bind.annotation.PathVariable;

import java.util.concurrent.TimeUnit;

@Service
public class PaymentService {

    public String paymentInfo_Ok(Integer id){
        return "Thread pool:"+Thread.currentThread().getName()+" paymentInfo_OK,id: "+id+"\t"+"O (studying studying) O ha ha ~";

    }

    @HystrixCommand(fallbackMethod = "paymentInfo_TimeOutHandler",commandProperties = { @ HystrixProperty (name = "execution. The isolation. Thread. TimeoutInMilliseconds", value = "3000") / / rules for normal business hours peak 3 s})
    public String paymentInfo_TimeOut(Integer id){
        int timeout = 5;
        try {
            TimeUnit.SECONDS.sleep(timeout);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return "Thread pool:"+Thread.currentThread().getName()+" paymentInfo_TimeOut,id: "+id+"\t"+"O (studying studying) O ha ha ~"+"Time (s) :"+timeout;
    }

    public String paymentInfo_TimeOutHandler(@PathVariable("id") Integer id){
        return "Thread pool:"+Thread.currentThread().getName()+" paymentInfo_TimeOutHandler,id: "+id+"\t"+"O (╥ man ╥) o"; }}Copy the code

Add the @enablecircuitbreaker annotation to the main startup class

3. Start the cloud-provider-Hystrix-Payment8001 module test

Access request: http://localhost:8001/payment/hystrix/timeout/1

After the peak service time, service degradation was performed and an alternative approach was executed.

Replace the above timeout exception with a calculated exception int age = 10/0

    @HystrixCommand(fallbackMethod = "paymentInfo_TimeOutHandler",commandProperties = { @ HystrixProperty (name = "execution. The isolation. Thread. TimeoutInMilliseconds", value = "3000") / / rules for normal business hours peak 3 s})
    public String paymentInfo_TimeOut(Integer id){
        int age = 10/0;
        return "Thread pool:"+Thread.currentThread().getName()+" paymentInfo_TimeOut,id: "+id+"\t"+"O (studying studying) O ha ha ~";
    }

    public String paymentInfo_TimeOutHandler(@PathVariable("id") Integer id){
        return "Thread pool:"+Thread.currentThread().getName()+" paymentInfo_TimeOutHandler,id: "+id+"\t"+"O (╥ man ╥) o";
    }
Copy the code

Access request: http://localhost:8001/payment/hystrix/timeout/1

Service execution exception, service degradation, also executed alternative method

Summary: The current service is not available, perform service degradation, the alternative is paymentInfo_TimeOutHandler.

  • Service provider 8001 has timed out, and service consumer 80 cannot remain stuck, there must be a service downgrade
  • Service provider 8001 is down, and service consumer 80 can’t keep waiting, there must be a service downgrade

5.2 Service consumers 80 Service degradation

1. Set paymentInfo_TimeOut in the cloud-provider-hystrix-Payment8001 service provider to 3s and 5s. Therefore, 8001 runs normally.

2. Enable Hystrix in the YML configuration file of the Cloud-consumer-Feign-Hystrix-Order80 module

feign:
  hystrix:
    enabled: true
Copy the code

3. The main startup class is annotated @enablehystrix

4. Modify the controller of service consumer 80 module

package com.cheng.springcloud.controller;

import com.cheng.springcloud.service.PaymentHystrixService;
import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand;
import com.netflix.hystrix.contrib.javanica.annotation.HystrixProperty;
import feign.hystrix.FallbackFactory;
import lombok.extern.slf4j.Slf4j;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;

import javax.annotation.Resource;
@RestController
@Slf4j
public class PaymentHystrixController {

    @Resource
    private PaymentHystrixService paymentHystrixService;


    @GetMapping(value = "/consumer/payment/hystrix/ok/{id}")
    public String paymentInfo_Ok(@PathVariable("id") Integer id){
        return paymentHystrixService.paymentInfo_Ok(id);
    }

    @GetMapping(value = "/consumer/payment/hystrix/timeout/{id}")
    @HystrixCommand(fallbackMethod = "paymentInfo_TimeOutHandler",commandProperties = { @ HystrixProperty (name = "execution. The isolation. Thread. TimeoutInMilliseconds", value = "1500") / / rules for normal business hours peak 3 s})
    public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
        return paymentHystrixService.paymentInfo_TimeOut(id);
    }

    public String paymentInfo_TimeOutFallbackMethod(@PathVariable("id") Integer id){
        return "From service consumer 80, the opposing payment system is busy, please try again later or check yourself if anything is wrong, o(╥﹏╥)o"; }}Copy the code

We set the peak time for consumer 80 to 1.5 seconds, but provider 8001 will execute the business for 3s, so there will be a timeout exception, and then consumer 80 will degrade the service and execute the alternative paymentInfo_TimeOutFallbackMethod.

Access request http://localhost/consumer/payment/hystrix/timeout/1 test:

The alternative, paymentInfo_TimeOutFallbackMethod, was executed as expected.

Int age = 10/0; Let the 80 fail on its own

Then access request http://localhost/consumer/payment/hystrix/timeout/1 test:

Conclusion:

Service provider 8001 is normal. Service consumer 80 is faulty or has requirements (for example, its waiting time is shorter than the service provider’s processing time)

5.3. The Global service degraded DefaultProperties

In both 5.1 and 5.2, Hystrix used one method to configure one alternative, and the alternative was mixed with the business logic, which inflated the code so that a global approach could be used to degrade the service.

In addition to some important core businesses that have their own options, other common ones can be redirected to the unified processing result page through @defaultProperties (defaultFallback =””).

General exclusive separate, to avoid code inflation, reasonable reduction of code volume.

5.3.1 Solve the problem of code inflation

Design case

80 If a runtime exception occurs on the client, cancel the customized fallback method and use the global fallback method.

1. Add a global fallback to the Controller of the 80 client using the @defaultProperties annotation

package com.cheng.springcloud.controller;

@RestController
@Slf4j
@DefaultProperties(defaultFallback = "payment_Global_FallbackMethod") // Define the global fallback method
public class PaymentHystrixController {

    @Resource
    private PaymentHystrixService paymentHystrixService;


    @GetMapping(value = "/consumer/payment/hystrix/ok/{id}")
    public String paymentInfo_Ok(@PathVariable("id") Integer id){
        return paymentHystrixService.paymentInfo_Ok(id);
    }

    @GetMapping(value = "/consumer/payment/hystrix/timeout/{id}")
// @HystrixCommand(fallbackMethod = "paymentInfo_TimeOutFallbackMethod",commandProperties = {
/ / @ HystrixProperty (name = "execution. The isolation. Thread. TimeoutInMilliseconds", value = "1500") / / regulation peak for 3 s normal business hours
/ /})
    @HystrixCommand // If fallbackMethod is not specified, the globally defined fallback method is used, if specified, the specified fallback method is used
    public String paymentInfo_TimeOut(@PathVariable("id") Integer id){
        int age = 10/0;
        return paymentHystrixService.paymentInfo_TimeOut(id);
    }

    public String paymentInfo_TimeOutFallbackMethod(@PathVariable("id") Integer id){
        return "From service consumer 80, the opposing payment system is busy, please try again later or check yourself if anything is wrong, o(╥﹏╥)o";
    }

    // Global fallback method
    public String payment_Global_FallbackMethod(a){
        return "Global exception processing message, please try again later o(╥﹏╥)o"; }}Copy the code

2. Test:

  1. Start the cloud-Eureka-Server7001 and Cloud-Eureka-Server7002 modules
  2. Start the cloud-provider-Hystrix-Payment8001 module
  3. Cloud – consumer – feign hystrix – order80 module

Access request: http://localhost/consumer/payment/hystrix/timeout/1

According to the results, the global fallback method was executed after the paymentInfo_TimeOut exception. Since paymentInfo_TimeOut does not specify fallbackMethod, the globally defined fallback method is used, or the specified one if specified.

Now that we have solved the problem of code bloat, let’s move on to the problem of high coupling between fallbacks and business logic.

5.3.2 Solve the problem of high coupling degree of code

Design case

In this case, the service degradation is performed on client 80. Assume that server 8001 breaks down suddenly when client 80 accesses server 8001. The global fallback is used to solve the problem.

Workaround: Simply add an implementation class for service degradation handling to the interface defined by the Feign client to achieve decoupling

1. Create the implementation class of Feign interface, and unify the implementation class into the methods in the interface to handle exceptions:

@Component
public class PaymentFallBackService implements PaymentHystrixService{
    @Override
    public String paymentInfo_Ok(Integer id) {
        return "-- The PaymentFallBackService class handles the paymentInfo_Ok method o(╥﹏╥)o--";
    }

    @Override
    public String paymentInfo_TimeOut(Integer id) {
        return "-- The PaymentFallBackService class handles the paymentInfo_TimeOut method o(╥﹏╥)o--"; }}Copy the code

2. Bind the implementation class in the @FeignClient annotation of the Feign interface

@ FeignClient increase attributes: fallback = PaymentFallBackService. Class

3, test,

Start by testing a healthy service, paymentInfo_Ok, that does not specify a fallback method for downgrade processing

Access request: http://localhost/consumer/payment/hystrix/timeout/1

Then the 8001 server is shut, simulation server downtime, access request again: http://localhost/consumer/payment/hystrix/timeout/1

The global downgrade takes effect

Although the server went down, we did a global service downgrade so that clients would be notified when the server was unavailable, rather than constantly accessing the server.

== Create the implementation class of Feign interface, and unify the implementation class into the method in the interface for exception handling, which can solve the problem of code inflation + high coupling degree ==

6. Service fuse

6.1 Service circuit breaker mechanism

Circuit breaker mechanism is a kind of micro – service link protection mechanism to deal with avalanche effect. If a microservice on the fan out link is unavailable or the response time is too long, the service will be degraded, and the microservice invocation on the node will be interrupted, and an incorrect response message will be quickly returned. When the microservice invocation of this node is detected to be normal, the call link is restored.

In the Spring Cloud framework, circuit breakers are implemented through Hystrix. Hystrix monitors calls between microservices, and when the number of failed calls reaches a certain threshold, the default is 20 failed calls within five seconds, the circuit breaker is activated.

The comment on the circuit breaker mechanism is @hystrixCommand.

Micro service architecture proposed by Martin fowler fuse paper: martinfowler.com/bliki/Circu…

6.2 Service fuse case

1. Add the following to the service class of cloud-provider-Hystrix-Payment8001:

//=============== Service fuse case ===============
@HystrixCommand(fallbackMethod = "paymentCircuitBreak_fallback",commandProperties = { @HystrixProperty(name= "circuitBreaker.enabled",value = "true"), / / open circuit breaker @ HystrixProperty (name = "circuitBreaker. RequestVolumeThreshold", value = "10"), / / request times @ HystrixProperty (name = "circuitBreaker.sleepWindowInMilliseconds",value = "10000"), / / time window @ HystrixProperty (name = "circuitBreaker. ErrorThresholdPercentage", value = "60"), / / what is the failure rate after tripping / / the general meaning is: Make 10 requests within 10 seconds, if the request failure rate reaches 60%, perform service fusing})
public String paymentCircuitBreak(@PathVariable("id") Integer id){
    Fallback paymentCircuitBreak_fallback(); fallback(); fallback();
    if (id<0) {throw new RuntimeException("Id cannot be negative");
    }
    String simpleUUID = IdUtil.simpleUUID();  / / IdUtil. SimpleUUID () equivalent UUID. RandomUUID (), toString ()
    return Thread.currentThread().getName()+"\t"+"Call successful, serial number:"+ simpleUUID;
}

/ / fallback method
public String paymentCircuitBreak_fallback(@PathVariable("id") Integer id){
    return "Id cannot be negative, please try again later o(╥﹏╥)o id:"+id;
}
Copy the code

2. Controller calls the newly added method:

@GetMapping(value = "/payment/circuit/{id}")
public String paymentCircuitBreak(@PathVariable("id") Integer id){
    String result = paymentService.paymentCircuitBreak(id);
    log.info("result:"+result);
    return result;
}
Copy the code

3. Test:

  1. Start the cloud-provider-Hystrix-Payment8001 module

  2. It works when the id passed by our front end is positive

  1. But when we pass a negative ID, the fallback method is executed

  1. Circuit breaker test: We specified above: within 10 seconds, send 10 requests, if the request failure rate reaches 60%, perform service circuit breaker. Now I continuously execute the error request with id negative, after ten times of abnormal requests, the server will perform service circuit breaker

  2. If the request id is positive, check whether the access is normal:

  1. However, after the time window, the number of normal accesses increases and the access success rate increases, and the service becomes available again.

6.3. Summary of service circuit breaker

From the above Martin fowler fuse paper: martinfowler.com/bliki/Circu…

Fuse type:

  • Fuse open: the request will not call the current service, internal set MTTR (average fault handling time), when the opening time reaches the set MTTR, the state enters the half-fuse (half-open)
  • Fusible off: Service will not be fusible
  • Partial requests invoke the current service according to the rule. If the request is successful and meets the rule, the current service is considered normal and the fuse is closed

At what point does the circuit breaker start to operate:

There are three important parameters related to the circuit breaker: snapshot time window, total request threshold, and error percentage threshold

  1. Snapshot time window: The circuit breaker needs to collect request and error data to determine whether to enable the circuit breaker. The statistics range is the snapshot time window (time window). The default value is the latest 10 seconds.
  2. Request threshold: Specifies the request threshold in the snapshot time window. The default value is 20. If the hystris command is invoked for less than 20 times within 10 seconds, the circuit breaker will not open even if all requests time out or fail for other reasons.
  3. Error percentage threshold: When the total number of requests exceeds the threshold in the snapshot time window, for example, 30 invocations occur. If 15 of the 30 invocations are abnormal, that is, the error percentage exceeds 50%, the circuit breaker is turned on if the default value is 50%.

After the circuit breaker is turned on:

When a request is invoked again, the main logic will not be called, but the degraded fallback will be called directly. Through circuit breaker, error detection and logic degradation can be realized automatically, and response delay can be reduced.

7. Hystrix workflow

Flow chart:

Process steps:

1. Packaging request:

Business methods can be wrapped using HystrixCommand or HystrixObservableCommand;

2, initiate a request:

Execute a business method call using the execute of the Command;

In addition to the execute method, Hystrix provides three methods for all requests:

  

K value = command.execute(); ``Future<K> fValue = command.queue(); ``Observable<K> ohValue = command.observe(); ``//hot observable``Observable<K> ocValue = command.toObservable(); ``//cold observableCopy the code

As shown above:

Execute calls queue().get(), and queue() calls toObservable().toblocking ().toFuture();

So all method calls depend on Observable method calls, depending on whether synchronous or asynchronous calls are required.

3. Cache processing:

When the request arrives, it determines whether caching is enabled for the request (which is enabled by default) and whether the current request carries a cache Key.

Returns if the cache is hit; Otherwise enter the rest of the logic;

4, judge whether the circuit breaker is open (fuse) :

At the heart of Hystrix’s design is the circuit breaker, which is an important means to achieve rapid failure (when the circuit breaker is opened, it returns to failure).

After the circuit breaker is open for a certain period of time, a service request can be made (5000 ms by default).

5. Determine whether to make business requests (whether the requests need to be isolated or degraded) :

Before making a service request, the system determines whether to request service according to the current service processing quality.

If the current quality of service is low (thread pool/queue/semaphore full), it will also fail directly;

Thread pool or semaphore selection (default is thread pool) :

The main advantages of thread pooling are client-side isolation and timeout Settings, but in the case of a large number of low-latency requests, the loss caused by frequent thread switching can be significant. In this case, we can use a semaphore strategy.

The main disadvantage of semaphores is that they cannot handle timeouts. If a request is pending after being sent to a client, it will have to wait forever.

6. Execute business request:

If the current quality of service is good, the request will be submitted to the business server;

HystrixObservableCommand.construct()or HystrixCommand.run()

7. Health monitoring:

According to the execution result of the historical service method, the current service health indicators are counted as the basis for whether the circuit breaker is fusing or not.

8/9. Processing result of response failure or success

Refer to the article: www.cnblogs.com/souyoulang/…

8. Service monitoring hystrixDashboard

In addition to isolating calls to dependent services, Hystrix also provides in-time call monitoring (Hystrix Dashboard). Hystrix continuously records the execution information of all requests initiated through Hystrix and presents it to users in the form of statistical reports and graphs, including how many requests are executed and how many are successful per second. How many failures. Netflix monitors these metrics through the Hystrix-metrics-event-stream project. Spring Cloud also provides integration with the Hystrix Dashboard, which translates monitoring content into a visual interface.

1. Create the Cloud-consumer-Hystrix-Dashboard9001 module

2. Import dependencies

<dependencies>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-netflix-hystrix-dashboard</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-devtools</artifactId>
        <scope>runtime</scope>
        <optional>true</optional>
    </dependency>

    <dependency>
        <groupId>com.cheng.springcloud</groupId>
        <artifactId>cloud-api-commons</artifactId>
        <version>${project.version}</version>
    </dependency>

    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
    </dependency>

    <dependency>
        <groupId>org.projectlombok</groupId>
        <artifactId>lombok</artifactId>
        <optional>true</optional>
    </dependency>

</dependencies>
Copy the code

3. Write the YML configuration file

server:
  port: 9001
Copy the code

4. Main startup class

@SpringBootApplication
@EnableHystrixDashboard
public class HystrixDashboardMain9001 {
    public static void main(String[] args) { SpringApplication.run(HystrixDashboardMain9001.class,args); }}Copy the code

5. Test

Access request: localhost:9001/hystrix

HystrixDashboard monitoring platform built!

Now use the hystrixDashboard to monitor how many requests your service provider is accessing, how many are successful, how many are failing, and so on.

1. Ensure that the dependencies of the cloud-provider-Hystrix-Payment8001 module include:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
Copy the code

2, fill in the pit and add the following method to the main startup class:

/* * This configuration is for service monitoring, has nothing to do with service fault tolerance itself, Because the default path of SpringBoot is not "/hystrix.stream" *, you can configure the following servlets in your own project
 @Bean
public ServletRegistrationBean getServlet(a){
    HystrixMetricsStreamServlet streamServlet = new HystrixMetricsStreamServlet();
    ServletRegistrationBean registrationBean = new ServletRegistrationBean(streamServlet);
    registrationBean.setLoadOnStartup(1);
    registrationBean.addUrlMappings("/hystrix.stream");
    registrationBean.setName("HystrixMetricsStreamServlet");
    return registrationBean;
}
Copy the code

3, test,

  1. Start the cloud-Eureka-Server7001 and Cloud-Eureka-Server7002 modules
  2. Cloud – consumer – hystrix – dashboard9001 module
  3. Start the cloud-provider-Hystrix-Payment8001 module

Monitor module 8001 on the 9001hystrixDashboard monitoring platform:

Access module in 8001 normal service: http://localhost:8001/payment/circuit/1

Visit hystrixDashboard repeatedly to view the monitoring page:

Again access module 8001 abnormal service: http://localhost:8001/payment/circuit/-1

Visit hystrixDashboard repeatedly to view the monitoring page:

As you can see from the results, after multiple access exceptions, the service enters the circuit breaker mechanism and the fuse is turned on.

HystrixDashboard monitoring graphs