In the microservice architecture, a single service is divided into several microservices. A service usually needs to call (network mode) multiple services to complete the expected function, and the stability of the service is restricted by the overall stability of other services. If a service fails, service consumers cannot work properly, and the impact is gradually magnified, and even the entire service cluster crashes, which is the service avalanche effect.

To prevent service avalanches, researchers used traffic control, improved caching, automatic service expansion, service downgrading, and circuit breakers. This article introduces the service circuit breaker and uses go-Kit +Hystrix to implement a circuit breaker solution for microservices.

Service fusing

Service meltdown is when the caller finds that the service provider is slow to respond or unavailable, and the caller does not invoke the target service to protect itself from direct failure. Access is attempted after a period of time, allowing for the possibility of recovery by the service provider. This is essentially a “circuit breaker pattern” application that Martin Fowler has written about in his own article. Through the following circuit breaker switch status diagram to illustrate:

  • The initial state is Closed. If the request is always successful, the Closed state remains. If the timeout threshold is not set for the number of failures, the state will remain Closed. If the number of failures reaches the threshold, the system switches to the Open state.
  • In the Open state, the caller does not invoke the target service. If the set retry time is reached, the state changes to Half Open, allowing some requests to be tried. If the attempt succeeds, the state is switched to Closed; if the attempt fails, the state is switched to Open.

One of the most widely used service outages in the industry is Hystrix, which Netflix pulled out of to facilitate microservices built by Spring Cloud. Here’s a quote from the official Hystrix introduction:

Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.

Hystrix is a delay and fault tolerant library designed to isolate points of access to remote systems, services, and third-party libraries, stop cascading failures, and enable resilience in complex distributed systems where failure is inevitable.

This example uses the Go language version of Hystrix, AFEX/Hystrix-Go, to implement service circuit breaker governance.

We practice

This example is based on the arithmetic_trace_demo change, which adds a service fusing governance policy in the gateway and nothing else in the Register other than a temporary addition of fault simulation code.

Step-0: Code preparation

Copy the arithmetic_trace_demo directory and rename it arithmetic_circuitbreaker_demo.

Download the Go dependencies required for this example:

go get github.com/afex/hystrix-go
Copy the code

Docker/docker-comemage. yml, add hystrix-Dashboard instance as follows:

version: '2'

  consul:
    image: progrium/consul:latest
    ports:
      - 8400:8400
      - 8500:8500
      - 8600:53/udp
    hostname: consulserver
    command: -server -bootstrap -ui-dir /ui

  zipkin:
    image: openzipkin/zipkin
    ports:
      - 9411:9411

  hystrix:
    image: mlabouardy/hystrix-dashboard:latest
    ports:
      - 8181:9002
Copy the code

Step-2: Add gateway/router.go

Before we begin, hystrix-go provides a Do method to execute the user’s business logic in asynchronous mode, which is blocked until the execution succeeds or an error returns, as defined below:

func Do(name string, run runFunc, fallback fallbackFunc) error 
Copy the code
  • Name: indicates the command name. The value is usually set to the request name or service name.
  • Run: business logic method that encapsulates the invocation logic to the service provider.
  • Fallback: callback method when an error occurs during the run. Generally, error information is encapsulated.

To complete the Hystrix-Go call, I packaged the original reverse proxy logic into the Do method and implemented ServeHTTP through the HystrixRouter type, encapsulating link tracing and service discovery logic in it (just for demonstration purposes).

HystrixRouter is defined and created as follows. It encapsulates link tracing and service discovery.

// HystrixRouter Indicates the hystrix routetypeHystrixRouter struct {svcMap * sync.map Stores have been monitored via hystrix services list logger log. logger // logging tool fallbackMsg string // callback messages to consulClient * api.client // Client object tracer Func Routes(client *api. client, zikkinTracer *zipkin.Tracer, fbMsg string, logger log.Logger) http.Handler {return HystrixRouter{
		svcMap:       &sync.Map{},
		logger:       logger,
		fallbackMsg:  fbMsg,
		consulClient: client,
		tracer:       zikkinTracer,
	}
}
Copy the code

The main logic that follows is implemented in server HTTP. The main ideas are as follows:

  • Resolve the service name in the request path to check whether it has been added to Hystrix monitoring: If not, configure the information and cache it to the service list; if not, skip.
  • Encapsulate service discovery, reverse routing, and link tracing logic in the Do method. Service discovery fails, and an error message is returned. Reverse proxy failure returns an error message (a callback method for reverse proxy error is added here);

The detailed code is shown below and can be understood through comments:

Func (router HystrixRouter) ServeHTTP(w http.responsewriter, r * http.request) {// Query the original Request path, for example: /arithmetic/calculate/10/5 reqPath := r.URL.Pathif reqPath == "" {
		return} // follow the delimiter'/'ServiceName pathArray := strings.Split(reqPath,"/") serviceName := pathArray[1] // Check whether the monitor has been addedif_, ok := router.svcMap.Load(serviceName); ! Hystrix. ConfigureCommand(serviceName, hystrix.CommandConfig{Timeout: 1000}) router.svcmap. Store(serviceName, serviceName)} // Run the err := hystrix.do (serviceName, Func () (err error) {/ / call the consul serviceNam API query result, _, err: = router. ConsulClient. Catalog (), Service (serviceName,"", nil)
		iferr ! = nil { router.logger.Log("ReverseProxy failed"."query service instace error", err.Error())
			return
		}

		if len(result) == 0 {
			router.logger.Log("ReverseProxy failed"."no such service instance", serviceName)
			return errors.New("no such service instance"DestPath := strings.Join(pathArray[2:],"/"TGT := result[rand.int ()%len(result)] router.logger.log ()"service id", tgt.serviceId) // Set the proxy service address information req.url.scheme ="http"
			req.URL.Host = fmt.Sprintf("%s:%d", tgt.ServiceAddress, tgt.ServicePort)
			req.URL.Path = "/"+ destPath} var proxyError error = nil Use the following RoundTrip instead of the default Transport RoundTrip, _ : = zipkinhttpsvr. NewTransport (router. Tracer, zipkinhttpsvr. TransportTrace (trueErrorHandler := func(ew http.ResponseWriter, er *http.Request, err error) { proxyError = err } proxy := &httputil.ReverseProxy{ Director: director, Transport: roundTrip, ErrorHandler: errorHandler, } proxy.ServeHTTP(w, r)returnProxyError}, func(err Error) error {// Run failed, return fallback message router.logger.log ("fallback error description", err.Error())

		returnErrors. New(router.fallbackmsg)}) // Do method execution failed, response error messageiferr ! = nil { w.WriteHeader(500) w.Write([]byte(err.Error())) } }Copy the code

Step-3: Modify gateway/main.go

The hystrixRouter object is created and the parameters are passed according to the method parameter list. The return value is then wrapped in zipkin service middleware.

hystrixRouter := Routes(consulClient, zipkinTracer, "Circuit Breaker:Service unavailable", logger)

handler := zipkinhttpsvr.NewServerMiddleware(
	zipkinTracer,
	zipkinhttpsvr.SpanName("gateway"),
	zipkinhttpsvr.TagResponseSize(true),
	zipkinhttpsvr.ServerTags(tags),
)(hystrixRouter)
Copy the code

To monitor services through hystrix-Dashboard, you need to enable Hystrix’s real-time monitoring service. The code is as follows:

/ / enabled hystrix real time monitoring, listen on port 9010 hystrixStreamHandler: = hystrix. NewStreamHandler () hystrixStreamHandler. Start () gofunc() {
	errc <- http.ListenAndServe(net.JoinHostPort(""."9010"), hystrixStreamHandler)
}()
Copy the code

Now, the Gateway code is complete.

Step-4: Run & tests

run

Start Docker, Register, gateway, and then use Postman’s Runner tool to test.

Launch Consul, Zipkin, Hystrix-Dashboard
sudo docker-compose -fDocker/docker-comemage.yml up./register/ register-consul. host localhost-consul. port 8500-service. host 192.168.192.146 -service.port 9000 ./gateway/gateway -consul.host localhost -consul.port 8500Copy the code

Add a new collection in Postman named Circuitbreaker and create a POST request. Open Runner (top left), select the new collection, set the interval to 100 milliseconds, and set the number of iterations to 1000 (not running yet). The diagram below:

Open the browser, type http://localhost:8181/hystrix, enter hystrix monitoring address http://192.168.192.146:9010 in the input box (here need to configure your host address), and then start the monitoring, the diagram below:

Now that the preparation is complete, let’s begin the test.

test

Click the “Run CircuitBreak” button in Postman Runner interface, then check the Hysytrix monitoring panel, you will see the following interface, the circuit breaker status is Closed:

Then, close the Register service (stop it directly on the terminal for quick startup below). You will find that the circuit breaker status soon changes to Open:

After the register service is enabled, the circuit breaker status changes to Closed:

Analysis of the

The default Hystrix parameters are:

  • DefaultErrorPercentThreshold = 50: request failure rate reached 50%, the circuit breaker switch to the Open state.
  • DefaultTimeout = 1000: If a request exceeds this time, the service is abnormal. I also set it to 1 second in the code.
  • DefaultSleepWindow = 5000: Retry every 5 seconds in the Open state.

At the beginning, register service is normal and all requests are successfully executed, so it is in Closed state. When register is disabled and the number of failed requests reaches the threshold, the system switches to Open. After the service is restored, the service is restored when Hystrix reaches the retry time.

conclusion

In this paper, hystrix-Go is used in go-Kit to add a service fusing management scheme for gateway service. By simulating register service from “normal-fail-recovery”, the change of circuit breaker status is observed in Hystrix-Dashboard.

In the actual development process, due to the complex dependencies between services, it is necessary to add service circuit breaker governance measures for our services to ensure timely stop loss and prevent the failure of a single dependent service from affecting all business lines. This article is just an introduction to service circuit breakers, and further research on Hystrix circuit breakers and downgrades is needed.

In this paper, the reference

  • This article is @github
  • Hystrix
  • hystrix-go
  • CircuitBreaker
  • Hystrix official documentation translation

This article is first published in my wechat public account [Xi Yi Ang bar], welcome to scan code attention!