The role of fuses in distributed systems has been emphasized many times

To get a sense of the value of this article,Netflix uses circuit breakers to protect its distributed systems

Blog.51cto.com/developeryc…

The internal implementation mechanism can be referred to

Martinfowler.com/bliki/Circu…

This article describes how go Chassis uses circuit breakers to isolate upstream services and protect downstream services.

How does Go Chassis ensure that upstream errors do not affect downstream systems

Go chassis reference and packaging the https://github.com/afex/hystrix-go brought fusing and degradation.

When the coroutine in the internal processing reaches a certain threshold, the error rate reaches a certain threshold, or the timeout reaches a certain threshold, the fuse will be triggered. Users can customize the fuse configuration items to set these parameters.

Circuit breaker logic inside Hystrix-Go



The Go Chassis uses the uniform Invocation abstraction to represent each remote invocation, while hystrix-Go uses the Command abstraction to encapsulate any execution fragment. The Invocation is forcibly encapsulated in command and executed in a circuit.

Each Circuit has a unique Name, and there is a Ticket bucket to store tickets. At first, it is closed, that is, everything works properly

The call will be forced to wrap into the circuit independent coroutine pool and receive a ticket.

Command ultimately only has two states, timeout or done. Every time we get to either of these states we’re going to return the ticket

You can see here that the ticket mechanism is very similar to the token bucket algorithm in flow limiting.

If a timeout occurs or a ticket cannot be obtained, an error will be recorded. When the error reaches a certain threshold, the circuit will open and refuse to send network requests

Service level Isolation



Each service has multiple circuits, and each circuit corresponds to an upstream microservice. When Service3 encounters problems (such as deadlocks or a large number of concurrent requests), it is physically isolated to prevent any requests from being sent to ensure system health. Service1 can still interact with 2 and 4 to ensure most services.

Ideally, a bad call to Serivce3 would not bring down Service1 (which could easily bring down a four-service system if it were deadlocked), but is it? Let’s look at a more complex system.

Why is service level isolation not enough?



Each service was developed based on Go Chassis

Assume that API2 needs to be completed by calling Service4, api1 by calling 3, and API3 by calling 5

A deadlock within Service4 caused API2 to fail and eventually trigger a fuse. Service1 isolates the entire Service2, causing a small deadlock and a rapid system failure.

It looks like the circuit breaker is bad here, so let’s see what happens when there’s no circuit breaker

Do not join the circuit breaker




In this case, it depends on which client performs timeout processing, because the existence of deadlocks will cause the entire invocation link to hang up, and eventually cause the client port to run out, and then quickly fail

Now, deadlock in an unrobust system is a surefire way to bring down the entire distributed system. There is no solution

The effect is the same with or without a fuse, and the result is a quick failure. So how do we solve this

API level circuit breaker



Each circuit is responsible for only one API execution, monitoring, and isolation

When Service2 calls Service4, the separate interface goes into isolation without affecting other API calls.

conclusion

From this article, we learned that error isolation at the service level is not enough, and that a system with an uncomplicated structure is acceptable. However, after complexity, the entire service cannot be isolated because of an API error, but rather fine-grained isolation. Go Chassis offers apI-level circuit breakers to help developers quickly isolate problematic services.

The means of circuit breaker are timeout practice, concurrency, error rate and so on. It compulsively protects every remote call without requiring developers to write their own code to deal with timeouts, deadlocks, network errors, etc., freeing developers to focus on business code rather than the complexities of distributed systems

Project information

Go Chassis Development Framework: https://github.com/go-chassis/go-chassis

Fusing document: https://go-chassis.readthedocs.io/en/latest/user-guides/cb-and-fallback.html

Go Chassis series:

https://juejin.cn/post/6844903682362834952

https://juejin.cn/post/6844903682736144392