Circuit is an efficient and feature complete Hystrix like Go implementation of the circuit breaker pattern. Learn more about the problems Hystrix and other circuit breakers solve on the Hystrix Wiki. A short summary of advantages are:
- A downstream service failed and all requests hang forever. Without a circuit, your service would also hang forever. Because you have a circuit, you detect this failure quickly and can return errors quickly while waiting for the downstream service to recover.
- Circuits make great monitoring and metrics boundaries, creating common metric names for the common downstream failure types. This package goes further to formalize this in a SLO tracking pattern.
- Circuits create a common place for downstream failure fallback logic.
- Downstream services sometimes fail entirely when overloaded. While in a degraded state, circuits allow you to push downstream services to the edge between absolute failure and mostly working.
- Open/Close state of a circuit is a clear early warning sign of downstream failures.
- Circuits allow you to protect your dependencies from abnormal rushes of traffic.
There are a large number of examples on the godoc that are worth looking at. They tend to be more up to date than the README doc.
- No forced goroutines
- recoverable panic()
- Integrated with context.Context
- Comprehensive metric tracking
- Efficient implementation with Benchmarks
- Low/zero memory allocation costs
- Support for Netflix Hystrix dashboards, even with custom circuit transition logic
- Multiple error handling features
- Expose circuit health and configuration on expvar
- SLO tracking
- Customizable state transition logic, allowing complex circuit state changes
- Live configuration changes
- Many tests and examples
- Good inline documentation
- Generatable interface wrapping support with https://github.com/twitchtv/circuitgen
- Support for Additive increase/multiplicative decrease
- Prometheus metrics collector.
See UPGRADE_GUIDE.md for upgrade instructions if you're upgrading from v3 to v4.
This example shows how to create a hello-world circuit from the circuit manager
// Manages all our circuits
h := circuit.Manager{}
// Create a circuit with a unique name
c := h.MustCreateCircuit("hello-world")
// Call the circuit
errResult := c.Execute(context.Background(), func(ctx context.Context) error {
return nil
}, nil)
fmt.Println("Result of execution:", errResult)
// Output: Result of execution: <nil>
This example shows how fallbacks execute to return alternate errors or provide logic when the circuit is open.
// You can create circuits without using the manager
c := circuit.NewCircuitFromConfig("hello-world-fallback", circuit.Config{})
errResult := c.Execute(context.Background(), func(ctx context.Context) error {
return errors.New("this will fail")
}, func(ctx context.Context, err error) error {
fmt.Println("Circuit failed with error, but fallback returns nil")
return nil
})
fmt.Println("Execution result:", errResult)
// Output: Circuit failed with error, but fallback returns nil
// Execution result: <nil>
It is recommended to use circuit.Execute
and a context aware function. If, however, you want to exit
your run function early and leave it hanging (possibly forever), then you can call circuit.Go
.
h := circuit.Manager{}
c := h.MustCreateCircuit("untrusting-circuit", circuit.Config{
Execution: circuit.ExecutionConfig{
// Time out the context after a few ms
Timeout: time.Millisecond * 30,
},
})
errResult := c.Go(context.Background(), func(ctx context.Context) error {
// Sleep 30 seconds, way longer than our timeout
time.Sleep(time.Second * 30)
return nil
}, nil)
fmt.Printf("err=%v", errResult)
// Output: err=context deadline exceeded
All configuration parameters are documented in config.go. Your circuit open/close logic configuration is documented with the logic. For hystrix, this configuration is in closers/hystrix and well documented on the Hystrix wiki.
This example configures the circuit to use Hystrix open/close logic with the default Hystrix parameters
configuration := hystrix.Factory{
// Hystrix open logic is to open the circuit after an % of errors
ConfigureOpener: hystrix.ConfigureOpener{
// We change the default to wait for 10 requests, not 20, before checking to close
RequestVolumeThreshold: 10,
// The default values match what hystrix does by default
},
// Hystrix close logic is to sleep then check
ConfigureCloser: hystrix.ConfigureCloser{
// The default values match what hystrix does by default
},
}
h := circuit.Manager{
// Tell the manager to use this configuration factory whenever it makes a new circuit
DefaultCircuitProperties: []circuit.CommandPropertiesConstructor{configuration.Configure},
}
// This circuit will inherit the configuration from the example
c := h.MustCreateCircuit("hystrix-circuit")
fmt.Println("This is a hystrix configured circuit", c.Name())
// Output: This is a hystrix configured circuit hystrix-circuit
Dashboard metrics can be enabled with the MetricEventStream object. This example creates an event stream handler, starts it, then later closes the handler
// metriceventstream uses rolling stats to report circuit information
sf := rolling.StatFactory{}
h := circuit.Manager{
DefaultCircuitProperties: []circuit.CommandPropertiesConstructor{sf.CreateConfig},
}
es := metriceventstream.MetricEventStream{
Manager: &h,
}
go func() {
if err := es.Start(); err != nil {
log.Fatal(err)
}
}()
// ES is a http.Handler, so you can pass it directly to your mux
http.Handle("/hystrix.stream", &es)
// ...
if err := es.Close(); err != nil {
log.Fatal(err)
}
// Output:
If you wanted to publish hystrix information on Expvar, you can register your manager.
h := circuit.Manager{}
expvar.Publish("hystrix", h.Var())
Implement interfaces CmdMetricCollector or FallbackMetricCollector to know what happens with commands or fallbacks. Then pass those implementations to configure.
config := circuit.Config{
Metrics: circuit.MetricsCollectors{
Run: []circuit.RunMetrics{
// Here is where I would insert my custom metric collector
},
},
}
circuit.NewCircuitFromConfig("custom-metrics", config)
Code executed with Execute
does not spawn a goroutine and panics naturally go up the call stack to the caller.
This is also true for Go
, where we attempt to recover and throw panics on the same stack that
calls Go. This example will panic, and the panic can be caught up the stack.
h := circuit.Manager{}
c := h.MustCreateCircuit("panic_up")
defer func() {
r := recover()
if r != nil {
fmt.Println("I recovered from a panic", r)
}
}()
c.Execute(context.Background(), func(ctx context.Context) error {
panic("oh no")
}, nil)
// Output: I recovered from a panic oh no
Most configuration properties on the Hystrix Configuration page that say they are modifyable at runtime can be changed on the Circuit in a thread safe way. Most of the ones that cannot are related to stat collection.
This example shows how to update hystrix configuration at runtime.
// Start off using the defaults
configuration := hystrix.ConfigFactory{}
h := circuit.Manager{
// Tell the manager to use this configuration factory whenever it makes a new circuit
DefaultCircuitProperties: []circuit.CommandPropertiesConstructor{configuration.Configure},
}
c := h.MustCreateCircuit("hystrix-circuit")
fmt.Println("The default sleep window", c.OpenToClose.(*hystrix.Closer).Config().SleepWindow)
// This configuration update function is thread safe. We can modify this at runtime while the circuit is active
c.OpenToClose.(*hystrix.Closer).SetConfigThreadSafe(hystrix.ConfigureCloser{
SleepWindow: time.Second * 3,
})
fmt.Println("The new sleep window", c.OpenToClose.(*hystrix.Closer).Config().SleepWindow)
// Output:
// The default sleep window 5s
// The new sleep window 3s
If the context passed into a circuit function ends, before the circuit can
finish, it does not count the circuit as unhealthy. You can disable this
behavior with the IgnoreInterrupts
flag.
This example proves that terminating a circuit call early because the passed in context died does not, by default, count as an error on the circuit. It also demonstrates setting up internal stat collection by default for all circuits
// Inject stat collection to prove these failures don't count
f := rolling.StatFactory{}
manager := circuit.Manager{
DefaultCircuitProperties: []circuit.CommandPropertiesConstructor{
f.CreateConfig,
},
}
c := manager.MustCreateCircuit("don't fail me bro")
// The passed in context times out in one millisecond
ctx, cancel := context.WithTimeout(context.Background(), time.Millisecond)
defer cancel()
errResult := c.Execute(ctx, func(ctx context.Context) error {
select {
case <- ctx.Done():
// This will return early, with an error, since the parent context was canceled after 1 ms
return ctx.Err()
case <- time.After(time.Hour):
panic("We never actually get this far")
}
}, nil)
rs := f.RunStats("don't fail me bro")
fmt.Println("errResult is", errResult)
fmt.Println("The error and timeout count is", rs.ErrTimeouts.TotalSum() + rs.ErrFailures.TotalSum())
// Output: errResult is context deadline exceeded
// The error and timeout count is 0
Configuration factories are supported on the root manager object. This allows you to create dynamic configuration per circuit name.
You can use DefaultCircuitProperties to set configuration dynamically for any circuit
myFactory := func(circuitName string) circuit.Config {
timeoutsByName := map[string]time.Duration{
"v1": time.Second,
"v2": time.Second * 2,
}
customTimeout := timeoutsByName[circuitName]
if customTimeout == 0 {
// Just return empty if you don't want to set any config
return circuit.Config{}
}
return circuit.Config{
Execution: circuit.ExecutionConfig{
Timeout: customTimeout,
},
}
}
// Hystrix manages circuits with unique names
h := circuit.Manager{
DefaultCircuitProperties: []circuit.CommandPropertiesConstructor{myFactory},
}
h.MustCreateCircuit("v1")
fmt.Println("The timeout of v1 is", h.GetCircuit("v1").Config().Execution.Timeout)
// Output: The timeout of v1 is 1s
Most services have the concept of an SLA, or service level agreement. Unfortunantly, this is usually tracked by the service owners, which creates incentives for people to inflate the health of their service.
This Circuit implementation formalizes an SLO of the template "X% of requests will return faster than Y ms". This is a value that canont be calculated just by looking at the p90 or p99 of requests in aggregate, but must be tracked per request. You can define a SLO for your service, which is a time less than the timeout time of a request, that works as a promise of health for the service. You can then report per circuit not just fail/pass but an extra "healthy" % over time that counts only requests that resopnd quickly enough.
This example creates a SLO tracker that counts failures at less than 20 ms. You will need to provide your own Collectors.
sloTrackerFactory := responsetimeslo.Factory{
Config: responsetimeslo.Config{
// Consider requests faster than 20 ms as passing
MaximumHealthyTime: time.Millisecond * 20,
},
// Pass in your collector here: for example, statsd
CollectorConstructors: nil,
}
h := circuit.Manager{
DefaultCircuitProperties: []circuit.CommandPropertiesConstructor{sloTrackerFactory.CommandProperties},
}
h.CreateCircuit("circuit-with-slo")
Sometimes users pass invalid functions to the input of your circuit. You want to return
an error in that case, but not count the error as a failure of the circuit. Use SimpleBadRequest
in this case.
This example shows how to return errors in a circuit without considering the circuit at fault. Here, even if someone tries to divide by zero, the circuit will not consider it a failure even if the function returns non nil error.
c := circuit.NewCircuitFromConfig("divider", circuit.Config{})
divideInCircuit := func(numerator, denominator int) (int, error) {
var result int
err := c.Run(context.Background(), func(ctx context.Context) error {
if denominator == 0 {
// This error type is not counted as a failure of the circuit
return &circuit.SimpleBadRequest{
Err: errors.New("someone tried to divide by zero"),
}
}
result = numerator / denominator
return nil
})
return result, err
}
_, err := divideInCircuit(10, 0)
fmt.Println("Result of 10/0 is", err)
// Output: Result of 10/0 is someone tried to divide by zero
This implementation is more efficient than go-hystrix in every configuration. It has comparable efficiency to other implementations, faster for most when running with high concurrency.
The benchmarking code is available here. Run benchmarks with make bench
.
I benchmark the following alternative circuit implementations. I try to be fair and if there is a better way to benchmark one of these circuits, please let me know!
> make bench
cd benchmarking && go test -v -benchmem -run=^$ -bench=. . 2> /dev/null
goos: darwin
goarch: amd64
pkg: github.com/cep21/circuit/benchmarking
BenchmarkCiruits/cep21-circuit/Hystrix/passing/1-8 2000000 896 ns/op 192 B/op 4 allocs/op
BenchmarkCiruits/cep21-circuit/Hystrix/passing/75-8 3000000 500 ns/op 192 B/op 4 allocs/op
BenchmarkCiruits/cep21-circuit/Hystrix/failing/1-8 10000000 108 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/cep21-circuit/Hystrix/failing/75-8 20000000 82.5 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/cep21-circuit/Minimal/passing/1-8 10000000 165 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/cep21-circuit/Minimal/passing/75-8 20000000 87.7 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/cep21-circuit/Minimal/failing/1-8 20000000 64.4 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/cep21-circuit/Minimal/failing/75-8 100000000 19.6 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/cep21-circuit/UseGo/passing/1-8 1000000 1300 ns/op 256 B/op 5 allocs/op
BenchmarkCiruits/cep21-circuit/UseGo/passing/75-8 5000000 374 ns/op 256 B/op 5 allocs/op
BenchmarkCiruits/cep21-circuit/UseGo/failing/1-8 1000000 1348 ns/op 256 B/op 5 allocs/op
BenchmarkCiruits/cep21-circuit/UseGo/failing/75-8 5000000 372 ns/op 256 B/op 5 allocs/op
BenchmarkCiruits/GoHystrix/DefaultConfig/passing/1-8 200000 8146 ns/op 1001 B/op 18 allocs/op
BenchmarkCiruits/GoHystrix/DefaultConfig/passing/75-8 500000 2498 ns/op 990 B/op 20 allocs/op
BenchmarkCiruits/GoHystrix/DefaultConfig/failing/1-8 200000 6299 ns/op 1020 B/op 19 allocs/op
BenchmarkCiruits/GoHystrix/DefaultConfig/failing/75-8 1000000 1582 ns/op 1003 B/op 20 allocs/op
BenchmarkCiruits/rubyist/Threshold-10/passing/1-8 1000000 1834 ns/op 332 B/op 5 allocs/op
BenchmarkCiruits/rubyist/Threshold-10/passing/75-8 2000000 849 ns/op 309 B/op 4 allocs/op
BenchmarkCiruits/rubyist/Threshold-10/failing/1-8 20000000 114 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/rubyist/Threshold-10/failing/75-8 5000000 302 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/gobreaker/Default/passing/1-8 10000000 202 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/gobreaker/Default/passing/75-8 2000000 698 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/gobreaker/Default/failing/1-8 20000000 90.6 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/gobreaker/Default/failing/75-8 5000000 346 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/handy/Default/passing/1-8 2000000 1075 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/handy/Default/passing/75-8 1000000 1795 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/handy/Default/failing/1-8 1000000 1272 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/handy/Default/failing/75-8 1000000 1686 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/iand_circuit/Default/passing/1-8 10000000 119 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/iand_circuit/Default/passing/75-8 5000000 349 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/iand_circuit/Default/failing/1-8 100000000 20.4 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/iand_circuit/Default/failing/75-8 300000000 5.46 ns/op 0 B/op 0 allocs/op
PASS
ok github.com/cep21/circuit/benchmarking 59.518s
Limiting to just high concurrency passing circuits (the common case).
BenchmarkCiruits/cep21-circuit/Minimal/passing/75-8 20000000 87.7 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/GoHystrix/DefaultConfig/passing/75-8 500000 2498 ns/op 990 B/op 20 allocs/op
BenchmarkCiruits/rubyist/Threshold-10/passing/75-8 2000000 849 ns/op 309 B/op 4 allocs/op
BenchmarkCiruits/gobreaker/Default/passing/75-8 2000000 698 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/handy/Default/passing/75-8 1000000 1795 ns/op 0 B/op 0 allocs/op
BenchmarkCiruits/iand_circuit/Default/passing/75-8 5000000 349 ns/op 0 B/op 0 allocs/op
Make sure your tests pass with go test
and your lints pass with golangci-lint run
.
You can run an example set of circuits inside the /example directory
go run example/main.go
The output looks something like this:
go run example/main.go
2017/12/19 15:24:42 Serving on socket :8123
2017/12/19 15:24:42 To view the stream, execute:
2017/12/19 15:24:42 curl http://localhost:8123/hystrix.stream
2017/12/19 15:24:42
2017/12/19 15:24:42 To view expvar metrics, visit expvar in your browser
2017/12/19 15:24:42 http://localhost:8123/debug/vars
2017/12/19 15:24:42
2017/12/19 15:24:42 To view a dashboard, follow the instructions at https://github.com/Netflix/Hystrix/wiki/Dashboard#run-via-gradle
2017/12/19 15:24:42 git clone [email protected]:Netflix/Hystrix.git
2017/12/19 15:24:42 cd Hystrix/hystrix-dashboard
2017/12/19 15:24:42 ../gradlew jettyRun
2017/12/19 15:24:42
2017/12/19 15:24:42 Then, add the stream http://localhost:8123/hystrix.stream
If you load the Hystrix dasbhoard (following the above instructions), you should see metrics for all the example circuits.