Fortify is a production-grade resilience and fault-tolerance library for Go 1.23+. It provides a comprehensive suite of battle-tested patterns including circuit breakers, retries, rate limiting, timeouts, and bulkheads with zero external dependencies for core functionality.
go get github.com/felixgeelhaar/fortify
Requirements: Go 1.23 or higher
package main
import (
"context"
"time"
"github.com/felixgeelhaar/fortify/circuitbreaker"
"github.com/felixgeelhaar/fortify/retry"
)
func main() {
// Create a circuit breaker
cb := circuitbreaker.New[string](circuitbreaker.Config{
MaxRequests: 100,
Interval: time.Second * 10,
ReadyToTrip: func(counts circuitbreaker.Counts) bool {
return counts.ConsecutiveFailures >= 3
},
})
// Create a retry strategy
r := retry.New[string](retry.Config{
MaxAttempts: 3,
InitialDelay: time.Millisecond * 100,
BackoffPolicy: retry.BackoffExponential,
})
// Use them together
result, err := cb.Execute(context.Background(), func(ctx context.Context) (string, error) {
return r.Do(ctx, func(ctx context.Context) (string, error) {
return callExternalService(ctx)
})
})
}
Prevents cascading failures by temporarily stopping requests to failing services.
import "github.com/felixgeelhaar/fortify/circuitbreaker"
cb := circuitbreaker.New[Response](circuitbreaker.Config{
MaxRequests: 100,
Interval: time.Second * 60,
Timeout: time.Second * 30, // Half-open timeout
ReadyToTrip: func(counts circuitbreaker.Counts) bool {
failureRatio := float64(counts.TotalFailures) / float64(counts.Requests)
return counts.Requests >= 10 && failureRatio >= 0.5
},
OnStateChange: func(from, to circuitbreaker.State) {
log.Printf("Circuit breaker: %s -> %s", from, to)
},
})
result, err := cb.Execute(ctx, func(ctx context.Context) (Response, error) {
return makeRequest(ctx)
})
States: Closed → Open → Half-Open → Closed
Use Cases:
Automatically retries failed operations with configurable backoff strategies.
import "github.com/felixgeelhaar/fortify/retry"
r := retry.New[Response](retry.Config{
MaxAttempts: 5,
InitialDelay: time.Millisecond * 100,
MaxDelay: time.Second * 10,
BackoffPolicy: retry.BackoffExponential,
Multiplier: 2.0,
Jitter: true,
ShouldRetry: func(err error) bool {
return isTransientError(err)
},
})
result, err := r.Do(ctx, func(ctx context.Context) (Response, error) {
return makeRequest(ctx)
})
Backoff Policies:
BackoffConstant: Fixed delay between retriesBackoffLinear: Linearly increasing delayBackoffExponential: Exponentially increasing delayUse Cases:
Controls the rate of operations using a token bucket algorithm.
import "github.com/felixgeelhaar/fortify/ratelimit"
rl := ratelimit.New(ratelimit.Config{
Rate: 100, // 100 requests
Burst: 200, // burst of 200
Interval: time.Second, // per second
})
// Non-blocking check
if rl.Allow(ctx, "user-123") {
handleRequest()
}
// Blocking wait
if err := rl.Wait(ctx, "user-123"); err == nil {
handleRequest()
}
Use Cases:
Enforces time limits on operations with context propagation.
import "github.com/felixgeelhaar/fortify/timeout"
tm := timeout.New[Response](timeout.Config{
DefaultTimeout: time.Second * 30,
OnTimeout: func(duration time.Duration) {
log.Printf("Operation timed out after %v", duration)
},
})
// Use specific timeout
result, err := tm.Execute(ctx, 5*time.Second, func(ctx context.Context) (Response, error) {
return makeRequest(ctx)
})
// Use default timeout
result, err := tm.ExecuteWithDefault(ctx, func(ctx context.Context) (Response, error) {
return makeRequest(ctx)
})
Use Cases:
Limits concurrent operations to prevent resource exhaustion.
import "github.com/felixgeelhaar/fortify/bulkhead"
bh := bulkhead.New[Response](bulkhead.Config{
MaxConcurrent: 10, // Max concurrent operations
MaxQueue: 20, // Max queued operations
QueueTimeout: time.Second * 5, // Queue wait timeout
OnRejected: func() {
log.Println("Request rejected: bulkhead full")
},
})
result, err := bh.Execute(ctx, func(ctx context.Context) (Response, error) {
return makeRequest(ctx)
})
// Get statistics
stats := bh.Stats()
log.Printf("Active: %d, Queued: %d, Rejected: %d",
stats.ActiveRequests, stats.QueuedRequests, stats.RejectedRequests)
Use Cases:
Provides graceful degradation with automatic fallback on errors.
import "github.com/felixgeelhaar/fortify/fallback"
fb := fallback.New[Response](fallback.Config{
Primary: func(ctx context.Context) (Response, error) {
return primaryService.Call(ctx)
},
Fallback: func(ctx context.Context, err error) (Response, error) {
log.Printf("Primary failed: %v, using fallback", err)
return fallbackService.Call(ctx)
},
ShouldFallback: func(err error) bool {
return isServiceError(err) // Only fallback on service errors
},
OnFallback: func(err error) {
metrics.IncFallbackCount()
},
})
result, err := fb.Execute(ctx)
Use Cases:
Combine multiple patterns into a single execution chain:
import "github.com/felixgeelhaar/fortify/middleware"
chain := middleware.New[Response]().
WithBulkhead(bh).
WithRateLimit(rl, "user-key").
WithTimeout(tm, 5*time.Second).
WithCircuitBreaker(cb).
WithRetry(r)
result, err := chain.Execute(ctx, func(ctx context.Context) (Response, error) {
return makeRequest(ctx)
})
Order matters:
Integrate resilience patterns with standard http.Handler:
import (
"net/http"
fortifyhttp "github.com/felixgeelhaar/fortify/http"
)
// Create patterns
cb := circuitbreaker.New[*http.Response](/* config */)
rl := ratelimit.New(/* config */)
tm := timeout.New[*http.Response](/* config */)
// Apply middleware
handler := fortifyhttp.RateLimit(rl, fortifyhttp.KeyFromIP)(
fortifyhttp.Timeout(tm, 5*time.Second)(
fortifyhttp.CircuitBreaker(cb)(
http.HandlerFunc(myHandler),
),
),
)
http.Handle("/api", handler)
Key Extractors:
KeyFromIP - Extract client IPKeyFromHeader(name) - Extract from HTTP headerStatus Codes:
503 Service Unavailable - Circuit breaker open429 Too Many Requests - Rate limit exceeded504 Gateway Timeout - Request timeoutIntegrate with gRPC services:
import (
fortifygrpc "github.com/felixgeelhaar/fortify/grpc"
"google.golang.org/grpc"
)
// Unary interceptors
server := grpc.NewServer(
grpc.UnaryInterceptor(
fortifygrpc.UnaryCircuitBreakerInterceptor(cb),
),
grpc.StreamInterceptor(
fortifygrpc.StreamRateLimitInterceptor(rl,
fortifygrpc.StreamKeyFromMetadata("x-api-key")),
),
)
Interceptors:
UnaryCircuitBreakerInterceptorUnaryRateLimitInterceptorUnaryTimeoutInterceptorStreamCircuitBreakerInterceptorStreamRateLimitInterceptorStreamTimeoutInterceptorimport (
"log/slog"
fortifyslog "github.com/felixgeelhaar/fortify/slog"
)
logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
fortifyslog.LogPatternEvent(logger, fortifyslog.PatternCircuitBreaker, "state_change",
slog.String("from", "closed"),
slog.String("to", "open"),
)
import (
fortifyotel "github.com/felixgeelhaar/fortify/otel"
"go.opentelemetry.io/otel/sdk/trace"
)
provider := trace.NewTracerProvider(/* config */)
tracer := fortifyotel.NewTracer(provider, "my-service")
ctx, span := tracer.StartSpan(ctx, fortifyotel.PatternCircuitBreaker, "execute")
defer span.End()
tracer.SetAttributes(span,
attribute.Int("requests", 100),
attribute.String("state", "closed"),
)
Export detailed metrics for all resilience patterns:
import (
"github.com/felixgeelhaar/fortify/metrics"
"github.com/prometheus/client_golang/prometheus"
)
// Register Fortify metrics with Prometheus
metrics.MustRegister(prometheus.DefaultRegisterer)
// Use the default collector
collector := metrics.DefaultCollector()
// Record circuit breaker metrics
collector.RecordCircuitBreakerRequest("api-client", "closed")
collector.RecordCircuitBreakerSuccess("api-client")
// Record retry metrics
collector.RecordRetryAttempts("database-query", 2)
collector.RecordRetrySuccess("database-query")
Available Metrics:
Fortify is designed for production use with minimal overhead:
| Pattern | Overhead | Allocations |
|---|---|---|
| Circuit Breaker | ~30ns | 0 |
| Retry | ~25ns | 0 |
| Rate Limiter | ~45ns | 0 |
| Timeout | ~50ns | 0 |
| Bulkhead | ~39ns | 0 |
Benchmarks on Apple M1, Go 1.23
Comprehensive examples are available in the examples/ directory:
Run examples:
go run examples/basic/circuit_breaker.go
go run examples/http/server.go
go run examples/composition/chain.go
ReadyToTrip based on your error budgetRecommended order for combining patterns:
OnStateChange, OnRetry, OnTimeout, and OnRejected callbacksRun tests with race detection:
# All tests
go test -v -race ./...
# Specific package
go test -v -race ./circuitbreaker
# With coverage
go test -v -race -coverprofile=coverage.out ./...
go tool cover -html=coverage.out
Test resilience with built-in chaos utilities:
import fortifytesting "github.com/felixgeelhaar/fortify/testing"
// Inject errors with configurable probability
injector := fortifytesting.NewErrorInjector(0.3, errors.New("service unavailable"))
// Add network latency
latency := fortifytesting.NewLatencyInjector(10*time.Millisecond, 50*time.Millisecond)
// Simulate timeouts
timeout := fortifytesting.NewTimeoutSimulator(100*time.Millisecond, 0.5)
// Create flakey service combining all
service := fortifytesting.NewFlakeyService(0.3, 10*time.Millisecond, 30*time.Millisecond)
Chaos Utilities:
ErrorInjector: Simulate failures with probabilityLatencyInjector: Add realistic network delaysTimeoutSimulator: Create timeout scenariosFlakeyService: Combine errors, latency, and timeoutsAutomated benchmark tracking and regression detection:
# Run benchmarks with automation
./scripts/benchmark.sh run
# Generate performance baseline
./scripts/benchmark.sh generate-baseline
# Check for regressions
./scripts/benchmark.sh check
# Complete workflow
./scripts/benchmark.sh all
Features:
See Performance Testing Guide for details.
Run benchmarks:
go test -bench=. -benchmem ./...
Contributions are welcome! Please:
MIT License - see LICENSE file for details.
Fortify is inspired by resilience libraries from other ecosystems:
Built with ❤️ by Felix Geelhaar