Mert Tosun
← Posts
Retry, Timeout, and Circuit Breaker: A Reliability Playbook

Retry, Timeout, and Circuit Breaker: A Reliability Playbook

Mert TosunReliability

Resilience mechanisms often fail when configured independently. Unlimited retries, long timeouts, and passive circuit breakers can amplify outages instead of isolating them.

Treat them as one control system.

Timeout budgeting first

Start from end-to-end request SLO and split budget across downstream calls. Retries must fit inside this budget.

Safe retry policy

  • Retry only transient failures
  • Use exponential backoff with jitter
  • Set max attempts and total retry time cap
  • Never retry non-idempotent operations blindly

Circuit breaker role

Circuit breaker protects dependencies under sustained failure:

  • Closed: normal flow
  • Open: fail fast
  • Half-open: limited probe traffic

Anti-pattern to avoid

If every service retries aggressively at once, you get retry storms and queue growth. Enforce retry budgets per client and per dependency.

Conclusion

Reliability improves when retry, timeout, and circuit breaker are designed together, observed with clear metrics, and tuned against real latency/error profiles.