Idempotency Keys and the Exactly-Once Myth in Distributed Systems
Teams often say "exactly once" when they really mean "at least once delivery plus deduplication." In real systems, retries, network timeouts, and client reconnects make duplicate requests normal behavior.
Idempotency keys are the practical way to keep side effects safe.
Why duplicates happen
- Client times out but server still completes the write
- Load balancer retries after a transient 502
- User taps the same payment button twice
- Message broker redelivers after consumer crash
Core design
Use a stable idempotency key per business operation (for example, one checkout submission). Store the key with operation outcome and return the same response on duplicate attempts.
Request(idempotency_key=abc123)
-> lookup key
-> if exists: return stored result
-> if not exists: execute + persist result atomically
Implementation rules
- Scope the key by tenant/user to avoid cross-account collisions.
- Persist both success and known business failures.
- Set an expiration window that matches retry behavior.
- Protect storage with unique constraints.
Common mistakes
- Generating a new key for each retry attempt
- Saving key after side effect instead of atomically
- Returning 409 without replaying original result body
- Treating non-idempotent downstream calls as safe
Conclusion
Exactly-once delivery is usually a protocol claim, not an end-to-end guarantee. Idempotency keys give you realistic protection against duplicates and make write APIs predictable under failure conditions.
Related posts
Durable Workflow Orchestration with Temporal
How Temporal helps backend teams build reliable long-running workflows with retries, timeouts, compensation logic, and strong observability.
Context, Timeout, and Cancellation in Go: A Production Reliability Guide
Practical patterns for context propagation, timeout budgeting, cancellation handling, and graceful shutdown in Go services.