Design an API Gateway
Edge proxy with auth, rate-limiting, routing, observability, request transformation, and circuit breakers — at < 5 ms p99 added latency.
Intro
An API Gateway is the front door to a service-oriented backend. It handles auth, rate limiting, routing to N upstream services, request transformation, and observability. The hard part is staying invisible — adding < 5 ms p99 latency while doing all of that at hundreds of thousands of QPS.
Functional
- Route requests to upstream services by path / host.
- Authenticate (JWT / mTLS / API key) + authorize.
- Rate-limit per scope (IP, user, key).
- Request / response transformation (path rewrite, header injection).
- Circuit breaker on upstream failures.
Non-functional
- Added latency p99 < 5 ms.
- Throughput ≥ 200 k QPS per region.
- Availability ≥ 99.99%.
- Hot-reloadable config — no restarts.
Components
Edge proxy fleet
Stateless; horizontally scaled (Envoy / NGINX / custom).
Auth service
JWT verify + introspection cache.
Rate limiter
Redis-backed; atomic Lua.
Service discovery
Upstream service registry (Consul / DNS / k8s).
Config plane
Routes + policies; pushed via xDS / pub-sub.
Observability sink
Metrics + logs + traces.
Trade-offs
Centralized gateway vs service mesh
Pros
- Gateway: one entry point for external traffic.
- Mesh: per-pod sidecar, fine-grained.
Cons
- Gateway: external-only.
- Mesh: deployment overhead.
Off-the-shelf (Envoy/Kong) vs custom
Pros
- Off-the-shelf: battle-tested.
- Custom: tuned for one workload.
Cons
- Off-the-shelf: less customization.
- Custom: maintenance burden.
Scale concerns
- Auth introspection cache miss → upstream auth latency.
- Hot route — dedicate routes to a pool.
- Config push — atomic swap; no torn state.
- Cascading failure on upstream slowdown.