Skip to content
Jarviix
HLD10 min read

Design an API Gateway

Edge proxy with auth, rate-limiting, routing, observability, request transformation, and circuit breakers — at < 5 ms p99 added latency.

hldsystem-designinfra

Intro

An API Gateway is the front door to a service-oriented backend. It handles auth, rate limiting, routing to N upstream services, request transformation, and observability. The hard part is staying invisible — adding < 5 ms p99 latency while doing all of that at hundreds of thousands of QPS.

Functional

  • Route requests to upstream services by path / host.
  • Authenticate (JWT / mTLS / API key) + authorize.
  • Rate-limit per scope (IP, user, key).
  • Request / response transformation (path rewrite, header injection).
  • Circuit breaker on upstream failures.

Non-functional

  • Added latency p99 < 5 ms.
  • Throughput ≥ 200 k QPS per region.
  • Availability ≥ 99.99%.
  • Hot-reloadable config — no restarts.

Components

  • Edge proxy fleet

    Stateless; horizontally scaled (Envoy / NGINX / custom).

  • Auth service

    JWT verify + introspection cache.

  • Rate limiter

    Redis-backed; atomic Lua.

  • Service discovery

    Upstream service registry (Consul / DNS / k8s).

  • Config plane

    Routes + policies; pushed via xDS / pub-sub.

  • Observability sink

    Metrics + logs + traces.

Trade-offs

Centralized gateway vs service mesh

Pros

  • Gateway: one entry point for external traffic.
  • Mesh: per-pod sidecar, fine-grained.

Cons

  • Gateway: external-only.
  • Mesh: deployment overhead.

Off-the-shelf (Envoy/Kong) vs custom

Pros

  • Off-the-shelf: battle-tested.
  • Custom: tuned for one workload.

Cons

  • Off-the-shelf: less customization.
  • Custom: maintenance burden.

Scale concerns

  • Auth introspection cache miss → upstream auth latency.
  • Hot route — dedicate routes to a pool.
  • Config push — atomic swap; no torn state.
  • Cascading failure on upstream slowdown.

Related reads