HLD9 min read
Design a Distributed Cache
Consistent hashing, replication for read-heavy workloads, and the cache-stampede problem.
hldsystem-design
Intro
A distributed cache is rarely the most novel design, but it's the most common — almost every other system uses one. The interesting decisions are sharding, eviction, and what happens when many requests miss the same key at once.
Functional
- GET / SET / DEL with TTL.
- Bulk MGET.
- Optional pub/sub for invalidation.
Non-functional
- p99 read < 1 ms.
- Tolerate single-node loss without all keys becoming cold.
- Hot key rebalancing without downtime.
Components
Cache nodes
Memcached / Redis with limited memory per node.
Client library
Holds the consistent-hash ring; routes ops to the right node.
Coordinator
Membership; gossip or ZK.
Replication layer
Async replicas per shard for read offload + failover.
Trade-offs
Consistent hashing vs. mod-n
Pros
- Adding a node moves only K/N keys, not all keys.
Cons
- Hash hot-spots on small clusters — virtual nodes mitigate.
Eviction: LRU vs. LFU vs. ARC
Pros
- LRU is simple and ubiquitous.
- LFU keeps hot keys warm under scans.
Cons
- LRU thrashes on scans.
- LFU has a count overhead.
Scale concerns
- Cache stampede on expiry → single-flight, jittered TTLs, or pre-refresh.
- Thundering herd on cold start → warm caches before traffic.
- Network: 10 GbE saturates faster than CPU on small-value workloads.