Design Twitter (Timeline)
Fan-out on write vs. read, with a hybrid for celebrity accounts. The classic timeline trade-off.
Intro
Twitter's home timeline is the canonical fan-out problem. Fan-out on write writes one tweet to N follower inboxes; fan-out on read assembles each user's timeline at request time. Neither alone scales — production runs a hybrid keyed on follower count.
Functional
- Post a tweet (140–280 chars + media).
- Read home timeline: most recent tweets from people you follow.
- Read user timeline: most recent tweets you posted.
Non-functional
- Read p95 < 200 ms.
- 500 M users, 200 M DAU. ~100 M tweets/day = ~1.2k QPS write, 100k QPS read at peak.
- Avg follower fan-out: 200; celebrity outliers: > 50 M.
Components
Write API
Persist tweet to user-timeline store; emit event.
Fan-out worker
Consumer: pushes new tweet to each follower's home cache.
Home-timeline cache
Per-user list of recent tweet ids in Redis / Memcached.
User-timeline store
Append-only per-user store (Cassandra).
Tweet store
Tweet content keyed by id (Manhattan / KV).
Search index
Inverted index for keyword + hashtag.
Trade-offs
Fan-out on write vs. on read
Pros
- On write → reads are O(1).
- On read → writes are cheap, no inbox storage.
Cons
- On write blows up for celebrities.
- On read fans out at request time — high latency.
Hybrid
Pros
- Push for normal users, pull for celebrities — best of both.
Cons
- Two code paths; merging push + pull at read time.
Scale concerns
- Celebrity tweets → backfill millions of inboxes; needs lazy fan-out at read.
- Inbox size cap (e.g., 800 tweets) keeps Redis bounded.
- Backpressure on the fan-out queue during traffic spikes.