HLD9 min read

Design Top-K / Trending (real-time heavy hitters)

Approximate top-K at billions of events/day using Count-Min Sketch + heap, with multi-window leaderboards (1m, 1h, 1d).

hldsystem-designstreaming

Intro

Top-K trending (Twitter trends, YouTube trending, Spotify charts) is the canonical 'heavy-hitters at scale' problem. Naive `GROUP BY ... ORDER BY count DESC LIMIT k` doesn't scale past tens of thousands of QPS. The right answer is probabilistic data structures (Count-Min Sketch) + a small min-heap of top-K candidates per partition, merged at query time.

Functional

Ingest events (term, ts) at billions/day.
Compute top-K trending terms in last 1m / 1h / 1d windows.
Allow per-region / per-language slicing.
Query latency for current top-K under 200 ms.

Non-functional

Approximation error ≤ 1% on counts (CMS bounds).
Window staleness ≤ 1 minute.
Memory: bounded sketch per partition (single-digit MB).
Survive partition failure with bounded loss.

Components

Event ingester
Validates + writes to Kafka.
Stream processor
Per-partition CMS + min-heap of top-K.
Aggregator
Merges per-partition heaps into global top-K.
Window manager
Tumbling 1m / 1h / 1d windows.
Query API
Reads merged top-K from cache.

Trade-offs

Exact (hash map) vs CMS

Pros

CMS: bounded memory.
Exact: 0% error.

Cons

Exact: blows up on cardinality.
CMS: tunable error.

Per-partition heap merge vs central heap