Design Ad Click Aggregation
Sub-minute aggregation of billions of click events with idempotent dedup, fraud filtering, and real-time + batch reconciliation.
Intro
Ad click aggregation is the canonical 'large-volume event aggregation' interview. Billions of clicks/day must be counted accurately (every click is money), deduplicated against bots and double-clicks, aggregated by ad / campaign / publisher, and made queryable in near-real-time. The classic lambda architecture answer.
Functional
- Ingest click events at billions/day.
- Dedupe — fraud + double-click + bot.
- Aggregate by (ad_id, campaign_id, publisher_id) at multiple time granularities.
- Query aggregates with low latency (advertiser dashboards).
Non-functional
- Ingest p99 < 100 ms (writer can't be the bottleneck).
- Query p99 < 200 ms on dashboards.
- Aggregation freshness ≤ 1 minute.
- Reconciliation accuracy: ≥ 99.99% vs batch source-of-truth.
Components
Click ingest
Validates + writes raw click to Kafka.
Stream processor
Flink — dedupe + aggregate within windows.
Speed table
Latest-minute aggregates (Druid / ClickHouse).
Batch processor
Daily reconciliation (Spark).
Query API
Reads from speed + batch tables; merges.
Fraud filter
Real-time + offline ML scoring.
Trade-offs
Lambda (speed + batch) vs kappa (stream-only)
Pros
- Lambda: handles late events well.
- Kappa: simpler ops.
Cons
- Lambda: two pipelines.
- Kappa: replay all on reprocess.
Pre-aggregate vs raw
Pros
- Pre-agg: cheap reads.
- Raw: flexible.
Cons
- Pre-agg: limited query shape.
Scale concerns
- Late-arriving events (mobile uploads minutes after click).
- Duplicate clicks (network retries).
- Bot/fraud — ML model in stream + offline review.
- Reconciliation drift between speed + batch.
Related reads
HLD
Design a Distributed Counter
Sharded counters at 1 M+ writes/sec with bounded staleness reads, used for view counts, like counts, and rate aggregations.
HLD
Design a Distributed Message Queue (Kafka-class)
Partitioned, replicated, append-only log with at-least-once delivery, ordered partitions, and consumer groups at 1M+ msgs/sec.