Design a Notification Service (Push, Email, SMS)
Multi-channel delivery with templating, user preferences, batching, and provider-failover at billions of sends/day.
Intro
A notification service takes events from many internal callers, applies user preferences + templating, and delivers via the right channel (push, email, SMS, in-app). The hard parts: (1) provider reliability — APNS / FCM / SES all have their own failure modes; (2) preference enforcement — a single missed unsubscribe is a regulatory issue; (3) scale — 10 B notifications/day at peak.
Functional
- Send notifications via push (mobile), email, SMS, in-app.
- Respect per-user channel preferences + global unsubscribe.
- Templating with localisation.
- Batch / quiet-hours suppression.
Non-functional
- 10 B notifications/day; 100 k peak QPS.
- Delivery latency p95 < 5 s for transactional, < 1 hr for digests.
- ≥ 99.95% delivery success on healthy providers.
- Idempotent — retries never double-send.
Components
Ingest API
Receives notification requests from internal callers (idempotent).
Preference checker
Reads user prefs + suppression lists.
Template renderer
Substitutes variables, localises, outputs final payload.
Channel dispatchers
Per-channel workers (push/email/SMS) with provider clients.
Tracker
Records delivery, opens, clicks, unsubscribes.
Trade-offs
Push at write vs. pull at user-open
Pros
- Push is real-time.
- Pull avoids waste for rarely-active users.
Cons
- Push wastes for inactive users.
- Pull misses transactional UX windows.
Scale concerns
- Provider rate limits — APNS allows ~10 k/sec per app; SMS providers are tighter.
- Quiet hours globally → spike at 9 AM local in each timezone.
- Email reputation — too many bounces tank deliverability; pre-flight checks mandatory.