Skip to content
Jarviix
HLD9 min read

Design Instagram (Photo feed)

Object storage for media, CDN-fronted feed, hybrid fan-out timeline.

hldsystem-design

Intro

Instagram looks like Twitter with photos, but the bandwidth profile is different — a single image is ~200 KB, video much more. The architecture is dominated by storage tiering and CDN strategy.

Functional

  • Upload photo / video with caption.
  • Feed: photos from people you follow, ranked.
  • Profile: your own posts.
  • Like / comment / DM (out of scope here).

Non-functional

  • Hot photos served at p95 < 100 ms via CDN.
  • 1 B users, 500 M DAU. ~100 M photos/day at 200 KB = ~20 TB/day raw.
  • 5 yrs storage with copies = ~150 PB.

Components

  • API gateway

    Auth, rate-limit, request routing.

  • Upload service

    Multi-part upload to object store; emits processing job.

  • Media processor

    Generates thumbnails + transcodes video.

  • Object store

    S3 / GCS / blob.

  • CDN

    Edge caches photos. > 95% cache hit.

  • Feed service

    Hybrid fan-out (push for normal, pull for celebrities).

  • Ranking model

    Lightweight ranker on candidates from feed-store.

Trade-offs

Pre-generated thumbnails vs. on-the-fly

Pros

  • Pre-gen → cheap reads.
  • On-the-fly → no storage waste for unviewed images.

Cons

  • Pre-gen multiplies storage by ~3×.
  • On-the-fly needs an image proxy fleet.

Scale concerns

  • Origin shielding to prevent CDN miss storms.
  • Cold storage tier (S3 IA / Glacier) for old media.
  • Feed staleness — rank locally to balance freshness vs. compute.

Related reads