Skip to content
Jarviix
HLD8 min read

Design Pastebin (Code-paste / Snippet sharing)

URL-shortener-class read path with text bodies in object storage, expiry, syntax-highlight, and abuse controls.

hldsystem-designstorage

Intro

Pastebin lets users paste a code snippet or text and get a short URL to share it. Architecturally it's a URL shortener with the long URL replaced by an arbitrary text blob (≤ ~10 MB). The interesting parts are body storage (object store keyed by paste id), syntax-highlight rendering, expiry / privacy semantics, and abuse controls (malware, doxing, leaked secrets).

Functional

  • POST /paste { text } → short id.
  • GET /{paste_id} → render with syntax highlighting.
  • Optional expiry (10 min, 1 day, 30 days, never).
  • Privacy levels: public, unlisted, password-protected.

Non-functional

  • Read p95 < 100 ms on hot pastes (CDN-cacheable).
  • Write p95 < 500 ms.
  • Storage: avg paste 50 KB, top 1% > 1 MB.
  • Durability ≥ 11 9s on bodies.

Components

  • Write API

    Validates + persists the body to object storage.

  • Read API

    Resolves id → body URI; returns body or rendered HTML.

  • Object store

    Bodies keyed by paste_id.

  • Metadata DB

    Postgres for paste meta (lang, expiry, ACL).

  • Renderer

    Server-side syntax highlight + paste-of-the-day cache.

  • Abuse pipeline

    Async secret-scan + safe-browsing-style classification.

Trade-offs

Render server-side vs. client-side

Pros

  • Server-side: cacheable HTML, accessible by curl/wget.
  • Client-side: cheaper compute.

Cons

  • Server-side: one rendered version per (paste, theme).
  • Client-side: needs JS, breaks scrapers.

Scale concerns

  • Hot pastes (Hacker-News-front-page) need CDN absorption.
  • Body size variance — must enforce a hard cap.
  • Abuse: phishing / leaked credentials / malware go viral.

Related reads