Do I need a CDN for an API?

If most responses are uncacheable user-specific data, less so. If you have idempotent reads (product catalogs, public data, infrequently-changing pages), a CDN dramatically cuts origin load and improves latency.

What's the difference between a CDN and a reverse proxy?

A CDN is a globally distributed network of reverse proxies. The functionality is similar; the geographic distribution and the operational model differ. CloudFlare, Fastly, and Akamai handle the global distribution for you.

Should I serve HTML through a CDN?

If your HTML is static or near-static (marketing pages, blogs), absolutely. For per-user dynamic HTML, the answer is more nuanced — Edge Side Includes or per-user fragments at the edge can still help.

How CDNs Actually Work: Edge, Origin, and the Magic in Between

CDNs are one of those pieces of infrastructure that have quietly absorbed an enormous fraction of internet traffic. The big sites you visit every day — most of their bytes never come from their origin servers; they come from a server a few hops from your house.

This post walks through what CDNs actually do, the cache headers that drive them, and the patterns that turn a CDN from a static-asset hoster into a real performance tool.

What a CDN actually is

A Content Delivery Network is a globally distributed set of caching reverse proxies. When a user requests a URL:

DNS resolves to the nearest CDN edge location.
The edge checks its cache.
Hit: serve from the edge. Latency = the trip from user to nearest edge, often 5-20ms.
Miss: fetch from origin, cache the response, serve to the user. Subsequent users in that region get the cached copy.

The wins:

Latency. Edge-to-user is much faster than origin-to-user.
Origin offload. A 90% cache hit rate means your origin sees 1/10th the traffic.
DDoS protection. The CDN absorbs volumetric attacks before they reach your infrastructure.
TLS termination. Modern CDNs handle HTTPS efficiently with shared session caches.

What controls caching

The CDN doesn't decide what to cache; you do, via HTTP headers.

Cache-Control

The most important header. Tells the CDN (and the browser) how to cache.

Cache-Control: public, max-age=3600, s-maxage=86400

public — anyone (CDN, browser) can cache.
private — only the user's browser, not shared caches.
max-age=3600 — fresh for 1 hour for browsers.
s-maxage=86400 — fresh for 1 day for shared caches (CDN).
no-cache — must revalidate with origin before serving.
no-store — don't cache at all.
immutable — content will never change; don't bother revalidating.

Vary

Tells the cache that responses differ based on certain request headers.

Vary: Accept-Encoding, Accept-Language

Important and easy to get wrong. Vary: User-Agent effectively defeats caching (every browser is a different cache key). Be specific.

ETag and Last-Modified

Validators that let the cache check freshness without re-downloading.

ETag: "abc123"

# next request
If-None-Match: "abc123"
# response: 304 Not Modified (no body)

Useful for browsers; less commonly used between CDN and origin where TTL-based caching dominates.

Cache levels in a typical stack

A request can be served from any of these layers:

Browser cache. Per-user, on disk.
ISP cache. Sometimes; less common in the HTTPS era.
CDN edge. Geographically near the user.
CDN regional/shield. Larger cache that the edges fall back to.
Origin. Your servers.

Hit rates compound. 70% browser, 70% on remaining at edge, 70% on remaining at shield → only ~3% of requests reach your origin.

Cache keys

By default, the cache key is roughly the URL. Same URL, same response. But:

Query strings might be part of the key (or not — configurable).
Vary headers add dimensions.
Cookies can either bust the cache or be ignored (very configurable).

A common mistake: passing analytics or affiliate query params (?utm_source=...) and not normalizing them. The cache treats every UTM as a different page; hit rates plummet. Fix: configure your CDN to ignore tracking params for cache key purposes.

Invalidation

The hard problem. Three approaches:

TTL expiry

Set a TTL; wait for it. Simple. The default for most sites — set a 5-minute TTL on dynamic pages and accept brief staleness.

Purge

Tell the CDN "forget this URL". Most CDNs support this via API:

curl -X POST https://api.cdn.com/purge \
  -d '{ "urls": ["https://site.com/page", "https://site.com/asset.js"] }'

Cost: some CDNs charge per purge or rate-limit them.

Surrogate keys

Tag responses with one or more keys; purge by key.

Surrogate-Key: product-42 catalog

Now you can purge "everything tagged product-42" or "everything tagged catalog". Fastly is the original implementer; CloudFlare's "cache tags" are equivalent.

This is the right answer for content sites — you publish or update an article, you purge by article-key, every page that includes it (homepage, category, sitemap) is invalidated atomically.

Patterns that elevate a CDN from "static" to "real performance tool"

Stale-while-revalidate

Cache-Control: max-age=60, stale-while-revalidate=600

Serve from cache for 60 seconds. After that, keep serving the stale version while asynchronously fetching fresh from origin. User never waits for revalidation.

This is one of the highest-leverage caching directives in the spec. Use it everywhere staleness is acceptable.

Stale-if-error

Cache-Control: max-age=60, stale-if-error=86400

If the origin is down, keep serving the stale cached version (up to a day) instead of returning 5xx. Free uptime improvement.

Edge SSR / fragment caching

For per-user pages that are mostly the same:

Cache the static shell aggressively.
Inject per-user fragments via Edge Side Includes (ESI) or via client-side hydration.

Most of the bytes are cached; only the personalized pieces hit your origin or edge function.

Edge functions

Modern CDNs (Cloudflare Workers, Fastly Compute, Vercel Edge Functions) let you run code at the edge:

Geo-aware redirects.
A/B test bucketing without origin involvement.
Auth checks that 401 unauthorized requests before they reach origin.
Personalization at the edge.

For high-traffic sites this is a real architectural option, not just a toy.

Things that catch people out

Cookies kill caching. A page with a Set-Cookie is often considered private and not shared-cached. Strip unnecessary cookies before responding.
Auth headers vary the cache. Authorization headers force per-user cache entries. Consider serving public versions to anonymous users.
Mixed content. A cache layer that serves HTTP for some users and HTTPS for others is a security incident waiting to happen. Use HSTS; redirect at the edge.
Geo content gotchas. A page that varies by user country needs Vary: Country (or its equivalent in your CDN) or you'll serve French content to Indian users.

Three rules

Decide on TTLs intentionally per route. Hero images: months. CSS/JS bundles with hash filenames: a year. HTML pages: minutes. APIs: seconds, sometimes none.
Use stale-while-revalidate aggressively. It's the most under-used caching directive in the spec.
Watch hit ratios per route, not just overall. A 95% overall hit rate that's actually 0% on your homepage and 99% on assets is hiding a problem.

What to read next

CDNs are one layer of caching; load balancing is the layer where edge traffic eventually reaches your origin. The YouTube/Netflix HLD writeup is the canonical applied example — almost every byte streamed by these platforms is served from edge caches you'll never see. System design basics covers how all these pieces fit together.