Tech · 6 min read
Caching Strategies: Cache-Aside, Write-Through, Write-Back, and When to Use Each
Caching is the highest-leverage performance optimization. The patterns, the consistency trade-offs, and the invalidation strategies that actually work in production.
By Jarviix Engineering · Apr 19, 2026
Caching is one of the highest-leverage performance optimizations available to backend engineers. A well-placed cache can deliver 100-1000x speedups on hot data paths and reduce database load by 90%+. A poorly-managed cache creates subtle correctness bugs that take weeks to debug.
This post covers the four major caching patterns, the trade-offs each makes, and the practical strategies for cache invalidation, consistency, and avoiding the famous "two hard problems in computer science" trap.
What caching does
A cache stores frequently-accessed data in fast, expensive memory (usually RAM) to avoid querying slower, cheaper storage (disk-based databases). Cache hits are typically 100-1000x faster than database queries.
The key trade-offs:
- Speed vs. cost: RAM is expensive; you cache only the hot subset
- Speed vs. correctness: cache may be stale; staleness must be tolerable
- Speed vs. complexity: caching adds invalidation logic, failure modes, capacity management
Cache-Aside (lazy loading)
The application is responsible for cache management. On a read:
- Check cache for the key
- If hit: return cached value
- If miss: query database, populate cache, return value
On a write:
- Update database
- Either invalidate the cache key or update it
Pros:
- Simple, application-controlled
- Works with any cache and database
- Only cache what's actually requested
- Resilient to cache failures (just falls back to DB)
Cons:
- Stale data window between DB write and cache invalidation
- Initial reads always hit DB (cold cache)
- Cache stampede when popular keys expire simultaneously
When to use: Default choice for most read-heavy applications. Best when stale-data tolerance is reasonable.
Read-Through
The cache itself queries the database on miss. Application sees a uniform interface — just queries the cache.
Application → Cache → (on miss) Database
Pros:
- Application code is simpler (no DB fallback logic)
- Centralized cache logic
- Often built into ORMs and cache libraries
Cons:
- Cache becomes critical path; cache failure breaks reads
- Less flexibility per-request
- Initial reads still slow (cold cache)
When to use: When using ORM/framework with native read-through support and you want to abstract cache management.
Write-Through
Application writes to cache; cache synchronously writes to database.
Application → Cache → Database
Pros:
- Cache always consistent with DB
- No stale data issues
- Reads after writes always fresh
Cons:
- Slow writes (must wait for DB)
- Cache pressure from all writes (even rarely-read data)
- Cache becomes critical write path
When to use: Write-heavy workloads where consistency is critical and writes are immediately followed by reads.
Write-Behind (Write-Back)
Application writes to cache; cache asynchronously writes to database in batches.
Application → Cache → (async) Database
Pros:
- Very fast writes (only cache update is synchronous)
- Database write batching reduces load
- Smooths write spikes
Cons:
- Risk of data loss if cache fails before persisting
- Eventual consistency between cache and DB
- Complex error handling on async writes
When to use: Write-heavy workloads where some data loss is tolerable (analytics events, metrics) and database write throughput is the bottleneck.
Cache Invalidation Strategies
The hardest part of caching. Several approaches:
Time-based (TTL)
Each entry has a time-to-live; expires automatically.
Pros: Simple, no application coordination needed. Cons: Stale during TTL window; thundering herd when popular keys expire.
TTL choice matters:
- Too short: minimal cache benefit, high DB load
- Too long: significant staleness
- Standard: 5 minutes to 1 hour for most data; longer for slow-changing data
Event-based (active invalidation)
On data mutation, explicitly invalidate the cache key.
Pros: Cache stays fresh. Cons: All write paths must remember to invalidate; missed invalidations cause subtle bugs.
Version-based
Cache key includes a version number. Increment version on update; old cache entries become "garbage" and eventually expire.
Pros: No race conditions; reads always get correct version. Cons: Memory waste from old versions; requires versioning system.
Hybrid
TTL + event invalidation. TTL handles fallback; explicit invalidation handles correctness for critical paths.
This is the most common production pattern.
Cache stampede prevention
When a popular key expires, many concurrent requests miss simultaneously, all hitting the database. Prevention:
Probabilistic early expiration
Refresh cache before actual TTL with probability that increases as expiration approaches. Spreads refresh load over time.
Lock-based regeneration
First request to miss acquires a lock and regenerates; others wait for the lock or serve stale data.
Stale-while-revalidate
Serve stale data while triggering async refresh in background. Users see slight staleness; database load stays manageable.
Cache warming
Proactively refresh popular keys before they expire. Used for predictable high-traffic patterns.
Multi-level caching
Production caching often involves multiple layers:
- Browser cache: HTTP cache headers, service workers
- CDN: edge caching for static and semi-static content
- Application cache (in-memory): fastest, ephemeral, per-instance
- Distributed cache (Redis/Memcached): shared across instances
- Database query cache: built into many databases (PostgreSQL, MySQL)
Each layer reduces load on the next. Hot data may be served entirely from CDN or application cache, never reaching Redis.
Cache key design
Key naming matters more than people think. Best practices:
- Namespace by entity type:
user:123:profile,order:456 - Include version:
user:v2:123:profileenables blue-green schema changes - Avoid encoding sensitive data: keys may show in logs
- Keep keys reasonable length: extremely long keys waste memory
- Use consistent separators:
:is conventional in Redis
Common cache failures
Cache poisoning
Malformed data written to cache; subsequent reads return bad data. Mitigation: validate data before caching.
Cache penetration
Queries for non-existent data always miss cache and hit DB. Attacker exploits with random key queries. Mitigation: cache empty responses with short TTL; bloom filter for existence checks.
Cache stampede
Many concurrent misses on hot key. Mitigation: techniques above.
Cold start
Cache empty after restart; all queries hit DB. Mitigation: cache warming, gradual traffic ramp-up.
Memory eviction surprises
Cache evicts entries you assumed would persist. Mitigation: monitor eviction rates; size cache appropriately.
Common mistakes
- Caching everything indiscriminately: cache the hot, slow, rarely-changing things; not everything
- No monitoring of hit rate: hit rate below 80-90% suggests poor caching strategy
- Forgetting to invalidate on update: subtle bugs surface days later
- Using cache as primary storage: cache is best-effort; don't rely on it for critical data
- Ignoring serialization cost: large objects are slow to serialize/deserialize; cache size matters
- TTL of 0 or infinite: defeats the purpose of caching
- Caching at the wrong layer: caching DB queries when API responses would be better, or vice versa
What to read next
- System design basics — caching in broader context.
- Database sharding explained — caching often delays sharding need.
- Eventual consistency — what caching forces you to accept.
- Load balancers deep dive — caching at the edge.
Caching is the rare engineering technique with order-of-magnitude impact. Done well, it makes systems feel snappy and reduces infrastructure costs significantly. Done poorly, it creates correctness bugs that take months to surface and longer to debug. Master the patterns, design cache keys deliberately, and monitor everything — the rewards are worth the discipline.
Frequently asked questions
What's the most common caching pattern in production?
Cache-aside (lazy loading) is by far the most common — application checks cache first, falls back to database on miss, then populates cache. It's simple, doesn't require special database support, and gives the application full control over what gets cached. The downside: stale data risk and the 'thundering herd' problem when popular cache entries expire simultaneously. Despite these issues, it's the default for most teams.
How do I prevent cache stampede when entries expire?
Multiple techniques: (1) Probabilistic early expiration — refresh cache before actual TTL with some probability, distributing refresh load. (2) Lock-based regeneration — only one request triggers DB query and cache repopulation; others wait. (3) Stale-while-revalidate — serve stale data while async refresh happens in background. (4) Sharded caches with jittered TTLs to spread expiration. Most production systems combine 2-3 of these for high-traffic keys.
Should I use Redis or Memcached?
Redis for almost all new use cases. Reasons: rich data structures (lists, sets, sorted sets, hashes), persistence options, pub/sub, Lua scripting, cluster mode for horizontal scaling, atomic operations. Memcached is simpler and slightly faster for pure key-value workloads but lacks features. Memcached makes sense for very specific high-throughput, simple-key-value cache layers; Redis is the right default for everything else.
Read next
Apr 19, 2026 · 6 min read
Caching Strategies for Backend Engineers: Cache-Aside, Write-Through, and the Rest
How to actually use a cache — when to use cache-aside, write-through, write-behind, refresh-ahead — and the failure modes (thundering herd, stampede, drift) that bite in production.
Apr 19, 2026 · 6 min read
How CDNs Actually Work: Edge, Origin, and the Magic in Between
What a CDN really does, the cache headers that drive it, how to invalidate, and the patterns (stale-while-revalidate, surrogate keys, edge functions) that turn a CDN from 'static asset server' into a real performance tool.
Apr 19, 2026 · 6 min read
API Rate Limiting Strategies: Token Bucket, Leaky Bucket, and Sliding Window
Rate limiting protects APIs from abuse and overload. The major algorithms, when each is appropriate, and how to implement them in distributed systems.