Tech · 7 min read
Database Sharding Explained: When, Why, and How to Do It Right
Sharding is the most common — and most misunderstood — way to scale databases. The strategies, the trade-offs, and the cases where you absolutely should not shard.
By Jarviix Engineering · Apr 19, 2026
Database sharding is one of those topics where the conventional wisdom — "shard your database to scale!" — leads to most teams making expensive mistakes. Sharding is a sledgehammer that solves real scaling problems but introduces operational complexity that compounds over years.
This post covers when sharding is actually justified, the major sharding strategies and their trade-offs, and the engineering decisions that determine whether your sharded system stays maintainable or becomes a quagmire.
What sharding actually is
Sharding splits a logical database across multiple physical machines (shards). Each shard contains a subset of the total data, distinguished by a sharding key. Queries route to the appropriate shard based on the key.
For example, sharding by user_id with 4 shards:
- Shard 0: user_ids 0-249,999
- Shard 1: user_ids 250,000-499,999
- Shard 2: user_ids 500,000-749,999
- Shard 3: user_ids 750,000+
A query for user 327,489 routes to Shard 1.
This achieves three things:
- Horizontal scaling: write load splits across machines
- Data locality: data for related queries lives on the same machine
- Failure isolation: an outage on one shard doesn't take down the entire system
It also introduces:
- Distributed query complexity: cross-shard queries become hard
- Operational overhead: 4 databases to manage instead of 1
- Resharding pain: changing shard topology requires significant data movement
The pre-sharding scaling ladder
Before sharding, exhaust simpler options:
1. Vertical scaling
Buy a bigger machine. Modern cloud instances offer 96 vCPUs, 1.5TB RAM, NVMe SSDs. PostgreSQL on a beefy instance can handle 50,000+ TPS for many workloads. The simplicity advantage is enormous.
2. Read replicas
Add 2-5 read replicas. Route reads to replicas, writes to primary. This handles 80% of scaling needs because most workloads are read-heavy. Eventual consistency on replicas is usually acceptable.
3. Query optimization
Profile slow queries with EXPLAIN ANALYZE. Add missing indexes. Rewrite N+1 query patterns. A 10x query speed improvement is often equivalent to 10x more hardware.
4. Caching
Add Redis/Memcached for frequently-accessed data. A cache hit is 100x faster than a database query. Caching often eliminates 70-90% of read load.
5. Schema denormalization
Pre-compute aggregates, join tables, materialized views. Trade some write complexity for dramatic read efficiency.
6. Connection pooling and database-level optimizations
PgBouncer, statement-level config tuning, autovacuum tuning. These can yield 2-5x throughput on existing hardware.
Only after these are exhausted should you consider sharding.
Sharding strategies
Range-based sharding
Data is partitioned by ranges of the shard key. Example: user_id 0-1M on shard 1, 1M-2M on shard 2, etc.
Pros:
- Simple to understand
- Range queries within a shard are efficient
- Easy to provision new shards by adding range ends
Cons:
- Hot shards (recent users get all the action)
- Manual rebalancing if data is uneven
- New writes concentrate on the latest shard
When to use: Time-series data, append-only workloads, or when range queries are common and even distribution is achievable.
Hash-based sharding
Apply a hash function to the shard key, modulo the number of shards. Example: shard = hash(user_id) % 4.
Pros:
- Even data distribution
- No hotspots from sequential keys
- Predictable shard count
Cons:
- Range queries require fan-out to all shards
- Adding/removing shards requires resharding (every key may change shard)
- Cross-shard queries are common
When to use: User-centric workloads where evenness matters more than range queries.
Consistent hashing
A hash-based variant that minimizes data movement when adding/removing shards. Each shard owns a slice of the hash ring; only data near the new shard's slice moves.
Pros:
- Adding shards is much cheaper than naive hash sharding
- Failure of one shard affects only its slice
- Used by Cassandra, DynamoDB, Redis Cluster
Cons:
- Slightly more complex to implement and debug
- Hot spots possible if hash distribution is uneven
- Requires virtual nodes for balanced load
When to use: When dynamic resharding is a likely operational need.
Directory-based sharding
A lookup service maps shard keys to physical shards. Total flexibility — each key can be on any shard.
Pros:
- Maximum flexibility
- Easy to move individual users/tenants between shards
- Custom routing logic possible (e.g., big customers on dedicated shards)
Cons:
- Lookup service is a critical dependency
- Lookup latency on every query
- Lookup service must be highly available
When to use: Multi-tenant applications where tenants vary dramatically in size, or where you need tenant-level shard isolation.
Geographic / locality-based sharding
Data is sharded by user location to minimize cross-region latency. Example: EU users on EU shards, US users on US shards.
Pros:
- Compliance with data residency laws (GDPR)
- Lower latency for users
- Failure isolation by geography
Cons:
- Cross-geography queries expensive
- Complex if users move regions
- Doubled-up infrastructure costs per region
When to use: Global applications with regulatory requirements or strong locality patterns.
Choosing a shard key
The shard key determines almost everything about your sharding implementation. Bad shard keys cause most sharding failures.
A good shard key:
- High cardinality: many distinct values to distribute across shards
- Even distribution: no hot keys (avoid sharding by country if 80% are in one country)
- Used in most queries: minimizes cross-shard fan-out
- Stable: doesn't change frequently (re-sharding moves data)
Bad shard keys:
- Boolean fields: only 2 values, only 2 shards possible
- Timestamps for active workloads: latest shard becomes hot
- Sequential IDs without hashing: writes concentrate on latest shard
- Mutable fields: changes require data migration
Common good choices:
user_id(for user-centric apps)tenant_id(for B2B SaaS)hash(user_id)(for even distribution)
Cross-shard queries: the hidden tax
Once sharded, queries that span shards become expensive:
- Aggregations across all users (must query every shard)
- Joins between users on different shards (rarely possible without expensive scatter-gather)
- Global counts/sums (require fan-out + aggregation)
Strategies to mitigate:
- Denormalize across shards: duplicate data so queries can be answered from one shard
- Pre-compute aggregates: maintain a separate, non-sharded analytics database
- Avoid cross-shard joins: redesign data model so related data is co-located
- Accept fan-out for rare queries: some queries running slower is acceptable for systems where they're infrequent
The discipline: design your sharding so 95%+ of queries hit a single shard.
Resharding: the inevitable pain
Eventually you'll need to:
- Add shards as data grows
- Remove shards if traffic shrinks
- Rebalance hot shards
- Change shard key (worst case)
Resharding requires:
- Provisioning new shards
- Copying data from old shards to new
- Updating routing logic
- Cutover (often with downtime or careful coordination)
- Decommissioning old shards
This is one of the most operationally expensive activities in distributed systems. Many companies plan their initial shard count to be 4-10x current needs to delay resharding for years.
Common mistakes
- Sharding too early: paying operational tax for years before scale justifies it
- Bad shard key: data ends up unevenly distributed, hot shards form
- Cross-shard joins everywhere: queries fan out, performance worsens after sharding
- No resharding plan: when growth forces resharding, no playbook exists
- Single point of failure in routing: directory service goes down, entire system unreachable
- Inconsistent transactions across shards: distributed transactions are slow and complex; many teams underestimate this
What to read next
- System design basics — context before sharding decisions.
- SQL vs NoSQL — sharding implications differ between paradigms.
- Database isolation levels — distributed transactions deep-dive.
- Eventual consistency — what you trade for scale.
Sharding is a powerful tool but a heavy one. The best sharding decisions come from teams that have first exhausted simpler scaling options, then sharded with a clear understanding of the trade-offs they're accepting. Premature sharding burdens engineering teams for years; well-executed sharding solves real scale problems durably.
Frequently asked questions
When do I actually need to shard?
Almost never as your first scaling step. Try in order: vertical scaling (bigger machine), read replicas, query optimization, schema denormalization, caching, then sharding. Sharding is appropriate when (a) a single primary cannot handle write load even after optimization, (b) data volume exceeds what one node can store, or (c) latency requirements demand co-location of user data. Most teams shard 2-3 years too early and pay a heavy operational tax for years.
What's the difference between sharding and partitioning?
Often used interchangeably but technically different. Partitioning is splitting a single table into smaller pieces — could be on the same machine (e.g., PostgreSQL table partitions). Sharding is partitioning data across multiple physical machines, each running its own database. Sharding always implies partitioning; partitioning doesn't always mean sharding.
Can I un-shard if it was a mistake?
Possible but painful. Consolidating shards involves data migration, schema reconciliation, and downtime risks. Plan for the possibility from day one — keep cross-shard queries minimal, don't rely on shard-specific behaviors, and document migration paths. Many companies have un-sharded successfully (e.g., when a previously growth-justified shard count became unnecessary after architectural improvements), but it requires the same engineering investment as the original sharding.
Read next
Apr 19, 2026 · 6 min read
Sharding and Partitioning: Strategies, Trade-offs, and the Pain Nobody Warns You About
When to shard, how to shard, and the operational realities — hot keys, resharding, cross-shard joins — that decide whether your database scales gracefully or painfully.
Apr 19, 2026 · 6 min read
Designing Rate Limiters: Token Bucket, Leaky Bucket, and Sliding Windows
How rate limiters actually work — token bucket, leaky bucket, fixed and sliding windows — with the trade-offs that decide which one belongs in front of your API.
Apr 19, 2026 · 7 min read
Designing a URL Shortener: An Interview-Style Walkthrough
A complete walkthrough of designing a URL shortener at interview depth — requirements, ID generation, storage, caching, scaling, and the trade-offs at every step.