When do I actually need to shard?

Almost never as your first scaling step. Try in order: vertical scaling (bigger machine), read replicas, query optimization, schema denormalization, caching, then sharding. Sharding is appropriate when (a) a single primary cannot handle write load even after optimization, (b) data volume exceeds what one node can store, or (c) latency requirements demand co-location of user data. Most teams shard 2-3 years too early and pay a heavy operational tax for years.

What's the difference between sharding and partitioning?

Often used interchangeably but technically different. Partitioning is splitting a single table into smaller pieces — could be on the same machine (e.g., PostgreSQL table partitions). Sharding is partitioning data across multiple physical machines, each running its own database. Sharding always implies partitioning; partitioning doesn't always mean sharding.

Can I un-shard if it was a mistake?

Possible but painful. Consolidating shards involves data migration, schema reconciliation, and downtime risks. Plan for the possibility from day one — keep cross-shard queries minimal, don't rely on shard-specific behaviors, and document migration paths. Many companies have un-sharded successfully (e.g., when a previously growth-justified shard count became unnecessary after architectural improvements), but it requires the same engineering investment as the original sharding.

Database Sharding Explained: When, Why, and How to Do It Right

Database sharding is one of those topics where the conventional wisdom — "shard your database to scale!" — leads to most teams making expensive mistakes. Sharding is a sledgehammer that solves real scaling problems but introduces operational complexity that compounds over years.

This post covers when sharding is actually justified, the major sharding strategies and their trade-offs, and the engineering decisions that determine whether your sharded system stays maintainable or becomes a quagmire.

What sharding actually is

Sharding splits a logical database across multiple physical machines (shards). Each shard contains a subset of the total data, distinguished by a sharding key. Queries route to the appropriate shard based on the key.

For example, sharding by user_id with 4 shards:

Shard 0: user_ids 0-249,999
Shard 1: user_ids 250,000-499,999
Shard 2: user_ids 500,000-749,999
Shard 3: user_ids 750,000+

A query for user 327,489 routes to Shard 1.

This achieves three things:

Horizontal scaling: write load splits across machines
Data locality: data for related queries lives on the same machine
Failure isolation: an outage on one shard doesn't take down the entire system

It also introduces:

Distributed query complexity: cross-shard queries become hard
Operational overhead: 4 databases to manage instead of 1
Resharding pain: changing shard topology requires significant data movement

The pre-sharding scaling ladder

Before sharding, exhaust simpler options:

1. Vertical scaling

Buy a bigger machine. Modern cloud instances offer 96 vCPUs, 1.5TB RAM, NVMe SSDs. PostgreSQL on a beefy instance can handle 50,000+ TPS for many workloads. The simplicity advantage is enormous.

2. Read replicas

Add 2-5 read replicas. Route reads to replicas, writes to primary. This handles 80% of scaling needs because most workloads are read-heavy. Eventual consistency on replicas is usually acceptable.

3. Query optimization

Profile slow queries with EXPLAIN ANALYZE. Add missing indexes. Rewrite N+1 query patterns. A 10x query speed improvement is often equivalent to 10x more hardware.

4. Caching

Add Redis/Memcached for frequently-accessed data. A cache hit is 100x faster than a database query. Caching often eliminates 70-90% of read load.

5. Schema denormalization

Pre-compute aggregates, join tables, materialized views. Trade some write complexity for dramatic read efficiency.

6. Connection pooling and database-level optimizations

PgBouncer, statement-level config tuning, autovacuum tuning. These can yield 2-5x throughput on existing hardware.

Only after these are exhausted should you consider sharding.

Sharding strategies

Range-based sharding

Data is partitioned by ranges of the shard key. Example: user_id 0-1M on shard 1, 1M-2M on shard 2, etc.

Pros:

Simple to understand
Range queries within a shard are efficient
Easy to provision new shards by adding range ends

Cons:

Hot shards (recent users get all the action)
Manual rebalancing if data is uneven
New writes concentrate on the latest shard

When to use: Time-series data, append-only workloads, or when range queries are common and even distribution is achievable.

Hash-based sharding

Apply a hash function to the shard key, modulo the number of shards. Example: shard = hash(user_id) % 4.

Pros:

Even data distribution
No hotspots from sequential keys
Predictable shard count

Cons:

Range queries require fan-out to all shards
Adding/removing shards requires resharding (every key may change shard)
Cross-shard queries are common

When to use: User-centric workloads where evenness matters more than range queries.

Consistent hashing

A hash-based variant that minimizes data movement when adding/removing shards. Each shard owns a slice of the hash ring; only data near the new shard's slice moves.

Pros:

Adding shards is much cheaper than naive hash sharding
Failure of one shard affects only its slice
Used by Cassandra, DynamoDB, Redis Cluster

Cons:

Slightly more complex to implement and debug
Hot spots possible if hash distribution is uneven
Requires virtual nodes for balanced load

When to use: When dynamic resharding is a likely operational need.

Directory-based sharding

A lookup service maps shard keys to physical shards. Total flexibility — each key can be on any shard.

Pros:

Maximum flexibility
Easy to move individual users/tenants between shards
Custom routing logic possible (e.g., big customers on dedicated shards)

Cons:

Lookup service is a critical dependency
Lookup latency on every query
Lookup service must be highly available

When to use: Multi-tenant applications where tenants vary dramatically in size, or where you need tenant-level shard isolation.

Geographic / locality-based sharding

Data is sharded by user location to minimize cross-region latency. Example: EU users on EU shards, US users on US shards.

Pros:

Compliance with data residency laws (GDPR)
Lower latency for users
Failure isolation by geography

Cons:

Cross-geography queries expensive
Complex if users move regions
Doubled-up infrastructure costs per region

When to use: Global applications with regulatory requirements or strong locality patterns.

Choosing a shard key

The shard key determines almost everything about your sharding implementation. Bad shard keys cause most sharding failures.

A good shard key:

High cardinality: many distinct values to distribute across shards
Even distribution: no hot keys (avoid sharding by country if 80% are in one country)
Used in most queries: minimizes cross-shard fan-out
Stable: doesn't change frequently (re-sharding moves data)

Bad shard keys:

Boolean fields: only 2 values, only 2 shards possible
Timestamps for active workloads: latest shard becomes hot
Sequential IDs without hashing: writes concentrate on latest shard
Mutable fields: changes require data migration

Common good choices:

user_id (for user-centric apps)
tenant_id (for B2B SaaS)
hash(user_id) (for even distribution)

Cross-shard queries: the hidden tax

Once sharded, queries that span shards become expensive:

Aggregations across all users (must query every shard)
Joins between users on different shards (rarely possible without expensive scatter-gather)
Global counts/sums (require fan-out + aggregation)

Strategies to mitigate:

Denormalize across shards: duplicate data so queries can be answered from one shard
Pre-compute aggregates: maintain a separate, non-sharded analytics database
Avoid cross-shard joins: redesign data model so related data is co-located
Accept fan-out for rare queries: some queries running slower is acceptable for systems where they're infrequent

The discipline: design your sharding so 95%+ of queries hit a single shard.

Resharding: the inevitable pain

Eventually you'll need to:

Add shards as data grows
Remove shards if traffic shrinks
Rebalance hot shards
Change shard key (worst case)

Resharding requires:

Provisioning new shards
Copying data from old shards to new
Updating routing logic
Cutover (often with downtime or careful coordination)
Decommissioning old shards

This is one of the most operationally expensive activities in distributed systems. Many companies plan their initial shard count to be 4-10x current needs to delay resharding for years.

Common mistakes

Sharding too early: paying operational tax for years before scale justifies it
Bad shard key: data ends up unevenly distributed, hot shards form
Cross-shard joins everywhere: queries fan out, performance worsens after sharding
No resharding plan: when growth forces resharding, no playbook exists
Single point of failure in routing: directory service goes down, entire system unreachable
Inconsistent transactions across shards: distributed transactions are slow and complex; many teams underestimate this

What to read next

System design basics — context before sharding decisions.
SQL vs NoSQL — sharding implications differ between paradigms.
Database isolation levels — distributed transactions deep-dive.
Eventual consistency — what you trade for scale.

Sharding is a powerful tool but a heavy one. The best sharding decisions come from teams that have first exhausted simpler scaling options, then sharded with a clear understanding of the trade-offs they're accepting. Premature sharding burdens engineering teams for years; well-executed sharding solves real scale problems durably.

Database Sharding Explained: When, Why, and How to Do It Right

What sharding actually is

The pre-sharding scaling ladder

1. Vertical scaling

2. Read replicas

3. Query optimization

4. Caching

5. Schema denormalization

6. Connection pooling and database-level optimizations

Sharding strategies

Range-based sharding

Hash-based sharding

Consistent hashing

Directory-based sharding

Geographic / locality-based sharding

Choosing a shard key

A good shard key:

Bad shard keys:

Common good choices:

Cross-shard queries: the hidden tax

Resharding: the inevitable pain

Common mistakes

What to read next

Frequently asked questions

Related Jarviix tools

Read next

Sharding and Partitioning: Strategies, Trade-offs, and the Pain Nobody Warns You About

Designing Rate Limiters: Token Bucket, Leaky Bucket, and Sliding Windows

Designing a URL Shortener: An Interview-Style Walkthrough