Is Postgres really 'good enough' for everything?

For most products with under a few terabytes of data and reasonable QPS, yes. Postgres is so feature-rich (JSONB, full-text search, GIS, listen/notify, logical replication) that it's a credible answer to most early-stage data questions.

When should I split into multiple databases?

When one workload's needs (latency, schema, scale) actively hurt the others. A single Postgres serving OLTP, analytics, and search will eventually do all three badly.

Is NoSQL faster than SQL?

Sometimes for specific shapes (single-key lookups at huge scale), often not. The honest comparison is workload-by-workload, not database-by-database.

SQL vs NoSQL: A Decision Framework, Not a Religion

The "SQL vs NoSQL" debate is mostly noise. Both work. The interesting question is which database fits the workload you actually have — and that question gets answered with engineering, not vibes.

This post walks through how to decide without joining a tribe.

What "SQL" and "NoSQL" actually mean

"SQL" databases are relational stores — Postgres, MySQL, SQL Server, Oracle. They share three things: the relational model (tables, rows, foreign keys), strong consistency by default, and transactional guarantees (ACID).

"NoSQL" is a marketing umbrella covering at least four very different families:

Document stores. MongoDB, DynamoDB documents, Couchbase. JSON-like records keyed by ID.
Wide-column stores. Cassandra, ScyllaDB, BigTable. Sorted key-value structures with rich row layouts.
Key-value stores. Redis, DynamoDB key-value, RocksDB. Pure GET/SET on opaque values.
Graph databases. Neo4j, Amazon Neptune. Optimized for traversing relationships.

Treating all of these as one thing ("NoSQL") is the source of half the bad decisions in this space. They're as different from each other as they are from Postgres.

What relational really gives you

The thing relational databases do best — and that's underrated until you don't have it — is enforce constraints and let you ask arbitrary questions.

Schema enforcement. Bad data doesn't get in. Bugs surface at write time, not three months later when reports are wrong.
Foreign keys + constraints. Referential integrity, uniqueness, check constraints. These are bug-prevention infrastructure.
Joins. When you don't know in advance how the data will be queried, joins let you ask any question without re-modeling.
Transactions. Money-moving operations either fully happen or fully don't. With ACID this is one line of code; without it, it's an entire architectural concern.
Mature tooling. 40 years of query optimizers, monitoring, backup tools, ORMs, education. Don't undervalue this.

The cost: relational databases historically scaled vertically (bigger machines), and re-modeling takes migrations. Both are less true than they used to be — Postgres on a $1000/month managed instance handles 95% of products comfortably.

What document stores really give you

Document stores nail one specific problem: store and retrieve flexible records by key, fast, at scale.

The wins:

Flexible schema. Records of different shapes coexist in one collection. Useful when the data model is genuinely heterogeneous.
Single-record reads at scale. "Get user 42's full profile" is one fast lookup, no joins.
Horizontal scaling. The shard key is built in; large document stores scale to petabytes routinely.
Aggregation pipelines. MongoDB's aggregation framework and DynamoDB's GSIs cover many query needs.

The trade-offs:

Joins are awkward. You denormalize (duplicate data), or do app-side joins (slow), or both.
Consistency models vary. Some document stores are eventually consistent by default; you have to opt in to stronger guarantees.
Schema-less is a lie. Your application has a schema regardless; without DB enforcement, every read defends against malformed data.

Reach for document stores when the dominant access pattern is "lookup by ID", you have a clear shard key, and the data shape is naturally hierarchical.

What wide-column gives you

Cassandra and friends are designed for one thing: massive write throughput across many machines, with predictable read latency for known query patterns.

Writes are extremely cheap. LSM-tree storage means writes are essentially appends. Hundreds of thousands of writes per second per node is normal.
Linear horizontal scaling. Add nodes, get more capacity, no resharding pain.
Tunable consistency. Per-query, you choose how many replicas must acknowledge.

The cost: you must design the schema around your queries, not the other way around. Want a new query pattern? You build a new table for it. Joins, ad-hoc filters, secondary indexes: limited or expensive.

Reach for wide-column when you have known, fixed access patterns at very high write volume — time-series, event logs, IoT telemetry, messaging fanout.

What key-value gives you

The simplest model: GET/SET on opaque keys. Redis, Memcached, DynamoDB-as-KV.

Sub-millisecond latency. In-memory or memory-resident.
Trivial scaling for simple workloads. Sharding by key is straightforward.
Atomic primitives. Redis especially: counters, lists, sorted sets, streams.

Almost no production system uses key-value as the primary store. They're caches, session stores, rate limiters, queue cores. The boring, high-impact role.

How to actually choose

A short decision flow:

Default to Postgres. If the data fits a relational model and the scale is "millions of rows" not "billions", you'll regret almost any other choice.
Need extreme write throughput on known patterns? Wide-column (Cassandra, ScyllaDB).
Lookup by key, hierarchical records, schema flexibility? Document store (MongoDB, DynamoDB).
Caching, sessions, counters, queues? Redis.
Heavily traversal-based relationships (recommendations, social graphs, fraud rings)? Graph database.
Need multiple of the above? Pick a primary, layer the others. Polyglot persistence is normal.

"We use Mongo because we'll be web-scale"

The most common bad reason to pick a NoSQL database: imagined future scale.

The honest math: Postgres on modest hardware handles tens of thousands of QPS and several terabytes. By the time your product genuinely needs more, you'll either have the revenue to staff a real data platform, or you'll have already pivoted three times. Don't pre-optimize for a load you don't have, with a database whose trade-offs you don't yet feel.

Pick what makes development fast today. Re-architect when usage proves you need to.

Polyglot in practice

A typical production stack at moderate scale:

Postgres for the OLTP heart — users, orders, billing.
Redis for caching, sessions, rate limiting, and short-lived state.
Elasticsearch / OpenSearch for full-text and complex filtering.
S3 + Parquet + DuckDB / ClickHouse for analytics.
DynamoDB / Cassandra for one or two specific high-throughput append paths.

This isn't "we couldn't decide" — it's "each database is doing the thing it's actually best at". The cost is operational complexity; the benefit is each workload runs on a system that suits it.

What to read next

If you've decided on relational, database indexing and isolation levels are the next two articles to internalize. If you're going polyglot, eventual consistency explains what you're trading off. The Instagram HLD writeup is a great applied example of polyglot persistence — relational for users and relationships, NoSQL for the feed, blob storage for media, search index for discovery.