SQL vs NoSQL
Every application stores data. The question is how. Two fundamentally different philosophies dominate production systems — relational databases (SQL) and non-relational databases (NoSQL). They're not competitors fighting for the same use case. They solve different problems, and knowing which to reach for is one of the most consequential early decisions in any system design.
How the same data is stored
SQL — Relational Databases
Relational databases organise data into tables — rows and columns, like a spreadsheet. Every row conforms to a fixed schema. Every column has a defined type. Relationships between tables are expressed through foreign keys and resolved at query time with JOIN operations.
SELECT users.name, orders.amount
FROM users
JOIN orders ON orders.user_id = users.id
WHERE users.id = 'u1';
This structured approach comes with strong guarantees — the ACID properties:
- Atomicity — a transaction either fully succeeds or fully fails. No partial writes.
- Consistency — every write leaves the database in a valid state, respecting all constraints.
- Isolation — concurrent transactions don't interfere with each other.
- Durability — once committed, data survives crashes.
These guarantees are why relational databases have been the default for fifty years. When you need correctness — banking, healthcare, e-commerce — you need ACID.
The most widely used SQL databases: PostgreSQL, MySQL, SQLite, and managed versions like Amazon RDS, Cloud SQL, and Azure SQL.
The Schema Contract
The rigid schema is both SQL's superpower and its constraint. Every piece of data has a defined shape before a single row is inserted. This means:
- Invalid data can't enter the database — the schema enforces it
- Your codebase has a clear, queryable contract for what data looks like
- Adding or changing a column requires a migration — a planned change applied to the whole table
For large tables, migrations can be slow and risky. A schema change on a 500-million-row table requires careful planning to avoid downtime.
NoSQL — Non-Relational Databases
NoSQL is an umbrella term for databases that don't use the relational model. They trade some of the structure and guarantees of SQL for flexibility, scale, or specialisation. There are four main types:
Document Stores
Store data as JSON-like documents. Each document can have a different shape — no fixed schema required. Related data is often embedded directly in the document rather than split across tables.
{
"id": "u1",
"name": "Ada",
"email": "ada@example.com",
"orders": [
{ "product": "Book", "amount": 29 },
{ "product": "Pen", "amount": 5 }
]
}
One document, one read. No joins. For read-heavy workloads where you always fetch the same shape of data together, this is dramatically faster. MongoDB, Firestore, and CouchDB are common examples.
Key-Value Stores
The simplest NoSQL model — a dictionary at scale. You store a value under a key and retrieve it by that key. Extremely fast, but limited query capability. Perfect for caching, sessions, and rate-limiting counters. Redis and DynamoDB (in simple use cases) are the dominant tools.
Wide-Column Stores
Data is stored in tables, but columns are flexible per-row and grouped into families. Designed for massive write throughput and time-series data. Apache Cassandra and HBase are the main examples — used by Netflix, Instagram, and Discord at massive scale.
Graph Databases
Model data as nodes and edges — entities and the relationships between them. Ideal when the relationships are as important as the data itself. Social networks, recommendation engines, and fraud detection are natural fits. Neo4j is the most widely used.
The Core Tradeoffs
Schema Flexibility
SQL enforces a schema. NoSQL usually doesn't. For rapidly evolving products where the data model changes frequently, NoSQL's flexibility is an advantage. For mature systems where data integrity is paramount, SQL's schema is a safety net.
Scalability
SQL databases scale vertically well — give them more RAM and CPU and they handle more. Horizontal scaling (splitting data across many machines) is harder. Sharding a relational database is complex and often painful.
NoSQL databases are frequently designed for horizontal scalability from the start. Cassandra and DynamoDB can distribute data across hundreds of nodes with no single point of failure.
Query Power
SQL is a remarkably powerful query language. Ad-hoc queries, complex aggregations, multi-table joins — SQL handles all of it. NoSQL databases usually have more limited query capabilities. What they sacrifice in query flexibility, they gain in speed for the queries they do support.
Consistency Guarantees
SQL gives you ACID. Most NoSQL databases opt for eventual consistency — writes propagate across nodes eventually, not immediately. For many applications this is fine. For financial transactions, it is not.
Comparison
| SQL | NoSQL | |
|---|---|---|
| Data model | Tables, rows, columns | Documents, key-value, graph, wide-column |
| Schema | Fixed, enforced | Flexible or schema-less |
| Relationships | JOINs across tables | Embedding or application-level |
| Scaling | Vertical (primarily) | Horizontal (primarily) |
| Consistency | ACID | Eventual (typically) |
| Query language | SQL — powerful, flexible | Varies — often limited |
| Best for | Structured, relational data | High-volume, flexible, or specialised data |
When to Use SQL
Reach for a relational database when:
- Your data is structured and relationships between entities matter
- You need strong consistency — financial transactions, inventory, anything where partial writes are dangerous
- Your team needs to run ad-hoc queries and reports
- You're building a product where the data model is reasonably stable
- You don't yet know your access patterns — SQL's flexibility lets you query any way you need
PostgreSQL is the default choice for most new applications. It's open source, battle-tested, supports JSON columns (giving you some document-store flexibility), and has excellent managed hosting on every major cloud.
When to Use NoSQL
Reach for NoSQL when:
- You have very high write throughput that a single relational database can't handle
- Your data is naturally document-shaped and you always access it as a whole
- You need flexible schemas — different records with different shapes
- You're building a caching layer, session store, or leaderboard (reach for Redis)
- You're storing time-series data or event logs at massive scale (reach for Cassandra)
- You need sub-millisecond reads on simple key lookups
The most common real-world pattern: use PostgreSQL as your primary database for core business data, and Redis alongside it for caching and sessions. You rarely need to choose just one.
Key Takeaways
- SQL databases store data in structured tables with fixed schemas and strong ACID consistency — the right default for most applications
- NoSQL databases trade schema rigidity and full consistency for flexibility, scale, or specialised access patterns
- Document stores embed related data together for fast reads; key-value stores give you ultra-fast simple lookups; wide-column handles massive write throughput; graph databases model relationships natively
- Most production systems use both — PostgreSQL for core data, Redis for caching and sessions
- Start with PostgreSQL and reach for NoSQL when you have a specific problem it solves better, not because it sounds more scalable
What's Next
Now that you understand where data lives, the next question is how to avoid hitting the database at all. In the next lesson we'll look at caching — one of the highest-leverage tools in a system designer's toolkit.
Enjoyed this breakdown?
Get new lessons in your inbox.