Data & Storage
A choice between relational databases with strict schemas and ACID guarantees and non-relational databases optimized for scale, flexibility, or specialized workloads.
SQL (relational) databases — PostgreSQL, MySQL, SQL Server, Oracle — store data in tables with strict schemas, enforce ACID transactions, and support powerful joins and aggregations via SQL. They are the right default for most applications because schemas catch bugs at write time, transactions make multi-step operations safe, and decades of tooling exist.
NoSQL is an umbrella for everything else: key-value stores (Redis, DynamoDB), document stores (MongoDB, Couchbase), wide-column stores (Cassandra, HBase), and graph databases (Neo4j). Each NoSQL flavor optimizes for a specific workload — write throughput, schema flexibility, single-digit-ms key lookup, or graph traversal — and trades away features the relational model assumes.
The modern reality is that most large systems use both: a relational database for the transactional core, and one or more specialized NoSQL stores for caching, search, analytics, time-series data, or graph queries. The interesting question is rarely "SQL or NoSQL" but "what is the right store for this specific access pattern".
Default to SQL unless you have a clear reason — proven scale problem, schema-less data, key-value workload, graph traversal — to pick a specific NoSQL.
SQL gives up horizontal scale and write throughput in exchange for safety. NoSQL gives up joins, transactions, and flexibility of querying in exchange for scale or speed on a specific access pattern.
Atomicity, Consistency, Isolation, Durability — the four properties that traditional database transactions guarantee.
Splitting a large dataset across multiple machines so that each shard holds a subset of the data and handles a subset of the load.
Maintaining multiple copies of the same data across different nodes for fault tolerance, read scalability, and lower latency.
A data structure (typically a B-tree or hash table) that lets a database find rows matching a query without scanning the entire table.
Intentionally duplicating data across tables to avoid expensive joins and improve read performance, at the cost of write complexity.
A storage architecture that manages data as objects (file + metadata + ID) in a flat namespace, optimized for huge amounts of unstructured data.