Reference · 44 terms

System Design Glossary

Concise, opinionated definitions of every concept that shows up in real system design work — from caching and sharding to consensus and CQRS. Each entry covers what the term means, when to reach for it, and the tradeoffs that come with it.

Scalability & Performance

Caching

Storing copies of frequently accessed data in fast memory so that subsequent requests can be served without recomputing or refetching.

CDN (Content Delivery Network)

A globally distributed network of edge servers that cache static content close to end users to minimize latency and origin load.

Horizontal Scaling

Adding more machines to a system to handle increased load, as opposed to making a single machine more powerful.

Latency

The time delay between a request being sent and a response being received — typically measured in milliseconds.

Load Balancer

A component that distributes incoming network traffic across multiple backend servers to maximize throughput, minimize response time, and avoid overload.

Rate Limiting

A control mechanism that caps the number of requests a client can make in a given time window to protect a service from abuse and overload.

Throughput

The number of operations a system can handle per unit of time, often measured in requests per second (RPS) or queries per second (QPS).

Vertical Scaling

Increasing the capacity of a single machine — more CPU, memory, or disk — to handle more load.

Distributed Systems

CAP Theorem

A principle stating that a distributed data store can provide at most two of: Consistency, Availability, and Partition tolerance.

Consensus

The process by which a group of distributed nodes agree on a single value or sequence of values, even in the presence of failures.

Consistent Hashing

A hashing technique that minimizes the amount of data that needs to be moved when nodes are added to or removed from a distributed system.

Eventual Consistency

A consistency model where, given enough time and no new updates, all replicas of a piece of data will converge to the same value.

Leader Election

A protocol by which a group of nodes selects one node as the coordinator or "leader" responsible for a given task.

Quorum

The minimum number of nodes that must agree on an operation for it to be considered successful in a distributed system.

Replication

Maintaining multiple copies of the same data across different nodes for fault tolerance, read scalability, and lower latency.

Sharding

Splitting a large dataset across multiple machines so that each shard holds a subset of the data and handles a subset of the load.

Data & Storage

ACID

Atomicity, Consistency, Isolation, Durability — the four properties that traditional database transactions guarantee.

Database Indexing

A data structure (typically a B-tree or hash table) that lets a database find rows matching a query without scanning the entire table.

Denormalization

Intentionally duplicating data across tables to avoid expensive joins and improve read performance, at the cost of write complexity.

Object Storage

A storage architecture that manages data as objects (file + metadata + ID) in a flat namespace, optimized for huge amounts of unstructured data.

SQL vs NoSQL

A choice between relational databases with strict schemas and ACID guarantees and non-relational databases optimized for scale, flexibility, or specialized workloads.

Time-Series Database

A database optimized for storing and querying timestamped data points — ideal for metrics, sensor data, financial ticks, and events.

Communication & APIs

GraphQL

A query language for APIs that lets clients request exactly the fields they need in a single request, eliminating over- and under-fetching.

gRPC

A high-performance RPC framework using HTTP/2, Protocol Buffers, and code generation for type-safe, low-latency service-to-service communication.

Message Queue

A buffer that holds messages between producers and consumers, enabling asynchronous processing and decoupling of services.

Pub/Sub

A messaging pattern where publishers emit messages to topics without knowing who consumes them, and subscribers receive messages from topics they care about.

REST API

An architectural style for web APIs based on HTTP verbs (GET, POST, PUT, DELETE) acting on resources identified by URLs.

Webhook

An HTTP callback that one system sends to another to notify it of an event, enabling push-style integrations between services.

WebSocket

A persistent, bidirectional communication channel between client and server over a single TCP connection — the standard for real-time web features.

Reliability & Resilience

Circuit Breaker

A pattern that stops calls to a failing downstream service for a cool-off period to prevent cascading failures and give the service time to recover.

Graceful Degradation

Designing a system so that when a component fails, the rest of the system continues to operate with reduced functionality rather than failing completely.

Idempotency

A property of operations such that performing them multiple times has the same effect as performing them once — essential for safe retries.

Retry & Backoff

A reliability pattern that re-attempts failed operations after progressively longer delays, optionally with jitter, to ride out transient failures.

SLA, SLO, SLI

Service Level Indicator (the metric), Service Level Objective (the target), and Service Level Agreement (the contract with consequences).

Architecture Patterns

API Gateway

A single entry point that routes external requests to internal services, handling concerns like authentication, rate limiting, and request transformation in one place.

CQRS

A pattern that separates the model used for writing data (commands) from the model used for reading data (queries), allowing each to be optimized independently.

Event-Driven Architecture

An architectural style where services communicate primarily by emitting and reacting to events, rather than calling each other directly.

Microservices

An architectural style that structures an application as a collection of small, independently deployable services, each responsible for a specific business capability.

Monolith

A single deployable application containing all features and logic, sharing one codebase, one database, and one deployment unit.

Service Mesh

A dedicated infrastructure layer that handles service-to-service communication in a microservices architecture — encryption, retries, observability, traffic shaping — outside application code.

System Design Glossary

Scalability & Performance

Caching

CDN (Content Delivery Network)

Horizontal Scaling

Latency

Load Balancer

Rate Limiting

Throughput

Vertical Scaling

Distributed Systems

CAP Theorem

Consensus

Consistent Hashing

Eventual Consistency

Leader Election

Quorum

Replication

Sharding

Data & Storage

ACID

Database Indexing

Denormalization

Object Storage

SQL vs NoSQL

Time-Series Database

Communication & APIs

GraphQL

gRPC

Message Queue

Pub/Sub

REST API

Webhook

WebSocket

Reliability & Resilience

Circuit Breaker

Graceful Degradation

Idempotency

Retry & Backoff

SLA, SLO, SLI

Architecture Patterns

API Gateway

CQRS

Event-Driven Architecture

Microservices

Monolith

Service Mesh

Networking & Infrastructure

DNS

HTTP/2

Reverse Proxy

TCP vs UDP

System Design Problems

Learn System Design

Interview Prep Guide

System Design Glossary

Scalability & Performance

Caching

CDN (Content Delivery Network)

Horizontal Scaling

Latency

Load Balancer

Rate Limiting

Throughput

Vertical Scaling

Distributed Systems

CAP Theorem

Consensus

Consistent Hashing

Eventual Consistency

Leader Election

Quorum

Replication

Sharding

Data & Storage

ACID

Database Indexing

Denormalization

Object Storage

SQL vs NoSQL