Pillar guide

Learn System Design — A Complete Guide

A structured path from zero to system design proficiency: the core concepts, the twelve building blocks, scalability patterns, a four-week study plan, and how to practice with AI-powered feedback on 122+ real problems.

What is system design?

System design is the discipline of defining the architecture, components, data flow, and trade-offs of a software system that must operate under real-world constraints — traffic spikes, partial failures, geographic latency, cost, and team scale. It sits between product requirements and code, and it determines whether a service can grow from one server to a billion users without being rewritten.

Why system design matters

Almost every senior software engineering interview at companies like Google, Meta, Amazon, Stripe, Uber, and Netflix includes a system design round. More importantly, the decisions you make in a design review — which database to pick, where to cache, how to partition data, how to handle retries — directly determine your service's reliability, cost, and the team's velocity for years afterward.

The 12 building blocks every engineer must know

Most large-scale systems are assembled from a small vocabulary of components. Master these and you can reason about almost any architecture:

Load balancers (L4 vs L7, sticky sessions, health checks)
Reverse proxies and API gateways
Stateless application servers and horizontal scaling
Relational databases (Postgres, MySQL) and ACID guarantees
NoSQL: key-value (Redis, DynamoDB), document (MongoDB), wide-column (Cassandra)
Caching tiers: client, CDN, edge, application, database
Message queues and event streams (Kafka, SQS, RabbitMQ)
Object storage (S3) and blob CDNs
Search infrastructure (Elasticsearch, OpenSearch)
Rate limiters, circuit breakers, and bulkheads
Observability: metrics, structured logs, distributed traces
Identity, authentication, and authorization layers

Scalability patterns you should be able to draw from memory

These are the recurring patterns that show up in nearly every advanced design problem:

Horizontal sharding by user ID, geography, or hash range
Read replicas with eventual consistency for read-heavy workloads
Write-through, write-back, and cache-aside caching strategies
CQRS: separating read and write models for performance
Event sourcing and the outbox pattern for reliable cross-service events
Consistent hashing for cache and shard rebalancing
Leader election and Raft / Paxos for coordination
Geo-replication, multi-region writes, and conflict resolution (CRDTs, last-write-wins)
Backpressure, retry with exponential backoff, and idempotent APIs
CDN edge compute and request collapsing

A 4-week study plan

If you only have a month before an interview or a real architectural decision at work, follow this plan:

Week 1 — Fundamentals: networking basics, HTTP/2 vs HTTP/3, TCP vs UDP, CAP theorem, ACID vs BASE, latency numbers every engineer should know.
Week 2 — Storage: choose the right database for the workload, indexing, sharding, replication, transaction isolation levels, and when to denormalize.
Week 3 — Scale: caching, queues, async processing, idempotency, rate limiting, and failure modes (timeouts, retries, dead-letter queues).
Week 4 — Practice: solve 10–15 problems on SystemCity end-to-end. Draw the architecture, defend each component choice, run AI evaluation, and iterate.

How to actually practice

Reading books like Designing Data-Intensive Applications is necessary but not sufficient. You only internalize system design by repeatedly designing systems and getting feedback. Pick a problem, set a 45-minute timer, and produce: (1) clarifying questions, (2) functional and non-functional requirements, (3) capacity estimates, (4) high-level architecture, (5) data model, (6) API design, (7) deep dive on one component, (8) bottleneck analysis. Then submit your design to AI evaluation and review the gaps.

Frequently asked questions

How long does it take to learn system design?

A focused engineer with 2–3 years of backend experience can reach interview-ready proficiency in 4–8 weeks of consistent practice. Reaching senior staff-level depth — being able to design novel infrastructure from first principles — typically takes 2–3 years of operating real production systems alongside continuous study.

Do I need to memorize specific numbers like RPS or storage costs?

You need to memorize a small set of latency and throughput baselines (memory access ~100ns, SSD ~100µs, datacenter round trip ~500µs, cross-region ~150ms; a single Postgres can do ~10k QPS, a single Redis ~100k QPS). Beyond that, you should be able to derive estimates from first principles using daily active users, requests per user, and average payload size.

Is system design only for backend engineers?

No. Frontend engineers face system design questions about offline-first apps, real-time collaboration, micro-frontends, and CDN strategy. Mobile engineers deal with sync, conflict resolution, and battery-aware networking. ML engineers design feature stores and inference serving systems. The principles transfer across all of these.

What is the difference between high-level design and low-level design?

High-level design (HLD) is the architectural diagram: services, databases, queues, and how requests flow between them. Low-level design (LLD) is the class diagrams, API contracts, table schemas, and algorithms inside one component. Most senior interviews lead with HLD then drill into one LLD area.

Should I draw diagrams or write code?

For system design interviews, prioritize diagrams plus pseudo-code for tricky algorithms (rate limiter, consistent hash ring, leader election). On SystemCity the canvas lets you drag real components onto a board so you focus on the architecture rather than fighting drawing tools.

Start practicing now

Pick any problem to design end-to-end on the live canvas with AI evaluation.

Design a Nested Comments System Design a Network Connection Path Analyzer Design a URL Shortening Service Design a Collaborative Online Spreadsheet Design a Database Batch Auditing Service Design an Employee Swap System Design a Weather Reporting System Design a Digital Distribution Platform Design a Conference Room Booking System Design an Efficient Parking Lot System Design a Vending Machine System Design an Airport Baggage Handling System

Browse all 122 problems

Related guides

What is system design?

Why system design matters

The 12 building blocks every engineer must know

Most large-scale systems are assembled from a small vocabulary of components. Master these and you can reason about almost any architecture:

Load balancers (L4 vs L7, sticky sessions, health checks)

Reverse proxies and API gateways

Stateless application servers and horizontal scaling

Relational databases (Postgres, MySQL) and ACID guarantees

NoSQL: key-value (Redis, DynamoDB), document (MongoDB), wide-column (Cassandra)

Caching tiers: client, CDN, edge, application, database

Message queues and event streams (Kafka, SQS, RabbitMQ)

Object storage (S3) and blob CDNs

Search infrastructure (Elasticsearch, OpenSearch)

Rate limiters, circuit breakers, and bulkheads

Observability: metrics, structured logs, distributed traces

Identity, authentication, and authorization layers

Scalability patterns you should be able to draw from memory

These are the recurring patterns that show up in nearly every advanced design problem:

Horizontal sharding by user ID, geography, or hash range

Read replicas with eventual consistency for read-heavy workloads

Write-through, write-back, and cache-aside caching strategies

CQRS: separating read and write models for performance

Event sourcing and the outbox pattern for reliable cross-service events

Consistent hashing for cache and shard rebalancing

Leader election and Raft / Paxos for coordination

Geo-replication, multi-region writes, and conflict resolution (CRDTs, last-write-wins)

Backpressure, retry with exponential backoff, and idempotent APIs

CDN edge compute and request collapsing

A 4-week study plan

If you only have a month before an interview or a real architectural decision at work, follow this plan:

Week 1 — Fundamentals: networking basics, HTTP/2 vs HTTP/3, TCP vs UDP, CAP theorem, ACID vs BASE, latency numbers every engineer should know.

Week 2 — Storage: choose the right database for the workload, indexing, sharding, replication, transaction isolation levels, and when to denormalize.

Week 3 — Scale: caching, queues, async processing, idempotency, rate limiting, and failure modes (timeouts, retries, dead-letter queues).

Week 4 — Practice: solve 10–15 problems on SystemCity end-to-end. Draw the architecture, defend each component choice, run AI evaluation, and iterate.