SystemCity
WorkspaceProblemsCanvasPricing
Sign in
S

SystemCity

AI-powered system design tutor. Learn architecture, ace interviews, build real systems.

Learn

  • Learn System Design
  • Interview Prep Guide
  • All Problems
  • Glossary
  • Compare
  • Design Canvas

Product

  • Pricing
  • Portfolio
  • Support

Legal

  • Terms
  • Privacy
  • Refunds

© 2026 SystemCity. All rights reserved.

Master system design · interview prep · 120+ problems

Back to glossary

Scalability & Performance

CDN (Content Delivery Network)

Also known as: Content Delivery Network, Edge Network

A globally distributed network of edge servers that cache static content close to end users to minimize latency and origin load.

In depth

A Content Delivery Network is a layer of geographically distributed servers — called edge nodes or points of presence (PoPs) — that sit between your origin servers and end users. When a user requests a cacheable asset (image, video, JavaScript bundle, API response), the CDN serves it from the nearest edge instead of the origin.

CDNs reduce latency because data travels a shorter physical distance. They also absorb traffic spikes, protect against DDoS attacks, and offload static traffic from your origin so it can focus on dynamic requests. Modern CDNs like Cloudflare, Fastly, and CloudFront also support edge compute, letting you run code at the edge for personalization, A/B testing, or auth checks.

Popular use cases: serving images and videos for a media site, caching API responses with short TTLs, accelerating SaaS dashboards globally, and serving the static shell of a single-page application.

When to use

Use a CDN whenever your users are geographically dispersed and you serve any static or cacheable content. For most modern web apps, this is essentially always.

Tradeoffs

CDNs add another caching layer to invalidate. Misconfigured cache headers can serve stale or even private data publicly. There is also a per-GB cost that can become significant for video-heavy applications.

Related terms

Caching

Storing copies of frequently accessed data in fast memory so that subsequent requests can be served without recomputing or refetching.

Latency

The time delay between a request being sent and a response being received — typically measured in milliseconds.

Load Balancer

A component that distributes incoming network traffic across multiple backend servers to maximize throughput, minimize response time, and avoid overload.

Reverse Proxy

A server that sits in front of one or more backend servers and forwards client requests to them, often handling TLS, caching, compression, and load balancing.

Horizontal Scaling

Adding more machines to a system to handle increased load, as opposed to making a single machine more powerful.

Vertical Scaling

Increasing the capacity of a single machine — more CPU, memory, or disk — to handle more load.

Practice this concept

MediumInfrastructure

Design a Web Cache

MediumStreaming

Design YouTube

HardNetworking

Design a Global Content Distribution Network

AdvancedInfrastructure

Design a Cloud Storage Gateway