Scalability & Performance
Also known as: Content Delivery Network, Edge Network
A globally distributed network of edge servers that cache static content close to end users to minimize latency and origin load.
A Content Delivery Network is a layer of geographically distributed servers — called edge nodes or points of presence (PoPs) — that sit between your origin servers and end users. When a user requests a cacheable asset (image, video, JavaScript bundle, API response), the CDN serves it from the nearest edge instead of the origin.
CDNs reduce latency because data travels a shorter physical distance. They also absorb traffic spikes, protect against DDoS attacks, and offload static traffic from your origin so it can focus on dynamic requests. Modern CDNs like Cloudflare, Fastly, and CloudFront also support edge compute, letting you run code at the edge for personalization, A/B testing, or auth checks.
Popular use cases: serving images and videos for a media site, caching API responses with short TTLs, accelerating SaaS dashboards globally, and serving the static shell of a single-page application.
Use a CDN whenever your users are geographically dispersed and you serve any static or cacheable content. For most modern web apps, this is essentially always.
CDNs add another caching layer to invalidate. Misconfigured cache headers can serve stale or even private data publicly. There is also a per-GB cost that can become significant for video-heavy applications.
Storing copies of frequently accessed data in fast memory so that subsequent requests can be served without recomputing or refetching.
The time delay between a request being sent and a response being received — typically measured in milliseconds.
A component that distributes incoming network traffic across multiple backend servers to maximize throughput, minimize response time, and avoid overload.
A server that sits in front of one or more backend servers and forwards client requests to them, often handling TLS, caching, compression, and load balancing.
Adding more machines to a system to handle increased load, as opposed to making a single machine more powerful.
Increasing the capacity of a single machine — more CPU, memory, or disk — to handle more load.