Architecture Patterns
A single entry point that routes external requests to internal services, handling concerns like authentication, rate limiting, and request transformation in one place.
An API gateway sits at the edge of a system and is the single entry point for all external clients. It routes incoming requests to the appropriate internal service, but it also centralizes cross-cutting concerns: TLS termination, authentication and token validation, rate limiting, request and response transformation, request logging, response caching, and aggregation of multiple internal calls into one client response.
In a microservices architecture, the gateway lets internal services stay focused on their business logic without each one re-implementing auth, throttling, and logging. It also gives the team a place to enforce contract changes without coordinating across many services. Common implementations: Kong, AWS API Gateway, Apigee, Tyk, Envoy-based gateways, and NGINX with custom config.
A related pattern is BFF (Backend For Frontend): a gateway-like layer per client type (one for web, one for iOS, one for Android) that tailors API responses to the specific client's needs.
Add an API gateway as soon as you have more than one backend service exposed to clients, or when cross-cutting concerns are duplicating across services.
The gateway is a single point of failure and a potential bottleneck — it must be highly available and horizontally scaled. Over-aggregation in the gateway can recreate the monolith problem at the edge.
An architectural style that structures an application as a collection of small, independently deployable services, each responsible for a specific business capability.
A control mechanism that caps the number of requests a client can make in a given time window to protect a service from abuse and overload.
A dedicated infrastructure layer that handles service-to-service communication in a microservices architecture — encryption, retries, observability, traffic shaping — outside application code.
A component that distributes incoming network traffic across multiple backend servers to maximize throughput, minimize response time, and avoid overload.
A single deployable application containing all features and logic, sharing one codebase, one database, and one deployment unit.
An architectural style where services communicate primarily by emitting and reacting to events, rather than calling each other directly.