Distributed Systems
A consistency model where, given enough time and no new updates, all replicas of a piece of data will converge to the same value.
Eventual consistency is a relaxation of strong consistency. Instead of guaranteeing that every read returns the most recent write, the system promises only that, in the absence of new writes, all replicas will eventually converge. The convergence window is typically milliseconds in healthy systems but can stretch to seconds or longer during failures.
Many large-scale systems are built on eventual consistency because it provides much better availability and lower latency than strong consistency. Amazon DynamoDB, Cassandra, DNS, and most CDN caches are eventually consistent by default. The tradeoff is that clients must tolerate occasionally seeing stale data — a refresh might show the previous value before the new one appears.
There are stricter variants: read-your-writes consistency (a client always sees its own writes), monotonic reads (a client never sees data move backward in time), and causal consistency (operations that are causally related are seen in order by everyone).
Use eventual consistency for caches, view counters, social-network feeds, product catalogs, recommendations, and any data where slight staleness is harmless.
Eventually consistent systems are easier to scale and remain available during partitions, but force application code to handle stale reads, conflicts, and reconciliation logic.
A principle stating that a distributed data store can provide at most two of: Consistency, Availability, and Partition tolerance.
Maintaining multiple copies of the same data across different nodes for fault tolerance, read scalability, and lower latency.
The process by which a group of distributed nodes agree on a single value or sequence of values, even in the presence of failures.
A protocol by which a group of nodes selects one node as the coordinator or "leader" responsible for a given task.
The minimum number of nodes that must agree on an operation for it to be considered successful in a distributed system.
Splitting a large dataset across multiple machines so that each shard holds a subset of the data and handles a subset of the load.