Distributed Systems

Leader Election

A protocol by which a group of nodes selects one node as the coordinator or "leader" responsible for a given task.

In depth

Leader election is the mechanism by which a distributed system designates one node as the leader for some scope of work. The leader typically handles writes, coordinates state changes, or schedules work, while followers replicate or stand by. Examples: a single MongoDB primary, the etcd leader, the active master in HDFS, the producer of a Kafka transaction coordinator.

Election is usually built on consensus: nodes propose themselves, a quorum agrees, and the winner takes the role for a bounded "term" or "lease". When the leader fails (detected by missed heartbeats), a new election kicks off and another node takes over. Lease-based election with fencing tokens prevents the dreaded split-brain scenario where two nodes both believe they are the leader.

Leader election simplifies many distributed problems by reducing them to single-node problems: only one node accepts writes, so you avoid concurrent-write conflicts entirely.

When to use

Use leader election whenever you need a single coordinator: primary database replicas, scheduled job runners, cluster managers, distributed lock services.

Tradeoffs

A single leader is a bottleneck (all writes go through it) and an availability concern (if election takes seconds, the system is unavailable for writes during that window). Multi-leader and leaderless designs trade complexity for higher availability.

Consensus

The process by which a group of distributed nodes agree on a single value or sequence of values, even in the presence of failures.

Quorum

The minimum number of nodes that must agree on an operation for it to be considered successful in a distributed system.

Replication

Maintaining multiple copies of the same data across different nodes for fault tolerance, read scalability, and lower latency.

CAP Theorem

A principle stating that a distributed data store can provide at most two of: Consistency, Availability, and Partition tolerance.

Eventual Consistency

A consistency model where, given enough time and no new updates, all replicas of a piece of data will converge to the same value.

Sharding

Splitting a large dataset across multiple machines so that each shard holds a subset of the data and handles a subset of the load.

MediumInfrastructure

Distributed Systems

Leader Election

A protocol by which a group of nodes selects one node as the coordinator or "leader" responsible for a given task.

In depth

Leader election simplifies many distributed problems by reducing them to single-node problems: only one node accepts writes, so you avoid concurrent-write conflicts entirely.

When to use

Use leader election whenever you need a single coordinator: primary database replicas, scheduled job runners, cluster managers, distributed lock services.

Leader Election

In depth

When to use

Tradeoffs

Related terms

Consensus

Quorum

Replication

CAP Theorem

Eventual Consistency

Sharding

Practice this concept

Design a Blockchain Based System

Design a Distributed Locking System

Design a Distributed File System

Design a Distributed Messaging System

Leader Election

In depth

When to use

Tradeoffs

Related terms

Consensus

Quorum

Replication

CAP Theorem

Eventual Consistency

Sharding

Practice this concept

Design a Blockchain Based System

Design a Distributed Locking System

Design a Distributed File System

Design a Distributed Messaging System