Understanding Multileader Replication Topologies in System Design
Written on
Chapter 1: Introduction to Multileader Architectures
In the context of system design interviews, multileader architectures can adopt various topologies. A replication topology describes how nodes within a system coordinate local writes—i.e., writes made at a specific leader—with other leader nodes.
Make sure to grab your copy of Designing Data-Intensive Applications, an essential read for preparing for system design interviews! Check out courses like Udacity, Coursera, and Pluralsight, as well as ByteByteGo's highly regarded course, Grokking Modern System Design for Software Engineers and Managers.
Section 1.1: The Basics of Replication Topologies
When considering two nodes, the information exchange is straightforward, requiring bilateral communication. However, as we introduce three or more nodes, the complexity increases significantly, leading to various configurations for information exchange. This document will delve into the different topologies that facilitate write exchanges among nodes.
Subsection 1.1.1: All-to-All Topology
In an all-to-all topology, a leader disseminates its writes to every other leader in the network. The advantage of this setup is that each leader node receives updates from all other leaders, ensuring that the failure of one leader does not disrupt the propagation of writes among the remaining leaders.
However, the downside is that each node must send its local writes to every other leader, which increases the volume of data transmitted across the network. A more significant issue arises when a single client can submit writes to multiple leaders. For instance:
- A client sends a write request W1 to leader A.
- Simultaneously, the same client submits a second write request W2 to leader B.
- Leader B sends W2 to leader C for replication before leader A can send W1.
- Subsequently, leader A sends W1 to leader C, even after leader C has already processed W2 from leader B.
This sequence can lead to writes being recorded at leader C in the wrong order, which poses problems if both writes target the same record. Potential solutions include routing writes for specific records to a single leader or employing a technique known as version vectors. Version vectors help track changes in distributed systems, allowing participants to ascertain the sequence of updates and whether they occurred concurrently. This enables effective causality tracking among data replicas.
Chapter 2: Circular and Star Topologies
The first video provides insights from Google Software Engineers on how to approach multileader replication in system design.
Section 2.1: Circular Topology
Leader nodes can be arranged in a circular manner. In this configuration, each node communicates with its neighbor, passing along both its local writes and any received writes from its predecessor. To prevent endless loops, each write is logged and forwarded with identifiers indicating which nodes have already processed it. If a node receives a write with its own identifier, it discards it.
One drawback of the circular topology is that a failure in any single node can disrupt the flow of replication messages throughout the system. While the topology can be adjusted to bypass the failed node, it requires manual intervention.
The second video succinctly explains multi-leader replication in just five minutes, highlighting its practical applications.
Section 2.2: Star Topology
In the star topology, nodes are arranged radiantly around a central leader node, which forwards replication messages to all other leaders. Like the circular topology, the star configuration can encounter interruptions in message flow if nodes in the communication path fail.
For those preparing for interviews, consider investing in our top-rated course for Java Multithreading Interviews.
Your Comprehensive Interview Kit for Big Tech Jobs
- Grokking the Machine Learning Interview
- Grokking the System Design Interview
- Grokking Dynamic Programming Patterns for Coding Interviews
- Grokking the Advanced System Design Interview
- Grokking the Coding Interview: Patterns for Coding Questions
- Grokking the Object-Oriented Design Interview
- Machine Learning System Design
- System Design Course Bundle
- Coding Interviews Bundle
- Tech Design Bundle
- All Courses Bundle