Distributed computing Study Guide
Study Guide
📖 Core Concepts
Distributed Computing – Study of systems whose components run on different networked computers and cooperate by passing messages.
Distributed Program / Programming – A program that executes across multiple nodes; the act of writing such programs.
Autonomous Entities – Each node has its own local memory and can fail independently.
No Global Clock – Algorithms must work without synchronized time; they rely on message ordering instead.
Message vs. Event –
Message: command, event, or document sent to a specific recipient (e.g., ProcessPayment).
Event: a state‑change notification broadcast without a designated receiver (e.g., PaymentProcessed).
Architectural Patterns – Event‑driven, Service‑Oriented (SOA), Publish–Subscribe, Peer‑to‑Peer (P2P).
Coupling Styles – Loose: independent services; Tight: components tightly bound.
Parallel vs. Distributed – Parallel: shared memory, tightly coupled. Distributed: private memory, message‑passing, loosely coupled.
Complexity Measures – Time (rounds), message count, bit count; models: LOCAL (rounds), CONGEST (bits/message size).
Linearizability – Operations appear to occur instantaneously at some point between invocation and response.
Reactive Manifesto – Systems should be Responsive, Resilient, Elastic, Message‑driven.
---
📌 Must Remember
Delivery Semantics – At‑least‑once, at‑most‑once, exactly‑once (apply to both events and messages).
Tight vs. Loose Coupling – Tight ⇒ shared‑memory/disk; Loose ⇒ message‑driven, microservices.
NC Class – Solvable in polylogarithmic time with polynomial processors.
LOCAL Model – Complexity measured in synchronous communication rounds; any problem solvable in ≤ 2·D rounds (D = network diameter).
CONGEST Model – Each message ≤ B bits; total bits transmitted matter.
Leader Election Goal – Elect a unique coordinator, usually the node with the highest identifier; minimize bytes and time.
Event‑driven Advantage – Perfect for propagating state changes and decoupled notifications.
Message Advantage – Suited for explicit commands, workflow orchestration, and tight coordination.
---
🔄 Key Processes
Leader Election (generic ring/graph)
Identifier Exchange – Each node broadcasts its unique ID to neighbors.
Comparison – Nodes compare received IDs with their own.
Winner Selection – Node holding the highest ID declares itself leader.
Announcement – Leader sends a “leader‑found” message to inform all nodes.
Gallager‑Humblet‑Spira (MST) – high‑level steps
Fragment Creation – Each node starts as a singleton fragment.
Find Minimum Outgoing Edge – Every fragment identifies its cheapest edge to another fragment.
Merge Fragments – Fragments connect via selected edges, forming larger fragments.
Iterate – Repeat steps 2‑3 until a single fragment spanning all nodes remains (the MST).
Event‑Driven Flow (publish‑subscribe)
Publish – Producer posts an event to a topic.
Broker – Receives event, matches it to subscriber interests.
Dispatch – Asynchronously delivers event to all matching subscribers.
---
🔍 Key Comparisons
Events vs. Messages
Events: broadcast, decoupled, state‑change notification.
Messages: directed, command‑oriented, explicit coordination.
Parallel vs. Distributed
Parallel: shared memory, tightly coupled, processors see the same address space.
Distributed: private memory, loosely coupled, communication via messages.
Tight vs. Loose Coupling
Tight: shared‑disk or memory, components depend on each other’s internal details.
Loose: services interact via well‑defined messages/events, can evolve independently.
LOCAL vs. CONGEST Models
LOCAL: counts rounds only; message size unlimited.
CONGEST: limits each message to B bits; both rounds and bits matter.
---
⚠️ Common Misunderstandings
“Events are just messages.” – Events are broadcast notifications; messages are point‑to‑point commands.
“Distributed systems have a global clock.” – They explicitly lack synchronized time; algorithms avoid clock reliance.
“Exactly‑once delivery is always achievable.” – Requires extra coordination; many systems settle for at‑least‑once or at‑most‑once.
“Linearizability = sequential consistency.” – Linearizability also respects real‑time ordering; sequential consistency does not.
“All parallel algorithms belong to NC.” – NC requires polylogarithmic time; many parallel algorithms are slower.
---
🧠 Mental Models / Intuition
Distributed System = Team of independent workers – each knows only its own notebook; they pass notes (messages) to finish a group puzzle.
Reactive Manifesto = R‑E‑S‑S – think of a Responsive Elastic System that Self‑heals via messages.
LOCAL rounds ≈ “how many times the group can shout across the room” – each round lets information travel one hop.
---
🚩 Exceptions & Edge Cases
Delivery Guarantees – Exactly‑once often needs idempotent processing or two‑phase commit; not guaranteed by default.
Network Diameter Bound – 2·D rounds works if you can gather all information; high‑diameter topologies may make this impractical.
CONGEST Limit – When B is very small (e.g., 1‑bit messages), algorithms may need many more rounds than LOCAL.
Leader Election in Symmetric Topologies – In a perfectly symmetric ring with identical IDs, symmetry breaking must rely on randomness or external input.
---
📍 When to Use Which
Choose Events when you need decoupled propagation of state changes (e.g., UI updates, cache invalidation).
Choose Messages for explicit actions, request‑reply patterns, or workflow steps (e.g., payment processing).
Pick Publish–Subscribe if many independent consumers may appear/disappear over time.
Use Peer‑to‑Peer when no central authority is desired (file sharing, blockchain).
Apply LOCAL model for algorithmic analysis where message size isn’t a bottleneck.
Apply CONGEST model when bandwidth per link is constrained (wireless sensor networks).
Select NC algorithms only when you can guarantee polylogarithmic parallel depth; otherwise consider P‑time parallel algorithms.
---
👀 Patterns to Recognize
“Broadcast → Multiple Consumers” → Event‑driven / Publish‑Subscribe pattern.
“Command → Single Receiver” → Direct message (RPC, queue).
“Two·Diameter Rounds” → Any problem solved by full information gathering in the LOCAL model.
“Highest ID wins” → Classic leader election symmetry‑breaking pattern.
“Loose coupling + asynchronous messaging” → Microservices or reactive architecture.
---
🗂️ Exam Traps
Trap: “Exactly‑once delivery is guaranteed by the publish‑subscribe model.”
Why wrong: Guarantees depend on broker implementation; most systems only promise at‑least‑once.
Trap: “If a system is distributed, it must be loosely coupled.”
Why wrong: Distributed systems can be tightly coupled (e.g., shared‑disk clusters).
Trap: “NC means any parallel algorithm runs fast on many processors.”
Why wrong: NC requires polylogarithmic time; many parallel algorithms exceed this bound.
Trap: “Linearizability ensures operations occur in the order they were issued globally.”
Why wrong: It only guarantees each operation appears instantaneously somewhere between its call and response, not globally ordered issuance.
Trap: “Leader election always finishes in O(log n) messages.”
Why wrong: Message complexity depends on topology; rings may need O(n) messages, while complete graphs can be faster.
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or