Distributed computing - Applications Benefits and Example Systems
Learn the performance, reliability, and manageability benefits of distributed systems and see real-world examples such as scientific computing, reactive architectures, and blockchain.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
What are the four principles of the Reactive Manifesto?
1 of 4
Summary
Applications and Benefits of Distributed Systems
Introduction
A distributed system is a collection of independent computers that work together to achieve a common goal. Unlike traditional single-machine systems, distributed systems spread computation and data across multiple physical locations connected by a network. Understanding when and why to use distributed systems, and how they work, is fundamental to modern computing.
When Distribution Is Required
Distribution becomes necessary when data must physically travel between separated locations. Consider a company with offices in different cities, each needing real-time access to shared data, or a social media platform serving users worldwide. In these cases, a single centralized computer cannot efficiently serve all users—the physical distance itself makes distribution not just beneficial, but essential.
Performance Advantages
Distributed systems can achieve what single machines cannot. By pooling resources across multiple computers, you gain:
Larger storage capacity: Store more data than any single machine could hold
Higher memory capacity: Process larger datasets with aggregate RAM
Faster computation: Parallelize work across multiple processors simultaneously
Higher bandwidth: Combine network connections for greater data throughput
For example, a search engine like Google doesn't run on one supercomputer—it distributes queries across thousands of servers, allowing it to search billions of web pages while responding to millions of users simultaneously.
Reliability Advantages
A single machine has a critical weakness: when it fails, the entire system goes down. Distributed systems solve this through redundancy—there is no single point of failure.
If one server crashes, others continue operating. Data can be replicated across multiple machines so that losing one copy doesn't mean losing the data entirely. This resilience is why banks, hospitals, and other critical services rely on distributed systems.
Manageability Advantages
Distributed systems are easier to expand and maintain than monolithic systems. When you need more capacity, you can simply add another machine to the network, rather than replacing an entire system. Updates can be rolled out gradually across machines without bringing everything down at once. This flexibility makes distributed systems more practical for organizations that grow and change over time.
Example Distributed Systems
Distributed systems appear throughout scientific and technical computing:
Cluster computing: Groups of similar machines working together on shared problems
Grid computing: Loosely coupled computers across organizations sharing computational resources
Cloud computing: On-demand access to distributed computing and storage resources
Volunteer computing: Distributed projects (like SETI@home) using volunteers' spare computing power
Distributed rendering: Animation and graphics studios split rendering work across many machines to speed up production
Reactive Distributed Systems
Reactive Manifesto Principles
As distributed systems have grown more complex, the "Reactive Manifesto" emerged to define best practices. A reactive system exhibits four key characteristics:
Responsive: The system quickly acknowledges requests and responds in a timely manner, even under load.
Resilient: When failures occur, the system recovers gracefully. Problems in one component don't cascade to bring down the entire system.
Elastic: The system can scale up or down dynamically based on demand, adding or removing resources as needed.
Message-driven: Components communicate through asynchronous message passing rather than synchronous calls, allowing loose coupling and better failure isolation.
These principles reflect the reality that distributed systems must be designed to expect failures, handle varying loads, and communicate reliably across unreliable networks.
Data Management in Distributed Systems
Distributed Database Allocation
When data is spread across multiple machines, a critical question emerges: where should each piece of data live? Optimal database allocation distributes data fragments to minimize access latency—the time it takes to retrieve data.
The key insight is that allocation strategies must consider:
Communication cost: The network overhead of fetching data from distant machines
Query patterns: Which queries access which data, and from which locations
For instance, if a company's sales database is primarily queried from the New York office but also used in London, smart allocation might store the full dataset in New York with a read-only copy (or frequently-accessed subset) in London. This reduces network traffic and speeds up the most common queries.
Peer-to-Peer and Blockchain Technologies
Peer-to-Peer Computing Principles
Most computer networks follow a client-server model: clients request services from centralized servers. Peer-to-peer (P2P) systems invert this: each node acts as both client and server simultaneously, sharing resources directly with other nodes.
This architecture enables:
Decentralized file sharing: Users share files directly without a central server (BitTorrent is a well-known example)
Distributed computation: Work is distributed among peers who contribute processing power
Content distribution: Popular content spreads through the network as peers share it with each other
P2P systems are particularly valuable because they eliminate bottlenecks at centralized servers and allow systems to scale as more participants join.
Cryptocurrency and Blockchain
Bitcoin revolutionized distributed computing by introducing a decentralized ledger—a permanent, tamper-proof record of transactions maintained not by a single authority, but by thousands of independent computers.
Blockchain technology secures this distributed ledger through two key mechanisms:
Cryptographic hashing: Each data block contains a mathematical fingerprint of the previous block, creating an unbreakable chain. Changing any past transaction would change its hash, breaking the chain and immediately revealing the tampering.
Consensus mechanisms: Before a new block is added, the network must agree it's valid. This prevents any single malicious actor from forging false transactions. Bitcoin uses "proof of work," where participants must solve difficult computational puzzles to earn the right to add blocks.
Together, these mechanisms allow strangers on the internet to maintain a shared, trustworthy ledger without requiring a central authority. This innovation extends far beyond cryptocurrency to supply chain tracking, smart contracts, and other applications requiring distributed trust.
<extrainfo>
Additional Context: The image you saw illustrates different system architectures. Panel (b) shows a true distributed system where each processor has its own local memory and they communicate over a network. Panel (c) shows a shared-memory multiprocessor where all processors access common memory. Understanding this distinction helps clarify why distributed systems face unique challenges—without shared memory, all communication requires explicit message passing across networks.
</extrainfo>
Flashcards
What are the four principles of the Reactive Manifesto?
Responsive
Resilient
Elastic
Message-driven
What is the primary goal of optimal database allocation in a distributed system?
To minimize access latency by distributing data fragments.
How do nodes function in a peer-to-peer (P2P) system?
Nodes act as both clients and servers to share resources directly.
By what primary mechanisms does blockchain technology secure data?
Cryptographic hashing
Consensus mechanisms
Quiz
Distributed computing - Applications Benefits and Example Systems Quiz Question 1: What are the four principles defined in the Reactive Manifesto?
- Responsive, resilient, elastic, and message‑driven (correct)
- Scalable, secure, modular, and deterministic
- Fast, reliable, simple, and portable
- Stateless, synchronous, centralized, and monolithic
Distributed computing - Applications Benefits and Example Systems Quiz Question 2: Which of the following describes a performance advantage of distributed systems?
- Larger storage and faster compute than a single machine (correct)
- Lower network bandwidth compared to a single server
- Reduced memory capacity across all nodes
- Increased latency due to centralized processing
Distributed computing - Applications Benefits and Example Systems Quiz Question 3: Cluster computing, grid computing, cloud computing, volunteer projects, and distributed rendering are examples of which application area of distributed systems?
- Scientific computing (correct)
- E‑commerce platforms
- Social networking services
- Multimedia streaming services
Distributed computing - Applications Benefits and Example Systems Quiz Question 4: Which two factors are typically considered in allocation strategies for distributing data fragments?
- Communication cost and query patterns (correct)
- Processor speed and memory size
- User interface design and operating system version
- Security protocols and licensing fees
Distributed computing - Applications Benefits and Example Systems Quiz Question 5: In a peer‑to‑peer system, how can a node function?
- It can act as both client and server (correct)
- It only requests data from a central server
- It only provides storage without processing
- It requires a dedicated supernode to coordinate
What are the four principles defined in the Reactive Manifesto?
1 of 5
Key Concepts
Distributed Computing Models
Distributed system
Cloud computing
Cluster computing
Grid computing
Volunteer computing
Peer-to-peer computing
Distributed database
Blockchain and Cryptocurrency
Blockchain
Cryptocurrency
Reactive Systems
Reactive system
Definitions
Distributed system
A network of independent computers that work together to appear as a single coherent system.
Cloud computing
Delivery of computing resources such as servers, storage, and applications over the internet on demand.
Cluster computing
A set of tightly coupled computers that function as a single high-performance system for parallel processing.
Grid computing
A distributed architecture that enables the sharing of heterogeneous resources across multiple administrative domains.
Volunteer computing
A model where individuals donate idle processing power of personal devices to support large-scale scientific projects.
Reactive system
Software designed to be responsive, resilient, elastic, and message‑driven, handling events in real time.
Peer-to-peer computing
A decentralized network where each node can act as both client and server, sharing resources directly.
Blockchain
A tamper‑evident, distributed ledger that records transactions using cryptographic hashing and consensus protocols.
Cryptocurrency
A digital or virtual currency that uses cryptography and a blockchain to secure transactions without a central authority.
Distributed database
A database whose storage devices are spread across multiple locations, allowing data fragmentation and allocation to reduce access latency.