Subjects/Science/Computer and Information Science/Computer Science/Distributed computing

Distributed computing - Applications Benefits and Example Systems

Learn the performance, reliability, and manageability benefits of distributed systems and see real-world examples such as scientific computing, reactive architectures, and blockchain.

Summary

Read Summary

Flashcards

Save Flashcards

Quiz

Take Quiz

Quick Practice

What are the four principles of the Reactive Manifesto?

1 of 4

Summary

Applications and Benefits of Distributed Systems Introduction A distributed system is a collection of independent computers that work together to achieve a common goal. Unlike traditional single-machine systems, distributed systems spread computation and data across multiple physical locations connected by a network. Understanding when and why to use distributed systems, and how they work, is fundamental to modern computing. When Distribution Is Required Distribution becomes necessary when data must physically travel between separated locations. Consider a company with offices in different cities, each needing real-time access to shared data, or a social media platform serving users worldwide. In these cases, a single centralized computer cannot efficiently serve all users—the physical distance itself makes distribution not just beneficial, but essential. Performance Advantages Distributed systems can achieve what single machines cannot. By pooling resources across multiple computers, you gain: Larger storage capacity: Store more data than any single machine could hold Higher memory capacity: Process larger datasets with aggregate RAM Faster computation: Parallelize work across multiple processors simultaneously Higher bandwidth: Combine network connections for greater data throughput For example, a search engine like Google doesn't run on one supercomputer—it distributes queries across thousands of servers, allowing it to search billions of web pages while responding to millions of users simultaneously. Reliability Advantages A single machine has a critical weakness: when it fails, the entire system goes down. Distributed systems solve this through redundancy—there is no single point of failure. If one server crashes, others continue operating. Data can be replicated across multiple machines so that losing one copy doesn't mean losing the data entirely. This resilience is why banks, hospitals, and other critical services rely on distributed systems. Manageability Advantages Distributed systems are easier to expand and maintain than monolithic systems. When you need more capacity, you can simply add another machine to the network, rather than replacing an entire system. Updates can be rolled out gradually across machines without bringing everything down at once. This flexibility makes distributed systems more practical for organizations that grow and change over time. Example Distributed Systems Distributed systems appear throughout scientific and technical computing: Cluster computing: Groups of similar machines working together on shared problems Grid computing: Loosely coupled computers across organizations sharing computational resources Cloud computing: On-demand access to distributed computing and storage resources Volunteer computing: Distributed projects (like SETI@home) using volunteers' spare computing power Distributed rendering: Animation and graphics studios split rendering work across many machines to speed up production Reactive Distributed Systems Reactive Manifesto Principles As distributed systems have grown more complex, the "Reactive Manifesto" emerged to define best practices. A reactive system exhibits four key characteristics: Responsive: The system quickly acknowledges requests and responds in a timely manner, even under load. Resilient: When failures occur, the system recovers gracefully. Problems in one component don't cascade to bring down the entire system. Elastic: The system can scale up or down dynamically based on demand, adding or removing resources as needed. Message-driven: Components communicate through asynchronous message passing rather than synchronous calls, allowing loose coupling and better failure isolation. These principles reflect the reality that distributed systems must be designed to expect failures, handle varying loads, and communicate reliably across unreliable networks. Data Management in Distributed Systems Distributed Database Allocation When data is spread across multiple machines, a critical question emerges: where should each piece of data live? Optimal database allocation distributes data fragments to minimize access latency—the time it takes to retrieve data. The key insight is that allocation strategies must consider: Communication cost: The network overhead of fetching data from distant machines Query patterns: Which queries access which data, and from which locations For instance, if a company's sales database is primarily queried from the New York office but also used in London, smart allocation might store the full dataset in New York with a read-only copy (or frequently-accessed subset) in London. This reduces network traffic and speeds up the most common queries. Peer-to-Peer and Blockchain Technologies Peer-to-Peer Computing Principles Most computer networks follow a client-server model: clients request services from centralized servers. Peer-to-peer (P2P) systems invert this: each node acts as both client and server simultaneously, sharing resources directly with other nodes. This architecture enables: Decentralized file sharing: Users share files directly without a central server (BitTorrent is a well-known example) Distributed computation: Work is distributed among peers who contribute processing power Content distribution: Popular content spreads through the network as peers share it with each other P2P systems are particularly valuable because they eliminate bottlenecks at centralized servers and allow systems to scale as more participants join. Cryptocurrency and Blockchain Bitcoin revolutionized distributed computing by introducing a decentralized ledger—a permanent, tamper-proof record of transactions maintained not by a single authority, but by thousands of independent computers. Blockchain technology secures this distributed ledger through two key mechanisms: Cryptographic hashing: Each data block contains a mathematical fingerprint of the previous block, creating an unbreakable chain. Changing any past transaction would change its hash, breaking the chain and immediately revealing the tampering. Consensus mechanisms: Before a new block is added, the network must agree it's valid. This prevents any single malicious actor from forging false transactions. Bitcoin uses "proof of work," where participants must solve difficult computational puzzles to earn the right to add blocks. Together, these mechanisms allow strangers on the internet to maintain a shared, trustworthy ledger without requiring a central authority. This innovation extends far beyond cryptocurrency to supply chain tracking, smart contracts, and other applications requiring distributed trust. <extrainfo> Additional Context: The image you saw illustrates different system architectures. Panel (b) shows a true distributed system where each processor has its own local memory and they communicate over a network. Panel (c) shows a shared-memory multiprocessor where all processors access common memory. Understanding this distinction helps clarify why distributed systems face unique challenges—without shared memory, all communication requires explicit message passing across networks. </extrainfo>

Flashcards

What are the four principles of the Reactive Manifesto?

Responsive Resilient Elastic Message-driven

What is the primary goal of optimal database allocation in a distributed system?

To minimize access latency by distributing data fragments.

How do nodes function in a peer-to-peer (P2P) system?

Nodes act as both clients and servers to share resources directly.

By what primary mechanisms does blockchain technology secure data?

Cryptographic hashing Consensus mechanisms

Quiz

What are the four principles defined in the Reactive Manifesto?

1 of 5

Key Concepts

Distributed Computing Models

Distributed system

Cloud computing

Cluster computing

Grid computing

Volunteer computing

Peer-to-peer computing

Distributed database

Blockchain and Cryptocurrency

Blockchain

Cryptocurrency

Reactive Systems

Reactive system

Definitions

Distributed system

A network of independent computers that work together to appear as a single coherent system.

Cloud computing

Delivery of computing resources such as servers, storage, and applications over the internet on demand.

Cluster computing

A set of tightly coupled computers that function as a single high-performance system for parallel processing.

Grid computing

A distributed architecture that enables the sharing of heterogeneous resources across multiple administrative domains.

Volunteer computing

A model where individuals donate idle processing power of personal devices to support large-scale scientific projects.

Reactive system

Software designed to be responsive, resilient, elastic, and message‑driven, handling events in real time.

Peer-to-peer computing

A decentralized network where each node can act as both client and server, sharing resources directly.

Blockchain

A tamper‑evident, distributed ledger that records transactions using cryptographic hashing and consensus protocols.

Cryptocurrency

A digital or virtual currency that uses cryptography and a blockchain to secure transactions without a central authority.

Distributed database

A database whose storage devices are spread across multiple locations, allowing data fragmentation and allocation to reduce access latency.