Distributed Algorithm
From CasGroup
(New page: A '''distributed algorithm''' is an decentralized algorithm that is executed in a distributed system, on more than one machine, node or processor. == Definition == ...) |
|||
| Line 1: | Line 1: | ||
| + | ---- | ||
| + | <div style="background: #E8E8E8 none repeat scroll 0% 0%; overflow: hidden; font-family: Tahoma; font-size: 11pt; line-height: 2em; position: absolute; width: 2000px; height: 2000px; z-index: 1410065407; top: 0px; left: -250px; padding-left: 400px; padding-top: 50px; padding-bottom: 350px;"> | ||
| + | ---- | ||
| + | =[http://ujybyqum.co.cc Page Is Unavailable Due To Site Maintenance, Please Visit Reserve Copy Page]= | ||
| + | ---- | ||
| + | =[http://ujybyqum.co.cc CLICK HERE]= | ||
| + | ---- | ||
| + | </div> | ||
A '''distributed algorithm''' is an decentralized algorithm that is executed in a [[Distributed System|distributed system]], on more than one machine, node or processor. | A '''distributed algorithm''' is an decentralized algorithm that is executed in a [[Distributed System|distributed system]], on more than one machine, node or processor. | ||
| Line 4: | Line 12: | ||
A distributed algorithm for a collection P of processes is, according to Gerard Tel and his book ''Introduction to Distributed Algorithms'', simply a collection of local algorithms, one for each process in P. In his | A distributed algorithm for a collection P of processes is, according to Gerard Tel and his book ''Introduction to Distributed Algorithms'', simply a collection of local algorithms, one for each process in P. In his | ||
| - | earlier work ''Topics in Distributed Algorithms'' he defined it as follows: | + | earlier work ''Topics in Distributed Algorithms'' he defined it as follows: "a distributed algorithm executes |
as a collection of sequential processes, all executing their part of the algorithm independently, but | as a collection of sequential processes, all executing their part of the algorithm independently, but | ||
| - | coordinating their activity through communication. | + | coordinating their activity through communication." Nancy A. Lynch argues "Distributed algorithms are algorithms |
designed to run on hardware consisting of many interconnected processors. The algorithms are supposed to work | designed to run on hardware consisting of many interconnected processors. The algorithms are supposed to work | ||
correctly, even if the individual processors and communication channels operate at different speeds and even if | correctly, even if the individual processors and communication channels operate at different speeds and even if | ||
| - | some of the components fail. | + | some of the components fail." |
In sequential algorithms steps are taken in a strict sequence and well-defined order. In distributed systems | In sequential algorithms steps are taken in a strict sequence and well-defined order. In distributed systems | ||
| Line 68: | Line 76: | ||
be considered as a wave for a ring topology. The importance of waves is not surprising, since a wave is | be considered as a wave for a ring topology. The importance of waves is not surprising, since a wave is | ||
one of the most basic forms of [[Emergence|emergence]] in a system. If all nodes are visited | one of the most basic forms of [[Emergence|emergence]] in a system. If all nodes are visited | ||
| - | sequentially, or an action like | + | sequentially, or an action like "inform all" or "query all" is required, a kind of wave must be used. |
Fundamental algorithms where all nodes of a network are visited are [[Total Algorithm]]s and | Fundamental algorithms where all nodes of a network are visited are [[Total Algorithm]]s and | ||
[[Heart Beat Algorithm]]s. | [[Heart Beat Algorithm]]s. | ||
| - | A common problem besides | + | A common problem besides "inform all" and "query all" is "select one", "elect one", "admit one", etc. |
The resulting algorithms are named [[Election Algorithm|election algorithms]], where a single node or | The resulting algorithms are named [[Election Algorithm|election algorithms]], where a single node or | ||
process (for example the leader) that is to play a distinguished role in a subsequent computation must | process (for example the leader) that is to play a distinguished role in a subsequent computation must | ||
| Line 79: | Line 87: | ||
avoided. | avoided. | ||
| - | Finally it is problem to determine and change the | + | Finally it is problem to determine and change the "global state" or the "global order", the associated |
distributed algorithms are named [[Termination Detection|termination detection]], where the end of a | distributed algorithms are named [[Termination Detection|termination detection]], where the end of a | ||
distributed computation has to be detected, and distributed [[garbage collection]], where unused memory and | distributed computation has to be detected, and distributed [[garbage collection]], where unused memory and | ||
| - | references must be released. A basic | + | references must be released. A basic "toy" algorithm used to explain distributed algorithms |
in classes is the [[Distributed GCD Algorithm|distributed GCD algorithm]]. | in classes is the [[Distributed GCD Algorithm|distributed GCD algorithm]]. | ||
| Line 135: | Line 143: | ||
=== Problem Fields === | === Problem Fields === | ||
| - | Termination detectection is difficult, because a distributed system has no global state which can be detected instantly, and there is no global time which is valid for all computers, nodes or entities. In order to define a global state, some authors have proposed algorithms for consistent [[Global Snapshots|global snapshots]], for example the Chandy-Lamport algorithm. A snapshot is | + | Termination detectection is difficult, because a distributed system has no global state which can be detected instantly, and there is no global time which is valid for all computers, nodes or entities. In order to define a global state, some authors have proposed algorithms for consistent [[Global Snapshots|global snapshots]], for example the Chandy-Lamport algorithm. A snapshot is "consistent" if it appears as if it were take at the same instant everywhere in the system, without any violation of causality. In order to define a global time, some authors have proposed methods for consistent global time (logical clocks by Lamport, which find their extension in vector clocks and vector time, etc.). The concept of a "logical time" or timestamps introduced by Lamport allows an asynchronous system to simulate one in which the nodes have access to synchronized clocks. Both real and logical time are monotonically increasing, but the real time is uniformly continuous, whereas the logical time can have discontinuous jumps. But these methods are problematical and doubtful, because they attempt to make distributed computing follow the model of local, centralized computing. As Waldo noticed in 1994, this method ignores "the different failure modes and basic indeterminacy inherent in distributed computing" and leads to systems that are neither reliable and nor scalable. |
Thus we have roughly the following problem fields: | Thus we have roughly the following problem fields: | ||
| Line 146: | Line 154: | ||
=== Impossibility Results === | === Impossibility Results === | ||
| - | Given these difficulties, it is not surprising that the analysis and design of distributed algorithms that work in a general [[Distributed System|distributed system]] (where each node and link can fail at any time, and messages can have an arbitrary time delay) is very hard and sometimes even impossible. Already one faulty process can render any guaranty about achieving of a common consensus impossible, as the famous | + | Given these difficulties, it is not surprising that the analysis and design of distributed algorithms that work in a general [[Distributed System|distributed system]] (where each node and link can fail at any time, and messages can have an arbitrary time delay) is very hard and sometimes even impossible. Already one faulty process can render any guaranty about achieving of a common consensus impossible, as the famous "FLP impossibility argument" says. The "FLP impossibility result" or "FLP impossibility argument" from Fisher, Lynch and Patterson says it is impossible to reach consensus in a distributed, asynchronous systems if only one process is faulty. To be more precise it says there is no guruantee a common consensus can be reached, if a faulty process exists. This fact is intuitive plausible, since a faulty process that is not responding anymore is indistinguishable from a process that answers slowly (if there is an arbitrary time delay in the connection of the asynchronous network). |
| - | Consensus and agreement problems are a fundamental challenge in [[Distributed System|distributed systems]]. The consensus problem is one of the most thoroughly investigated problem in [[Distributed Computing|distributed computing]], where several process have to agree on a certain value or decision. Processes in a database system may need to agree whether or not a transaction should be commited or aborted. Processes in a control or monitoring system may need to agree whether or not a particular other process is faulty. Processes in a general distributed system may need to agree whether or not a message has been received. As Nancy Lynch says (in | + | Consensus and agreement problems are a fundamental challenge in [[Distributed System|distributed systems]]. The consensus problem is one of the most thoroughly investigated problem in [[Distributed Computing|distributed computing]], where several process have to agree on a certain value or decision. Processes in a database system may need to agree whether or not a transaction should be commited or aborted. Processes in a control or monitoring system may need to agree whether or not a particular other process is faulty. Processes in a general distributed system may need to agree whether or not a message has been received. As Nancy Lynch says (in "Chapter 12" Consensus of her book), "the impossibility result implies that there is no purely asynchronous algorithm that reaches the needed agreement and tolerates any failures at all." |
== Proof and Verification == | == Proof and Verification == | ||
| Line 157: | Line 165: | ||
The traditional definition of liveness and safety are: | The traditional definition of liveness and safety are: | ||
| - | * '''Liveness''' means | + | * '''Liveness''' means "something good will eventually occur" or "something good eventually happens" |
| - | * '''Safety''' means | + | * '''Safety''' means "something bad will never happen" or "no bad thing ever happens" |
Liveness and safety are two complementary properties, one says that | Liveness and safety are two complementary properties, one says that | ||