ABSTRACT: This paper introduces an algorithm to solve the Approximate Agreement Problem in an asynchronous failure-by-omission (or crash-failure) system, and proves that the algorithm is optimal by considering the power of the "adversary" scheduler to disrupt processors' views. We show that the adversary need not cause any crashes or omissions to achieve its purpose, 
Introduct ion
A fundamental problem in designing fault-tolerant distributed systems is how to eliminate or reduce differences between the information held by different processors. A classical abstract version of this is known as the Byzantine Agreement Problem [PSL] , and has been studied extensively, using many models of computation, reflecting differing amounts of synchrony in the system, different degrees of maliciousness on the part of faulty processors, PrrmiGon IO copy xilhout fee all orpat ofthic material isgranted provided [hat thecopies we nor made or distributed for direct commercial advantage, the ACM copyright notice and rhe rilte of the publication and it5 date appear, and notice ir given that copying is by pe;misrion of the Association for Compurinp hlachinerv. Tocowotherwire. or to republish. requires a fee and! N specific permieiou.
\? 1987 ACIM O-89791-239-Xi87/0008/0064 7% different power of computation of processors, and different requirements on the solution (see [Fi] for a survey of these results). The distressing fact that in a system with asynchronous communication (i.e. where messages cm take arbitrarily long to arrive) there is no agreement protocol that can tolerate even one fault, was first proved in [FLP] , and extended to more general system models in [DDS] .
Since reaching agreement is difficult even in synchronous systems, and impossible in asynchronous ones, [@SD]) and app roximating a true value (e.g. a sensor) [MS] . An abstract formulation of the problem, which permits the use of techniques developed in studying Byzantine Agreement, is Approximate Agreement, introduced in (DLPSW] , where algorithms were given for both sYnchronous and asynchronous systems assuming Byzantine (i.e. arbitrarily malicious) behaviour of faulty processors.
Those algorithms proceed in rounds, where in each round each processor receives the current value held by other processors and uaverages n these values to obtain a new value for itself (the function used is not the mean, but a fault-tolerant meaSure of central tendency). In (DLPSWI the algorithms given are shown to be optimal (for BYzantine faults) among algorithms having the same form, that is, where the value chosen in each round depends only on values held by processors at the start of that round. The question is raised in [DLPSW] , whether using information from other rounds permits better algorithms. This paper answers that question negatively for asynchronous systems in which processor failures are relatively benign (failure-by-omission), by giving an algorithm that proceeds in rounds and showing that this is optimal among deterministic algorithms,
The Approximate Agreement Problem is studied in this paper in the following form: there are n processors labelled 1,2,. . . ,n which are linked by a completely connected, fault-free, point-to-point network which is the only means of interprocess communication.
A message submitted to the network will eventually reach its destination (where it will be delivered if the addressee asks to receive it), but no upper bound exists on the time from source to destination. We are interested in protocols i,l;ar, a;;i resilient to 1. failures, so we consider only executions where at least n -t of the processors are correct, that is, they follow ihe given algorithm, and the remaining proce~ors ,:lle "faulty" ones, numbering at most t) are obliged to ti,Ilow the protocol as well, except that they may neglect to send some messages the algorithm requires, and they may halt ("crash") at any time, withotii other prc essors being aware of the crash. of the interval containing the initial valuess (so a good algorithm is one with a low K).
The results of [DLPSW] indicate that any value for K can be achieved (so long as n > 5t) if enough communication is used, so we will restrict our discussion to algorithms using at most S rounds of communication.
The algorithm given in [DLPSW] has performance
The algorithm given in this paper is very similar to that of [DLPSW] , but it is valid whenever n > t, and it is able to exploit the fact that failed processes do not exhibit malicious behaviour to obtain performance This accords nicely with the results in /Fe], where ii was found that for synchronous systems, the fai'iure-bj; omission model permits twice the rate of convergence a!-lowed by the Byzantine failure model.
We also prove the matching lower bound K2 I -1 n-t 4 t I
for any deterministic t-resilient Appr0ximat.e Agreement algorithm in an asynchronous system with failure-byomission faults. This proof is the major original contibution of this paper, since the result is surprising at first, considering that in [Fe] it3 was found that for synchronous systems with failure-by-omission faults (or By2antir.r faults) an S-round alworithm could do subr;t antially better that an itcrated round-by-round a1E;orit.h.m like thor c in [DLPSW] or this paper. The intuitive reason for this result is that the synchronous S-round algorithms of [Fe\ exploit the fact that the same set oft processes have to ac" count for the faulty behaviour in all the rounds. Thus the algorithms try to detect which processors are faulty, an< then alter the information received from them to re;luce the damage they can do. However in an asynchronor;e system with failure-by-omission, the worst damage a pr+-% the terminoIogy of [MS] , K is the ratio of final precision to initial precision. [DLPSW] , but for us it will be enough to say that a multiset (sometimes called a bag)
is an unordered collection of values which need not be distinct. For each value u and multiset V we denote the 'This is reflected in [FLP] in the fact that the exe-&ions constructed there do not involve processors failing. 6Thii does not mean that an adversary without the power to cause crashes or omissions is as powerful as one with the power, but merely that (having the power) it need not exercise it! number of occurrences of v in V (the multiplicity of v)
by mult (v,V) . Th e smallest interval containing all the values in V will be denoted by p(V), and its length, the The main change to that model is that we allow multiple channels between each pair of processors, and allow a processor to try to receive messages from only a subset of channels, during a receive operation. This can be used to model the capacity in languages like CSP, for message receipt to be guarded by the message type. We also assume that there are initial and decision states for every real number, not just for 0 and 1 as in [DDS] . This formal model is not needed to discuss the algorithm or the lower bound, but it is needed for a rigorous proof of the intuition about the power of the adversary, mentioned in the introduction.
'By results in [v?] , the processors could be assumed to be synchronous, without altering the results.
Formally, a protocol of the set P of processors 1,2,... ,n is described by the following data: a universe M of messages, a collection C of channels for communication, and for each processor p a set of states Z* and functions p* (describing message generation), TP (describing the guards on message receipt), and P (describing the state transitions). The collection of channels has two associated functions begin : C -+ P and end : C -+ P. We put PI* = {c E C : begin(c) = p} and C"" = {c E C : end(c) = p}, respectively the sets of channels starting and ending at p. We define the set EP of events at p to be {t,0} U (C"** Thus no message is generated during a receiving step, and at most one message is generated in a sending step. The guard function ++' : ZP -t ~(C*~p) is required to satisfy -+'(z) = 0 for z E Zg, modelling the fact that no messages are received during a sending step. The state transition function 6* : Z* x E* --t ZP is required to satisfy the condition that P(z,e) E Zvqdec for all 2: E ZZ,dcc and for all e, to reflect the irrevocable nature of a decision. We also require that SP(z, t) = 2 for all states 2, since t reflects a place-holder for a step not taken because of a crash.
A configuration IC consists of a state for each processor and a multiset' of messages for each channel. We write 'A multiset or bag is used, rather than a set, because the same message could be sent several times.
st(p, G) specifying the state of processor p, and buff(c, K)
for the messages in transit on channel c, in the configuration IE. We say the event (p,e) where e E EP is applicable to the configuration n, if either e = t or e = 0 or e = (c,m) where c E -+'(st(p, K)) and m f buff(c,rc).
Suppose (p,e) is an event applicable to rc. If e = (c,m) we define the failure-free result of (p, e) If e = 0 we define the failure-free result of (p,e) in K to be the unique configuration n' with
for d E C. If e = (c,m) we also define the failure result of (p,e) in K to be the unique configuration K' with l for each i, (pi, ei) is applicable to 6;; l for each i, ni+r is either the failure-free result or the failure result Of (pi, ei) in /Ci; s if j > i, ci = t, and pi = pi, then ej = t.
An execution is an infinite alternating sequence m,(pl, el),n2,(p2, e2)m,.
-. whose prefixes of odd length are partial executions. 
Proofi
We inductively construct (pirei) and ni+l for i > 1. Let pi = (i mod n) f 1. Consider the multiset of messages (m : m E buff(c,n;),c E 7pi(st(pi,n;))}, arranged in order, from the earliest. sent to the latest sent.
If the set is empty, let ei = 0, otherwise let ei = (co, nio)
where mo is the member of the multiset that was sent earliest, and where m. E buff(eo,Ki). Now, let ni+l be the failure-free result of (pi, e) in 4. We say that such a protocol uses only S rounds of com- This definition explains why our model needs to include a mechanism for a protocol to be able to guard against receiving some messages too early -once a round r message has been received, a processor has Iost the chance to send a round t message of its own.
The Algorithm
Each processor p acts according to the following algorithm, which we give first informally, as in [DLPSW] : 
The Lower Bound
To simplify the analysis we will assume that any protocol considered has been put in a canonical form, ss a fullinformation protocol*. A full-information protocol is the following: Different full-information protocols are given by using different functions of the history to determine the decision value. We refer to steps (i) and (ii) for r as forming "phase r" of the execution.
We briefly explain the reasons this form of protocol is completely general, in that any protocol can be implemented by a full-information protocol. As we only consider deterministic algorithms, each message is determined by the history of the sender at the time the message is sent, so we can assume that it is the history sThe full-information protocol we give here iz a natural generalization to asynchronous systems of a standard form used for resaoning about synchrohous algorithms.
itself that is sent. Since the receiver need not pay any attention to messages it is not interested in, there is no loss of generality in sending each message to every processor, or in sending a message for each round. Once a round r message is received, a processor cannot send a round r message of its own, so it should only try to receive messages from rounds up to r -1 until it has sent its round r message. In order to put as much information as possible in that message, it should not send the round r message till it is possible that the processor has received everything that it will ever get, among the messages of rounds 1,. . .,r -1. Thus, the processor should wait (trying to receive) as long as it knows that there are such messages still to come, but no longer. In the failure-by-omission asynchronous system, this means waiting until all messages of preceding rounds have been received from some set of n -t processors, since the remaining t processors could have omitted to send all the
messages not yet received from them.g Once a processor
has entered a decision state, there is no point to trying to receive messages, as the final value in fixed, and any message sent after that point would violate the requirement that no message have round number greater than S. We also note that in an infinite admissible execution of a full-information protocol, every processor (unless it crashes) eventually decides on a final value.
To say that a protocol P has performance K 2
[?I-' is to say that there is some run (determined For notational convenience we put v = [VI.
We first prove the theorem for S = 1.
We denote by p,, (a = 1,. , . , V) the processor at + 1, and by pV+l processor 1. Now we will describe a chain of admissible executions ps, pr ,. . .+++I such that processor pa has the same history in executions p,,-1 and po, and thus the same decision value in those executions. We wiil construct ps with each processor ,having initial value 1, so the decision value of pr in that execution must be 1.
Similarly the decision value of pv+l in p,+l must be 0.
From these facts it follows by a standard argument (see the lower bounds in [Fe] for example) that for some (Y the processes p. and po+l reach decision states with final values that differ by at least y-l, in the admissible execution pa which satisfies all the conditions of the theorem.
The execution po is one where every processor has initial value 1, no processors crash or omit to send, and each processor receives round 1 messages from processes t+1 ,. . .,n before entering its decision state. For a = 1 , . . .,v -1 the execution pp has processors 1,2,. . .,at with initial value 0, and processors at + 1,. . .,n with initial value 1. No processor crashes or omits to send, and p.
enters its decision state after receiving round 1 messages from processors 1,. . .,(a -1)t and at + 1,. . .,n, while every other processor (in particular p,+l) receives round 1 messages from processors 1,. . .,at and (cz + 1)t + 1, . . .,n before entering its decision state. The execution pV has processors 1,2,. . ., it with initial value 0, and processors Yt + 1,. . . ,n with initial value 1. No processor crashes or omits to send, and p,, enters its decision state after receiving round 1 messages from processors 1,. . .,(v -1)t and vti-1,. . ., n, while every other processor (in particular pV+t) receives round 1 messages from processors 1,. . .,n-t before entering its decision state. In the execution pV.+l every processor has initial value 0 and each processor enters its decision state after receiving messages from processors 1,. . .,n -t. Now we suppose the theorem true for (S -l)-round protocols, and prove it for the S-round protocol P.
From P we construct an (S -1)-round full-information protocol Q. To describe Q we have to specify the decision value chosen by p after a given history h. This value will be the decision value chosen by p in protocol P after a history &. We now de&be k-lo Let Rp denote a set of n -t processors such that p received every message from each processor in Rr during h (such a set exists since p entered a decision state). The history i is the same as h during stages 1,. . . ,S -2. During phase S -1 of L, p receives only those messages (among those it received in h during phase S -1) that were from processors in R,,. Then in phase S of i, p receives round S messages from the processors in Rr, and no other messages, We will have completed the description of h when we give the contents of these round S messages, Since this is a full-information protocol, the message p receives from p'
contains (beside the round number) a history of p'. This history is an extension of the history p received from p' in the round S -1 message in h, with the extra events during phase S -1 being the receipt by p' of any message (of rounds 1,. . ., S-l) from a processor in Rp that had not already been received by p'. We will see below that the history il is actually a history of p that occurs in a run of protocol P. Note that p can compute the history k, since it can compute the purported histories of processors p' (as loWe will give the description in the same high-level terminology we have used to descibe the protocols. For a complete description in the fon.zal model of $2, we would also need to specify the order in which the messagea are received during each phase. Any consistently applied choice would be suitable, for example, receiving a round il message from processor pl before a round ia message from processor pn if pl < pi or (pi -= pz and il < ia).
it did receive during h all the messages it is appending to the history of p').
The theorem applied to the S -1 round algorithm Q implies the existence of an admissible execution p' of Q and processors p' and q' which reach decision states with final values w(p') and w(q)) satisfying
and there are at most [yj t processors from which both p' and q' receive round S -1 messages. Lemma 4 from 52 implies that-we .canassume that no processor is faulty during p'.
Choose processors po,pl ,. . .,pr such that po = p', pv = g' and p. # pp+l for ~1 = 0 ,..., v -1. We will describe, in the next paragraph, admissible executions psr for c~ = 1 , . . . , Y of protocol P so that pa has the same history in pm and pa+l. Furthermore, the history of po = p' in execution p1 is the same as the purported history constructed by p' during protocol Q in the execution p', and so during ~1, po must decide on value w (p'). Similarly the history of py = q' during execution py will be the same as the history constructed by g' to determine its decision value during execution p' of protocol Q, so during py, pv must enter a decision state with value w(q'). Just as in the case S = 1, a standard argument shows that for some a=1 ,. . .,v, the: execution pa causes processors ppel and pa to enter decision states with final values w(P,-~) and
The construction of p. also ensures that there are at most (V -1)t processors from which P.-~ and pm both receive round S messages during pa, so that the admissible execution p. satisfies all the conditions in the theorem.
Each execution per will be identical to p' for each processor during phases 1, . , . , S -2. Also, in each execution The construction of phase S of the execution wiI1 be given separately for the cases Y 2 2 and u = 1. First suppose y 1 2. In the execution ~1, during phase S, processor PO receives all the messages sent by processors 1 ,**** n -t during phases 1,. . . , S that it had not previrlh the formal model of $2, we would also need to specify in whi& order the Proposers take steps, and in which order the various messages b a Phase me received, in order to ComPletelY specify the exe,-ution. For example, we could choose to let Processors take stePs h round robm order 1,. _. , R, 1,2,. . ., and similarly to let messages arrive in the order used for the formal description of Protocol P.
ously received. In the execution pi every other processor (in particular pr) receives in round S all the messages sent by processors t + 1,. . . ,n during phases 1,. . . ,S, that it had not previously received. For Q = 2,. , , ,y -1, in the execution pa the messages received by paB1 are those from processors 1,. . . , (a -2)t and those from processors (a -1)t + 1 , . . . ,n (except for those of these messages that had been received before), while all processors except paA receive the messages from processors 1 , . . . , (a -1)t and from at + 1,. . . , n that they had not received before. The execution pv has stage S where processor py-l receives all outstanding messages from processors 1,. . . , (Y -2)6 and from (V -1)t + 1,. . . , n, while all the processors except pv-i receive the outstanding messages from processors 1,. . . , n -2t + $ and from n--t+$+1,. . . ,n. Inthecasev= 1 we need to construct only the execution ~1, with phase S in which processor po receives all the as yet undelivered messages from processors 1 , . . . , n -t and each other processor receives all the messages from processors t + 1, . . . , n that it had not executions form a subset of the failureby-omission executions, it is obvious that any algorithm that solves the S-round approximate agreement problem in a failure-byomission system will also solve the problem (with at least as good a performance) in a system where crashes are the only possible failures. However, it is (a priori) conceivable that there is some protocol that solves the probIem in a crash-failure system, and that uses the special nature of the crash-failure system to obtain better performance than is possible for any algorithm in the more general failure-by-omission system. We show that this is not the case by converting any protocol for the crsshfailure model into a general protocol, and then applying Theorem 2. Thus the lower bound of $4 also applies to the crash-failure model, and so the protocol of $3 remains optimal in the more restricted crash-failure system. received before. Proof:
We first note that we may assume that P has the following form (which is a full-information protocol there is a round rs message from q with rr > rr among the messages p received, or among the messages whose receipt is reported in the histories that are the messages p received), and until there is a set of n -t processors (including p itself) such that p has received r messages (one with each first component from 1 to r) from each of these processors during the execution.
l Finally decide on a value w(p) which is some function of p's history, and thereafter do not send or try to receive messages.
We construct from P a protocol Q that is a fullinformation protocol for the failure-by-omission model as decribed in 54. Thus we need only specify the decision value chosen by processor p in Q after a history h, and this will be the decision value chosen by p in P after a history i= converts(h), where convert; is a function that we will define in the next paragraph, that converts an S-round history of p in the full-information failureby-omission protocol to a history in the full-information crash-failure protocol.
We define inductively convert: to convert a history h of p up to the end of phase r in the full-information failure-by-omission protocol, into a history of p up to the end of phase r in the full-information crash-failure protocol. If r = 1 then h consists of an initial value u(p) for p, followed by receipt of n -t messages containing initial values of processors. We define convert:(h) to be the history consisting of the same initial value u(p) followed by receipt of the same messages as in h and also (unless h already contains the receipt of a round 1 message from p) by the receipt of a round 1 message from p itself, containing its own initial value. Now if r > 1, we let h' denote the p&.x of h up to the end of phase r -1 of the full-information protocol. We will form convert:(h) as an extension of convert:-,(V), with the additional events (forming phase r of the protocol) being receipt of certain messages described below. First, for any round rl message from p1 (containing the history hr of p1 up to the end of phase rl -1) that p received during phase r in h, we include among the events of convert:(h) the receipt by p of a round r1 message from p1 containing the history convert::-r(hr), except if convert&,(P) already contains the receipt by p of a round r-1 message from ~1. Next, we include in convert:(h) the receipt by p of a round r message from itself containing the history convert:-r (h'), unless a round r message from p is already among the messages whose receipt was added in the first step of the construction.
Finally, we examine all the messages received by p in phase r of h or whose receipt is reported 
