Randomized Wait-Free Consensus using An Atomicity Assumption by Cheung, L.
PDF hosted at the Radboud Repository of the Radboud University
Nijmegen
 
 
 
 
The following full text is a preprint version which may differ from the publisher's version.
 
 
For additional information about this publication click this link.
http://hdl.handle.net/2066/32905
 
 
 
Please be advised that this information was generated on 2017-12-05 and may be subject to
change.
Randomized Wait-Free Consensus using An
Atomicity Assumption?
Ling Cheung??
Department of Computer Science, University of Nijmegen
P.O. Box 9010, 6500 GL Nijmegen, The Netherlands
Abstract. We present a randomized algorithm for asynchronous wait-
free consensus using multi-writer multi-reader shared registers. This algo-
rithm is based on earlier work by Chor, Israeli and Li (CIL) and is correct
under the assumption that processes can perform a random choice and
a write operation in one atomic step. The expected total work for our
algorithm is shown to be O(N log(logN)), compared with O(N 2) for the
CIL algorithm and O(N logN) for the best weak-adversary algorithm
previously known. We also model check instances of our algorithm using
the probabilistic model checking tool PRISM.
Keywords: Asynchronous Consensus, Randomized Algorithms, Wait-
Free Termination, Weak Adversary, Probabilistic Model Checking
AMS (2000): 68W15, 68W20, 68W40, 68Q25, 68Q60
CR (1998): C.2.2, C.4, G.3
1 Introduction
Distributed consensus refers to a class of problems in which a set of parallel
processes exchange messages in order to agree on a common preference. Initially,
each process is given an input value from a fixed, finite domain and, at the end
of the algorithm, each non-faulty process outputs a decision value. Correctness
requirements are typically formulated as follows.
– Validity : the output of any non-faulty process must have been the input of
some process.
– Agreement : all non-faulty processes decide on the same value.
– Termination: every non-faulty process decides after a finite number of steps.
As shown in [FLP85], there exists no deterministic algorithm that solves
distributed consensus in a setting of asynchronous communication with unde-
tected process failure. Nonetheless, many efficient solutions exist under stronger
assumptions (e.g. partial synchrony [DLS88] and failure detection [ACT00]) or
weaker correctness requirements (e.g. probabilistic termination [CIL87]).
? An extended abstract of this report appears in the Proceedings of the 9th Interna-
tional Conference on Principles of Distributed Systems (OPODIS’05).
?? Supported by DFG/NWO bilateral cooperation project Validation of Stochastic Sys-
tems (VOSS2).
2Our algorithm falls into the category of randomized consensus algorithms1,
where processes may use coin tosses to determine their course of actions. In
this setting, termination is weakened to a probabilistic statement: the set of all
non-terminating executions has probability 0. The first randomized consensus
algorithm was proposed by Chor, Israeli and Li [CIL87,CIL94]. It satisfies the
following termination condition.
– Probabilistic wait-free termination: with probability 1, each non-faulty pro-
cess decides after a finite number of steps.
We adopt the same requirement. In fact, the logical structure of our algorithm
closely resemble that in [CIL94], while we borrow ideas from [Cha96] to reduce
the amount of shared and local data. We shall refer to [CIL94] as the original
CIL algorithm and our own as the modified CIL algorithm.
Adversary Models and Work Bounds. To prove probabilistic termination,
we must reason about probability distributions on the set of executions. This is
done by specifying the so-called adversaries, which are fictitious entities designed
to model scheduling uncertainties in a distributed environment. Mathematically,
an adversary is a function mapping each finite execution to an available next
operation. Such a function resolves all non-deterministic choices among parallel
processes, thereby inducing a probability distribution on the set of executions.
One can then ask if probabilistic termination is satisfied according to this distri-
bution. By quantifying over all possible adversaries (of a certain strength), we
obtain worst-case guarantees similar to those in a non-probabilistic setting.
The strength of an adversary varies according to the amount of information it
can extract from a finite history. The strong adversaries have access to complete
history of all processes and shared registers. Some weaker forms, such as write-
oblivious and value-oblivious, delay the adversary’s knowledge of outcomes of
internal coin tosses. The oblivious adversaries are the weakest, unable to observe
any random outcomes and their subsequent effects on system dynamics.
Clearly, a stronger adversary model permits more possibilities and therefore
renders consensus more difficult. Consensus against strong adversaries is shown
to be Ω(N2/ log2N) in expected total work, where N is the number of processes
participating in the algorithm [Asp98]. The best known algorithms achieve ex-
pected O(N2 logN) total work [BR91] and O(N log2N) per process [AW96].
Against write-oblivious adversaries, one can achieve expected O(logN) per pro-
cess work and O(N logN) total work [Aum97].
Our adversary model takes the form of an atomicity assumption2: processes
can perform a random choice and a write operation in one atomic step. In par-
ticular, if the coin comes up heads, the process increments its round number and
1 We refer to [Asp03] for a comprehensive overview on this topic.
2 Some authors refer to this assumption as local-oblivious, which we find misleading.
Local-oblivious implies that local operations based on local coin tosses are also hid-
den from the adversary. This is stronger than we need, because in our case successful
tosses are immediately written to shared memory, thus revealed to the adversary.
3writes to the shared memory; otherwise, nothing happens. This model is weaker
than the strong adversaries, but stronger than any other previously proposed
adversary model.
The original CIL algorithm relies on the same atomicity assumption and
achieves expected O(N2) total work [CIL94]. In the present paper, we replace
the single-writer multiple-reader (SWMR) registers of [CIL94] with multi-writer
multi-reader (MWMR) registers, thereby reducing the expected total work to
O(N log(logN)).
Since our adversaries are value-sensitive, every non-faulty process must per-
form at least one read operation, otherwise we can easily construct an execution
that violates the agreement property. Therefore, expected total work in this
model is Ω(N), which almost matches our upper bound of O(N log(logN)).
We have adopted the worst case expected total work as our complexity mea-
sure, mainly because it is more natural to reason about the collective effect of
all processes on the shared memory. In fact, per process work in our case is
comparable to total work: if all but one process suffer crash failures, the lone
survivor carries the total work burden and performs expected O(N) tosses in
order to pull far enough ahead for termination. In this sense, our algorithm
is less efficient than [Cha96,Aum97], where polylogarithmic upper bounds are
given for per process work. This is however a misleading comparison, because
our adversary model is strictly stronger than those used in [Cha96,Aum97]. It is
not known whether per process work is inevitably high in our particular setting.
Probabilistic Model Checking. We model check instances of our algorithm
using PRISM, which can check PCTL (Probabilistic Computation Tree Logic)
formulas against an MDP (Markov Decision Process) [PRI,BK98]. This tool has
been applied to many randomized algorithms, including the consensus algorithm
of Aspnes and Herlihy [AH90,KNS01] and Byzantine agreement [KN02].
Consensus algorithms are often hard to model check, because the state space
grows exponentially with the number of participating processes. In [KNS01],
PRISM is applied only to a shared-coin subroutine, while full correctness re-
lies on verification using Cadence SMV, as well as higher level manual proofs.
Unfortunately, the structure of our algorithm does not provide such convenient
isolation of probabilistic reasoning. Nevertheless, we are able to build models of
binary consensus with up to 4 processes and verify relevant properties.
In Section 6, we briefly describe these models and give a summary of PRISM
results. In Section 7, we discuss our learning experience with PRISM and some
prospects in improving feasibility of model checking.
Overview. Section 2 describes in greater detail our computational setting and
assumptions. Section 3 presents the algorithm and correctness proofs are given in
Sections 4 and 5. Section 6 is devoted to model checking and Section 7 contains
closing discussions.
42 System Model
We consider a system ofN processes interacting asynchronously via shared mem-
ory objects. Each process Pi is given as input an initial preference p
0
i , which be-
longs to a fixed, finite domain. Without loss of generality, this preference domain
is assumed to be ZK for some natural number constant K ≥ 2. (As a convention,
we write ZK for {0, . . . ,K − 1} and Z+K for {1, . . . ,K − 1}.)
We take a state-based view of our system. The local state of a process is
determined by a valuation of all of its local variables, plus a program counter
indicating the next line of code to be executed. In fact, it is trivial to include the
program counter as an explicit variable, so that local state is fully determined by
valuation of local variables. This is done in our PRISM models. The global state
is then determined by local states of all N processes, together with contents of
shared memory objects. These objects are MWMR registers allowing primitive
read and write operations. They are assumed to be linearizable [HW90], so that
each memory operation can be viewed as taking place at a particular instant in
time (as opposed to a time interval between invocation and response). Under
this assumption, each read access returns the value written by the last write
access in the linearized execution history.
A process executes a possibly infinite sequence of discrete steps, each con-
sisting of a memory operation and/or a change in local state. It may also exhibit
a limited form of non-deterministic behavior: crashing at any point of its execu-
tion. A crashed process may never recover and re-enter the algorithm.
An execution of the entire system is obtained by interleaving executions of
individual processes, where scheduling among processes is determined by an
adversary that satisfies the atomicity assumption stated in Section 1. That is, if
a process is scheduled to toss a coin, it must be allowed to write to the memory
(in case the coin lands heads) before another process is given a turn. The worst-
case complexity is measured in terms of the expected number of read and write
operations taken by all processes, quantifying over all admissible adversaries.
3 Modified CIL Algorithm
As in many other consensus algorithms [BO83,CIL94,AH90,Cha96], our pro-
cesses advance through a sequence of rounds. During a particular round, a pro-
cess goes through a possibly infinite sequence of phases, each of which is a com-
plete pass through the main while-loop.
In original CIL, the shared memory is configured into an array of N many
SWMR registers, one for every process. Each registeri contains two pieces of
information: round number ri and preference value vi. At the beginning of each
phase, process Pi copies the contents of all registerj (i 6= j) and stores them
locally. These entries are then examined to decide the next action of Pi: output
a decision value and terminate, toss a coin to advance to the next round, jump
to a higher round, etc.
This initial copying of each phase is the main source of inefficiency in orig-
inal CIL: a large portion of copied data is inessential for decision making. For
5example, Pi need not know exactly which Pj is in a higher round, as long as
it knows some Pj is. This observation is precisely the motivation of our move
from SWMR memory to MWMR memory. By treating processes anonymously,
we reduce the number of read operations in the main loop from O(N) to O(1).
Following [Cha96], our MWMR shared memory is configured into K arrays
of bits, each of length R + 2, where R := 2dlogNe. In other words, we have
mem : ZR+2 × ZK → {0, 1}. (Recall that K is the size of the preference domain
and is a constant, while N is the number of participating processes.) These bits
can be interpreted as follows. For all r ∈ Z+R+1 and v ∈ ZK , mem(r, v) = 1 if
and only if some process holds/held preference v while in round r. These entries
are thus initialized to 0. On the other hand, round-0 entries are initialized to
1 in order to avoid an initialization error3. Finally, round-(R + 1) entries are
initialized to 0 and are used for marking decision values. That is, if a process
decides on value v, it writes 1 to mem(R+ 1, v).
As the algorithm progresses, a slow process (i.e., one with lower round num-
ber) always adopts the preference of one of the fastest processes it sees. Intu-
itively, we say the preference value of a slow process is eliminated by that of a fast
process. (This notion is made precise in Definition 1 in Section 4.) Therefore the
number of contending preference values never increases and the algorithm ter-
minates when that number decreases to 1. If a process Pi sees itself at least two
rounds ahead of any disagreeing peer, the algorithm guarantees that the pref-
erence pi has eliminated every other contending value, therefore Pi can safely
terminate with pi.
Biased coin tosses are used to break ties in the lead pack in such a way that,
with probability 1, the number of contending preferences eventually reaches 1.
This technique is used in [CIL94] and is quite different from the more common
approach of shared coin subroutines, in which processes cast randomly generated
votes to obtain a weak shared-coin [AH90,BR91].
Although every non-faulty process is guaranteed (with probability 1) to ter-
minate after a finite number of steps, the round in which it terminates can
become arbitrarily high. This requires an unbounded number of registers and
is infeasible. Therefore we stop our algorithm when it reaches a certain round
without successful termination, in which case we switch to a slower algorithm
that requires bounded memory. We call this the exit algorithm. For convenience,
the original CIL algorithm is chosen for this purpose4. We will show that any
exit algorithm is invoked with probability at most 1N , therefore the higher cost
of original CIL does not affect overall expected complexity.
Figure 1(a) contains the pseudocode for process Pi. The numbered lines can
be described informally as follows.
(1) Check if some process has decided.
3 As noted in [CH05], original CIL contains this error: a process may decide for its
own preference before a disagreeing process has a chance to write to mem.
4 Technically, original CIL requires registers with unbounded size. However, according
to [CIL94], the probability of non-termination is already extremely small (2−56)
when each register is 128 bits.
6(2) If so, decide for the same value.
(3) Check if a disagreeing process has reached round ri − 1.
(4) If not, write 1 to mem(R + 1, pi) and terminate with output pi.
(5) Otherwise, if round R is reached, run the original CIL algorithm.
(6) Otherwise, check if some process has reached round ri + 1.
(7) If not, advance to the next round with probability 12N .
(8) Otherwise, run subroutine Jump to catch up with faster processes.
Notice that the atomicity assumption discussed in Section 1 applies to Line (7).
This prevents the adversary from selectively delaying write operations of pro-
cesses who are already to advance to the next round.
Figures 1(b) and 1(c) contain the subroutines ReadMem and Jump, respec-
tively. The former is used to read from the shared memory, while the later is
used by a lagging process to catch up with the fastest processes. When called
with parameters r and p, ReadMem scans one-by-one the r-th entry of every bit
vector, except for the p-th. In other words, we would like to know if any process
has reached round r with preference other than p. It returns the first k such
that both k 6= p and, at the time of read access, mem(r, k) = 1. If no such k is
encountered, ReadMem returns K.
In every pass through the while-loop of Figure 1(a), ReadMem is called with
at most three round numbers: R + 1, ri − 1, and ri + 1. This does not reveal
the highest round ever reached by any process. Therefore, a separate subroutine
Jump is run when the process sees itself behind. This is a key difference between
our algorithm and original CIL: in exchange for fewer read operations in the
main loop, more work is needed for slower processes to catch up.
The subroutine Jump can be implemented in various ways. A simple solution
is to increment ri by 1 and then call ReadMem once to find the least k such
that mem(ri, k) = 1. This requires a constant amount of work per invocation of
Jump and is implemented in our PRISM models. However, a process lagging way
behind may need to step through the main loop as many as R times in order
to catch up. Hence we opt for a faster implementation, which is essentially a
binary search on mem. This involves O(log(logN)) operations per invocation of
Jump, but a lagging process can catch up with the leaders in one complete phase
(provided no further progress is made in the mean time).
4 Validity and Agreement
In this section, we treat all coin tosses as non-deterministic choices. Let s0 denote
the initial state of our system, where all N processes as well as the shared
memory have been properly initialized. A path of the system is a finite sequence
of states s0s1 . . . sm where, for all j ∈ Zm, sj+1 can be obtained from sj by
allowing exactly one non-faulty process to execute its next instruction. A state
s is reachable if there is a path ending in s. Finally, a value k ∈ ZK is said to be
valid if there is i ∈ ZN such that k equals the input p0i to process Pi.
We use record notation to indicate valuation of variables. For example, s.ri
denotes the round number of Pi in state s. If Pi is running a subroutine (e.g.
7ModifiedCIL(i, p0i )
local variables
// round number
ri ∈ ZR+2,
// preference
pi ∈ ZK ,
// decision value
di ∈ ZK+1,
// values read from memory
aheadi, behindi ∈ ZK+1
begin
pi := p
0
i ; ri := 0;
while ri ≤ R do
(1) di := ReadMem(R+ 1, K);
(2) if di 6= K then return di;
if ri > 0 then {
(3) behindi := ReadMem(ri − 1, pi);
(4) if behindi = K then {
mem(R + 1, pi) := 1;
return pi
}
(5) elseif ri = R then return
OriginalCIL(i, pi)
}
(6) aheadi := ReadMem(ri + 1, K);
if aheadi = K then
(7) with probability 1
2N
do {
ri := ri + 1;
mem(ri, pi) := 1
}
(8) else 〈ri, pi〉 := Jump(ri + 1, aheadi)
od
end
(a) Main Algorithm.
ReadMem(r, p)
local variables
// counter
k ∈ ZK ,
// preference value found
v ∈ ZK+1,
begin
k := 0; v := K;
while k < K and v = K do
if mem(r, k) = 1 and k 6= p then
v := k;
k := k + 1
od
return v
end
(b) Subroutine ReadMem.
Jump(r, p)
local variables
// confirmed round and preference
r′ ∈ ZR+1, p′ ∈ ZK ,
// current round and preference
l ∈ Z+R+1, u ∈ ZK+1,
// counter
c ∈ ZR+1,
begin
if r ≥ R then return 〈r, p〉;
r′ := r; p′ := p; c := dlog(R− r)e;
while c > 0 do
l := r′ + 2c−1;
if l ≤ R then {
u := ReadMem(l,K);
if u 6= K then {
r′ := l; p′ := u
}
}
c := c − 1
od
return 〈r′, p′〉
end
(c) Subroutine Jump.
Fig. 1. Modified CIL Algorithm
8ReadMem), we add subscript i to variables of that subroutine (e.g. s.ki and
s.vi). This should not cause any confusion because each process runs at most
one instance of any subroutine at any given point of the execution.
First we state some properties about mem and subroutines ReadMem and
Jump. Lemma 1 says that an entry in mem never changes from 1 to 0. Lemma 2
says that the return value of ReadMem is correct (although it may be out-of-
date). Similarly, Lemma 3 states the correctness of Jump.
Lemma 1. Let r ∈ ZR+2, v ∈ ZK and a path s0 . . . sm be given. Suppose there
is j ∈ Zm+1 with sj .mem(r, v) = 1. Then sj′ .mem(r, v) = 1 for all j ≤ j ′ ≤ m.
Proof. A process writes to the shared memory only if it executes Lines (4) or (7)
in Figure 1(a). In either case, the value 1 is written. Therefore, once mem(r, v)
becomes 1, it retains that value in the rest of the path. ut
Lemma 2. Let r ∈ ZR+2, p, v ∈ ZK+1 and a path s0 . . . sm be given. If the last
step is ReadMem(r, p) returning v 6= K, then sm.mem(r, v) = 1.
Proof. The return value v of ReadMem is set to a value other than K only if
the if-then clause is executed. Let sj (0 ≤ j ≤ m) be the state from which
this instance of ReadMem reads from mem(r, v). Clearly, sj .mem(r, v) = 1. By
Lemma 1, this also holds in sm. ut
Lemma 3. Let r, r′′ ∈ ZR+1, p, p′′ ∈ ZK and a path s0 . . . sm be given. Suppose
the last step is Jump(r, p) returning 〈r′′, p′′〉. If mem(r, p) = 1 when Jump(r, p)
is called, then sm.mem(r
′′, p′′) = 1.
Proof. We prove that mem(r′, p′) = 1 is an invariant of the while-loop in Jump.
By assumption, the claim holds for initial values r′ = r and p′ = p. Noticed
that 〈r′, p′〉 is updated only if the if-then clause is executed, in which case
v 6= K. Since v is the return value of ReadMem(l,K), we have by Lemma 2
that mem(l, v) = 1, hence mem(r′, p′) = 1 still holds after the update. Let
sj be the state immediately after the last update of 〈r′, p′〉. Then we know
sj .mem(r
′′, p′′) = 1. By Lemma 1, this also holds in sm. ut
Lemma 4 below states that mem correctly reflects the preference history of
participating processes. Validity is then proven to be an invariant (Theorem 1).
Lemma 4. Let a path s0 . . . sm be given.
(i) For all i ∈ ZN , sm.ri ≤ R implies sm.mem(sm.ri, sm.pi) = 1.
(ii) For all r ∈ Z+R+2 and v ∈ ZK , sm.mem(r, v) = 1 implies there exist i ∈ ZN
and j ∈ Zm+1 such that sj .pi = v.
Proof. We proceed by induction on the length of paths. For the initial state s0,
recall that round-0 entries are initialized to 1 and all other entries 0, therefore
the two claims hold trivially.
Now we consider a path s0 . . . smsm+1. Suppose the last step is taken by
process Pi. Let r denote sm.ri and v denote sm.pi. Notice that only Lines (4), (7)
and (8) in Figure 1(a) update variables ri, pi and mem.
9– Line (4). By Lemma 1, Item (i) is trivial because ri is not updated. Item (ii)
holds because the only entry of interest is mem(R + 1, v) and by definition
v = sm.pi.
– Line (7). We may focus on the case in which the coin toss is successful. Recall
that the two updates in Line (7) are assumed to be atomic. If r + 1 ≤ R,
then r < R and mem(r+ 1, v) is set to 1. This proves Item (i). On the other
hand, sm+1.pi = v, therefore Item (ii) also holds.
– Line (8). Item (i) follows from Lemma 3. Item (ii) follows from the induction
hypothesis.
ut
Theorem 1. The following claims hold in every reachable state s.
(i) For every i ∈ ZN , s.pi is valid.
(ii) For every r ∈ Z+R+2 and v ∈ ZK , s.mem(r, v) = 1 implies v is valid.
(iii) For every i ∈ ZN , if s.di 6= K then s.di is valid. Similarly for s.aheadi and
s.behindi.
Proof. We prove these claims simultaneously using induction on the length of
paths. First consider the initial state s0. For Item (i), every s0.pi is valid because
it is set to the input value p0i . Item (ii) holds trivially because all but round-0
entries are initialized to 0. Item (iii) is also trivial because all relevant variables
are initialized to K.
Now we consider a path s0 . . . smsm+1. Suppose the last is taken by process
Pi. We examine Figure 1(a) line by line for all possible actions of Pi.
– Line (1). In this case, one update is possible: di is set to the return value v
of ReadMem(R+ 1,K). Suppose v is not K. Then we can apply Lemma 2 to
conclude that sm+1.mem(R+1, v) = 1. Since mem is not updated in the last
step, this also holds in sm. Applying the induction hypothesis, we conclude
that sm+1.di = v is valid.
– Line (2). In this case, Pi terminates by returning the value sm.di and there
are no variable updates. We simply apply the induction hypothesis.
– Lines (3) and (6). Similar to Line (2).
– Line (4). In this case, mem(R + 1, sm.pi) is set to 1. By the induction hy-
pothesis, sm.pi is valid. Therefore Items (ii) hold in sm+1. Item (iii) follows
from the inductive hypothesis.
– Line (7). Similar to Line (4).
– Line (8). 〈ri, pi〉 is set to the return values of Jump. Notice that this update is
executed only if sm.aheadi 6= K. Therefore, we can conclude that in Line (6)
ReadMem returned a value v other than K. Moreover, notice that from then
on ri and aheadi have not be updated. Therefore, by Lemmas 1 and 2,
we know that mem(sm.ri + 1, sm.aheadi) = 1 at the time Jump is called.
Applying Lemma 3, we have sm+1.mem(sm+1.ri, sm+1.pi) = 1. Since mem
is not updated in the last step, we have sm.mem(sm+1.ri, sm+1.pi) = 1.
Applying the induction hypothesis, we conclude that sm+1.pi is valid.
ut
10
Corollary 1. The modified CIL algorithm in Figure 1 is valid, assuming the
exit algorithm (in this case, the original CIL algorithm) is valid.
Next we prove agreement. A key ingredient is a predicate Φ on global states.
Definition 1. Let v, v′ ∈ ZK and r ∈ Z+R+1 be given. We say that v eliminates v′
in round r in global state s (denoted s |= Φ(v, v′, r)) just in case s.mem(r, v) = 1
and s.mem(r − 1, v′) = 0.
We state a string of lemmas leading to the claim that no two processes
terminating by Line (4) do so with conflicting decision values (Lemma 8). First,
if an entry mem(r, v) is marked 1, then every entry mem(r′, v) with r′ ≤ r is
also marked 1 (Lemma 5). Second, if v′ is eliminated by v in round r, then no
process subsequently reaches round r with preference v′ (Lemma 6). Finally, if
a process Pi terminates by Line (4) with value v in round r, then every other v
′
must have been eliminated by v in round r at some earlier state (Lemma 7).
Lemma 5. Let s be a reachable state. For all r ∈ ZR+1 and v ∈ ZK , if
s.mem(r, v) = 1 then s.mem(r′, v) = 1 for all r′ ≤ r.
Proof. We proceed by induction on the length of paths. Clearly this holds at
the initial state s0. Consider a path of the form s0 . . . sm+1 and assume the
property holds for sm. The only case of interest is when mem(r, v) changes from
0 to 1 as the result of some process Pi executes Line (7) from sm. In that
case, we have sm.ri = r − 1 and sm.pi = v. By Lemma 4, we may infer that
sm.mem(r − 1, v) = 1. By the induction hypothesis, sm.mem(r′, v) = 1 for all
r′ ≤ r− 1. Using Lemma 1, we conclude that sm+1.mem(r′, v) = 1 for all r′ ≤ r.
ut
Lemma 6. Let v, v′ ∈ ZK and r ∈ Z+R+1 be given. Consider a path s0 . . . sm
such that sj |= Φ(v, v′, r) for some j ∈ Zm+1. Then, for all j′ ∈ {j, . . . ,m},
sj′ .mem(r, v
′) = 0.
Proof. By the definition of Φ, we have sj .mem(r−1, v′) = 0 and sj .mem(r, v) = 1.
Using Lemma 5, we infer that sj .mem(r, v
′) = 0. For contradiction, suppose that
the claim doesn’t hold. We focus on the least j ′ ≥ j with sj′ .mem(r, v′) = 1.
Then it must be the case that sj′−1.mem(r, v′) = 0 and some process Pi executes
Line (7) from sj′−1. Moreover, sj′−1.ri = r − 1 and sj′−1.pi = v′.
On the other hand, using Lemma 4 and the fact that sj .mem(r − 1, v′) = 0,
we know either sj .ri < r − 1 or sj .pi 6= v′. Therefore, Pi must have entered
the current phase after sj . Since mem(r, v) is 1 in every state following sj , the
invocation of ReadMem in Line (6) of the current phase of Pi must have returned
a value other than K. This contradicts the claim that Pi executes Line (7) in
the current phase. ut
Lemma 7. Consider a path s0 . . . sm+1. Suppose that in the last step some pro-
cess Pi terminates by executing Line (4). Let r denote sm.ri and v denote sm.pi.
For every v′ 6= v, there is j′ ∈ Zm+1 such that sj′ |= Φ(v, v′, r).
11
Proof. Let v′ 6= v be given and let sj denote the first state in which Pi enters
the current phase. Thus sj .ri = r and sj .pi = v. By Lemma 4 and Lemma 1, we
have sj′′ .mem(r, v) = 1 for all j
′′ ∈ {j, . . . ,m}.
Since Line (4) is executed, r must be non-zero and the invocation of ReadMem
in Line (3) must have returnedK. Let sj′ be the state from which ReadMem reads
from mem(r− 1, v′). Since the return value of ReadMem is K, we may infer that
sj′ .mem(r − 1, v′) = 0. Moreover, we have j ′ > j and hence sj′ .mem(r, v) = 1.
Therefore sj′ |= Φ(v, v′, r). ut
Lemma 8. Let a path s0 . . . sm and j, j
′ ∈ Zm+1be given. Assume that process
Pi terminates by Line (4) with output v from state sj and some other process
Pi′ does the same with output v
′ from state sj′ . Then v = v′.
Proof. For the sake of contradiction, suppose v 6= v′. Let r and r′ denote the final
round numbers of Pi and Pi′ , respectively. Without loss of generality, assume that
r ≤ r′. By Lemma 7, we know that sj |= Φ(v, v′, r), therefore sj .mem(r, v) = 1
and sj .mem(r− 1, v′) = 0. On the other hand, sj′ .ri′ = r′ and sj′ .pi′ = v′, so by
Lemma 4 we have sj′ .mem(r
′, v′) = 1.
First we consider the case in which j < j ′. By Lemma 6, we know that
sj′ .mem(r, v
′) = 0. Applying Lemma 5, we have sj′ .mem(r′, v′) = 0, which yields
a contradiction.
Next we consider the case in which j ′ < j. By Lemma 1, we may infer that
sj .mem(r
′, v′) = 1. By Lemma 5, this implies sj .mem(r−1, v′) = 1, contradicting
the fact that sj |= Φ(v, v′, r). ut
It remains to consider termination by Line (2). Lemma 9 below implies that
every process terminating by Line (2) must be preceded by a process terminating
by Line (4) with the same decision.
Lemma 9. Let v ∈ ZK and a path s0 . . . sm be given. Assume that sm.mem(R+
1, v) = 1. There is j ∈ Zm+1 such that some process Pi terminates with decision
value v by executing Line (4) from sj .
Proof. Let j be the index for the first state in this path such that sj .mem(R +
1, v) = 1. Since mem(R+ 1, v) is initialized to 0, we know that j > 0. Let Pi be
the process writing to mem(R+ 1, v) from sj−1. Then Pi must have terminated
with decision value v by executing Line (4) from sj−1. ut
Theorem 2. Let a path s0 . . . sm be given. Assume that process Pi terminates by
executing either Line (2) or Line (4) from state sj (j ∈ Zm+1) and its decision
value is v. Similarly for Pi′ , sj′ and v
′. Then v = v′.
Proof. We claim that there exist j ′′ ∈ Zm+1 and i′′ ∈ ZN such that Pi′′ ter-
minates with decision value v by executing Line (4) from sj′′ . If Pi terminates
by Line (4), then we simply set i′′ := i and j′′ := j. Otherwise, Pi termi-
nates by Line (2) and the invocation of ReadMem in Line (1) of the last phase
of Pi must have returned v 6= K. By Lemma 2 and Lemma 1, we know that
sm.mem(R+ 1, v) = 1. We can then choose j
′′ and i′′ using Lemma 9.
The same claim also holds for v′. Now we apply Lemma 8 to infer that v = v′.
ut
12
5 Probabilistic Termination and Expected Complexity
Let us first consider the amount of work required during each phase of the
algorithm. (Recall that a phase is an entire pass through the while-loop in
Figure 1(a)). Notice each phase involves at most (i) three calls to ReadMem,
(ii) one write operation and (iii) one call to Jump. Each call to ReadMem requires
O(1) read operations, because the size K of the preference domain is a constant
in our analysis. Therefore, aside from Jump, each phase involves constant work.
Consider the while-loop in Jump. Each pass through this loop involves at
most one call to ReadMem. Furthermore, this loop is executed at most logR+ 1
times. Since R = 2dlogNe by definition, each call to Jump requires O(log(logN))
read operations. This is then also the cost of a complete phase. Later on, we will
prove that the expected number of complete phases until at least one process
terminates successfully is O(N) and hence the expected number of read/write
operations is O(N log(logN)) (Lemma 13).
For any state s, let s.rmax denote the highest round reached by any process in
state s. In other words, s.rmax := maxi∈ZN s.ri. Since the two updates in Line (7)
of Figure 1(a) are performed in a single step, s.rmax is also the largest r such
that s.mem(r, v) = 1 for some value v ∈ {0, . . . ,K − 1}. Lemma 10 below says,
if no leader advances to round rmax + 1, a lagging process can catch up to round
rmax in one complete phase. Lemma 11 then shows, whenever rmax is at most
R−2, the probability of at least one process terminating successfully within the
next two rounds is bounded below by a constant. Moreover, this termination
takes place before 15N complete phases are executed. The proof of Lemma 11
strongly resembles the analysis given in [CIL94].
Lemma 10. Let s0 . . . sm . . . sm′ be a path with m < m
′. Assume that sj .rmax =
sm.rmax for every j ∈ {m, . . . ,m′}. Moreover, assume that Pi completes a phase
between sm and sm′ without crashing, successfully terminating or switching to
the exit algorithm. Then sm′ .ri = sm.rmax.
Proof. For readability, write rmax for sm.rmax and r for sm.ri + 1. Consider
the first complete phase executed by Pi between sm and s
′
m. Without loss of
generality, assume that sm is the first state in that phase and that r ≤ rmax.
By assumption, Pi does not crash, terminate, or exit. Therefore it reaches
Line (6) in this phase. By Lemma 1 and Lemma 5, r ≤ rmax implies there is
v ∈ ZK such that sj .mem(r, v) = 1 for all j ∈ {m, . . . ,m′}. Hence the invocation
of ReadMem in Line (6) returns a value other than K and Pi executes Line (8).
It remains to show Jump returns rmax for the round number.
Note that Jump returns r if r ≥ R, in which case r = R = rmax. Otherwise, let
c denote dlog(R−r)e. The while-loop of Jump calculates the following sequence
{r′0, . . . , r′c} of natural numbers: r′0 is r and r′i+1 is
– r′i, if r
′
i + 2
c−i > R or ReadMem(r′i + 2
c−i,K) returns K;
– r′i+1 := r
′
i + 2
c−i, otherwise.
13
From this we obtain a nested sequence of intervals:
[r′0, r
′
0 + 2
c), . . . , [r′i, r
′
i + 2
c−i), . . . , [r′c, r
′
c + 2
0).
It is easy to see that rmax belongs to every one of these intervals and, since
the last is a singleton, we know r′c = rmax. This is precisely the round number
returned by Jump. ut
Lemma 11. Suppose ModifiedCIL starts from a reachable state s. Let r denote
s.rmax and suppose r ≤ R− 2. Then, with probability greater than 0.511, at least
one process terminates successfully in a round no higher than r+2. Moreover, at
most 15N complete phases are executed between s and the successful termination.
Proof. By assumption, at least one process survives throughout the execution
of the algorithm. Therefore, if no successful termination ever takes place, the
algorithm stops only if all surviving processes reach round R and switch to
the exit algorithm. Without loss of generality, we assume that no successful
termination occurs in round r + 1 or lower.
Consider the first complete phase following s. There are two cases.
– It is executed by a process Pi in round < r. By Lemma 10, Pi reaches round
r by the end of this phase.
– It is executed by a process Pi in round r. Then Pi reaches Line (7) in this
phase and, with probability 12N , Pi advances to round r + 1.
Suppose that either the first case applies, or the second case applies but Pi fails to
advance to round r+1. Then the same case distinction can be made on the next
complete phase. This repeats until all surviving processes have reached round r
and, after that point, every complete phase involves a coin toss to advance to
round r + 1 until a success occurs. Moreover, since a lagging process catches up
to round r in one complete phase, at most N complete phases following s are
executed by processes in round strictly lower than r.
Consider the event E1, in which “a success occurs before 5N attempts to
move from r to r + 1 are made” and “all subsequent attempts to move from r
to r + 1 fail.” Notice the first condition is equivalent to “it is not the case that
all of the first 5N attempts to move from r to r + 1 fail,” which occurs with
probability 1 − (1 − 12N )5N . By the reasoning above, this first success occurs
before 6N complete phases are executed following s.
Let Pi be the successful process, thus the first to reach round r + 1. By our
atomicity assumption, mem(r + 1, s.pi) is set to 1 as soon as Pi reaches round
r + 1. After that point, every other Pi′ tosses a coin at most once to advance
from r to r + 1. This is because in the subsequence phase Pi′ sees it’s no longer
leading and therefore does not execute Line (7). As a result, the probability of
“all subsequent attempts to move from r to r+1 fail” is at least (1− 12N )N−1 and
hence the probability of E1 is at least (1− (1− 12N )5N )(1− 12N )N−1. Moreover,
after Pi reaches round r + 1, at most 2N − 2 complete phases are executed by
processes in round r or lower: at most N − 1 failed coin tosses to move from r
to r + 1 and at most N − 1 phases to catch up to r + 1.
14
By assumption, no successful termination takes place until a process has
reached round r+ 2. Thus, every complete phase executed by a process in round
r+ 1 is a coin toss to move to round r+ 2, until a success occurs. Let E2 denote
the event that “a success occurs before 5N attempts to move from r+ 1 to r+ 2
are made.” The probability of E2 given E1 is then 1 − (1 − 12N )5N . Similarly,
this success occurs before 6N + (2N − 2) + 5N = 13N − 2 complete phases are
executed following s and, after this success, at most 2N − 2 complete phases are
executed by processes in round r + 1 or lower.
Therefore, given E1 and E2, at least one process executes a complete phase
in round r+ 2 before 15N complete phases are executed following s. Due to E1,
no process reaches round r + 1 with preference value other than s.pi. Therefore
the first process to complete a phase in round r+2 sees no disagreement in round
r+1 or higher. It then terminates successfully by Line (4). It remains to consider
the probability of both E1 and E2 occurring. Recall that {(1 − 1n )n}∞n=2 is an
increasing sequence with limit 1e . Therefore {(1 − 12n )n−1}∞n=2 is a decreasing
sequence with limit 1√
e
and {1 − (1 − 12n )4n}∞n=2 is a decreasing sequence with
limit 1− 1e2.5 . Therefore, the probability of both E1 and E2 occurring is at least
(1− (1− 1
2N
)5N )2 · (1− 1
2N
)N−1 ≥ (1− 1
e2.5
)2 · 1√
e
> 0.511.
ut
Notice Lemma 11 applies only to executions starting in round R−2 or lower.
The next lemma covers rounds R − 1 and R, assuming a decision is reached
before switching to the exit algorithm.
Lemma 12. Suppose ModifiedCIL starts from a reachable state s. Let r denote
s.rmax and suppose R− 2 < r ≤ R. Assuming the exit algorithm is not invoked,
the (conditional) probability that at least one process terminates successfully be-
fore 15N complete phases are executed after s is greater than 0.511.
Proof. We use arguments similar to those in the proof of Lemma 11. First
suppose r = R. Then at most N − 1 complete phases are executed before
a process completes a phase in round R. Suppose Pi is the first to do so. If
Pi does not terminate by Line (4) in that phase, it must be the case that
mem(R − 1, v1) = mem(R − 1, v2) = 1 for some v1 6= v2. Then Pi, as well
as every other process that reaches round R, invokes the exit algorithm. By as-
sumption, this is not the case and hence Pi terminates by Line (4) in that phase.
Therefore, with probability 1, at least one process terminates before N complete
phases are executed.
If r = R − 1, then at most N − 1 complete phases are executed before a
process completes a phase in round R− 1. Similar to the previous case (r = R),
if the first process completing a phase in round R − 1 does not terminate by
Line (4) in that phase, every process reaching round R − 1 will try to advance
to round R by Line (7), until one of them succeeds. The probability of at least
one success before 4N attempts are made is 1− (1− 12N )4N , which is bounded
15
below by (1− 1e2 ) > 0.864. After that success, the problem reduces to the case
where r = R and successful termination is guaranteed before N complete phases.
Therefore, with probability at least 0.864, at least one process terminates before
6N complete phases are executed. ut
Theorem 3. If the exit algorithm is wait-free and satisfies probabilistic termi-
nation, the same holds for ModifiedCIL.
Proof. By correctness of the exit algorithm, we may focus on the case in which
the exit algorithm is not invoked. Consider execution blocks of 15N complete
phases each. By Lemma 11 and Lemma 12, the probability of successful termina-
tion within each block is at least 0.511. Thus, with probability 1, the algorithm
terminates successfully after a finite number of blocks. Since we have made no
assumption on the number of surviving processes, the algorithm is wait-free. ut
We now turn to complexity considerations. Again, we make a case distinction
based on whether the exit algorithm is invoked.
Lemma 13. Assume that the exit algorithm is not invoked. The expected num-
ber of elementary read/write operations until at least one process terminates
successfully is O(N log(logN)).
Proof. Again we consider execution blocks of 15N complete phases each. The
expected number of blocks is:
∞∑
n=1
(n · 0.511 · (1− 0.511)n−1)
=
∞∑
n=0
((n+ 1) · 0.511 · 0.489n)
= 0.511 · (
∞∑
n=0
0.489n +
∞∑
n=1
0.489n +
∞∑
n=2
0.489n + . . .)
= 0.511 · (
∞∑
n=0
0.489n + 0.489
∞∑
n=0
0.489n + 0.4892
∞∑
n=0
0.489n + . . .)
= 0.511 · (
∞∑
n=0
0.489n)2
= 0.511 · ( 1
0.511
)2 =
1
0.511
< 2.
Thus the expected number of complete phases is at most 30N . Moreover, there
are at most N − 1 incomplete phases. Since each phase involves O(log(logN))
elementary operations, the expected number of elementary operations is at most
O(N log(logN)). ut
Lemma 14. Suppose the exit algorithm is the original CIL algorithm and is
invoked. The expected number of elementary read/write operations until at least
one process terminates successfully is O(N 2 log(logN)).
16
Proof. In this case the algorithm steps through all R rounds without success-
ful termination. Using a similar calculation as in the proof of Lemma 13, the
expected number of coin tosses to move from r to r + 1 is
∞∑
n=1
n(
1
2N
)(1− 1
2N
)n−1 = 2N.
Following each success, at most N − 1 phases are executed by processes lagging
behind. Therefore, the expected number of complete phases before switching
to original CIL is at most 3NR ≤ 6N(logN + 1). The expected number of
elementary operations before switching is then O(N(logN)(log(logN))).
In [CIL94], it is shown that the expected number of elementary operations
for the original CIL algorithm is O(N 2). Therefore, the overall expected number
of elementary operations is O(N 2 log(logN)). ut
Lemma 15. Suppose the ModifiedCIL starts from the initial state s0. The prob-
ability of failing to reach a decision in or before round R is at most 1/N .
Proof. By Lemma 11, this probability is at most (1 − 0.511)R2 . Since R =
2dlogNe, we have
(1− 0.511)R2 ≤ (1− 0.511)logN < (0.5)logN = 1
N
.
ut
Putting together Lemmas 13, 14, and 15, we conclude that the expected
complexity of ModifiedCIL is O(N log(logN)).
Theorem 4. Suppose ModifiedCIL starts from the initial state s0 and the exit
algorithm is original CIL. The expected number of elementary read/write oper-
ations until at least one process terminates successfully is O(N log(logN)).
6 Model Checking
It turns out to be quite straightforward to specify our algorithm in PRISM’s
state-based input language. Each process is modeled as a module and the shared
memory is modeled using global variables. Two more global variables are used
to keep track of process failures and the number of completed phases.
We consider binary consensus (i.e., K = 2) with N = 2, 3, 4 processes.
Processes are assumed to disagree initially, therefore validity is trivial. Agree-
ment is satisfied in all models constructed. For probabilistic termination, we
ask PRISM to compute the (exact) minimum probability of at least one pro-
cess terminating successfully, given an allowance of R = 2dlogNe rounds and
15N · R2 = 15NdlogNe complete phases. This result is compared against our
analytic lower bound of 1− 1N .
In the case of N = 4, the model becomes too complex (with 2dlogNe = 4
rounds and 15NdlogNe = 120 complete phases). However, we discover that the
17
analytic bound of 1− 1N = 0.750 is already met when we restrict to 40 complete
phases. This suggests that we have made some overly conservative estimates
while deriving the analytic bound.
The table below summarizes our results. We use PRISM version 2.1, running
on a 1.4 GHz Pentium M machine with 500 Mb memory under Linux 2.6. The
MTBDD engine is used with a CUDD memory limit of 400 Mb. Other parameters
remain at default settings. All relevant files, including model checking logs, can
be found in [Che05].
N R #Phases Model Agreement Termination
#States Time(s) Time(s) Time(s) MinProb AnalyticBd
2 2 30 42,320 4 0.025 6 0.745 0.511
3 4 90 12,280,910 213 0.094 2,662 0.971 0.667
4 2 60 45,321,126 429 0.078 602 0.755 0.511
4 4 40 377,616,715 5224 3.926 55,795 0.765 0.750
7 Conclusions
We have given a relatively simple algorithm that solves asynchronous wait-free
consensus in expected O(N log(logN)) total work. To our best knowledge, this
is the most efficient algorithm (in terms of expected total work) for dynamic
adversaries. Moreover, our atomicity assumption is more benign than common
assumptions such as value-oblivious and write-oblivious, making our algorithm
more widely applicable.
We make use of MWMR memory in order to reduce global and local data.
This strategy, also adopted in [Cha96,Aum97], leads to more efficient consensus
algorithms. Interestingly, it also makes model checking significantly more feasi-
ble, for it helps to avoid the typical state explosion problem. MWMR memory
is often regarded as a stronger primitive than SWMR memory. Indeed, there
are optimal implementations of MWMR from SWMR physical registers using
linear time and logarithmic space [IS92]. However, dissenting opinions appear
in [BPSV00], where the authors argue that SWMR memory requires the hid-
den assumption of naming : existence of distinct identifiers known to all. In that
sense, MWMR is weaker than SWMR.
In theory, we can implement our MWMR algorithm using physical SWMR
registers via an O(N) emulation of MWMR from SWMR. Then the total number
of physical memory operations is raised by a factor of N to O(N 2 log(logN)),
but the total number of logical memory operations remain at O(N log(logN)).
This property can be quite useful in reducing network communication costs,
in case remote processes participate in the consensus protocol. In other words,
one can aim at an implementation in which the traffic to and from a remote
participant corresponds to the number of logical operations performed by that
participant, as opposed to the much higher number of physical operations.
For future work, we want to improve the per process work bound of our algo-
rithm. In [AW96], a similar improvement is achieved by allowing fast processes
to cast votes of increasing weights. However, their proofs rely on properties of
18
Martingale processes, which are not directly applicable to our algorithm. It is
also possible that in our setting per process work is inherently high, e.g. Ω( Nf(N) ),
where f is a polylogarithmic function.
Another possibility for future work is to consider contention cost, which mea-
sures the amount of conflict in memory access [AB04]. The contention cost for
ModifiedCIL is high because, in a roughly synchronous execution, all N pro-
cesses try to access a constant number of registers at the same time. It would
be interesting to modify the algorithm further to reduce contention.
Finally, we comment on model checking using PRISM. Although the current
limit seems to be 4 processes, we conjecture a vast improvement using a symme-
try reduction option, which is under development by the PRISM team. Before
symmetry reduction is available, manual abstraction can be used to increase fea-
sibility. That is, we manually construct an abstraction that captures core ideas
of an algorithm, while significantly decreasing the model size. We experimented
with such an abstraction of original CIL, by focusing on the shared memory and
filtering out local states of processes. Having done so, we were in fact able to
handle up to 10 processes. However, it is non-trivial to prove soundness of the
abstraction. Standard techniques such as probabilistic simulation are available
for this purpose, but substantial investment of time is required.
Overall, PRISM allows us to conduct experiments during the development
stage of an algorithm, with minimal learning effort. Although in most cases it
still cannot handle large instances of a full algorithm, it is perfectly feasible to
model check a subroutine or an abstract version. This already provides valuable
information, especially to those who simply wish to gain more insight into an
algorithm.
Acknowledgment. We thank James Aspnes for his inspiring article [Asp03]
and many helpful comments, as well as David Parker for support in using PRISM.
Also we thank the anonymous referees at OPODIS’05 for their useful suggestions.
References
[AB04] Y. Aumann and M.A. Bender. Efficient low-contention asynchronous con-
sensus with the value-oblivious adversary scheduler. Distributed Computing,
2004. Accepted in 2004.
[ACT00] M.K. Aguilera, W. Chen, and S. Toueg. Failure detection and consensus in
the crach recovery model. Distributed Computing, 13(2):99–125, 2000.
[AH90] J. Aspnes and M. Herlihy. Fast randomized consensus using shared memory.
Journal of Algorithms, 11(3):441–461, 1990.
[Asp98] J. Aspnes. Lower bounds for distributed coin-flipping and randomized con-
sensus. Journal of the ACM, 45(3):415–450, 1998.
[Asp03] J. Aspnes. Randomized protocols for asynchronous consensus. Distributed
Computing, 16(2-3):165–175, 2003.
[Aum97] Y. Aumann. Efficient asynchronous consensus with the weak adversary
scheduler. In Proceedings of the Sixteenth Annual ACM Symposium on Prin-
ciples of Distributed Computing, pages 209–218, 1997.
19
[AW96] J. Aspnes and O. Waarts. Randomized consensus in expected O(n log2 n)
operations per process. SIAM Journal on Computing, 25(5):1024–1044, 1996.
[BK98] C. Baier and M. Kwiatkowska. Model checking for a probabilistic branching
time logic with fairness. Distributed Computing, 11(3):125–155, 1998.
[BO83] M. Ben-Or. Another advantage of free choice: completely asynchronous
agreement protocols. In Proceedings of the Second Annual ACM Symposium
on Principles of Distributed Computing, pages 27–30, 1983.
[BPSV00] H. Buhrman, A. Panconesi, R. Silvestri, and P.M.B. Vita´nyi. On the impor-
tance of having an identity or is consensus really universal? In Proceedings of
the 14th International Conference on Distributed Computing, volume 1914
of LNCS, pages 134–148. Springer-Verlag, 2000.
[BR91] G. Bracha and O. Rachman. Randomized consensus in expected O(n2 log n)
operations. In Proceedings of the 5th International Workshop on Distributed
Algorithms, volume 579 of LNCS, pages 143–150, 1991.
[CH05] L. Cheung and M. Hendriks. Causal dependencies in parallel composition of
stochastic processes. Technical Report ICIS-R05020, Institute for Computing
and Information Sciences, University of Nijmegen, 2005.
[Cha96] T.D. Chandra. Polylog randomized wait-free consensus. In Proceedings of
the 15th Annual ACM Symposium on Principles of Distributed Computing,
pages 166–175, 1996.
[Che05] L. Cheung. Collection of PRISM models of the modified CIL algorithm,
2005. Available at http://www.niii.ru.nl/~lcheung/mcil/.
[CIL87] B. Chor, A. Israeli, and M. Li. On processor coordination using asynchronous
hardware. In Proceedings PODC’87, pages 86–97, 1987.
[CIL94] B. Chor, A. Israeli, and M. Li. Wait-free consensus using asynchronous
hardware. SIAM Journal on Computing, 23(4):701–712, 1994.
[DLS88] C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the presence of partial
synchrony. Journal of the ACM, 35(2):288–323, 1988.
[FLP85] M. Fischer, N.A. Lynch, and M.S. Paterson. Impossibility of distributed
consensus with one faulty process. Journal of the ACM, 32(2):374–382, 1985.
[HW90] M.P. Herlihy and J.M. Wing. Linearizability: a correctness condition for
concurrent objects. ACM TOPLAS, 12(3):463–492, 1990.
[IS92] A. Israeli and A. Shaham. Optimal multi-writer multi-reader atomic regis-
ter. In Proceedings of the 11th Annual ACM Symposium on Principles of
Distributed Computing, pages 71–82, 1992.
[KN02] M. Kwiatkowska and G. Norman. Verifying randomized Byzantine agree-
ment. In Proc. Formal Techniques for Networked and Distributed Systems
(FORTE’02), volume 2529 of LNCS, pages 194–209, 2002.
[KNS01] M. Kwiatkowska, G. Norman, and R. Segala. Automated verification of a
randomized distributed consensus protocol using Cadence SMV and PRISM.
In Proceedings CAV’01, volume 2102 of LNCS, pages 194–206, 2001.
[PRI] PRISM web site. http://www.cs.bham.ac.uk/~dxp/prism.
