On the possibility and impossibility of achieving clock synchronization  by Dolev, Danny et al.
JOURNAL OF COMPUTER AND SYSTEM SCIENCES 32, 23O-250 (1986) 
On the Possibility and Impossibility 
of Achieving Clock Synchronization* 
DANNY DOLEV 
Hebrew University, Givai Rum, 91904 Jrrusolem, Israel 
JOSEPH Y. HALPERN AND H. RAYMOND STRONG 
IBM Reseurch Laboratory, Sun Jose, California 95193 
Received November 5, 1984 
It is known that clock synchronization can be achieved in the presence of faulty processors 
as long as the nonfaulty processors are connected, provided that some authentication techni- 
que is used. Without authentication the number of faults that can be tolerated has been an 
open question. Here we show that if we restrict logical clocks to running within some linear 
functions of real time, then clock synchronization is impossible without authentication when 
one-third or more of the processors are faulty. We also provide a lower bound on the 
closeness to which simultaneity can be achieved in the network as a function of the trans- 
mission and processing delay properties of the network. 7~: 1986 Academic Press, Inc. 
1. INTRODUCTION 
The problem of achieving clock synchronization in the presence of faults has 
attracted much attention recently [LM, Ma, HSSD, LLl]. In [HSSD] an 
algorithm is presented that uses authentication (the ability to generate unforgeable 
signatures) and achieves synchronization with arbitrarily many faulty processors or 
communication links, provided that correct processors are not disconnected. Lam- 
port and Melliar-Smith [LM] and Lundelius and Lynch [LLl] present algorithms 
that do not require authentication, but work only if fewer than one-third of the 
processors are faulty. It is known that Byzantine agreement cannot be achieved 
without authentication if at least one-third of the processors are faulty [LSP, PSL]. 
Until now the corresponding question for clock synchronization has remained 
open. It has been conjectured that the answer would be the same [LM]. In his 
invited address at the PODC Symposium in Montreal, Leslie Lamport challenged 
his listeners to provide a proof of impossibility [La]. In this paper we provide 
proofs of both possibility and impossibility, and show how sensitive the proofs are 
to the precise definition of clock synchronization. For the specific question posed by 
* A preliminary version of this paper appeared in the Proceedings of the Sixteenth Annual ACM Sym- 
posium of Theory of Computation, Washington, D.C., 1984, and as IBM RJ 4218. 
230 
OO22OOOO/86 $3.00 
Copyright ((;t 1986 by Academic Press, Inc. 
All rights of reproduction in any form reserved. 
CLOCK SYNCHRONIZATION 231 
Lamport with respect to the model of [LM], we provide the expected proof of 
impossibility. 
For simplicity, we assume that each processor has a physical clock or duration 
timer D, and a designated register TAR called the time adjustment register. Physical 
clocks of correct processors drift away from real time by at most a bounded rate. 
The physical clock is never altered by the processor, but the time adjustment 
register may be altered of a processor’s internal operations, receipt of messages, or 
an indication from its physical clock that a specific amount of time has elapsed. A 
processor’s logical clock time C is the sum of TAR and D. Roughly speaking, and 
algorithm achieves clock synchronization if at all times the logical clock times of all 
correct processors are only a bounded distance apart. 
Note that there is a trivial algorithm for clock synchronization: namely, 
whenever a processor’s logical clock reads some predetermined value T, it is reset 
(by adjusting TAR) to read 0. We can eliminate this trivial solution by requiring 
that the range of a processor’s logical clock be unbounded. However, as we show in 
Theorem 1 below, there is still a clock synchronization algorithm where the range 
of every processor’s logical clock is unbounded that does not require any message 
passing. In this algorithm, a processor’s logical clock runs at a rate that is roughly 
the logarithm of that of its physical clock. We eliminate this solution by requiring 
that the logical clock stay within a linear envelope of the physical clock, i.e., by 
requiring that there be constants a, h, c, and d such that aD(t) + b< C(t) < 
CD(t) + d for all times t. (This condition is satisfied by all the clock synchronization 
algorithms mentioned above.) In this case we can show that clock synchronization 
is not achievable with one-third or more faulty processors (Theorem 2). 
Finally we consider (Theorems 4 and 5) the degree to which simultaneity can be 
achieved in a network. We show that for every network there is a lower bound A 
such that no algorithm for clock synchronization can ensure that the difference 
between the (real) times at which two correct processors read a given value on their 
clocks is less than A. A is a function of the uncertainty in the time required to trans- 
mit and process a message. Results of [HSSD] show that this bound can essentially 
be achieved. These theorems bound the degree to which coordination can be 
achieved in a given network. It we want two events to happen simultaneously in a 
network, the best we can do is to guarantee that they happen at times separated by 
no more than A. 
2. THE MODEL 
Each processor is connected to others via links. We do not assume that our 
network is completely connected; all our results hold regardless of the network 
topology. Processors send messages to each other along the communication links, 
where where a message is just a word over some fixed alphabet. As in most previous 
papers, we assume that a processor can always tell from which immediate neighbor 
in the communication network a message has come. We also assume that 
232 DOLEV, HALPERN, AND STRONG 
associated with a link from processor p to q are two numbers H( p, q) and L( p, q) 
that represent known upper and lower bounds on the time to transmit (and 
process) a message from p to q. More formally, we take a communication network N 
to be a triple (G, ZYZ, L), where G = (I’, E) is a connected directed graph, and H and 
L are functions from E to the nonnegative reals such that H( p, q) 2 L(p, q). 
N = (G, H, L) is a network with uncertain message transmission times if for each link 
(P, 4) in NT H(P, d > UP, 4. 
We assume the existence of a real time frame, not directly observable. As we men- 
tioned in the Introduction, each processor has a physical clock D and a time 
adjustment register TAR. Both of these can be viewed as real-valued functions of 
real time. A correct physical clock is one that thas a bounded rate of drift from real 
time. More precisely, there exists a constant R > 1 such that a correct physical clock 
satisfies, for all real times 2.4 > u, 
(l/R)(u - U) 6 D(u) - D(u) f R(u - u). 
Note that this inequality implies that a correct physical clock is a continuous, 
monotonic function of real time. 
A correct processor is one that behaves according to its algorithmic specifications 
and possesses a correct physical clock. No assumption is made about the behavior 
of a faulty (not correct) processor. 
We view TAR(t) as a function which is continuous on left-open right-closed 
intervals, where moreover, for all t, the limit of TAR(t’) as t’ approaches t from the 
right exists. We denote this limit TAR(t + ). Intuitively, if TAR is reset at discrete 
times, TAR(t+) is the value of TAR just after it has been reset at time t. In all our 
algorithms, TAR will in fact be reset only at discrete times, so that it will be con- 
stant on left-open right-closed intervals. Altering TAR at these discrete times 
corresponds to starting a new clock in the notation of [LM, LLl, HSSD]. By 
definition, C(t) = D(t) + TAR( t ); we take C( t + ) = D(t) + TAR( t + ). We think of 
C( t j as representing a processor’s logical clock at real time t. 
We assume that all processors start up with their logical clocks, physical clocks, 
and TAR all set to 0. We do not need to assume that all processors start up at the 
same time, but we will assume that the processors all start within a short time span. 
Moreover, we assume that there is some constant B, depending only on the 
network, such that physical clocks of correct processors differ by at most B at the 
first point when all correct processors are up. Thus, correct clocks start out being 
close together; the problem is to keep them close together. We do not use this 
assumption in any essential way in our proofs; it is just used to give some initial 
bound on the difference between logical clocks. Note that if we assume that a 
processor starts up either spontaneously or on receipt of a message from another 
processor, then there is a simple algorithm that shows we can take B < R(n - 1) H, 
where n is the number of processors in the network, and H = max., H(p, q), 
provided that any pair of correct processors is connected by a fault-free path of 
links and processors in the network. We proceed as follows: as soon as a processor 
CLOCK SYNCHRONIZATION 233 
starts up, it sends a message to all its neighbors. Clearly, within real time (n - 1) H, 
every correct processor has received a startup message. In this interval, at most 
R(n - 1) H has passed on the physical clock of any processor. 
We assume that processors perform actions such as sending messages and 
updating internal variables such as TAR according to some algorithm. An 
algorithm, in turn, is a function both of the messages sent and received and of the 
physical clock reading. To make this precise, we define processor p’s message 
history at real time t to consist of a finite sequence of tuples of the form 
(q, m, T, y) (intuitively, one tuple for each message that it has sent or received up 
to, but not including, time t), where q indicates that the message was sent to or 
received from processor q, m is the value of the message, T indicates that D,(t’) = T 
at the real time t’ (<t) when the message left or arrived, and y is “sent” or 
“received.” Note that the messages marked “sent” in p’s message history are redun- 
dant, in the sense that they are determined by the algorithm run and the message 
input. We include them here so that it is clear from p’s message history exactly what 
messages p sent, even without knowing what algorithm p is running. This will be 
convenient when we describe our “two-faced” algorithm below. 
We use HP to denote processor p’s message history function; H,(t) is p’s message 
history at time t. For the most part in this paper we concentrate on deterministic 
algorithms; an action performed by processor p at real time t according to a deter- 
ministic algorithm d is, by definition, a function of D,(t) and H,(t). We comment 
in the concluding section on the extent to which our results hold for probabilistic 
algorithms, where the actions performed by a processor at time t may also depend 
on the result of a random coin toss. 
We define a scenario to be a specification of the functions D, and HP for each 
processor p in the network and of the algorithm run by each processor. A run of an 
algorithm d in a communication network N is just a scenario in which all the 
processors in N run d. We will say that two scenarios cannot be distinguished by 
processor p if p is running the same algorithm in both scenarios, and at all physical 
clock times T, it has the same message history in both scenarios. From our 
definitions it follows that if p is running a deterministic algorithm in two senarios 
that it cannot distinguish, then it performs the same actions in both scenarios at all 
physical clock times T. This observation will be crucial in the proof of our 
impossibility results. 
3. SYNCHRONIZATION WITHOUT AUTHENTICATION 
Consider the following definition of clock synchronization: 
Weak Clock Synchronization Condition (WCSC): There exist constants PER, 
DMAX, and ADJ such that, for correct processors, (a) TAR is constant except that 
it changes at logical clock times that are positive multiples of PER by an amount 
with absolute values less than ADJ, and (b) the difference between logical clocks is 
always bounded by DMAX. 
571 ‘3212-6 
234 DOLEV, HALPERN, AND STRONG 
As we mentioned in the Introduction, if we do not require that C(t) be unboun- 
ded, then there is a trivial algorithm to achieve WCSC. We choose PER arbitrarily. 
Then whenever C(t) = PER, we set TAR(t + ) = --o(t) (thus making C(t + ) = 0). 
Clearly we have ADJ = DMAX = PER in this case, since C(t) is always between 0 
and PER for a correct processor. However, even if we require that C(t) have 
unbounded range, we still get: 
THEOREM 1. There is an algorithm that achieves WCSC, independent of the num- 
ber of faults, for which C(t) is unbounded. 
Prooj For ease of exposition, we assume that all correct processors start 
simultaneously at time 0, so that D,(O) = 0 for a correct processor p. The general 
case (where processors start within some constant B of each other) works essen- 
tially the same way, and is left to the reader. 
The idea of the algorithm is to have a processor’s logical clock running at 
roughly the logarithm (base 2) of its physical clock. We proceed as follows. 
Choose PER = 2 units. (This choice is made simply to make some calculations 
easier, since our logarithms are to the base 2. Actually, any choice of PER would 
work.) For each time t such that C(t) = 2i (= i PER) for some positive integer i, 
and 0 < log(D( t)) < 2i - 2, we set TAR( t + ) = log(D( t)) - D(t), thus making 
C(t’) =log(L)(t)). (Note that the purpose of checking that log(o(t)) <2i- 2, or 
equivalently, that C(t) - log(D(t)) > 2, is simply to prevent TAR from being reset 
infinitely often in a bounded amount of time. Without this clause, for example, we 
would have C(t) = 2 for infinitely many values of t.) 
We now show that if p is a correct processor following this algorithm, then, as 
illustrated in Fig. 1, we have 
max(0, log(D,(t))) < C,(t) < max(44 + log(D,(t))). (*) 
Suppose we can prove (*). Since, by assumption, for a correct processor p we 
have (l/R)t < o,(t) < Rt, it follows that -log(R) + log(t) < log(D,(t)) d 
1 PER=2 
Dftl 
FIG. 1. A logarithmic envelope allows synchronization without messages. 
CLOCK SYNCHRONIZATION 235 
log(R) + log(t). Thus, from (*) it immediately follows that if p and q are correct 
processors, then C,(t) is unbounded and (C,(t) - C,( t)l < 4 + 2 log(R). Moreover, 
note that every time TAR is changed, it is changed by an amount 
C,(t) - log(D,(t)). Since log(D,(t)) > 0 when TAR is changed, it follows from (*) 
that TAR is changed by at most 4. Thus the algorithm achieves WCSC with 
PER = 2, DMAX = 4 + 2 log(R), and ADJ = 4. 
So now it only remains to prove (*). The lower bound is immediate from the 
description of the algorithm, since Cp(t) is set back to log(D,(t)) and then increases 
along with o,(t) until it is next reset; the algorithm also guarantees that C,(t) is 
always positive. For the upper bound, we will show that for all real times t such 
that o,(t) > 4, we have C,(t) < 4 + log(D,(t)); since clearly C,,(t) < Da(t), this will 
complete the proof of (*). 
Note that f,(t) = C,(t) - log(D,( t)) . IS continuous and increasing in the intervals 
in which TAR is constant. Moreover, &(f+) = 0 at those times t when TAR is 
changed. Suppose, to obtain a contradiction, that for some time f, such that 
D&t,) b 4, we have f,(t,) > 4. Then there must exist a real time tz < t, such that 
&(f2) = 2, and TAR is not changed in the interval between t2 and t,. Since C, is 
continuous in this interval, there must be a latest real time t’ with t, < t’ < t, such 
that C,(t’) = 2k for some integer k > 0. Since &(t’) 3 2, we must have 
log(D,(t’)) 6 2k - 2. And since t’ b 2, we must have log(D,(t’)) > 0, and, according 
to our algorithm, TAR would be changed at real time t’, contradicting the initial 
assumptions. [ 
Note that this algorithm achieves WCSC using no message exchanges (although, 
as noted in the previous section, we may need some message exchanges to ensure 
that at the first time t, when all the processors are up, we have D,(t,) < B for some 
constant B depending only the communication network). 
In our algorithm, we essentially kept C(t) to within a constant of log(D(t)). It is 
easy to see that we could also have achieved WCSC by keeping C(t) to within a 
constant of any linear function of log(l)(t)). To achieve an impossibility result, the 
requirements for clock synchronization must be somewhat strengthened. This is 
done by requiring that C(t) stay within a linear function of o(t). Before we for- 
malize this notion of Linear Envelope Synchronization, we give another condition 
that implies it, which is more in the spirit of the clock synchronization condition of 
Wfl. 
Note that in the algorithm given in Theorem 1, for fixed i, there may be several 
times t when C(t) = i PER and TAR is reset. If, for each i, we only allow changes in 
TAR the first time that C(t) reads i PER, then the time between changes in this 
algorithm can grow unboundedly large; thus there is no bound on the difference 
between the logical clocks of correct processors. This leads us to make the following 
definition: 
Clock Synchronization Condition (CSC): There exist constants PER, DMAX, 
and ADJ with ADJ < PER such that, for all correct processors, (a) TAR is constant 
except that it changes at logical clock times that are positive multiples of PER by 
236 DOLEV,HALPERN, AND STRONG 
an amount with absolute value less than ADJ, (b) these changes can occur only the 
first time C reads i PER (i.e., if a change occurs when C(t) = i PER, then 
C(t’) # i PER for all t’ < t), and (c) the difference between logical clocks is always 
bounded by DMAX. 
We need to introduce one more general notion of synchronization in order to get 
a precise statement of our results: 
(U, L) Envelope Synchronization. Given two real-valued functions U(t) and 
L(t), (U, L) envelope synchronization is achieved if the logical clock C of a correct 
processor with duration timer D is bounded above by U(D(t)) and bounded below 
by W(t)) (i.e., W(t)) < C(t) < W(f))), and there is a constant DMAX such 
that logical clocks of correct processors differ by at most DMAX. A special case of 
(U, L) envelope synchronization is linear envelope synchronization, which we 
abbreviate as LES, where L and U are taken to be the linear functions at + b and 
ct + d, respectively, with u > 0. 
Linear envelope synchronization guarantees that the logical clock time of a 
correct processor is within a linear envelope of the physical clock time. But since we 
have assumed that physical clock time is within a linear envelope of real time 
(bounded by R and l/R), LES also implies that for a correct processor, logical 
clock time is within a linear envelope of real time. 
PROPOSITION 1. An algorithm that achieves CSC achieves LES. 
ProoJ Suppose & is an algorithm that achieves CSC with parameters PER and 
ADJ. For a correct processor p, TAR can be changed only when C&t) is a multiple 
of PER. If p sets its clock forward by the maximum allowable amount ADJ at every 
opportunity, C,(t) will read the next multiple of PER every time PER-ADJ passes 
on D,(t). If p sets its clock backward by ADJ at every opportunity, C,(t) will read 
the next multiple of PER every time PER + ADJ has passed on D,(t). To see this, 
note that if TAR is set back by ADJ when C,(t) = k PER, then C,(t ’ ) = k PER- 
ADJ. This means that C,( t’ ) > (k - l)PER, since by assumption ADJ < PER. 
Thus the next allowable time to change TAR is when C,(t) = (k + 1) PER, after 
PER + ADJ has passed on D,(t), since TAR can be changed only the first time 
C,(t) reads k PER. Since it is possible that C,(O) = PER (so that the first change 
can be made at time 0), it can be shown that we must have, for every correct 
processor p, (PER/( PER + ADJ))(D,( t) - D,(O)) - ADJ d C,,(t) - C,(O) < 
(PER/(PER-ADJ))(D,(t) - D,(O)) + ADJ. It now easily follows that & achieves 
LES. 1 
The proof of Theorem 1 above shows that (t, log(t)) envelope synchronization is 
achievable. Theorem 2 below states that LES is not achievable in a network with 
uncertain transmission times if at least one-third of the processors are faulty. And 
thus by Proposition 1, CSC is also not achievable if at least one-third of the 
processors are faulty. 
CLOCK SYNCHRONIZATION 237 
THEOREM 2. Linear envelope synchronization is impossible in a network with 
uncertain transmission times when one third or more of the processors are faulty. 
Before we can prove Theorem 2, we need to establish some notation. Suppose N 
is communication network with uncertain transmission times consisting of n 
processors. Let p = min(R, { H( p, q)/L( p, q) ( (p, q) is a link in N, L(p, q) #O}). 
Choose rl ,,.., rn with 1 <ri6p, i= l,..., n. We define a standard scenario determined 
by r,,..., rn to be one where Di(t)= r,t and, if there is a link in N from pi to pj, all 
messages from pi to p, along that link have transmission time (p/rj) L(pi, p,). (So, 
in particular, if L( pi, pi) = 0, then the transmission time from pi to pj is 0.) Since 
1 < p/r, d p, and pL(p;, p,) 6 H(p,, p,), this is a legal transmission along this link. 
Note that for all choices of algorithms JzZ~,..., & and all choices of rl ,..., r, with 
1 < ri < p, there is a standard scenario determined by rl ,..., r,, where pI runs 4. For 
the remainder of this section, we consider only standard scenarios. 
We say that an algorithm d achieves LES in network N with parameters DMAX, 
a (a > 0), b, c, and d, and tolerates f faults if whenever all the correct processors in 
N run .r9 and there are at most ffaults, for all correct processors pi and p, we have 
ICi(t)- Ci(t)l dDMAX, 
aDi(t)+bdCi(t)<cDi(t)+d. 
We need one more definition before we can state the crucial lemma on which the 
proof of Theorem 2 depends. If &I and V are algorithms, let &I + V[q] be the 
algorithm which has the following properties: processor p, running this algorithm, 
sends the same messages to all processors other than q as it would if it were run- 
ning 99, and it sends the same messages to q as it would if it were running %‘. 
However, we must be a little careful here when we say “as if it were running 99” or 
V, since p’s message history when running &? + W[q] will include the messages it 
sent according to both .64I and V. Thus, when computing what to do according to 
algorithm @, all the messages sent to q according to %? should be deleted. Similar 
remarks hold when computing what messages to send according to 5%‘. Thus, by 
running 94? + V[q], processor p displays “two-faced” behavior, pretending to 
processor q that it is running V and pretending to all other processors that it is run- 
ning 5?. Processor p sends messages to q according to algorithm % if it is running 
,%Y + U[q] for some algorithm 98. 
The following lemma provides the key step of our proof of Theorem 2 in the case 
of three processors, one of which is faulty. Roughly speaking, it says that, for all n, 
there is an algorithm that the faulty processor can run which forces the logical 
clock of one of the correct processors to run at a rate greater than ap”, no matter 
what messages the other correct processor sends. For ease of exposition, we have 
done the three-processor case first: we consider the general case in Lemma 2 below. 
LEMMA 1. Let N be a communication network with uncertain transmission times 
consisting qf three processors, and let d be an algorithm which achieves LES in N 
238 DOLEV, HALPERN, AND STRONG 
with parameters DMAX, a, b, c, d and tolerates one fault. For all integers n > 0, all 
permutations ijk of { 1,2, 3}, and all ri, r,, rk with 1 < ri, rj, r,<p, there is an 
algorithm FJijk, ri, rI, rk) such that in the standard scenario with D,,(t) = r,,t, 
h = 1, 2, 3, where pi sends messages to pj according to pn(ijk, ri, r,, rk), and pi and pk 
are correct and run -c4, we have, for all real times t, Cj( t) > ap”D,( t) + b - n DMAX. 
Proof: Fix N and d as in the hypothesis of the lemma. We define 
FJijk, ri, ri, rk) for all choices of rir r,, rk, by induction on n. For the case n = 0, for 
all choices of ri, rj, rk, we take S$(ijk, ri, rj, rk) = &‘. Since d achieves LES with 
parameters DMAX, a, b, c, d, and tolerates one fault, then as long as pj and pk both 
run d, Cj(t) > aDi + b must hold in all scenarios, no matter what algorithm p, 
runs. 
Suppose we have proved the result for n = m. We first give an informal proof of 
the result for n = m + 1. When running 9$m + , (we omit the parameters to 9 in this 
informal description). pi pretends that it is running d, with its physical clock runn- 
ing at a rate p times the rate of pj, and that pk is sending it messages that result in 
C,(t) > apmDi(t)+ b-m DMAX (by running Fr,). Now pi cannot distinguish a 
scenario where pi is running &+ i and pj and pk are both running ,PZ from one 
where pi is correctly running d with its physical clock at a rate p times that of pi, 
pj is running d, and pk is sending pi messages according to 9$,. In the latter 
scenario, by inductive hypothesis and the fact that d achieves LES and tolerates 
one fault. we have 
C,(t) > Ci( t) - DMAX 
> apmD,(t) + b - (m + 1) DMAX (by inductive hypothesis) 
= ap m+lDi(t)+b-(m+ l)DMAX (since Di(t) = pD,(t)). 
Since pi cannot distinguish these two scenarios, this inequality must also hold in the 
former scenario. 
The problem with our informal proof above is that it does not suffice to say, for 
example, “pi pretends . . . that pk is . . . running Fm.” We must also specify the rest of 
the scenario, and in particular, what PA’s message history is when it runs &,. In 
order to formalize these notions, in particular the idea of constructing an algorithm 
where a processor “pretends” that another processor is trying to “fool” it, we need 
two operators on message histories and physical clocks. 
( 1) If C is a scenario and /I is a real-valued function, let mimi@, C) be the 
algorithm that, when applied by processor p to a pair (T, H) consisting of a 
physical clock reading T and a message history H, sends message m to q if and only 
if (b(T), m, p, “received”) is in the message history of q in Z. If Y is a mapping 
which takes a message history and returns a scenario, let mimic*@, 9’) be the map- 
ping that returns mimic@, Y(H))( T, H) when applied to a pair (T, H). Note that 
for mimic* to compute what message to send q at physical clock time T according 
CLOCK SYNCHRONIZATION 239 
to algorithm mimic@, Z), processor p must simulate ,Z until the simulated physical 
clock of processor q reads /I(T). 
It is easy to see that mimic(fi, C) has the following crucial property: Suppose C’ 
is a scenario such that any message sent by p at time T on p’s physical clock in Z’ 
will be received by q at time /3(T) on q’s physical clock. Then if p runs mimic@, Z) 
in C’, q receives the same messages from p at the same physical clock times in both 
Z and C’. We will be making frequent use of this property throughout our proof. 
(2) Let output be a mapping from message histories to algorithms such that, 
if H is a message history, then output(H) is an algorithm that, when applied to a 
pair (T, H’) by processor p, sends message m to processor q at time T on its 
physical clock if and only if (T, m, q,“sent”) is in H. Thus, when p runs output(H), 
it sends exactly the same messages to all processors as in message history H, 
regardless of what messages it gets from the other processors. Note that this means 
that a processor running output(H) will stop sending messages after the greatest 
timestamp on any of the messages sent in history H. 
We now construct 9$* + ,(ijk, r,, r,, rk) so that for each algorithm 93, there exists 
an algorithm 9 such that the standard scenarios Z and @ described in the diagram 
below cannot be distinguished by pi. These scenarios correspond to the two 
scenarios in our informal description above. Note that in scenario 0, p,‘s physical 
clock is running at p times the rate of pi’s and that pk is sending pi messages 
according to Fm(kij, 1, p, 1). 
Scenario C 
Processor k 
Algorithm B+FM-,(ijL, ri, ri, rk)[j] 2 d 









9 + &(kij, 1, p, l)[i] 
t t 
In @ by inductive hypothesis we have Ci(t) > ap”Di(t) + b - m DMAX. The 
inequalities in the informal argument above show that in @ we must have 
C,(t)‘aP m “o,(t) + b - (m + 1) DMAX, so this inequality also holds for C, as 
desired, since these two scenarios cannot be distinguished by p,. 
Thus it suffices to find 9 m+ ,(ijk, ri, rj, rk) with the desired property. In order to 
give a precise definition of 9$,+ i (ijk, ri, r,, rk), we first need to define two auxiliary 
functions, @* and C*, from message histories to standard scenarios. (The numbers 
ri, rj, and rk that appear as parameters to TM+ i will also be parameters of O* and 
C*, but we suppress them here.) Given a message history H (which in the proof will 
be taken to be pls message history at various times in scenario Z), L’*(H) and 
Q*(H) are defined by the diagram 
240 DOLEV, HALPERN, AND STRONG 
Scenario Z*(H) 
Processor i j k 
Algorithm output(H) d d 





Algorithm mimic(& C*(H)) + 9Qkij, 1, P, 1 Nil 
Physical clock Pt t t 
where /I(T) = T+ pL(pk, pi). 
Now let pm+ ,(ijk, ri, rj, rk) = mimic*(a, CD*), where CX( T) = (rj/ri) T+ pL(p,, p,), 
and let the algorithm 9 of scenario @ be defined via 9 = mimic(j?, C). We now 
show that Z and @ cannot be distinguished by p,. Since a message sent by pk to p, 
(if there is a link between them) in scenario @ at time Ton pk’s physical clock is, by 
the definition of transmission times in standard scenarios, received by pi at time 
/I(T) on its physical clock, the crucial property of mimic guarantees that pj receives 
the same messages from pk at the same physical clock times in both C and @. 
Thus, in order to show that C and Qi cannot be distinguished by pj, it only 
remains to show that pj gets the same messages at the same physical clock times 
from pi in both scenarios. Let Hi(t) be p;s message history in scenario C at real 
time t. It is easy to see that, by construction, scenarios C and L’*(H,(t)) are iden- 
tical up to real time t. Using this fact we can also show that scenarios @ and 
@*(H,(t)) are identical up to real time t (since everything is identical in the two 
scenarios except than in CD’, pk sends messages to p, according to mimic(p, C), while 
in @*(Hi(t)), pk sends messages to pj according to mimic(/?, C*(Hi(t)))). Finally, 
we claim that pj receives the same messages at the same physical clock time from pi 
in scenarios @*(Hi(t)) and C up to real time t. Note that once we prove this claim, 
it follows immediately by transitivity that pj receives the same messages at the same 
physical clock times in scenarios @ and C. 
To prove the claim, suppose that pj receives a message m from pi at real time 
t’ < t in scenario Z, and this at physical clock time rjt’. Message m must have been 
sent by pi at real time t” = t’- (p/rj) L(pi, p,), and thus at physical clock time 
T’ = r,t”. Since pi sends messages to pj in scenario C according to mimic*(cl, @*), 
this means that pj receives m at a( T”) = rjt’ in scenario @*(Hi(f)). Now to com- 
plete the proof we must still show that pi also receives m at physical clock time rit’ 
in scenario @*(Hi(t)). We will prove this by proving a more general fact: namely, 
if u < u’, then pI sends the same messages in @*(Hi(u)) and H*(Hi(a’)) up 
to real time r,u. (t) 
From (t) it follows that pi sends the same messages in @*(Hi(f) and @*(Hi(t)) up 
to real time rjt”. Now pj receives m at physical clock time (and hence also real time) 
rjt’ in @*(Ifi(t thus pi sent m in @*(ITi( at real time rjt’ - pL(pi, pj) = rit”. 
From (7) we get that pi also sent m at real time r,t” in scenario @*(Hi(t)), and 
CLOCK SYNCHRONIZATION 241 
hence that pi received m at real time (and physical clock time) rjt’ in @*(Hi(t)). 
Thus it only remains to prove (t). 
Clearly @*(Hi(u)) and @*(H;(u’)) are identical except that pk sends pi messages 
according to mimic@, Z*(Hi(u))) in one case, and mimic@, C*(Hi(u’))) in the 
other. From the definition of mimic, it follows that pk sends the same messages to pj 
in both scenarios up to real tume rju - pL(pk, pj). We leave it to the reader to 
check that this means that all processors will receive the same messages in both 
scenarios up to real time rju. Since p, is running the same algorithm in both 
@*(H,(u)) and @*(Hi(u’)), it must send the same messages in both scenarios up to 
real time rju. This completes the proof of (t) and of the lemma. 1 
We now generalize Lemma 1 to arbitrary networks. The generalization intuitively 
says that if one-third or more of the processors in the network are faulty, then for 
all n, there is an algorithm that the faulty processors can run which forces the 
logical clocks of a non-empty subset of correct processors to run at a rate greater 
than apn, no matter what messages the other correct processors send. 
LEMMA 2. Let N be a communication network with uncertuin transmission times 
consisting of m processors, and let d be an algorithm which achieves LES in N with 
parameters DMAX, a, b, c, d and tolerates f faults, where 3f 3 m. For all integers 
n > 0, aN partitions PI, P,, P, of the processors in N into three disjoint sets, with 
1 SS JP,,I <f, h= 1, 2, 3, all permutations ijk of f1,2, 3}, and all ri, rI, rk with 
1 < ri, rj, rk d p, there is an algorithm RJijk, ri, rj, rk) such that in the standard 
scenario with Dp( t ) = r,, t for all processors p E P,, , h = 1, 2, 3, where all processors in 
Pi send messages to the processors in P, according to Fn(ijk, ri, r,, rk), and the 
processors in P, and Pk are correct and run d, we have, for all real times t, c,,(t) > 
ap”D,( t) + b - n DMAX for all processors p in P,. 
Proof The proof is a straightforward generalization of that of Lemma 1. We 
construct scenarios Z and @ just as above. The only added point that must be 
checked is that all the processors in Pi get the same messages from the other 
processors in P, in both C and @. The processors in P, are following d in both 
scenarios, but in scenario C, their physical clocks are running at rj times the rate of 
real time, while in @ they are running at exactly the rate of real time. However, 
because we are considering standard scenarios, messages to processors in F’i take r, 
times as much real time to arrive in scenario @ as in scenario C. Thus, in both 
scenarios, the message transmission time as measured on the physical clocks of 
processors in P, is the same. The result easily follows. m 
Proof of Theorem 2. Suppose by way of contradiction that N is a network with 
uncertain transmission times consisting of m processors, and & is an algorithm that 
achieves LES in N with parameters DMAX, a, 6, c, and d, and tolerates f faults, 
with 3f > m. Let P, , Pz, P, be a partition of N such that 1 < 1 PAI <A for h = 1, 2, 3, 
and choose n such that up” > c. Consider a standard scenario where all physical 
clocks are running at the rate of real time, all the processors in P, are faulty and 
242 DOLEV, HALPERN, AND STRONG 
running algorithm YJ 123, 1, 1, l), while the processors in P2 and P, are all correct 
and running .rB. By Lemma 2, we have C,(t) > ap”D,( t) + b - n DMAX, for all 
processors p E P, and all times t. Since ap” > c and Dp( t) = t, there must be some t’ 
such that ap”D,(t) + b - n DMAX > CD,(t) + d for all t > t’. This contradicts the 
assumption that G! achieves LES in N with parameters DMAX, a, b, c, and d, and 
tolerates f faults. u 
We have just shown that even in a completely connected network, if one-third or 
more of the processors are faulty, LES is not achievable. But in general, networks 
are not completely connected. Dolev [Do] has shown that Byzantine agreement is 
not achievable if the connectivity of the network is less than 2f+ 1 and there aref 
faulty processors. By combining the techniques of the proof of Theorem 2 with 
some of the techniques of [Do], we can also show that linear envelope clock syn- 
chronization is impossible if the connectivity of the network is less than 2f + 1 and 
there are f or more faulty processors. A proof of this fact (as well as another elegant 
proof of Theorem 2) can be found in [FLM]. 
The reader may wonder if we really need our assumption that the network has 
uncertain transmission times. That is, would our results still hold in a network N 
where H( p, q) = L( p, q) for all links (p, q), so that there is no uncertainty in trans- 
mission times? It can be shown that Theorem 2 still hold in a network where 
H( p, q) = L( p, q) = 0 (essentially the same proof goes through), but, as shown by 
Fischer, Lynch, and Merritt [Me], the result does not hold if H( p, q) = L( p, q) > 0. 
THEOREM 3 (Fischer, Lynch, Merritt). Let N be a network with uncertainty in 
transmission times such that H( p, q) = L( p, q) > 0 for all links (p, q). Then there is 
an algorithm d that achieves linear envelope synchronization and tolerates f faults iff 
N has indegree 2f + 1 (i.e., each processor can receive directly from at least 2f + 1 
other processors). 
Proof sketch. For the upper bound, suppose N is a network with no uncertainty 
and indegree 2f + 1. Assume for ease of exposition that C,(O) = 0 for all correct 
processors p; the general case is similar and left to the reader. We show that there 
exists a constant DMAX such that if p is a correct processor, we can keep C, within 
DMAX of real time, i.e., IC,(t) - t\ < DMAX. Thus clearly we can achieve LES. 
The basic idea of the algorithm is that in a network with no uncertainty, a 
processor can know exactly how much real time has passed by measuring the time 
required to send a message to its neighbor and receive a response. We present the 
algorithm from the point of view of processor p. Initially p sends an INIT 
message to each neighbor q. Whenever p receives a message from a neighbor, it 
immediately returns the message. Note that in particular, the INIT message will 
keep being passed and forth from p to q. We will call a subset S of processors of 
size f + 1 consisting of neighbors of p good at real time t if the order in which 
INIT messages are received by qE S is exactly what it would be if all the 
processors in S were correct. Processor p can easily check if S is good at time t, 
CLOCK SYNCHRONIZATION 243 
since it known what the message transmission time should be for each message. 
Since there is a subset of size f + 1 which consists of only correct processors, there 
must be at least one good subset according to p at all times t. Moreover, any good 
subset contains at least one correct processor (since there are at most f faulty 
processors). 
Assume that p has ordered all the subsets of its neighbors of size f + 1 in some 
way, and, for each such subset S, it has chosen some processor qs E S to “trust.” 
For each subset S, p constructs a logical clock C,,, by initializing C,,, to 0 and then 
setting C,,, to k&p, qs) when it receives the kth INIT message from qs, where 
6( p, q) is the real time required for a message to make the round trip from p to q 
and then back from q to p. We take C,(t) to be CP,s(t) where S is the first subset in 
the order that is good at time t. 
We claim that ICY,(t) - t[ < 24, where A is the maximum round trip message time 
from p to any of its neighbors. To prove the claim, we now show that if S is good at 
time t, then IC,,,(r) - tI d 24. Since S is good at time t, there must be at least one 
correct processor in S, say q’. Of course, the processor qs in S that p has chosen to 
“trust” may be faulty. Suppose that p has received k INIT messages from qs by 
real time t, and thus has set C,,, to kd(p, qs). Further suppose that p has received 
k’ INIT messages from q’ by time t. Since q’ is correct, this means that 
(I) k’d(p, q’) < t< (k’+ 1) 6(p, 4’). 
And we must have 
(2) (k+1)6(p,qs)~k’6’(p,q’). 
since otherwise the (k + 1)st INIT message would have been received before the 
(k’)th INIT message, contradicting the fact that S is good at time t. From (1) 
we have that k’ 6( p, q’) > t - 6( p, q’), which together with (2) gives 
(3) C,,,(t)=k6(p,q,)~t-6(p,q,)-6(p,q’). 
We must also have 
(4) k6(p,q,)~(k’+1)6(p,q’), 
since otherwise the kth INIT message would have been received after the 
(k’+ 1)st INIT message, again contradicting the fact that S is good at time t. 
From this and (1) it follows that 
(5) C,,,(t)~k’6(p,q’)+6(p,q’)~t+6(p,q’). 
Since, by choice, A 2 max{S( p, qs), 6(p, q’)}, from (3) and (5) it follows that 
JC,,Jt) - t) < 24, as desired. 
The lower bound can be proved using techniques similar to those of Theorem 2, 
or by using the techniques developed in [FLM]. We leave details to the reader. 1 
It is interesting to note that the upper bound of Theorem 3 can be achieved even 
if the correct processors do not have physical clocks. (This follows immediately 
244 DOLEV, HALPERN, AND STRONG 
from the observation that nowhere in the algorithm did we need to assume the 
existence of physical clocks.) 
4. LOWER BOUNDS ON SYNCHRONIZATION 
Suppose we have an algorithm that guarantees that logical clocks of correct 
processors are no more than DMAX apart at any real time. It is easy to see that, 
for all E > 0, we can modify this algorithm to obtain an algorithm that guarantees 
that logical clocks of correct processors are no more than E DMAX apart at all 
times, simply by slowing down all clocks by a factor of E. Of course, the slope of 
linear envelope limits the choice of E. 
These observations suggest that if we want to get lower bounds on how tightly 
logical clocks can be synchronized, it may not be appropriate to consider the dif- 
ference between logical clock readings at a given real time. We therefore turn our 
attention from the tightness of synchronization along the logical clock time axis to 
the tightness of synchronization along the real time axis. We show that there is a 
lower bound A, which depends on the uncertainty of transmission delay, such that 
no clock synchronization algorithm can guarantee that the difference between the 
real times at which clocks read a given value is less than A. In fact, we prove an 
even stronger result: we show that there is no algorithm that can guarantee that any 
action can be performed by two processors within less than A of each other, for an 
appropriately defined notion of action. These results thus give lower bounds on the 
degree of synchronization achievable in a network. We call A the essential temporal 
imprecision, or just imprecision, of the network. 
To make these notions precise, consider a communication network 
N= (G, H, L). For the purposes of this section, we will assume that G is an undirec- 
ted graph (so that there is a link (p, q) iff there is a link (q, p)) and that H(p, q) = 
H(q, p) and L(p, q) = L(q, p). (The case of directed networks is considered in 
[HMM]. It is shown in [HMM] that there is a precise sense in which for 
calculating imprecision, there is no loss of generality in considering only undirected 
networks where H and L are symmetric.) If p and q are joined by a link, define 
V(p, q), the variation in uncertainty between p and q, via VP? 4) = 
H(p, q) - L( p, q). We extend H, L, and V so that they apply to all pairs of 
processors by setting H( p, q) = L( p, q) = V( p, q) = co for processors p, q, that are 
not connected by a direct link. We now extend V so that it also applies to sequen- 
ces of processors. For a sequence of processors rc = pO, p, ,..., p,,, let P’(Z) be the 
sum of the values V( pi, pi+ i), for i from 0 to n - 1. Finally, let U,(p, q), the uncer- 
tainty in transmission time from p to q in N, be defined via U,( p, q) = min{ V(Z) ) n 
is a sequence of processors in N starting with p and ending with q 1, and let U, = 
max{ U,( p, q) ( p, q are processors in N}. 
For ease of exposition in what follows, we assume that each processor has a 
special register that initially contains the value 0. At some point the value must be 
changed to 1. The problem is to obtain an algorithm that guarantees that all 
CLOCK SYNCHRONIZATION 245 
processors change the value to 1 at as close to the same real time as possible. 
Clearly the time at which a processor changes the value in its special register 
depends on the scenario. The essential temporal imprecision inherent in a particular 
algorithm d is the worst case difference in the times that two processors change the 
value in their special register, where the difference is taken over all possible 
scenarios in which all processors run algorithm ~4. Note that since we are assuming 
that all processors run ~2, we are implicitly ignoring the possibility of faults. Clearly 
any lower bounds we obtain under the assumption that there are no faults also 
holds if there are faults. Recall that a run of algorithm ~4 in network N is a par- 
ticular scenario in which all processors in N run algorithm d. 
The essential temporal imprecision in a network is the minimum essential tem- 
poral imprecision over all possible algorithms. More formally, given an algorithm 
d, processors p and q in N, and run (T define 
A,,,,( p, q, a) = the absolute value of the difference of the real times at which 
processors p and q change the value of their special register in 
run g of algorithm d, 
A,,.,(P, q) =maxJA,.,(p, q, a)>, 
A,(P, q) = min.,{A,.,(p, 4% 
A N,.d = max,,,fAN,.d(p, q)), 
AN=min.d{AN,.dl. 
Thus A,,,( p, q) is the closest real time synchronization that can be guaranteed 
between p and q when running algorithm ~4, A,,,(p, q) is the closest real time syn- 
chronization that can be guaranteed between p and q, no matter what algorithm is 
run, A,,, is the imprecision in the network (the worst case real time difference 
between when two processors in N perform a given action) when running d, and 
A, is the imprecision in the network, the tightest coordination in N that can be 
guaranteed by any algorithm. 
THEOREM 4. For all communication networks N and all processors p, q in N, we 
have AN( p, q) 3 U,( p, q)/2; i.e., the imprecision is at least us great us half the uncer- 
tainty. 
Proof: Fix a communication network N = (G, H, L), an algorithm d, and 
processors p and q in N. We consider two runs of ~4 which, as we shall show, are 
indistinguishable from the point of view of every processor. In the first run, all 
processors are started at the same time, with their physical clocks running at the 
rate of real time. For definiteness, we take D,(t) = t for all processors r. If there is a 
link from processor r to processor r’ in N, then messages from r to r’ take time 
L(r, r’) + max( U,( p, r) - U,( p, r’), 0). We leave it to the reader to check that this 
expression is at most H(r, r’), so a message from r to r’ can indeed take this length 
of time. 
246 DOLEV, HALPERN, AND STRONG 
In the second run, processor p starts first, each processor r starts at real time 
U,( p, r) later than p, and all processor’s physical clocks run at the rate of real time. 
Again, for definiteness, we take D,(t) = t - U,( p, r) for all processors r. If r and r’ 
are joined by a link in N, then messages from processor r to r’ take time L(r, r’) + 
max( U,( p, r’) - U,( p, r), 0) to arrive. Again it is easy to check that this is at most 
H(r, r’), and that no message reaches a processor before it has been started. We 
now show that these two runs are indistinguishable from the point of view of every 
processor; i.e., they produce the same message history. 
Suppose r and r’ are joined by a link in N, and r sends r’ a message when r’s 
physical clock reads T. First consider the case where U,( p, r) > U,( p, r’). In the 
first run, this message arrives at r’ in time L( r, r’) + U,( p, r) - U,( p, r’), when the 
physical clock of r’ reads T + L(r, r’) + U,( p, r) - U,( p, r’). In the second run, this 
message arrives at r’ in time L(r, r’), but again the physical clock of r’ reads 
T + L( r, r’) + U,( p, r) - U,( p, r’), since r’ is started U,( p, r) - U,( p, r’) ahead of 
r. The argument in the case where U,( p, r) < U,( p, r’) is similar and is omitted 
here. 
Because messages are being sent and received at the same time on each 
processor’s physical clock in both runs, processors perform the same action at a 
given time on their physical clocks in both runs. Suppose processor p changes the 
value of its special register at time T, on its physical clock, while processor q 
changes the value at time T, on its physical clock. Let t, and tz be the real times 
that the physical clocks of p and q read T, and T,, respectively, in the first run. 
Note that in the second run, processor p’s physical clock still reads T, at t,, 
but processor q’s physical clock reads T, at t2 + U,( p, q) (since processor q was 
started UN(p, q) later in the second run). By definition, d,,.,(p, q) > 
max( 1 t2 - t L 1, ) t2 + U,( p, q) - t 1 I). But it can be straightforwardly checked that this 
latter quantity is 3 U,( p, q)/2. Since d was chosen arbitrarily, we get dN( p, q) 3 
U,(p, q)/2, as desired. 1 
Remarks. 1. The lower bound holds even if there are no faults in the network, 
and physical clocks run at exactly the rate of real time (i.e., there is no clock drift). 
2. Even if processor’s physical clocks are initially synchronized, we can also 
prove a version of this result in the presence of clock drift. We again consider two 
runs of d which we can show are indistinguishable from the point of view of any 
processor. The first run is defined just as in the proof of Theorem 4; in the second 
run, processors start with their physical clocks synchronized, but the physical 
clocks of processors drift until, for all r, p’s physical clock is U,(p, r) ahead of that 
of r. From this point we can essentially repeat the proof above. 
3. We can also extend this result to probabilistic algorithms (where 
processors can base their decisions on the results of coin tosses) in the following 
way. Consider the two runs described in the proof above, and suppose each of them 
holds with probability l/2. Then an argument essentially identical to the one above 
shows that even for a probabilistic algorithm, the mean difference between the times 
CLOCK SYNCHRONIZATION 247 
that processors p and q change the value of the special register in these two runs is 
at least U,/2. (See [HMM] for further details and a generalization of this result.) 
COROLLARY 1. For all communication networks N, we have A,,, 3 U,/2. 
Proof Note that A, 2 maxp,q AN( p, q), since we cannot do worse by allowing 
different algorithms to synchronize different pairs of processors rather than using 
the same algorithm to synchronize all pairs. The result follows now immediately 
from Theorem 4 and the definitions. 1 
THEOREM 5. For all E > 0, there exists a communication network N such that 
A,V> U,-E. 
Proof. Fix E > 0. Choose q > 0 and such that n/n < E. Let N = (G, H, L), where G 
is a completely connected graph with n nodes, H( p, q) = v, and L( p, q) = 0, for all 
nodes p, q in G. Note we then have U,= q. In [LL2], it is shown that 
A, = ((n - 1 )/n)n. Since ((n - 1 )/n)n > U, - E, the desired result immediately 
follows. 1 
Theorem 5 is essentially the best we can do, as the following theorem shows. 
THEOREM 6. There exists a clock synchronization algorithm d such that for all 
communication networks N and processors p and q in N, A,,~,( p, q) d U,( p, q); i.e., 
the imprecision is no greater than the uncertainty. 
ProoJ: The clock synchronization algorithm of [HSSD] can be used to guaran- 
tee that for all T, there is a T’> T such that the real times at which the logical 
clocks of processors p and q read T differ by at most U,(p, q) in every run. (It 
follows from the algorithm of [HSSD] that the time T is the local time when a 
resynchronization occurs, and is of the form k PER.) Each processor can thus use 
its logical clock to decide when to change the value in its special register. 1 
COROLLARY 2. For all communication networks N, A,,, 6 U,. 
We remark that the results of Theorems 4, 5, and 6, and their corollaries have 
recently been extended in [HMM], where an algorithm is given for computing the 
precise imprecision of any network. Since the algorithm of [HSSD] works even in 
the presence of faults (provided authentication is allowed) and clock drift, the 
upper bound of Theorem 6 also holds in this case. The interested reader may also 
wish to consult [HMM] for further results on achieving optimal precision in the 
presence of faults. 
We present one final result considering tightness of synchronization for 
algorithms that achieve LES. 
THEOREM 7. If algorithm ~4 achieves linear envelope synchronization in com- 
munication network N with parameters DMAX, a, b, c, and d, then DMAX > 
aRU, 12. 
248 DOLEV,HALPERN, AND STRONG 
Proof. Suppose d is an algorithm that achieves LES with parameters DMAX, 
a, b, c, and d. Note that this implies that for any constant U and any processor p in 
the network, there must be a time to such that C,(t, + U) - C,(t,) > 
a(D,(t, + U) -D,(to)) (otherwise it is easy to see that we must have 
C,(t) < aD,( t) + e for all times t and for some constant e, contradicting the fact that 
.d achieves LES). 
Consider two processors p, q in N such that U,( p, q) = U,. Now consider two 
runs of d identical to those described in the proof of Theorem 4, except that there 
is clock drift with a factor of R; i.e., in the first run o:(t) = Rt for all processors r, 
while in the second run O:(t) = R( t - U,( p, r)), or equivalently, 
Q?(t + U,(p, Y)) = Rt (note we are using superscripts to distinguish the physical 
clock functions in the two runs). Take message transmission times to be exactly the 
same as in the proof of Theorem 4 as measured in real time (i.e., there is no factor 
of R). The same proof as that in Theorem 4 now shows that these two runs are 
indistinguishable from the point of view of every processor. Let C:(t) denote 
processor r’s logical clock in run i, i = 1, 2. Because of the indistinguishability of 
these runs, processors must perform the same actions at the same time on their 
physical clocks. Thus we have that Cj( t) = C;(t) for all t, and C:(t) = CG(t + U,). 
By the observations in the first paragraph, we know there is some time t, such 
that CA( to + u,) - CA( to) 3 a(Di(to + U,) - DA( to)) = aRU, (since o:(t) = Rt). 
Thus we get 
aRU,<Cjt,+ u,)-c;(t,) 
= (qb + UN) - qLd + (C$,) - (q~,)) 
= (c;hl+ UN) - C~Oo + UN)) + (C&J - q&J) 
< 2 DMAX. 
The desired result immediately follows. 1 
5. CONCLUSIONS 
We have presented two flavors of possibility and impossibility results regarding 
clock synchronization. The first considers the question of when synchronization is 
achievable in the presence of faults when authentication techniques are not used. 
We have shown that while certain weak notions of clock synchronization are 
achievable, regardless of the number of faults (without any communication at all!), 
linear envelope synchronization does indeed require that less than one-third of the 
processors be faulty, just as Lamport conjectured [La]. 
The second type of result considers how tightly we can synchronize, as a function 
of the uncertainty in message transmission time. We have shown that even without 
CLOCK SYNCHRONIZATION 249 
faults and without clock drift, the essential temporal imprecision is always at most 
equal to the uncertainty, but is always at least half the uncertainty. 
The first type of result suggests that in some ways clock synchronization has very 
much the same flavor as Byzantine agreement: with authentication it can be 
achieved in the presence of arbitrarily many faults, provided these faults do not dis- 
connect the network (cf. [HSSD]), and without authentication, to achieve LES 
and to tolerate f faulty processors, there must be at least 3f+ 1 processors in the 
network, and the connectivity of the network must be at least 2f + 1. 
It is well known that randomness can help in achieving Byzantine agreement, or 
in reducing the number of rounds required to attain it (see, for example, [Be]). It 
would be interesting to know if randomness can also help in achieving clock syn- 
chronization. For example, can we achieve LES (in some resonable probabilistic 
sense) using a probabilistic algorithm if one-third or more of the processors are 
faulty? To get a sense of the problem here, consider the following simple algorithm 
in the case of three processors. Initially, each processor picks another processor 
(other than itself). Then, at specified intervals, it asks that other processor for the 
time on its clock, and sets its own clock to that time. We leave it to the reder to 
check that this algorithm guarantees LES if there are no faults, and guarantees LES 
with probability l/4 if there is one fault (since with probability l/4 the two correct 
processors will pick each other at the initial step). We conjecture that there is no 
probabilistic algorithm that guarantees LES if there are no faults, and guarantees 
LES with probability greater than l/2 if there is one fault. 
ACKNOWLEDGMENTS 
We gratefully acknowledge the two referees of this paper, Jennifer Lundelius and Mike Merritt, for 
catching a number of errors in a previous version and for their numerous suggestions on improving the 
readability of the paper. Mike, in particular, convinced us of the need of the assumption of uncertainty 
of transmission limes in Theorem 2, and of the incorrectness of a theorem claimed in a previous version 
of this paper, namely that LES is achievable if there is a bound on the rate at which a processor can 
generate messages. We also than Benny Chor, Ron Fagin, Nancy Lynch, Yoram Moses, and Barbara 
Simons for a number of helpful comments and criticisms on earlier drafts of this paper. 
REFERENCES 
WeI M. BEN-OR, Another advantage of the free choice: Completely asynchronous agreement 
protocols, in “Proceedings of the 2nd ACM Conference on Distributed Computing,” 1983, 
pp. 27-30. 
IPI D. DOLEV. The Byzantine generals strike again, J. Algorithms 3 (1982), 14-30. 
CDs1 D. DOLEV AND H. R. STRONG, Authenticated algorithm for Byzantine agreement, SIAM J. 
Compur. 12 (1983), 656666. 
[FLM] M. FISCHER, N. A. LYNCH, AND M. MERRITT, Easy impossibility proofs for distributed con- 
sensus problems, in “Proceedings of the 4th ACM Conference on Distributed Computing, 
1985,” pp. 59970. 
511/32/2-l 
250 DOLEV, HALPERN, AND STRONG 
[HMM] J. Y. HALPERN, N. MEGIDDO, AND A. MUNSHI, Optimal precision in the presence of uncer- 




[HSSD] J. Y. HALPERN, B. B. SIMONS, H. R. STRONG, AND D. DOLEV, Fault-tolerant clock syn- 
chronization, in “Proceedings of the 3rd ACM Conference on Principles of Distributed Com- 
puting, 1984,” pp. 89-102. 
L. LAMPORT, Unsolved problems, solved problems, and non-problems in concurrency, invited 
address, in “2nd ACM Conference on Principles of Distributed Computing, 1983”; an edited 
transcript appears in “Proceedings of the 3rd ACM Conference on Principles of Distributed 
Computing, 1984,” pp. l-l 1. 
L. LAMPORT AND P. M. MELLIAR-SMITH, Byzantine clock synchronization, in “Proceedings of 
the 3rd ACM Conference on Principles of Distributed Computing, 1984,” pp. 68-74; a revised 
and expanded version appears as: Synchronization clocks in the presence of faults, .I. Assoc. 
Cornput. Mach. 32 (1985), 52-78. 
L. LAMPORT, R. S~OSTAK, AND M. PEASE. The Byzantine generals problem, ACM Trans. Prog. 
Lang. and Systems 4 (1982), 382401. 
J. LUNDELIUS AND N. LYNCH, A new fault-tolerant algorithm for clock synchronization, in 
“Proceedings of the 3rd ACM Conference on Principles of Distributed Computing, 1984,” 
pp. 75-88. 
J. LUNDELIIJS AND N. LYNCH, An upper and lower bound for clock synchronization, Inform. 
and Control 62 (1984), 190-204. 
K. MARZULLO, “Loosely-Coupled Distributed Services: A Distributed Time System,” Ph.D. 
dissertation, Stanford University, 1983. 
M. MERRITT, private communication, 1985. 
M. PEASE, R. SHOSTAK, AND L. LAMPORT, Reaching agreement in the presence of faults, J. 
Assoc. Comput Mach. 21 (1980), 228-234. 
CLW 
Wll 
w21 
WI 
WeI 
CPSLI 
