Linearizable read/write objects  by Mavronicolas, Marios & Roth, Dan
Theoretical 
Computer Science 
Theoretical Computer Science 220 (1999) 267-319 
www.elsevier.comilocate/tcs 
Linearizable read/write objects’ 
Marios Mavronicolas”~2, Dan Rothb,*,3 
a Department of Computer Science, University of Cyprus, Nicosia CY-1678, Cyprus 
b Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA 
Abstract 
We study the cost of using message passing to implement linearizable readlwrite objects for 
shared-memory multiprocessors under various assumptions on the available timing information. 
We take as cost measures the worst-case response times for performing read and write operations 
in distributed implementations of virtual shared memory consisting of such objects, and the sum 
of these response times. It is assumed that processes have clocks that run at the same rate as 
real time and are within 6 of each other, for some known precision constant 6 2 0. All messages 
incur a delay in the range [d-u, d] for some known constants u and d, 0 < u < d. 
For the perfect clocks model, where clocks are perfectly synchronized, i.e., 6 = 0, and every 
message incurs a delay of exactly d, we present a linearizable implementation which achieves 
worst-case response times for read and write operations of bd and (1 -P)d, respectively; fi is 
a trade-off parameter, 0 < b < I, which may be tuned to account for the relative frequencies 
of read and write operations. This implementation is optimal with respect to the sum of the 
worst-case response times for read and write operations. 
We next turn to the approximately synchronized clocks model, where clocks are only ap- 
proximately synchronized, i.e., 6 >O, and message delays can vary, i.e., u >O. Our first major 
result is the first known linearizable implementation for this model which achieves worst-case 
response times of less than /?d+3u+min{G, u}+E, and (1 -/l)d+3u for read and write operations, 
respectively, under a mild restriction on the trade-off parameter ,6, 0 < /I < I -u/d; E is any ar- 
bitrary constant such that 0 < E < min{2u,d-u}. This implementation employs a novel use of 
approximately synchronized clocks in order to utilize the lower bound on message delay time 
and achieve bounds on worst-case response times that depend on the message delay uncertainty 
u. For a wide range of values of u, these bounds improve upon previously known ones for 
implementations that supports consistency conditions even weaker than linearizability. 
* Corresponding author. E-mail: danr@cs.uiuc.edu. 
’ This paper combines, unifies and extends results that appear in preliminary form in [46,47]. 
‘Currently at AT&T Labs - Research, NJ, as a visitor to the Special Year on Networks, organized by 
the DIMACS Center for Discrete Mathematics and Theoretical Computer Science, Rutgers University, NJ. 
Part of the work of this author was performed while at Aiken Computation Laboratory, Harvard University, 
supported by ONR contract NOOOl4-91-J-1981, at Department of Computer Science, University of Crete, 
and at Institute of Computer Science, Foundation for Research and Technology - Hellas. Partially supported 
by funds for the promotion of research at University of Cyprus. 
3 Part of the work of this author was performed while at Aiken Computation Laboratory, Harvard 
University, supported by NSF grant CCR-89-02500, and at Department of Applied Mathematics and 
Computer Science, The Weizmann Institute of Science. 
0304-3975/99/S-see front matter @ 1999 Elsevier Science B.V. All rights reserved 
PII: SO304-3975(98)00244-S 
268 M. Mavronicoh, D. Rod I Theoretical Computer Science 220 (1999) 267-319 
Our next major result is a lower bound of d+min{&u}/2 on the sum of the worst-case 
response times for read and write operations, for the approximately synchronized clocks model. 
This bound applies to linearizable implementations possessing some natural symmetry properties; 
the bound is shown using the technique of “shifting” executions. Corresponding lower bounds, 
but with no symmetry assumptions, are shown on the individual worst-case response times for 
read and write operations. 
Our bounds for the approximately synchronized clocks model extend naturally to the impe?ftict 
clocks model, where clocks may be arbitrarily far from each other, i.e., 6 = 00. @ 1999 Elsevier 
Science B.V. All rights reserved. 
1. Introduction 
The shared-memory model has been proven a useful model of logically shared data 
in concurrent computation. Perhaps this is so because it allows processes to access 
local and remote information in a transparent and uniform way, which results in sim- 
plifying the programming of distributed applications. Thus, the shared-memory model 
is an attractive paradigm of an interprocessor communication model, as it provides the 
programmers the illusion of a global shared memory across distributed processes. 
Shared-memory implementations must allow user programs to run “concurrently”, 
i.e., to access shared data by interleaving steps or truly in parallel. Many such imple- 
mentations have employed the technique of caching, i.e., maintaining multiple copies 
of the same logical piece of shared data; the performance of such implementations can 
be measured in terms of, e.g., the worst-case time to access a piece of data, availability 
of data to processes, or tolerance to process faults. Even in the simplest cases, how- 
ever, problems arise since concurrent data accesses cannot be executed instantaneously, 
while their interleaving causes additional “correctness” problems. 
Thus, a need arises for a consistency mechanism to support the illusion of atomic 
operations on single copies of memory objects. Such a mechanism may allow opera- 
tions to be executed concurrently on multiple copies of objects but must still guarantee 
that the operations will appear as if executed atomically in some sequential order con- 
sistent with the order in which individual processes “observe” them to occur. If, in 
addition, this order is required to respect the order of non-overlapping operations at 
processes, the consistency mechanism is said to guarantee linearizability [32];4 other- 
wise, it is said to guarantee sequential consistency [38]. Clearly, linearizability implies 
sequential consistency. It has been argued quite convincingly [32] that linearizability 
is the correctness condition that best guarantees “acceptable” concurrent behavior; in- 
deed, linearizability enjoys a number of nice properties such as compositionality; 5 this 
has made it quite attractive for different applications, such as concurrent programming, 
4 Also called atomicify in [3 1, 39,481 for the case of read/write objects. 
5 Roughly speaking, a consistency condition is said to be compositiond if the system as a whole satisfies 
the condition whenever each individual object does. 
M. Muvronicolas, D. Roth I Theoretical Computer Science 220 (1999) 267-319 269 
multiprocessor operating systems, distributed file systems, etc., where concurrency is 
of primary interest. 
Attiya and Welch [ 151 initiated a comparative study of the impact of the strength 
of correctness guarantees provided by sequential consistency and linearizability on the 
cost of supporting them. In more detail, they considered caching implementations of 
read/write objects in non-bused distributed systems; they took as cost measures the 
worst-case response times for performing read and write operations on such objects, 
and the sum of these times, in the best possible implementation supporting each of the 
consistency conditions. In this paper, we continue this study and present new lower 
and upper bounds on these costs for sequentially consistent and linearizable implemen- 
tations. We attach some particular emphasis on the costs of supporting linearizability, 
since our motivation is to further illuminate the advantages of linearizability over other, 
seemingly “cheaper”, correctness conditions, such as sequential consistency. In particu- 
lar, we are interested in understanding the dependence of the relation between lineariz- 
ability and sequential consistency on timing assumptions made by different models of 
distributed computation. 
We follow Attiya and Welch [ 151 and consider a model consisting of a collec- 
tion of application programs running concurrently and communicating through virtual 
shared memory, which consists of a collection of read/write objects. These programs 
are running in a distributed system consisting of a collection of processes located at 
the nodes of a complete communication network. 6 The shared memory abstraction is 
implemented by a memory consistency system (MCS), which uses local memory at 
each process node. Each MCS process executes a protocol, which defines the actions 
it takes on operation requests by the application programs. Specifically, each appli- 
cation program may submit requests to access shared data to a corresponding MCS 
process; the MCS process responds to such a request, based, possibly, on information 
from messages it receives from other MCS processes. In doing so, the MCS must, 
throughout the network, provide the proper read/write semantics with respect to the 
values returned to application programs. Fig. 1 (directly adapted from [ 15, Section 21) 
illustrates a note on which an application program and the corresponding MCS process 
are running. The model we consider captures characteristics of existing shared mem- 
ory multiprocessor architectures, such as the Rejective Memory System architecture 
in the Encore 92 Series [23], which provides efficient coupling of multiple processor 
nodes for time-critical applications. 
We make the following timing assumptions about the system. At each node, there 
is a real-time clock, readable by the MCS process at the node, which runs at the same 
rate as real time. It is assumed that the maximum difference between local times of 
any two processes in the system at the same real time is at most 6, for some precision 
constant 6 3 0; moreover, all message delays are in the range [d-u, d], for some known 
constants u and d, 0 < u 6 d. It turns out that the timing information available in the 
6 The assumption of a complete communication network is made only for simplicity and can be removed. 
270 M. Mavronicolus, D. Roth1 Theoretical Computer Science 220 (1999) 267-319 
node i 
1 I 
Call 
c 
send message 
c 
P, PI 
. . 
Response 
deliver message 
network 
Fig. 1. System architecture 
system has a critical impact on the efficiency of implementing sequential consistency 
and linearizability. 
We start with the perfect clocks model, where processes have perfectly synchronized 
clocks, i.e., 6= 0, and message delays are constant, i.e., u=O. We present a linear- 
lizable implementation, parameterized by some constant p, 0 d /I < 1; the worst-case 
response times for read and write operations are /Id and (l--&d, respectively, both 
dependent on the network’s latency d; the parameter /? precisely determines these de- 
pendencies and may be appropriately chosen in order to degrade the less frequently 
occurring operation. Roughly speaking, a read operation returns after time /?d, while 
a write operation returns after time (1 -/?)d. This implementation naturally generalizes 
those in [IS, Theorems 3.2 and 3.31, which are but the special cases with /?= 0 and 
p = 1, respectively. Lipton and Sandberg [41] show a lower bound of d on the sum of 
the worst-case response times for read and write operations in any sequentially consis- 
tent implementation, and for any model assuming an upper bound of d on end-to-end 
message delay; thus, our implementation is optimal with respect to this measure. 
We continue to present the first known linearizable implementation of read/write 
objects the more realistic approximately synchronized clocks model, where clocks are 
only approximately synchronized, i.e., 6 >O, and message delays can vary, i.e., u>O. 
As for the case of the perfect clocks model, the worst-case response times achieved by 
our implementation are parameterized by the tunable constant /?; this constant satisfies 
the mild restriction 0 d p< l-u/d. More specifically, the worst-case response times for 
read and write operations are less than pd+3u+min{G, u}+.s and (1 -P)d+3u, respec- 
tively; the constant E>O is arbitrarily small and no more than min{2u,d-u}. Roughly 
speaking, a read operation first waits for time /?d; following this, it returns as soon as 
a value has resided for time at least u in the local memory of the corresponding MCS 
process. For a write operation, a “time-slicing” technique is used. Once it reaches an 
appropriate “time slice”, the MCS process broadcasts the value to be written; follow- 
ing this, it waits for an additional time (1 -/I)d before returning. Naturally, the specific 
details of the “time-slicing” technique directly or indirectly determine the worstcase 
response times for both write and read operations. However, a major ingredient of our 
implementation is that the value returned in a read operation need not be the one to 
M. Muvronicolas, D. Roth I Theoretical Computer Science 220 (1999) 267-319 211 
which the local memory of the reading process was most recently updated; instead, 
the value to be returned is chosen among values of write operations on the same ob- 
ject performed by processes within a recent, small time interval. The specific choice 
is based on information shown to be shared by all MCS processes. This turns out to 
result not only in preserving the relative order of values returned by different reading 
processes, but also in maintaining consistent copies of local memory throughout the 
network; the latter result is shown to imply linearizability. 
Our linearizable implementation for the approximately synchronized clocks model 
relies heavily on the provided finite clock precision in order to exploit the known 
lower bound of d-u on message delay time and achieve better bounds on worst-case 
response times which, unlike previous ones, depend on the message delay uncertainty 
U. Although we assumed that this precision is a parameter of our model, in practice, 
it can be externally controlled by software protocols (see the many works on clock 
synchronization, e.g., [30,36,44], or [51] for a survey). It is known that the externally 
achievable precision depends critically on the timing uncertainty inherent to the sys- 
tem. For the specific system model we consider, Lundelius and Lynch [44] have shown 
that (1 - 1 /n)u is the optimal achievable precision and provided a clock synchroniza- 
tion protocol achieving it. We present a significantly simpler protocol which achieves 
a precision of u that is only slightly inferior. This protocol only uses messages of 
constant size, in contrast to those in [44] that carry explicit timing information, and 
is of independent interest. Plugging in this precision of u, our bounds on the worst- 
case response times for read and write operations become bd+4u+s and (1 -P)d+3u, 
respectively. In case the message delay uncertainty u is sufficiently small, these last 
bounds significantly improve those in [ 151 that correspond to an even weaker correct- 
ness condition, namely sequential consistency. (For a more detailed description of the 
results in [15], see Section 7.) 
Moreover, we support optimality of our implementation for the approximately syn- 
chronized clocks model by presenting corresponding lower bounds under general and 
mild assumptions on the pattern of sharing properties of processes. Our main negative 
result is a lower bound of d+min{ 6, u}/2 on the sum of the worst-case response times 
for any sequentially consistent implementation in which processes handle operations on 
each object identically and independently of operations on other objects. This implies 
a corresponding lower bound for linearizable implementations. We also show lower 
bounds of min(6, u}/2 on the individual worst-case response times for read and write 
operations, in any linearizable implementation. For the case where u < 6, the lower 
bound for the read operation improves on a result of Attiya and Welch [ 151 showing a 
lower bound of u/4. Our bounds are shown using the technique of shifting executions, 
introduced in [44] for showing a lower bound on the precision achievable by clock 
synchronization algorithms. 
The dependence on d of the upper bounds achieved by our implementation for 
the approximately synchronized clocks model is minimal: the sum of the worst-case 
response times for read and write operations contains only a single additive term of 
d, which, by our lower bound, is inherent. Furthermore, although the analysis of our 
272 M. Muvronicolus, D. Roth I Theorrticd Computer Scirncr 220 (1999) 267-3/9 
implementation is technically challenging, the implementation itself is fairly simple, it 
does not use complicated control mechanisms, and it is message-economical. It can 
be also considered as a natural generalization of the one for the perfect clocks model 
with /?=O, since, as u tends to 0, it almost “coincides” with it and achieves almost 
identical worst-case response times. 
Our result for the approximately synchronized clocks model, in particular, the upper 
bound of d+O(u) on the sum of the worst-case response times for read and write opera- 
tions in a linearizable implementation, along with the lower bound of d+O(min{& u}) 
on this sum, may suggest that sequential consistency and linearizability are actually 
“closer” than thought before in the specific system models we consider. All of these, 
even the imperfect clocks model, assume that all processor clocks move at exactly 
the same speed and that there is a known bound on message delays. Given that the 
primary difference between sequential consistency and linearizability is with respect to 
timing, it is perhaps not too surprising that the two concepts would tend to converge 
in models with strong synchrony. These bounds imply that it is more cost-effective to 
support linearizability in systems with low message delay uncertainty. 
The rest of the paper is organized as follows. Section 2 presents our formal def- 
initions, and surveys some preliminary facts and related background. Bounds for the 
perfect clocks model are included in Section 3. Sections 4 and 5 contain our upper and 
lower bounds, respectively, for the approximately synchronized clocks model. Bounds 
for the related imperfect clocks model are stated in Section 6. We conclude, in Sec- 
tion 7, with a discussion of our results, a survey of related work, and some open 
problems. 
2. Definitions, preliminaries and background 
In this section, we present the formal system model and its various timing aspects; 
we also introduce the memory objects, the consistency conditions, and the costs of their 
message-passing implementations. Towards the end, we review the shifting technique. 
Our definitions are patterned after those in [ 151, which they somehow refine and extend. 
For any real vector s, denote IIsllM and I(sI(_, the maximum and minimum, respec- 
tively, entries of s. 
2.1. System model 
We consider a collection of application programs running concurrently and com- 
municating through virtual shared memory; the latter consists of a collection 3 of 
read/write objects, or objects for short. Each object X E X attains values from a do- 
main, a set V of values that includes a special “undefined” value I; a total order < f‘ 
is defined on V. We assume a system consisting of a collection N of nodes, connected 
via a communication etwork; take INI = n. 
M. Muvronicolus, D. Roth I Theoretical Computer Science 220 (1999) 267-319 213 
The shared memory abstraction is implemented by a memory consistency system 
(MCS) consisting of a collection of MCS processes, one at each node; these processes 
use local memory, execute some local protocol, and communicate through exchanging 
messages, drawn from some message alphabet M, along the network. Each MCS process 
pi, located at node i, is associated with an application program Pi; pi and Pi interact 
by using call and responses events. Formally, the following external events may occur 
at the MCS process pi. 
Call events: They represent initiation of operations by the application program p1; 
they are Readi and Write&Y, v), for all objects X E X and values v E V. 
Response events: They represent responses by pi to operations initiated by the ap- 
plication program fi; they are Return,(X,u) and Ack&Y), for all objects X E 2” and 
values v E V”. 
Message-send events: They represent sending of a message by p; to any other MCS 
process; they are Sendi(m,j) for all messages m E M and MCS processes pi, j # i. 
Message-deliver events: They represent delivery of a message from any other MCS 
process to pi; they are Deli(m,j), for all messages m E M and MCS processes pj, j # i. 
For each index i, 1 < i < n, there is a physical, real-time clock at node i, readable 
by MCS process pi but not under its control, that runs at the same rate as real time. 
Formally, the local clock of process pi, denoted yi, is a monotonically increasing 
function from !JI (real time) to SJZ (clock time) of the form ri(t) = tfgi; gi is a real 
number called the local clock parameter of pi. ’ (The local clock parameters are fixed 
for each “run” of the system, but they are unknown to the processes.) The local clocks 
at various nodes may be initially “out-of-phase”; this happens whenever gi # gj for 
any process indices i and j. Moreover, the local clocks cannot be modified by the 
processes. 
Processes do not have access to real time; instead, each process obtains its only 
information about (real) time from its local clock. The local clock reliably measures 
how much real time has elapsed, although its actual value is not equal to real time. 
Moreover, process pi may use its local clock for “timing” itself. Formally, this is done 
through the following interval events: 
l Timer-set events: They represent setting of a timer by pi to “go off” after a specified 
amount of local clock time elapses and return a message; they are TimerSet;(T,m) 
for all real numbers 7’ > 0 and messages m E M. 
l Timer-expire events: They represent a timer expiration returning a message at p,; 
they are TimerExpire, for all messages m E M. 
The call, message-deliver, and timer-expire events are called interrupt events; the 
response, message-send, and timer-set events are called react events. 
Each MCS process pi is modeled as a state machine with a (possibly infinite) set 
of states, including an initial state, and a transition function. Each interrupt event at 
7 Although it is possible to make the local clock of each process a part of its (local) state, which we 
will soon introduce, we chose to keep local clocks separate from states so that we would not need to put 
restrictions on how those parts of states may be modified. 
274 M. Muvronicohs, D. Roth I Theoretical Computer Science 220 (1999) 267-319 
MCS process pi causes an application of its transition function; thus, computations of 
the system are “interrupt-driven”. More specifically, the transition function is a function 
from tuples of a state, a local clock time, and an interrupt event to tuples of a state and 
sets of react events; in more detail, the transition function takes as input the current 
state, the local clock time, and an interrupt event, and returns a new state, a set of 
response events to the corresponding application program, a set of messages to be sent 
to other MCS processes, and a set of timer-set events. Formally, a computation step of 
process pi is a pair of tuples ((q, y,i), (q’, .9, Y, F)), where q and q’ are states, y is a 
real number, called the local clock time, i is an interrupt event, .B is a set of response 
events, Y is a set of message-send events, and .Y is a set of timer-set events, so that 
q’, $2, Y, and Y result from the application of pi’s transition function on q, y and i. 
A history for MCS process pi with clock yi is a mapping hi from R (real time) to 
finite sequences of computation steps by pi such that: 
1. 
2. 
for each real time t, there is only a finite number of (real) times t’ <t such that the 
corresponding sequence of computation steps hi(t’) is non-empty; thus, the concate- 
nation of all such sequences in real-time order is also a sequence, called the history 
sequence; 
the old state for the first computation step in the history sequence is pi’s initial 
state; 
3. 
4. 
5. 
6. 
the old state for each subsequent computation step is the new state for the previous 
computation step in the history sequence; 
for each real time t, the local clock time of every computation step in the sequence 
hi(t) is equal to Yi(t); 
for each real time t, there is at most one computation step whose interrupt event is 
a timer-set event, and this step is ordered last in the sequence h,(t); 
there is a one-to-one correspondence between timer-set and timer-expire events 
appearing in computation steps of the history sequence; moreover, each timer-expire 
event occurs at local clock time T later than the corresponding timer-set event, 
where T is the real number specified in the timer-set event; 
at most one call event at pi is “pending” at a time;’ 
there is a one-to-one correspondence between call and response events appearing in 
computation steps of the history sequence. For each call event, the corresponding 
response event appears later in the history sequence; moreover, for each call event 
Read;(X), the corresponding response event is an event Returni(X, a) for some value 
u E V, while for each call event Write&Y, u), the corresponding response event is 
an event Ack;(X). 
Each pair of matching call and response events forms an operation. The call event 
marks the start of the operation, while the response event marks its end. An operation 
op is invoked when the application program issues the appropriate call event for op; 
op terminates when the MCS process issues the appropriate response for op. 
7. 
8. 
8 This outlaws pipelining or prefetching at the interface between an application program and the corre- 
sponding MCS process. 
h4. Mawonicohs. D. Roth I Theoretical Computer Science 220 (1999) 267-319 275 
For a given MCS, and execution o is a tuple of histories (hi, h2,. . . ,h,), one for 
each MCS process p/ with a corresponding local clock ~1, such that for any pair of 
MCS processes pi and pi, there is a one-to-one correspondence between the messages 
sent by pi to pj, and those delivered at pj that were sent by p,. Use this message 
correspondence to define the delay of any message in the execution rs to be the real time 
of delivery minus the real time of sending. Execution (T is admissible if every message 
in G incurs a delay in the range (d-u,d], for some fixed and known constants d and u, 
0 < u cd; d is the message delay latency, while u is the message delay uncertainty. 
2.2. Timing assumptions and clock synchronization 
Fix a (known) constant 6, called clock precision, such that 0 < 6 < co. Say that an 
execution CJ is a o-execution if for all pairs of MCS processes p, and pj and all real 
times t, Isi(Yj(t)l < S; notice that, by definition of local clocks, this happens if and 
only if Igi-qj( < 6. In particular, at O-execution will be called an in-phase execution. 
The inverse local clocks of process pi, denoted yi’, is the inverse function of pi’s 
local clock. By definition of local clock, the inverse clock is a monotonically increasing 
function from !R (local clock time) to % (real time) of the form v;‘(c) = c-gj; hence, 
for any pair of MCS processes pi and pj, for all real times t and local clock times c, 
l/i’(c)-?/i’(c) = gj-gi = (t-gi)-(t-gi) = yi(t)-yj(t). Hence, it follows: 
Proposition 2.1. Fix any S-execution. Then, for any pair of MCS processes pi and 
pi, and for all local clock times c, 
I;,;‘(c) = q’(c)1 d 6. 
The next simple claim relates the difference between local clock times at which 
message-send events occur in a &execution, with the difference between real times at 
which corresponding message-deliver events occur in the same execution. 
Lemma 2.2. Consider message-send events Sendi,(ml,ji) and Sendiz(m:!,j2) in a 6- 
execution CJ, occurring at (real) times tl and t2, respectively. Let Del,i,(mi, il) and 
Delj,(mz,iZ) be the corresponding message-deliver events occurring at (real) times t{ 
and tl, respectively, in a. Assume that Yi,(t2)-Yi,(tl)> 6’. Then, t-t; >a’-6-u. 
Proof. Clearly, 
?/iz(t2)-Yi, (ti ) = t2+giz -(ti +gi, ) (by definition of local clocks) 
= Yi2 -Sil +t2_tl 
6 h+t2-t, (since 0 is a b-execution); 
thus, 
t2-tl 3 Yi,(t2)-Yij(tl )-s 
> 6’-6 (by assumption). 
276 M. Mavronicolas, D. Roth I Theoretical Computer Science 220 (1999) 267-319 
Transition Relation: 
Pre: - 
Efl TimerSet; (d, synch) 
Broadcasti(synch) 
Pre: Deh(synch) 
Efi Corri +-‘/r 
Pre: TimerExpirei (d, synch) 
&jj? Cowi +-yi 
Fig 2 The algorithm .dsJnch: precondition effect code for process pi. 
Since CJ is admissible, ti > tz+d-u and t{ Z$ t,+d, ~0 that 
t;-t; 3 hi-d-u-(t, fd) 
= t2-tl-U 
> 6’-6-u, 
as needed. Cl 
Although we shall treat the clock precision S as a fixed parameter, it is possible to 
have each process obtain a logical clock $ that is “closer” to those of other processes 
by computing an additive “software” correction to its local clock time through a clock 
synchronization algorithm (GSA); see, e.g., [30,36,44,51] or [16, Section 6.31. Say 
that a CSA achieves clock precision A if the maximum difference between the logical 
clock times of any two processes at any real time after all processes have terminated 
executing the algorithm is at most A. There are, however, known limitations on the 
best achievable clock precision, as a function of the number of processes n and the 
message delay uncertainty u. 
Proposition 2.3 (Lundelius and Lynch [44]). No CSA achieves clock precision less 
than ( 1 - l/n)u. 
We proceed to present a simple clock synchronization algorithm &‘Q”“” that achieves 
clock precision II. We start with an informal description of J&‘J’~~~. Each process pi 
broadcasts a special synchronization message synch, and sets a timer for time d there- 
after. On either the first receipt of some synch message from some other process, or 
on expiration of its own timer, whichever happens first, pi sets its (logical) clock time 
to 0. In more detail, if first receipt or expiration occurs at (real) time t, p; adopts an 
additive correction of -yi(t) to its local clock, which results in vanishing its logical 
clock time at time t. In all future discussion, we will use local clock time to refer to 
logical clock time. Fig. 2 presents the code for process pi in a precondition-effect style 
that is commonly used to describe I/O automata [45]. We show: 
Proposition 2 4 dSynch . . achieves clock precision u. 
M. Mavronicolas, D. Roth1 Theoretical Computer Science 220 (1999) 267-319 217 
Proof. Fix any admissible execution (T. For each process pi, let t/ be the minimum 
among all (real) times t such that either TimerExpire,(synch) or Dell(synch) occurs 
at time t. Denote tmux = maxjEInl tl; thus, tmax is the time at which the last process 
completes the execution of &Ynch. Let pi be the last process to complete the execution 
of &‘sJ’n’h so that t n,ur = ti. We start by showing: 
Lemma 2.5. For any process pl, Broadcast,(synch) occurs no earlier than time 
t,,,,, - d. 
Proof. Assume, by way of contradiction, that for some process pl, Broadcastl(synch) 
occurs at real time less than tma.\- - d in cr. Since B is admissible, Deli(synch) occurs 
at real time less than t,,, - d + d = tMax. Thus, ti < t,,ax. A contradiction. Cl 
We continue to show: 
Lemma 2.6. For any process pl, Deli(synch) occurs no earlier than time t,nU.x - u. 
Proof. Since r~ is admissible, Del,(synch) occurs at real time which is at least d - u 
later than the real time at which a broadcast event occurs. By Lemma 2.5, it follows 
that Dell(synch) occurs no earlier than time tmoy -d +d - u = tmax - u, as needed. 0 
We finally show: 
Lemma 2.7. For any process PI, TimerExpire/(synch) occurs no earlier than 
time t,,,. 
Proof. By the algorithm, TimerSetj(d,synch) and Broadcastr(synch) occur at the 
same real time. Thus, by Lemma 2.5, limerSetl(d,synch) occurs no earlier than time 
trnas - d. It follows that TimerExpirel(synch) occurs no earlier than time tnluv - d + 
d = tmcu, as needed. 0. 
Consider any process pi. If pi completes the execution of zPnch on Delr(synch), 
then, by Lemma 2.6 and definition of tmuX, tmax - u d tl f tmax. If pi completes the 
execution of ,&YJ’nch on TimerExpire,(synch), then, by Lemma 2.7 and definition of 
t,,UU 1t ,mL~ . / 1 t1vlu.x > <t < so that tl = t,,,. This implies: 
Lemma 2.8. For any process pi, t,,,,, - u d tl < tma\-. 
Consider any real time t 2 t,nUx, and any pair of processes pi and pk, j # k. Clearly, 
K(t) -yj(t)= t - ti - (t - tj) 
(by the algorithm) 
= tj - t,. 
278 M. Mavronicolas, D. Roth I Theoretical Computer Science 220 (1999) 267-319 
By Lemma 2.8, tnzur - u d ti d tTrl(ls and tnlur - u ,< tj < tmox, SO that (tj - til <u. Thus 
Ifi - fj(t)j du. It follows that GYQ”‘~~ achieves clock precision U, as needed. 
We remark that Lundelius and Lynch [44] have shown that clock precision of (1 - 
l/n& is indeed achievable, which is slightly better than u, achieved in Proposition 2.4 
Lundelius and Lynch [44, Section 41 present a clock synchronization algorithm carrying 
explicit timing information, i.e., local clock values, in all messages exchanged between 
processes; by that algorithm, each process needs also to “count” the number of messages 
it receives from other processes. In contrast, neither timing information is carried in 
messages sent by our clock synchronization algorithm, which are of constant size, nor 
processes need to “count”. In these respects, our clock synchronization algorithm is 
more efficient in both message size and space overhead than the one of Lundelius and 
Lynch. Thus, we choose to use our own clock synchronization algorithm in some of 
our later algorithms in order to keep those correspondingly efficient as well. 
In the perfect clocks model, MCS processes have perfectly synchronized (perfect) 
clocks, i.e., 6 = 0. This is modeled by assuming that for each MCS process pi, yi(t) = t. 
Attiya and Welch [ 151 note that the assumption of perfect clocks is equivalent to the 
assumption of constant (and known) message delays, which, in our formal model, can 
be modeled by assuming u = 0. If clocks are perfect and there is a constant and known 
upper bound d on message delay, then constant message delays can be simulated by 
time-stamping each message with the local clock time of the sender at sending time, 
and having each recipient delay any message that arrives with a delay smaller than d 
until the delay is exactly d. If the message delay is constant and known, then a simple 
clock synchronization algorithm can synchronize the clocks perfectly; each message is 
time-stamped with the local clock time of the sender at sending time, which allows 
the recipient to exactly synchronize its local clock to that of the sender. 
In the more realistic approximately synchronized clocks model, MCS processes have 
local clocks with jinite clock precision; that is, 0 < 6 < 03. Proposition 2.4 implies that 
we can assume a clock precision of min(6, u} for all &executions in the approximately 
synchronized clocks model. 
In the imperfect clocks model, clocks may be arbitrarily far from each other, i.e., 
6 = co. Proposition 2.4 implies that we can assume a clock precision of min{co, U} = u 
for all executions in the imperfect clocks model. 
2.3. Memory objects 
Each object X has a serial specification [32] which describes its behavior in the 
absence of concurrency and failures. Formally, it defines. 
l A set OP(X) of operations on X, which are ordered pairs of call and response 
events. Each operation op E OP(X) has a value val(op) associated with it. 
l A set of legal operation sequences for X, which are the allowable sequences of 
operations on X. 
The set OP(X) contains a read operation [Readi( Return;(X, v)] on X, and a write 
operation [Writei(X, u),Acki(X)] on X, for each index i E [n] and value u E V; u is the 
A4 Mavronicolas, D. Roth1 Theoretical Computer Science 220 (1999) 267-319 219 
value associated with each of these operations. The set of legal operation sequences 
for X contains all sequences of operations on X for which, for any read operation rop 
in the sequence, either val(rop) = I and there is no preceding write operation in the 
sequence, or val(rop) = val(wop), where wop is the latest preceding write operation. 
Thus, each legal operation sequence obeys the usual read/write semantics: every read 
operation on X returns the value of the latest preceding write operation on X, if there 
is one, or, otherwise, an “undefined” value. 
Let T be a sequence of operations. Denote by z 1 i the restriction of z to operations 
at the MCS process p;; similarly, denote by r]X the restriction of r to operations on 
the object X. A sequence of operations r for a collection of processes and objects is 
legal if, for every object X E 3, z IX, in the set of legal operation sequences for X. 
We often speak informally of an operation on an object as in “the read operation 
on the object X”. An operation in our formal model is intended to represent a single 
“execution” of an operation as used in the informal sense. 
2.4. Correctness conditions 
Correctness conditions are specified at the interface between the application programs 
(written by the users), and the MCS processes (supplied by the system). 
Given an execution (T, let ops(o) be the sequence of call and response events ap- 
pearing in u in real-time order, breaking ties for each real time t as follows. First, 
order all response events whose matching call events occur before time t, using pro- 
cess identification numbers (id’s) to break any remaining ties. Then, order all op- 
erations whose call and response events both occur at time t. Preserve the relative 
ordering of operations for each process, and break any remaining ties using process 
id’s Finally, order all call events whose matching response events occur after time 
t, using process id’s to break any remaining ties. For an execution rs, the definitions 
of z 1 i and r IX can be extended in the natural way to yield ops(a) Ii and ops(o) (X, 
respectively. 
An execution (T specifies a partial order -% on the operations appearing in cr as 
follows. For any operations op, and op2 appearing in 0, op, 5 op2 if the response 
for opI precedes the call for op, in ops(cr); that is, op, 5 op, if opl completely 
precedes op2 in ops(o). 
Given an execution 0, an operation sequence z is a serialization of cr if it is a per- 
mutation of ops(a). A serialization r of o is a linearization of g if it extends --%; that 
is, if op, & op,, then op, 5 op2. Roughly speaking, the definitions for sequential 
consistency and linearizability involve, for each execution 0, the existence of a seri- 
alization r of g that possesses certain properties. The formal definitions for sequential 
consistency and linearizability follow. 
Definition 2.1 (Sequential Consistency, Lamport [38]). An execution o is sequentially 
consistent if there exists a legal serialization z of o such that for each MCS process p,, 
ops(a)(l=zll. 
280 M. Mavronicolas, D. Roth1 Theorrtical Compufrr Science 220 (1999) 267-319 
Definition 2.2 (Linearizability, Herlihy and Wing [32]). An execution ~7 is lineariz- 
able if there exists a legal linearization z of CJ such that for each MCS 
ops(o)~l=zJz. 
Intuitively, (r is sequentially consistent if the sequence of operations 
permuted to yield an operation sequence 7 that is legal and maintains the 
and response events seen at each process; if, in addition, T preserves the 
two non-overlapping operations in U, o is said to be linearizable. 9 
process pi, 
in IJ can be 
order of call 
order of any 
An MCS is a sequentially consistent implementation of X if every admissible ex- 
ecution of the MCS is sequentially consistent; similarly, an MCS is a linearizable 
implementation of X if every admissible execution of the MCS is linearizable. 
A correctness condition is compositional (or local) [32] if the combination of mem- 
ory objects each of which individually satisfies the condition yields an implementation 
that satisfies the condition as well. An important distinction holds between sequential 
consistency and linearizability with respect to compositionality. 
Proposition 2.9 (Herlihy and Wing [32]). Linearizability is local; sequential consis- 
tency is not. 
Proposition 2.9(ii) implies that to give a linearizable implementation of X, it suffices 
to give a linearizable implementation of a single object X E X. In contrast, for sequen- 
tial consistency, all objects must be implemented together. (This causes development 
costs to increase and makes it hard to apply separate optimizations to different objects; 
see [32] for an expanded discussion.) 
2.5. Cost measures 
In general, the efficiency of an implementation of & of X is measured by the worst- 
case response time for any operation on an object X E X. Given a particular MCS d 
and a read/write object X implemented by it, the time (o~,~(X, a)1(6) taken by an oper- 
ation op on X in an admissible &execution rs of d is the maximum difference between 
the times at which the response and call events of op occur in CJ, where the maximum 
is taken over all occurrences of op in 6. In particular, we denote by lR&X,0)1(6) and 
Iw,(x, ON@ th e maximum time taken by a read and a write operation, respectively, 
on X in g, where the maximum is taken over all occurrences of the corresponding 
operations in 0. 
Define IR.&x)l(@ (resp., lK&W@> as the maximum of /R_&,a)l (resp., IF& 
(X,0)1) over all S-executions o of &. Define IR_dl(S) (resp., IW,,I(S)) as the maximum 
of IR.&Ol(@ (resp., IW&Ul(Q), over all read/write objects X implemented by the 
9 Linearizability may be viewed as a special case of strict seriulizahilify (see, e.g., [l&49]), a basis 
correctness condition for concurrent computations on databases, where transactions are restricted to appear 
to be a single operation on a single object. 
M. Mawonicolas, D. Roth i Theoretical Computer Science 220 (1999) 267-319 281 
MCS &, Let also \R](S) and ]Wl(S) denote the minimum, over all implementations 
.d of 57, of IR,d\(S) and lW,((6), respectively. 
Finally, let /R/ and /WI be the minimum of !R/(6), respectively, over all achievable 
precisions 6. It follows from Theorem 2.3 that JR1 B jRI((1 - l/n&) and \W/ > jR/(( 1 - 
l/n)u). The sum (R( + IW/ is also considered as a measure of efficiency. 
2.6. Shifting executions and clocks 
Our presentation closely follows a corresponding one in [ 151. 
In our later proofs of lower bounds (Section 5) we use the technique of ~~~fti~~~, 
originally introduced by Lundelius and Lynch [443 to prove lower bounds on the clock 
precision achievable by clock synchronization algorithms. Shifting is used to change 
the timing and the ordering of events in an execution of the system, while preserving 
the “local views” of the processes. 
Roughly speaking, given an execution, if for each process pi, pi’s history is changed 
so that the real times at which the events at pi occur are shifted by some amount, and if 
pi’s clock is shifted by the same amount, then the result is another execution in which 
every process still “sees” the same events happening at the same local clock time. 
The intuition is that the changes in the real times at which events at a process occur 
cannot be detected by the process because its clock has changed by a corresponding 
amount. 
More precisely, the view of process pi in execution a‘= {hl,h2,. . . , h,}, denoted 
viewl(a), is the history sequence defined by the history hi in c. Note that the real 
times of occurrences of events at pl are not represented in the view of pt. 
Say that executions ~1 and 02 are equivalent if, for each MCS process PI, uiewl(o; ) = 
viewl(az). Intuitively, equivalent executions are indistinguishable to the processes; only 
an “outside observer” with access to real time can tell them apart. 
Given a history hi of MCS process pi with clock y1 and a real number s, a new 
history hi =shift(hi,s) is defined by hi(t)=hi(t + s) for all real times t. That is, all 
sequences of computation steps are shifted earlier in hi by s if s is positive, and later 
by -s if s is negative. Given a clock yi for MCS process pi and a real number s, a 
new clock yi = shift(y,, s) is defined by y:(t) = yi(t) + s for all real times t. That is, the 
clock is shifted forward by s ifs is positive, and backward by --s ifs is negative. The 
following chain observes that simultaneously shifting a process’s history and clock by 
the same amount yields another process history. 
Lemma 2.10. Let hi be a history of MCS process pi with clock yi, and let s be a 
real lubber. Then, shift(hi,s) is a history of pi with clock shift(~i,s). 
Given an execution D and a real vector s = (~1, ~2,. . , sn), a new execution (r’ = 
shift(cr,s) is defined by replacing, for each MCS process pi, the history hi of pi in 
G by (the history) s~ZZ~t(hi,si), while retaining the same ~o~esponden~e between sent 
and delivered messages. (Technically, the correspondence is redefined so that a pairing 
282 M. Mavronicohs, D. Roth / Theoretical Computer Science 220 (1999) 267-319 
in d that involves a message-send or message-deliver event for an MCS process pi at 
time t, it involves, in cr’, the event for pi occurring at time t - s,.) 
Given a tuple of clocks r = {yi [ 1 <i <n}, and a real vector s = (sr ,s2,. . . ,s,)~, 
a new tuple of clocks r’ = shzjt(r,s) is defined by replacing, for each MCS process 
pi, local clock yi by local clock shtjt(yi,si). 
The following claim observes that shifting each process’s history and clock by 
the same amount in an execution yields another execution that is equivalent to the 
original. 
Lemma 2.11 (Lundelius and Lynch [44]). Let a be un execution with clocks r, and 
consider any real vector s. Then, shift(a, s) is an execution with clocks shift(r, s) 
that is equivalent o a with clocks r. 
The following claim quantifies how message delays change when an execution is 
shifted. 
Lemma 2.12 (Lundelius and Lynch [44]). Let s be a real vector. For any pair oj 
MCS processes pi and pi, if the delay of a message m jrom pi to pj in the execu- 
tion a with clocks r is equal to A, then the delay of m in the execution shift(a,s) 
is equal to A + st - s,. 
Lemma 2.12 implies that the result of shifting an admissible execution is not nec- 
essarily admissible. The next simple claim precisely determines the change in clock 
precision due to shifting an execution. 
Lemma 2.13. Assume a is a A-execution with clocks r. Then, for any real vec- 
tor s, the execution shifr(a,s) with clocks r’=shijt(r,s) is a (A+IIIsIIW - ~l~ll_~l)- 
execution. 
Proof. Clearly, for any MCS processes pi and p, and real time t, 
Id(t) - I$Ct>l = IYiCt> + si - (Yjct> + sj)l 
G IYi(t> - Yj(t> + IYi - Yjl (by triangle inequality) 
d A + Isi - sil (since a is a A-execution) 
6 A + IllslIm - ll~ll~ool~ 
which implies that the execution shijt(a, s) with clocks r’ = shijt(r, s) is a (A+1 IIsilcr: - 
IIs_, /)-execution, as needed. 0 
2.7. Notation 
In this section, we introduce some notation that will be used in the sequel. Con- 
sider any execution a, and let op = [Call(op),Response(op)] be any operation in a. 
M. Muvronicolas. D. Roth I Theoretical Computer Science 220 (1999) 267-319 283 
We denote denote by t?‘(q) and tp)(op) the (real) times at which Call(q) and 
Response(q), respectively, occur in G. When 0 is not clear from context, we use 
ual(“)(op) to denote the value associated with the “execution” of operation op in G. 
For any real numbers XI and x2,x1 20 and x2 >O, fmod(xt,x2) denotes the remain- 
der of the division of XI by x2, i.e., fmod(xt ,x2) =x1 - [xI/x2J. For a real interval 
I = [il,i2], 111 = . z1 and [Zl = i2; the length i2 - il of I is denoted by \I(. 
For any index i and message m E A, we use Broadcasti to denote the set of 
message-send events {Sendi :j E [n]}. 
3. Perfect clocks 
In this section, we consider the perfect clocks model, where 6 = 0 and u = 0. We 
show: 
Theorem 3.1. For the perfect clocks model, there exists a linearizable implementation 
&‘per of read/write objects such that (R,+l(O)= /Id, and [M&l(O)= (1 - fl)d, ,for 
any constant b, O<p<l. 
By Proposition 2.9(ii), it suffices to provide an implementation of a single ob- 
ject X E 3. In Section 3.1, we describe the implementation s&‘“, while a correct- 
ness proof and complexity analysis for d per are presented in Sections 3.2 and 3.3, 
respectively. 
3.1. The algorithm 
We start with an informal description of & per. Each process pi keeps a local copy Xi 
of object X; denote Ual(Xi) the value currently held by Xi, initially 1. Upon a Readi 
event, pi waits for time bd and issues Returni(X,val(Xi)). Upon a Writei(X,u) event, 
pi sends update messages update(X, u) to all processes; after time (1 - fl)d passes, p, 
issues ACki(X) and waits for an additional time of fld to set Xi to u. Furthermore, upon 
receipt of an update message for X from another process, pi immediately updates Xi 
to the value being written. lo 
We remark that &Per guarantees that all local memories of processes undergo “iden- 
tical” changes with respect to each write operation; that is, all processes simultaneously 
update their local copies to the value being written. 
The code for process pi appears in Fig. 3 in the same style as Fig. 2. 
3.2. Correctness proof 
Fix any admissible O-execution c~ of ~4 per We construct a legal linearization r of .
CJ such that, for each MCS process pi, ops(o) 1 i = 7 1 i; read and write operations are 
‘” If pz receives several such update messages simultaneously, it updates X, to the minimal (with respect 
to C) of the corresponding values. 
284 M. Mavronicoias, D. Roth 1 Theoretical Computer Scimce 220 (I 999) 267-319 
Local State: 
Xi: The local copy of object X, initially I 
Transition Relation: 
[Readi( Return&Y, v)]: 
Read&Y) Pre: Read&Y) 
Efl Timerset,(/M,read(X)) 
Return@, V) Pre: TmerExpirei(read(X)) 
Efl Return&Y, wl(~)) 
[Write&Y, u), ACki(X)]: 
Writej(X, 21) Pre: Write,(X, 0) 
Efl Broadcasti(update(X, u)); 
limerSeti(p,d, write(X)); 
TimerSeti(/?,d, write(X)); 
ACki(X) Pre: TmerExpirei(write(X)) 
I3” Ack,(X) 
XiCU Pre: TimerExpire,(update(X, V)) 
E@ Xi + U 
Update of Xi: 
Pre. Deli(update(X, ~),j) 
Efl xj + u 
Fig. 3. The algorithm &per: precondition-effect code for process pt. 
“serialized” to occur at their times of call and response in C, respectively, breaking 
ties by ordering all write operations before read ones that are “serialized” together and 
then using < f ‘. 
Formally, we assign a time Y(op) to each operation op = [Call(op), Response(op)] 
in G as follows. Define F(op) to be either t;(op) if op is a read operation, or t;(op) 
if op is a write operation. We construct z as follows: 
1. for any pair of operations opl and 0~2 in a such that F(opl ) < F(opz), opl A 0~2; 
2. for any pair of operations opl and 0~2 in a such that F(opl ) = F(opz), 
(a) if opl is a write operation and 0~2 is a read operation, then opl A 0~2; 
(b) if opl and 0~2 are either both read operations or both write operations, then, if 
ual(opl ) < p- ual(opz), then opl A 0~2, else (ual(opl) = ual(opz))opl and 0~2 
are ordered arbitrarily in z. 
We start by showing: 
Lemma 3.2. r is a linearization of a. 
M. Mavronicolas. D. Roth1 Theoretical Computer Science 220 (1999) 267-319 285 
Proof. Let opl and 0~2 be any operations in c such that opl -% 0~2. By definition of 
5, t~(op~)<t~(op~). By definition of Y-, Y(opl)<t;(op,), and Y(op~)>t;(op2). It 
follows that Y(opl) dY(op2). By construction of T, the only non-trivial case occurs 
when .Y(opl)= F(op2). This happens if and only if F(opl)=tL(opl) and Y(op2)= 
tg(op2). Then, by definition of -Y, opl is a write operation, while op2 is a read oper- 
ation. Hence, by construction of T, opl &opt, as needed. 0 
We continue to prove: 
Lemma 3.3. For each MCS process pi, ops(a) 1 i = z 1 i. 
Proof. Fix any MCS process pi. For any operations opl and 0~2, say opl 3 0~2 
(resp., opl -% opz) if opl precedes 0~2 in r ( i (resp., G 1 i). To show that r ) i = CT ) i, it 
suffices to show that the order of any two operations in ‘t ) i is the same to their order 
in 0li. 
Consider any pair of operations opl and 0~2 such that opl 2 0~2. Clearly, opl 
5 0~2. Lemma 3.2 implies that opl A 0~2. It follows that opl 2 0~2, as needed. 
We continue to show that r is a legal operation sequence. We define a relation L 
between the set of write operations in IJ and the set of read operations in (T, as follows. 
For any pair of write and read operations wop and rop, respectively, in O, wop A rop 
if val(wop) = val(rop) and the most recent update (in c) of the local copy of X by the 
reading process, before it returns on rop, is to vaZ(wop) as a result of either receipt of 
an update message update(X, vaZ(wop)) from the writing process, or a timer expiration 
event TimerExpire,(update(X, v)). Roughly speaking, A captures causality and relates 
each read operation in o to the most “recent” write operation in o writing the returned 
value. We start with a simple claim. 
Lemma 3.4. Consider any pair wop = [Write@, u),ACki(X)] and rop = [Read&Y), 
Return&u)] of write and read operations, respectively, in 0, for some value v E Y” 
and indices i, k E [n], such that wop A rop. Then, wop -L rop. 
Proof. Since all message delays are exactly d and, by the algorithm, each local up- 
date is performed time d later than the invocation of the corresponding write opera- 
tion, it follows that t;(rop) B t:(wop) + d. Since Y(rop) = ti(rop) = ti(rop) - /3d, and 
Y(wop) = ti(wop) = t:(wop)+( 1 - /?)d, it follows that Y(rop) > Y(wop). We proceed 
by case analysis. If r(rop) > Y(wop), then, by definition of r (case l), wop A rop; 
furthermore, if F(rop) = Y(wop), then, by definition of r (case 2) wop 1, rop. Thus, 
in every case, wop -L rop, as needed. 0 
Note that Lemma 3.4 implies that whenever a read operation in r would return a 
value “out of order”, that is, a value other than that of the immediately preceding it 
286 M. Mavronicobs. D. Roth 1 Theoretical Computer Science 220 (1999) 267-319 
write operation in 7, such a read operation were to be related through L to a write 
operation that still precedes it in 7. Thus, Lemma 3.4 “restricts” in a sense the way in 
which z may violate legality. We finally show: 
Lemma 3.5. 7 is a legal operation sequence. 
Proof. An informal outline of our proof follows. We assume that some read operation 
returns a value other than that of the immediately preceding it write operation; we 
derive a contradiction by showing that the superseded written value is “known” to the 
reading process before the read operation returns. We now present the details of the 
formal proof. 
Assume, by way of contradiction, that 7 is not legal. If follows, by Lemma 3.4, 
that there exist operations wopi = [Write; (X, u1 ), ACki (X)], wopz = [Writej (X, uz), 
ACkj (X)] and rap = [Readk(X), Return&Y, III)], for some indices i,j and k E [n], and 
values vi, 212 E “Y-, such that wopl -I-i wop2, wopz Aropl, and there is no write op- 
eration wop in 7 such that wopz -& wop L rop; that is, wop2 is the most “recent” 
write operation in 7 that precedes rap. 
By construction of 7, F(wopl ) d.F(wop2) < F(ropl ); thus, tL(wopl ) < ti(wop2) 
<ts(ropl). In fact, we prove: 
Claim 3.6. F(wopl) < F(wop~). 
Proof. Assume, by way of contradiction, that f(wopl ) = F(wop2). By construction 
of 7, 212 -CT- III. Moreover, by definition of F, t;(wopl) = t;(wop2), which implies that 
tg(wopl ) = tg(wop2). Since all message delays equal d, pk receives update messages 
simultaneously from pi and pi; since it later returns 01 it must have set xk to VI. 
Hence, by the algorithm, vi <Y ~2. A contradiction. 0 
Note that Claim 3.6 implies that tz(wop,)< tg(wop2). Since all message delays 
equal d, and, by the algorithm, a writing process waits for time d to update its local 
copy to the value being written, it follows that each process sets its local copy of X 
to ui strictly before it sets it to ~2. Moreover, 
t;(rop> = t;(rop) + ad 
3 t;(wopz) + Bd 
=t:(wopz)+(l -B)d+/?d 
= t:(wopz) + d; 
thus, pk updates xk to v2 no later than time ti(ropl). It follows that rop returns ~2. 
A contradiction. 0 
M. Muvronicolas, D. Roth1 Theoretical Computer Science 220 (1999) 267-319 287 
By Lemmas 3.2, 3.3, and 3.5, it follows that T is a legal linearization of CT such that, 
for each MCS process pi, r ) i = o / i. Since CT was chosen arbitrarily, this implies that 
&per is a linearizable implementation, as needed. 
3.3. Complexity analysis 
Clearly, in any admissible O-execution of d per, the response time for every read 
operation is pd, and the response time for every write operation is (1 - fl)d, implying 
that IR.dperI(0) = bd and (Wdper i(O) = (1 - fi)d, as needed. 
4. Approximately synchronized clocks: upper bound 
In this section, we present our upper bound for the approximately synchronized 
clocks model. 
Fix throughout any arbitrary constant E subject to the constraint 0 < E < min{ 2u, d-u}. 
We show: 
Theorem 4.1. For the approximately synchronized clocks model, there exists a lin- 
earizable implementation SP of read/write objects such that IR,dU I(S) < fld + 3u + 
min{b,u} + F, and I!V,U\(S)<( 1 - /3)d + 324, for any constant /I such that 
Obj?<l-u/d. 
By Proposition 2.9(ii), it suffices to provide a linearizable implementation of a single 
object X E J. In Section 4.1, we describe one such implementation das, while some 
of its preliminary timing properties are shown in Section 4.2. A correctness proof and 
complexity analysis for JZZ” are presented in Sections 4.3 and 4.4, respectively. 
4. I. The algorithm 
We start with an informal description of da”. Each process pi keeps a local copy Xi 
of object X; denote ual(&) the value currently held by Xi, initially 1. In addition, 
pi keeps a register LCTi(X) holding the local clock time at the most recent update 
of Xi, or I if this time is at least u earlier than the current local clock time; finally, pi 
maintains a set Pendi(X) of “pending” update messages for object X. Each update 
message has the form (update(X, v), c) for some value v E Y and a real number c, 
which represents the local time of some process. 
We now describe the “timings” of ~2~~. 
l Upon a Read&Y) event, pi first sets a timer to expire at time fld thereafter, where 
0 < fi < 1 - u/d; then, pi waits to return until time u has passed without any update 
of xj; 
l a “time-slicing” technique is used for handling writes; roughly speaking, pi “slices” 
each time interval of length 3u + E into a “write-prohibited” interval of length 3u, 
288 M. Mavronicohs, D. Roth I Theoretical Computer Science 220 (1999) 267-319 
in which actions on a write request may not be initiated by a writing process, 
followed by an interval of length 8 in which they may. Upon a Write&Y, v) event, and 
when outside a “write-prohibited” time interval, pi broadcasts an (update(X, v), c) 
message, where c is the local time of pi at the time of broadcasting. Then, p1 waits 
for an additional time of (1 - /3)d to set Xi to v and issue Acki(X). 
l On receipt of (update(X,u),c) from a different (writing) process, pi immediately 
sets Xi t0 U. 
We now describe the mechanism by which pi “selects” the value to be returned in 
a read operation; candidate values are found in the set Pendi(X). More specifically, 
pi considers only values to which it previously set Xi, whose local broadcasting time 
(accompanying the update message) is within 2u of that of the update message with 
the currently maximal local broadcasting time. (As we will show, the most recently 
received value is one of the values considered.) The set Pend;(X) is maintained by pi 
as follows. Whenever pi updates X, to u, on receipt of (update(X, v), t) as a result of 
a write operation by another process or by itself, it adds (v, t) to Pendi(X). I’ At the 
time of return, pi returns the maximal (with respect to <y ) of the value components 
of elements of Pf?IZdi(X). 
The code for process pi appears in Fig. 4. pi uses the messages waitread and 
read(X), and write(X) for implementing the timers needed for the read and write 
operations, respectively. 
For the rest of this section, fix any admissible &execution c of JP. For any write 
operation wop = [Write&Y, v), ACki(X, u)] in g, denote by t,b’(wop) and t,d”(wop) the 
(real) times at which the writing process pi broadcasts a message update(X, v) (to- 
gether with its local broadcasting time) and the message update(X,u) is delivered at 
a process, respectively. 
4.2. Timing properties 
We start by showing that every process “hears” about a value currently being written 
no later than time pd after the corresponding write operation acknowledges. 
Proposition 4.2. For any write operation wop in CJ, ti(wop)> t,d”(wop) - pd. 
Proof. Clearly, 
t;(wop) = t,b’(wop) + (1 - j)d (by the algorithm for writes) 
3 t,d”‘(wop) - d + (1 - /3)d 
= t,d”(wop) - pd, 
as needed. 0 
” To keep the size of f%&(X) small, at each update pi removes from Pend(X) all elements (u’.t’) such 
that I’ is not within 224 of the currently maximum time component of elements of Pend,(X). 
M. Mavronicolas, D. Roth I Theoretical Computer Science 220 (1999) 267-319 289 
Local State: 
PWdi(X): 
tma.q(X): 
Transition Relation: 
[Roadi(X),Returni(X, a)]: 
Readi Pre: 
Efs: 
Pre: 
Efs: 
RetUrrli(X, 0) Pi-e: 
[Write,(X, U),ACki(X)]: 
Writoi(X, V) 
ACki(X) 
Update of Xi: 
Pre: 
4-8 
Pre: 
Efs: 
Pre: 
Efs: 
Pre: 
Efs: 
The local clock component 
The local copy of object X, initially I 
The local clock time at the most recent change of Xiui, 
or I, if this time is >u 
A set of “pending” update messages (II’, t’) for object X 
max{t’: (?I’, t’) E Pendi(X)} 
Readi 
TimerSeti(@,waitread(X)) 
TimerExpirei(waitread(X)) & LCTi(X) # i & 
yi - LC7;(X) <U 
TimerSeti(u - yi + LCTi,read(X)) 
(TimerExpirei(waitread(X)) & LCTi(X) = 1) or 
TimerExpirei(read(X)) 
Xi tmax,, {II: (V,l) E PeFZdi(X)}; 
Roturni(X, aal(&)) 
Writei(X, II) & fmod(yi, 3~ + a) 63~ 
TimorSeti(3u - fmod(yi,3u + a), write(X, v)) 
(Write@, II) & fmod(yi, 3u + 8) > 3~) or 
TimerExpirei(write(X, a)) 
Broadcasti(update(X, u)); 
Timeraeti(( 1 - p)d,update(X, u)) 
TimerExpirei(update(X, v)) 
Xi + U; 
Pedi(X) + Pedi(X) U {(U,?; - (1 - P)d)}; 
Pe?Zdi(X) t {(U’, t’): t?TZaXi - t’ < 2U); 
ACki(X) 
Deli((update(X, u), t)) 
PeTZdi(X) + Petldi(X) U { (0, t )}; 
Pendi(X) 4- {(U’,t’): tWX7X, - t’ <2U}; 
LCTi(X) + ‘/i 
Fig. 4. The algorithm zZas: precondition-effect code for process p;, 
We define a relation A between write and read operations in c as follows. For any 
write and read operations wop and rop, respectively, in O, wop L rop if ual(wop) = 
ual(rop) and the latest update (in a) of the local copy of X by the reading process, 
before it returns on rop, is to uaZ(wop), as a result of either receipt of an update 
message update(X, ual(wop)) from the writing process, or a result of a timer expiration 
event TimerExpire,(update(X, u)). Roughly speaking, 5 captures causality and relates 
290 M. Mavronicohs, D. Roth1 Theoreticul Computer Science 220 (1999) 267-319 
each read operation in o to the most “recent” write operation in o writing the returned 
value. We show that each write operation in g returns no later than a related read 
operation in cr. 
Proposition 4.3. Assume wop A rop. Then, t;(rop) 3 tz(wop). 
Proof. If wop and rop occur at the same process, the claim follows trivially from 
definition of history sequence. So assume that wop and rop occur at different processes. 
Clearly, 
ti(rop) 2 tF’(wop) + u (by the algorithm for reads) 
3 t$ + d - u + u (since cr is admissible) 
= tL(wop) - (1 - B)d + d (by the algorithm for writes) 
= t;(wop> i- bd 
3 t;(woP), 
as needed. q 
We continue with timing properties of the slicing technique. We show that for each 
process pi, there exists a sequence of “quiet” (update-free) time intervals quieti( 
one for each integer k > 1, with the following properties: 
l pi receives no update messages in quieti( 
l (quieti 2 2~ - min(6, u}; 
l any two consecutive intervals, quiet,(k) and quiet;(k + l), are separated by a time 
interval of length at most 2~ + e. 
These properties are shown formally in the next two claims. 
Proposition 4.4. For each process pi, there exists, fbr each integer k > 1, a time in- 
terval quieti in which pi receives no update messages. Furthermore, Iquieti(k)( 3 u. 
Proof. Consider any writing process pi. For any integer k > 1, any (update) message 
sent from pj to pi while y, tk(3u + E) is delivered to pi while ‘pi < k(3u + E) + d; on 
the other hand, any message sent from pj to pi while yj > k(3u + e) + 3u is delivered 
to pi while yj>k(3u+E)+3u+d-u=k(3u+&)+d+2u. (Recall that, by the algorithm, 
pj cannot send any update messages while k(3u + E) 6 yj < k(3tI + E) + 324.) Thus, no 
message from pj is delivered to pi while k(3u + E) + d < yj < k(3u + E) + d + 224. It 
follows that for each j E [n], no update message from pj is delivered to pi in the time 
interval [y,:‘(k(3u + E)) + d, y,:‘(k(3u + E)) + d + 2~1. Hence, no message from any 
process is delivered to pi in the time interval quieti( where 
quieti = n [y,T’(k(3u + 6)) + d, y,I’(k(3u + E)) + d + 2~1 
iE[nl 
= ypi y,-‘(k(3u + E)) + d, m[n; y,-‘(k(3u + 8)) + d + 224 
I 
. 
kf. Muwonieolas, D. Roth1 Theoretical Computer Science 220 (1999) 267-319 291 
Hence, 
lquiet;(k)J = ,k& yj’(k(3u + E)) + d + 224 - gT$y,-‘(k(3u + 6)) - d 
= 2u +,m;; y,~‘(k(3u + e)) - zy; y,-‘(k(3u 4 a)) 
= 2U - m~~,(“$(k(3u + E)) - $(k(3u + a))) 
3 224 - min(6, U} (by Proposition 2.1, with min{b,u} for S) 
32u-u=u, 
as needed. 0 
We continue to show an upper bound on the “gap” between consecutive quiet inter- 
vals. For each integer k 3 I, define gapi = [ [quieti(k [quieti(k + 1 )]]. Note that 
gapi # 0. We show: 
Proposition 4.5. For each integer k 3 1, Igup;(k)l <min{b, IA} + u + e. 
Proof. Clearly, 
/gapi( = [@eti(k + 1)J - [quiet,(k)] 
= ~~7 ;li’((k + 1)(3u + E)) + d - mf;yj?(k(3u + E)) - d - 2~ 
< z; y,-‘(k(3u + c;)) + 324 + E - tnf; $(k(3u + E)) - 2u 
ZZ 
jE[nl .’ 
max yr’(k(3u + 6)) - mt; y,:‘(k(3u + E)) + u + c 
< min{b,u} + u + E (by Proposition 2.1, with min(6,u) for 6) 
as needed. q 
We continue with a crucial property of the “slicing” intervals. Roughly speaking, 
we prove that local broadcasting times that are within 2~ fall within the “same” time 
slice. Formally, we show: 
Proposition 4.6. Consider write operations wop’ and wop2 at processes pi and pi, 
respectively, such that 
(3u + E)k’ - E<yj(t;(wopl))6(3u + E)kl, 
and 
(3~ + E)kz - e<j&(wop&S(3u + E)k2, 
for some positive integers k’ and k2. Then, 
IY,(t~(wOPl 1) - Yj(t,b’(WOPZ))) <2U 
if’ and only if k’ = k2. 
292 M. Muvronicolus, D. Roth I Throrrtical Computer Science 220 (1999) 267-319 
Proof. By assumption, 
(3U +e)kl - a - (3U + c)k* < yi(tg'(WOpl))- yj(tF(WOp2)) 
< (3U + E)k, - ((3U + E)kz - E), 
so that 
Claim 4.7. lyi(tg'(WOpl))- yj(tE(WOfQ))- (3U + &)(kl - k2)1 < C. 
Assume first that kl # kz; without loss of generality, take kl>, k2 + 1. Clearly, 
yi(tg’(wopi)) - yj(trbr(wop2)) > (3~ + a)(ki - k2) - a 
(by Claim 4.7) 
> (3U + E) - E 
(since kl 2 k2 + 1) 
= 3u > 0. 
Hence, 
lYi(tF(woPO) - Yj(tr(W"P2))l = Yi(thr(WOpl)) - >'j(thr(WOP2)) 
> 3u > 224 
as needed. Assume now that kl = k2. By Claim 4.7, 
lYi(fbr(wOPl)) - )'j(tbr(WOf?2))1 < c 
d min{2u,d - U} 
(by assumption on c) 
< 2% 
as needed. 0 
4.3. Correctness prooj 
We construct a legal linearization r of (T such that, for each MCS process p,, 
ops(cr) 1 i, = z 1 i. We start with an informal outline of the construction of r and the 
main ideas used in proving its properties. 
The construction proceeds in two phases. In the first phase, each read or write 
operation in cr is “serialized” to occur at the time of its response in Q, breaking ties 
by ordering all write operations before read ones that are “serialized” together and 
then using < 9 . Call r’ the resulting operation sequence. Clearly, by construction, r’ 
preserves both the order of operations at each MCS process and the order of non- 
overlapping operations. However, r’ might not be legal. 
In the second phase, we trace all legality violations in r’, and inductively fix each of 
them. The fix still guarantees that r’ is a linearization of cr which preserves the order 
M. Mavronicolas, D. Roth I Theoretical Computer Science 220 (1999) 267-319 293 
of operations at each process. Roughly speaking, we scan r’ and fix each violation 
of legality by “localty” permuting operations. We show that the index of the first 
operation “witnessing” a legality violation strictly grows after each fix, as we proceed; 
thus, inductively, this results in a legal linearization r of CT which preserves the order 
of operations at each process. We now present the details of the formal proof. 
Formally, we construct r’ as follows. For any operations up1 and op2 in rr if 
i-’ 
~Z(OPI ) < $(oPz), then QPI - 92; if ~~(~~~)=~~(0~2), then, if opi and 0~2 are write 
and read operations, respectively, then opl A opz, else (opl and 0~2 are either both 
reads or both writes), if z&(opl ) < $ vaZ(op;!), then opi L 0~2. 
We now elaborate on the second phase of the construction. We scan z’ till a read 
operation rop is reached such that wopi Z wop2 A rop for some write operations 
wopl and n~op2 in r’ such that ~al(~o~~ ) = ~a~(rop), uaf(wops) # uaI(rop), and there 
is no write operation wp in r’ such that wopz L wop A rap; call it a non-admissible 
triple. Let iviol be the index of ropl in r’, We permute wop2 to immediately precede 
wopl in t’. Let r1 be the resulting sequence. 
Our proof proceeds in two steps. First, we show that a non-admissible triple is the 
only cause of a legality violation; we next prove that i,i,r(rl) > i,&‘), by showing 
that the prefix of ri ending with ropl is a legal sequence of operations; induction 
implies, then, the correctness of our construction. 
Our first simple claim characterizes a legality violation; and it implies that legality 
may only be violated because of a non-admissible triple. In all of our discussion, wopi 
and ropi will denote write and read operations on object X such that vi is the associated 
value with each of them. Since, by construction, write operations precede in z’ read 
operations that occur at the same time, Proposition 4.3 implies that wopi precedes rap; 
in r’. It follows that a non-admissible triple is, indeed, the only possible form of a 
legality violation. We show that the values of the involved write operations must have 
been broadcast “very close” in time. 
Lemma 4.8. Assume that wopl 5 wop -k ropl. Then, 
Proof. Assume, by way of contradiction, that 
l”li(t,hv(WOPl>> - Y,j(t$(WOP))l > 2U.
We proceed by case analysis on the sign of yi(tF(wopi)) - yj(tF(wop)). 
1. Assume first that ~f~(t~(~~op,)) - ~~(t~(wop~)) > 0; It follows that ~‘i(~~(w~p~)) - 
~~j(~~(~~p)) > 2~. By Lemma 2.2, it follows that trfwopl) - t~(w~p~) > 2~ - 
min(d,u} > 2u - u = u > 0. By the algorithm for writes, for each i E { 1,2}, 
ti(wopi) = tF(wopi)+( 1 -/l)d. It follows that t;(wopl ) > t3wopz). By construction 
of r’, wop2 i wopl. A contradiction. 
294 M. Mavronicolus, D. Roth1 Theoretical Computer Science 220 (1999) 267-319 
2. Assume now that y,(tF(wopz)) - yj(tE(wopi)) > 2~. By the algorithm and the 
way r’ was constructed, tb’(wop2) < t’(wop2) 6 t’(wopi ). Proposition 4.2 implies 
that tde’(wop2) < t’(wopl). Thus, at time t’(ropi), both 01 and 02 reside in the 
memory of the reading process. It follows, however, that tmux; - yi(tb’(wopl)) 3 
Yi(tbr(wopZ)) > 2~. This contradicts the fact that ropl returns VI. 0 
We continue to show a simple property of r’. 
Proposition 4.9. Consider read operations rop and rop’ such that wopl L rop I 
wopz, and wopl A rap’ L wop2. Assume there is no write operation wop in z’ such 
that wopl 5 wop A wop2. Then, val(rop) = val(rop’). 
Proof. By construction, t’(rop), t’(rop’) < t’(wop’). Hence, it follows by Proposi- 
tion 4.3 that ual(rop),val(rop’) # val(wop’). By Claim 4.2, every process receives 
val(wop) and all values of preceding write operations in r’ by time t’(wop) + pd. 
Since there is no write operation in the interval of operations (wop, WOP’)~, no pro- 
cess modifies its Pend(X) set except on receipt of val(wop’) in the time interval 
(t’(wop), t’(wop’)). Notice, however, that a process that modifies Pend(X) on receipt 
of an update message for wop’) may return for a read operation no earlier than t’(wop’), 
and, by construction, such a read operation is not included in (wop, WOP’)~, . This im- 
plies that every read operation in (wop, WOP’)~~ returns the same value, as needed. 
Proposition 4.9 implies that we may assume, without loss of generality, that at most 
one read operation may be completed between any two successive completions of 
write operations in r’. The next claim argues that once a value is returned by a read 
operation, no (later) read operation in r’ may return a value of a preceding (in r’) 
write operation. 
Proposition 4.10. Consider a write operation wop such that wopl 5 wop L ropl. 
Then, there exists no read operation rop such that wop*‘i rop. 
Proof. Assume, by way of contradiction, that there exists a read operation rop such 
that wop Arop. We proceed by case analysis. 
1. Assume first that tL(rop2) > ti(ropl). It follows, by Claim 4.9, that there must 
be at least one write operation on X in the interval of operations (ropl , ropz)+ ; 
let wop3 be one with the maximal broadcasting time among all such write opera- 
tions. We consider the intervals (wopl,ropl),t and (woP~,Yo~~)~~; it follows from 
Proposition 4.6 and Lemma 4.8 that the local broadcasting times of val(wop~) and 
vaZ(wop3) are in the same time slice, as are those of ual(wop2) and val(wop3). 
Since every process receives both val(wopl ) and val(wop2) by time tr(wop2)+ljd, it 
follows, by the algorithm, that all values vi, u2 and us were considered in both ropl 
M. ~~uronicoius, D. Roth i Theore~i~~t Computer Science 220 (19991 26?--319 295 
and ropz as candidate values to be returned. Thus, both ~1 <ii- v2 and v2 <$ ~‘1. 
A contradiction. 
2. Assume now that tr(rop2) < t’(ropr ). We apply an identical reasoning to the inter- 
vals (wopl, ropz)r/ and (wopl, ropl )Tt. Let tmaxl and tmaxz be the maximal time 
components of elements of Pend(X) of the processes performing ropl and rop2, 
respectively, at the time they return. Clearly, by time f(ropz), each processes mod- 
ifies its Pend(X) set as a result of a write operation on X completed by time 
t’(rop2). Thus, any modification of Pelad(X) at time > t,(ropz), corresponds to 
a write operation returning at time > tr(ropz); hence, the broadcasting time of 
such an operation is greater than the broadcasting time of any write operation com- 
pleted by time &.(wop~), and the addition of its value to Pend(X) of any process 
can only increase tmaxz. Hence, tmaxl> tmaxz. Clearly, tmuxz - yj(f(wop2)) d Zu, 
aud tmaxl - yi(t’(WOpl))<2U. This implies that tmaxl - t’(wopz)&2u. By the al- 
gorithm and the way wopl returns, VI <y ~2. Hence, by the way wopz returns, 
tmaxz - t’(wopl) > 2~. A contradiction. 0 
Clearly, Proposition 4.10 implies that, after the reordering, the prefix of ‘c’ ending 
with ropl is a legal operation sequence. We next prove that this prefix is also a 
linearization of g by showing that the reordered operations wopi and wop2 (in the 
non-admissible triple wopl, wopz, ropl) “overlap” in G. 
Proposition 4.11, The reordered preJx oft’ ending with ropi is a linearization of G. 
Proof. By Proposition 4.6, the local broadcasting times of wopl and wopz fall in the 
same slice. It follows by Proposition 2.4 that It,b’(wopl) - t,b’(wopz)l <u + E. Hence, 
t;(wopz)<t;(wop2) -c t$(wopl)+c+u=t;(wopI) -d +~+u<t;(wop~), 
by assumption on E, as needed. •I 
Since, in permuting z’, we reordered only wopl and wopz, which, by Proposition 4. I 1, 
“overlap”, wopl and wop2 may not be performed by the same process. This implies that 
our reordering yields an operation sequence preserving the order of operations at each 
process. Hence, ivi,f(zl) > i,i,i(?). By induction, it follows that ~2” is a linearizable 
implementation. 
4.4. Complexity analysis 
The upper bound of (1 - P)d + 3u on IWdu I(S) is obvious since, by the algorithm, 
pi first waits for time at most 3u till it exists a “write-prohibited” inverval, and then 
for an additional time (1 - /3)d to issue an acknowledgment. We proceed to show that 
/R+/(S) < /?d + 324 + min{&u} + E. 
Consider a Readi event that occurs at time t in G. We show that a matching 
response occurs by time t’ --z t + /Id + 32~ + min(6, u} + e. Observe that such a response 
may only be prevented if an update message is delivered to pi. It seems as if a 
296 M. Mavronicolus, D. Roth I Theoretical Computer Science 220 (1999) 267-319 
starvation may occur due to successive update events; however, the “slicing” technique 
assures that this is not the case. Clearly, it is possible that Read&Y) occurs (at time t) 
within some interval quietj(k), for some integer k, but pi enters gapi( by receiving 
some update message before it may issue a response to Read;(X). Such an update 
message must be received no earlier than time t + u, since, otherwise, KY;(X) would 
have attained the value U. By Claim 4.5, pi enters quieti(k + 1) by time < f + u + 
2u + E= t + 3u + E. Since, by Claim 4.4, lquiet;(k + I)/ 8274 - min(&,u}, KY&) 
attains the value u within quieti(k + 1); hence, pi issues Returni(~,UffZ(~i)) by time 
< t + /?d + 3~ + min(6, U} + E, as needed. 
Since LT was chosen arbitrarily, this implies that IR,+j / (6) -c /?d + 3~ -t min{ 6, U} + c, 
as needed. 
In addition, it does not seem that the better clock precision achieved by the clock 
synchronization algo~thm of L~delius and Lynch [44, section 41 can considerably 
improve our results. 
5. Approximately synchronized clocks: lower bounds 
In this section, we present lower bounds for the approximately synchronized clocks 
model. 
This section is organized as follows. In Section 5.1, we present a lower bound 
on the sum of worst-case response times for read and write operations; this bound 
applies to a certain class of sequentially consistent implementations, and it implies a 
corresponding fower bound for linearizable implementations. In Sections 5.2 and 5.3, 
we present lower bounds on individual worst-case response times for read and write 
operations, respectively; these bounds apply to any linearizable implementation. 
5.1. Read and write operations 
Our lower bound on the sum of the worst-case response times for read and write op- 
erations applies to a certain class of implementations of objects, called object-separable 
and object-symmetric; roughly speaking, such implementations satisfy the following 
conditions. 
1. Each process handles activity involving a certain object independently of all activity, 
concurrent or even previous, involving other objects; hence, the sequence of actions 
taken by the process on this object is completely separated from and not affected 
by the presence or absence of events involving other objects. 
2. Each process handles activity involving a certain object in precisely the same way 
it handles activity on any other object. 
Our formal definitions follow. 
An implementation d is object-separable if for each process pi, every state s of pi 
contains I$? components si,s2,. . ,sI$x/, one for each object, so that if an interrupt event 
ik involves object xk and ((q, y, ik), (q’, B, Y, F)) is a computation step of process pi, 
M. Mavronicohs, D. Roth I Theoretical Computer Science 220 (1999) 267-319 297 
then (i) qi =qr for every I # k; (ii) 41, 93, Y, and r result from the application 
of pi’s transition function on qk, y and ik, and (iii) each of 2, 9 and 5 contains 
events that involve only object &. Thus, the transition function of a process in an 
object-separable implementation may be regarded as the “parallel composition” of 131 
transition functions, one for each state component associated with a specific object. If, 
in addition, these 1x1 transition functions are “identical”, the implementation is said to 
be object-symmetric. 
Formally, an object-separable implementation ~2 is object-symmetric if for each 
process pi, for any identical up to object interrupt events ik and il involving objects 
& and XI, respectively, if ((4, y, ik), (q/,.%,9, Y)) and ((4, y, ik), (i’,&, 9,$)) are 
computation steps of pi such that qk = 4, are identical, then each of the pairs 9 and 
8, Y and 9, and y and & are identical up to object. 
We start with two properties which will later be used in the proof of the lower 
bound on the sum of the worst-case response times for read and write operations; these 
are simple properties of sequentially consistent, object-separable and object-symmetric 
implementations, which may be of independent interest. 
Throughout this section, assume that d is any sequentially consistent, object- 
separable and object-symmetric implementation of read/write objects. 
5.1.1. First property 
Loosely speaking, we establish that in any execution of &, objects “identically 
written” by processes “respond identically” to read operations. This property is inspired 
by and generalizes a result of Lipton and Sandberg [41, Theorem 11, formalized and 
strengthened by Attiya and Welch [15, Theorem 3.11 for the perfect clocks model 
where u = d, to the approximately synchronized clocks model. 
Formally, consider objects X and Y; by the serial specifications of X and Y, there 
exists an admissible &execution ~1 of d consisting of the following operations at 
processes pi and pi: 
l pi performs a write operation wop; on Y with oaZ(wopi) = v and t& (wopi) = 0, im- 
mediately followed by a read operation ropi on X with t& (ropi) = ts, (wopi); 
l pj performs a write operation wopj on X with val(wopj) = v and t&(wopj) = 0, 
immediately followed by a read operation ropj on X with ti, (ropj) = t& (wopj). 
We assume that message delays in crt are as follows. Each message from p/ to 
pi, 1 # j, incurs a delay of d; each message from pj to pl, I # j, incurs a delay of 
d ~ min{d, u}; any other message incurs a delay of d - min(6, u}/2. Furthermore, we 
assume that for each I # j, yl(t) = t, while Yj(t) = t - min{&u}/2. We show: 
Proposition 5.1. val,, (YOpi) = val,, (YOpj) = 0. 
Proof. We start with an informal outline of our proof. By “perturbing” 01, we obtain 
an execution o{, which appears “symmetric” with respect to objects X and Y, and has 
the following properties: (i) each process “sees” each event happening at the same 
(local) time in both CJI and 0;; (ii) each of the objects X and Y undergoes the same 
298 M. Mawonicoias, D. Roth1 Theoretical Computer Science 220 (1999) 267-319 
“changes” at the same (local) time in 0;. By (i), it suffices to show that both read 
operations return v in al,, which follows from (ii) and object-symmetry. We now present 
the details of the formal proof. 
We describe how to “perturb” rot in order to obtain another admissible &execution 0; 
of ~2. Consider the real vector s = (ss,sr . . . ,sn_l), where SI = min{b, u}/2 if I =j, and 
0 otherwise. Then, cri = shift(al, s) with clocks r,’ = shift(rl, s). That is, each event at 
process pj that occurs at real time t in (~1 will occur at real time t - min{ 6, u}/2 in 0;) 
while times of events at all other processes remain unchanged; pi’s clock is shifted 
forward by min{ 6, u}/2, while all other clocks remain unchanged. By Lemma 2.11, it 
follows: 
Lemma 5.2. 0; is an execution of .d 
clocks r. 
We proceed to show: 
with clocks I’,’ that is equivulent to o1 with 
Lemma 5.3. u{ is an admissible S-execution of &‘. 
Proof. We first show: 
Claim 5.4. 0; is a &execution of SZ!. 
Proof. Fix any processes pl and pm. We proceed by case analysis. 
1. Assume that none of pl and pm is pi. Then, for any real time t, 
Iv;(t) - r6(t)l = h(t) - Ym(t)l 
(by construction of r’) 
= It - tI 
(by construction of %‘) 
=0<6, 
as needed. 
2. Assume now that some of pl and pm, say ~1, is pj. Then, for any real time t, 
I44 - rm = Ihgt> - $#)I 
= yj(t) + m1nl;), u, - h(t) 
(by construction of r’) 
= t _ min{&u} 
2 
+ min{&u} _ t 
2 
(by construction of r) 
=O<& 
as needed. 0 
M. Mavronicolus, D. Roth1 Theoretical Computer Science 220 11999) 267-319 299 
We continue to show that all delays are in the range [d - u,d]. Fix any MCS 
processes pl and pm, and let dim be the delay of any message m from pi and pm in 01. 
By Lemma 2.12, the delay dim of m in rs{ is di,+s, -s,. We proceed by case analysis. 
1. Assume that both 1 fj and m #j, so that d,,,, = d - min{6,u}/2, and sj =s,,, = 0. 
Then, d’,, = d - min{ 6, u}/2 + 0 - 0 = d - min{b, u}/2. 
2. Assume now that 1= j, so that dl,,, = d-min{&u}, sl=min{&u}, and s, =O. Then, 
d’,, = d - min(6, U} + + min(6, u}/2 - 0 = d - min(6, u}/2. 
3. Assume now that m = j, so that d[,,, = d, s/ = m, and s, = min{d, ~}/2. Then, dim = 
d + 0 - min{b, u}/2 = d - min(6, u}/2. 
Notice that since min(6, u}/2 <u, d - min{ 6, u}/2 >d - u/2 3 d - u; hence, d’,, E 
[d - u,d]. This implies that 0; is an admissible execution. By Claim 5.4, it follows 
that cr{ is an admissible &execution of &, as needed. 0 
Lemma 5.2 implies that val,; (rap;) = vala, (ropi) and val,; (ropj) = val,, (ropi). Thus, 
it suffices to show that val,;(ropi) = val,;(ropj) = V. 
Notice that in o/1, by construction, yi(O)=O, while y;(O) = 0 - min{&u}/2 + 
min{6,u}/2 =O. Thus, local clocks of pi and pj are identical in 0;. Since all mes- 
sage delays are equal, object symmetry implies that vi = v:. Notice that v: = v$ = i 
contradicts sequential consistency. Therefore, vi = v: = v, as needed. 0 
5.1.2. Second property 
Loosely speaking, we consider an execution of & with “conflicting” write operations 
on some object, and “late” read operations on this object, performed after processes 
“hear” about the write operations; we establish that the “late” read operations must 
return the same value. 
Formally, consider an object X, holding the value 1 at time 0. By the serial specifi- 
cation of X, there exists an admissible &execution 02 of JZZ consisting of the following 
operations at processes pi, p,j, pk and pl: 
l pi performs a write operation WOpi on X with val(wopi) = vi and t&(wopi) = 0; 
l pj performs a write operation WOpj on X with val(wop,) = Vj and ti,(wopj) = 0; 
0 pk performs a read operation ?-Opk on X with t&(i-opk) > d + IFV,l(S); 
l pi performs a read operation ropr on X with t&(ropl) > d + (W’,((S). 
Furthermore, we assume that message delays in ~2 are all equal, and that all local 
clocks are perfectly synchronized. 
We show: 
Proposition 5.5. val,,(rOpI,)=val,Z(ropI). 
Proof. Assume, by way of contradiction, that t’alO,(ropI,) # val,,(ropl). We construct 
an admissible S-execution gk of & that is not sequentially consistent. 
We start with an informal outline of our proof. We obtain an admissible &execution 
U; by “augmenting” 02 as follows. Each of pk and pl performs an additional later 
read operation on X, preceded by a pair of a write and a read operation on two other 
300 M. Mavronicolas, D. Roth1 Theoretical Computer Science 220 (1999) 267-319 
objects Y and 2. We use object symmetry to argue that the operations on Y and Z must 
be “interleaved” in any legal serialization of ~4. This will prevent all read operations 
on X by one of pk and pl to precede all such operations by the other in any legal 
serialization. Since 0; is an “augmentation” of 02, the “early” read operations on X 
in rsi must return different values, as in ~72. We use object separability to argue that 
each “later” read operation on X returns the same value as the corresponding earlier 
read operation by the same process. Since read operations on X by pk and pl must 
be “interleaved”, this contradicts sequential consistency. We now present the details of 
the formal proof. 
Consider objects Y and Z. By the serial specifications of X, Y and Z, there exists 
an admissible &execution 0; of & consisting of the following operations at processes 
Pi, Pi, Pk and Pi: 
l Pi performs a write operation wopi on X with val,;(wopi) =xi and ti;(woPi) = 0; 
l Pj performs a write operation wopj on X with val,;(wopj) =xj and ti;(wop,i) = 0; 
l pk performs a read operation ropk on X with $~(ropI,) = t,“,(ropk), followed by a 
write operation Wopk on Y with val,;(wopk) = y and $;(wop~,) = t,‘,,(ropk), followed 
by a read operation ropi’) on Z with t,';(rop:' )) = t,‘;(wopk ), and finally followed by 
a read operation ropf’ on X with tz;(ropf’) = ti;(ropf ‘); 
l p, performs a read operation ropl on X with t,‘;(ropl) = t&(ropl), followed by a 
write operation wopi on Z with val,;(wopk) =z and tz;(wop/) = ti,(rop,), followed 
by a read operation rap{” on Y with tz;(ropj’)) = ti:(wopj), and finally followed by 
a read operation ropy) on X with tz;(ropy’) = tL;(ropj*‘). 
Furthermore, we assume that all message delays in cri are equal, and that all local 
clocks are perfectly synchronized. 
By object separability, val,;(ropp)) = Val,+(ropk) and val,;(ropy)) = val,;(rop,). 
Since all message delays are equal, object symmetry implies that either val(rop:‘))= 
va/(wopl) and val(ropi’)) = val(wopk), or va/(rop:‘)) = va/(ropf’)) = 1. However, no- 
tice that val(ropr )) = val(ropi’)) = I violates sequential consistency. It follows that 
val(ropr)) = val(wop[) and va/(ropj’)) = Val(WOpk). 
Since 0; is sequentially consistent, there exists a legal serialization r of CJ; such that 
for each MCS process pi, ops(a$) ) i = z 1 i. Clearly, either ropf) (2) -I, ropl or rap, 5 
ropk; without loss of generality, assume the former. Since 05 ) 1 = z 11, ropl L wop~. 
By the serial specification of Z, wopl L ropi’). Since 0; 1 k = T 1 k, rap:‘) (2) 2 rap, . 
It follows that ropl i, ropr). A contradiction. 0 
We now present our main lower bound result. 
Theorem 5.6. For the approximately synchronized clocks model, in any sequentially 
consistent, object-separable and object-symmetric implementation d of at least three 
objects accessed by at least four processes, 
(IRMI + IW,l>(@ d d + 
min{ 6, U} 
2 . 
M. ~~vro~ico~as~ D. Roth I Theoretical computer Science 220 (1999j X7-319 301 
Proof. Assume, by way of contradiction, that there exists a sequentially consistent, 
object-separable and object-symmetric implementation &’ of such objects for which 
(/R.~/+IW~l)(~)<~+ min{b,u}/2. We construct an admissible ii-execution of ,d that 
is not sequentially consistent. 
We start with an informal outline of our proof. We construct an admissible 
b-execution CJ of & in which each of two MCS processes Pk and PI performs an 
“early” and a “late” read operation on an object X; we use object-symmetry to “force” 
pk and pl to either return different values in different order, which, clearly, violates 
sequential consistency, or to maintain “inconsistent” copies of the same object, also 
shown to violate sequential consistency. These different values are written by “conflict- 
ing” write operations on X by processes pi and P,~. We approp~ately choose message 
delays in G so that, under the assumption (/R_dl + IW.ti/ )(6) <d + min{b, u}/2, pk 
“gathers” fast information about the write operation by pi, but cannot “hear” about 
the write operation by pj till late. (The roles of delays of messages from pi and pi 
are reversed for PI.) Thus, by object-symmetry, read operations by pk and PI return 
different values in different order, establishing the contradiction. We now present the 
details of the formal proof. 
Consider objects X and Y, each holding the value I at time 0. By the serial speci- 
fications of X and Y, there exists an admissible &execution cr{ of ,d consisting of the 
following operations at processes pi, pj, pk and pi: 
l pi performs a write operation wopi on X with ~~~(~~pi) = vi and r~(~~~~) =min 
{S,u}/Z, followed by a read operation ropi on Y with t~(ropi)=t~(~~op;); 
l p,i performs a write operation Wopj on X with uaZ(wopi) = ri and t,“(wopi) =min 
(6, u}/2, followed by a read operation ropj on Y with C(ropj) = t:(wopj); 
a pk performs a write operation Wopk on Y with t&(wopk) = r)k and t;(wopk) = 0, fol- 
lowed by two consecutive read operations ropJ’) and ropy’ on X, with t,"(ropr ‘) = 
t;(wopk) and t,“(ropy’)> IW,l(S) + min{6,u}2; 
l pl performs a write operation wop~ on Y with val(wopf) = VI and t,‘(wopl) = 0, fol- 
lowed by two consecutive read operations ropi” and ropy) on X, with t,C(rop(,‘)) = 
tL(wopr) and ti(ropy’)> I&J/(S) + min{&,u}/2. 
We assume that the message delays in G are as follows. Each message from pi to pm, 
m # i, incurs a delay of either d if m = E, or d - u if m f I; each message from pi to 
pm, m #j, incurs a delay of either d if m = k, or d ~ u if m # k; any other message 
incurs a delay of A - 42. Furthermore, we assume that in 6, ym(t) = f if m 4 {i,j}, 
or t - min{ 6, u)/2 if m E {i,,j}. We remark that any message sent by pi or pj while 
performing write operations on X is delivered before the late read operations on X by 
Pk and pl are invoked. 
Since (IR,~I+IW~I)(6)<d+min{6,u}/2, it follows that t;(ropf))<d+min{h,u}/2, 
hence, the assumed message delays imply that pk may not receive a message from pj 
till after time tL(ropr’). Thus, Proposition 5.1 applies on the prefices of CT j i and 
CT jj consisting of all events at pi and pj occurring no later than time t,C(ropF ‘) 
in (7 to yield that ~aI~rop~‘)= Vi. A sy~et~c ar~ment establishes that 
~~~(~~p~“} = Cj. 
By the symmetry in delays of messages sent by processes pi and pj (writing to 
X) to processes pk and PI, there are two possibilities: either val,,(ropk) =.xi and 
val,,(ropj) =xi, or vaZ,,,(ropk) =x; and vaE,,(ropl) =xi. Clearly, the first possibility 
immediately contradicts sequential consistency. On the other hand, the second possi- 
bility contradicts, by object-separability, Proposition 5.5. Thus, in every case, a con- 
tradiction is reached. 0 
We remind the reader that although, apparently, the assumption of at least three 
objects is not explicitly used in the proof of Theorem 5.6, this assumption is necessary 
since it is used in the proof of Proposition 5.5. 
Since linearizability implies sequential consistency, it immediately follows: 
Corollary 5.7. For the approximately synchronized clocks model, in any linearizabke, 
object separable and object symmetric 
cessed by at least four processes, 
implementation of at least three objects ac- 
(/R.cgj + IW,l(S) 2 d + min(20;,u’. 
5.2. Read operat~o~ls 
We prove a lower bound on the worst-case response time for a read operation; 
this applies to any linearizable implementation of read/write objects, under reasonable 
assumptions on the sharing pattern of processes. More specifically, we consider any 
linearizable implementation of read/write objects including one with at least two readers 
and a distinct writer; we show that the worst-case response time for a read operation 
on this object is no less than min{ 6, u}/2. The proof constructs an execution for which 
if read operations are too short, then linearizability can be violated by appropriately 
shifting process’ histories. We show: 
Theorem 5.8. Assume that ~4 is a lirzearizable i)nple~?entafion f rea~~~r~te objects 
i~l~ludin~ an object X with at least two readers and a d~st~~~~t writer. Then, for the 
ap~ro~~rnate~y synchronized clocks model, 
lR.&)/f~) 2 min(b, u)/2. 
Proof. Assume, by way of contradiction, that there exists a linearizable implementation 
~4 of X for which IR.&Y)l(S) < min(6, ~}/2. We construct an admissible S-execution 
of d which is not linearizable. 
Let pi and pj be two processes that read X, and let pk be a process that writes 
X. An informal outline of our proof follows. We start with an execution in which pi 
reads -L from X, then pI and pi alternate reading from X while pk is writing x to X, 
and finally pj reads x from X. Thus, there exists a read operation rope, say by pi, that 
returns _L and is i~e~ately followed by a read operation ropi by pj that returns x. 
If pi’s process history is shifted later by minjb,u}/2, while pj’s process history is 
M. Mavronicolas. D. Roth1 Theoretical Computer Science 220 (1999) 267-319 303 
shifted earlier by min{&u}/2, the result is an execution in which ropl precedes rope; 
in the meanwhile, processes’ clocks are appropriately shifted so that pi and pj still 
“see” the same events occurring at the same local time in the new execution. Since 
ropl returns x, while rope returns I, this contradicts linearizability. We now present 
the details of the formal proof. 
Let b= rlw,(x)l(s)/min{s,u}l. By the serial specification of X, there exists an 
admissible &execution o of d consisting of the following operations at processes 
Pi> Pi, and Pk: - _I 
for each 1, O<l d 6, pi performs a read operation ropj*‘) on X with t,“(ropj*“) = 
1min{&u}; 
for each 1, 0 < 1 d b, pi performs a read operation ropiJ’+‘) on X with t,C(ropiJ’+‘)) = 
1 min{b, U} + min{ 6, u}/2; 
pk performs a write operation Wopk on X with ti(wopk) =min{6, u}/2 and t~al(wopk ) 
x. 
We assume that the message delays in IT are as follows. Each message from p, 
PI, 1 # i, incurs a delay of either d if 1= j or d - min{b, u}/2 if 1 fj; each 
message from pi to pl, 1 #j, incurs a delay of either d - min(6, U} if 1 = i or 
d - min(6, u}/2 if 1 # i; each message from pl to pi or pj, 14 {i,j}, incurs a delay of 
d-min{b, u}/2. Moreover, we assume that all local clocks are perfectly synchronized in 
execution g. 
Fig. 5(a) depicts the execution CJ, where time runs from left to right, each horizontal 
line represents events at a single process and time points that are used in the proof are 
marked at the bottom. 
Since A&’ is linearizable, there exists a legal linearization r of cr such that for each 
MCS process pi, ops(a) 1 1= z 1 1. The following sequence of simple claims describes 
the sequence r. 
Claim 5.9. ropj’) L wopk. 
Proof. Clearly, 
t,‘(roplO’> d ti(ropj”> + IR.&)((@ 
(by definition of IR.JX)l(S)) 
< o + min{ku} 
(by construction of o and assumption on IR.&Y)l(S)) 
= t;(wopk ) 
(by construction of 0). 
(0) fl Hence, by definition of 5, ropi * wopk, which implies, by definition of lin- 
earization, that ropj’) -% wopk, as needed. 0 
We continue by showing: 
304 M. Mavronicolas. D. Roth I Theoretical Computer Science 220 (1999) 267-319 
Pi 
Pj 
TOP!‘) (Zb+li 
I 
3 
I 
p”Pj 
. . . . . . I I 
pk 
I 
0 
I 
I 
1 
WfJPb 
I 
I I I 
2 . . . . . . 2b+ 1 
(a) The execution 0 
Pi 
TO&O) 
- . . . . . . 
TOP!26’ 
- 
Pj 
Top!‘) Top?) tOp(2b+1J 
I 
3 
i I 
I 
I 
1 f..... I 
‘=qk 
Pk I I 
I I I I I 
Time 0 1 2 . . . . . . 2b+l 
(b) The execution ~9 
Fig. 5. The executions 0 and 6’. Time is measured in units of min{fi,u}/2. 
(2/J+ I ) Claim 5.10. wopk & ropj . 
Proof. Clearly, 
t;(WOPk) < f,c(WOPk) + iwcd(x>l(@ 
(by definition of IW&X)l((s)) 
d 
min{ 6, ZL} 
2 
+ bmin(6,u) 
(by construction of u and definition of b) 
= t~(ro~j.2b+‘)) 
(by construction of 0). 
(2b+l) 
Hence, by definition of ‘-t, Wopk 5 rap, , which implies, by definition of lin- 
eariZatiOn that Wopk 5 ropj (*‘+‘), as needed. 0 
For each Y, 0 < r < 2b + 1, let rap(‘) = ropi” if r is even, or rap::) if r is odd. We 
show: 
M. Mavronicolas, D. Roth I Theoretical Computer Science 220 (1999) 267-319 305 
Claim 5.11. For each r, 0 d Y < 2b, rap(‘) A rop(‘+‘). 
Proof. Clearly, for any Y, 0 d Y d 2b, 
t’(rop”‘) d tc(rop(‘)) + IR.&T)l(S) f7 
(Ly definition of IR&X>l<S)> 
< t’(rop”‘) + 
min{ 6, U} 
0 2 
(by assumption on IR.#)l(S)) 
= t;(rop(‘+l) _ min(26, u] + miny3 u) 
(by construction of a) 
= t,c(rop(“‘)). 
Hence, by definition of 5, hop 2 ~op(“+‘), which implies, by definition of 
linearization, that rap(‘) 5 rop(‘+‘), as needed. 0 
It follows by Claims 5.9-5.11 that there exists an index ~0, 0 < ro < 26, such that 
YOp(roO) it Wopk i, rap (ro+‘). Since r is a legal sequence of operations, it follows 
that ual,(rop(m))= I and val,(rop @o+‘)) =x. Assume, without loss of generality, that 
ro is even, so that rop(‘“) is a read operation by process pi. 
We now show how to “perturb” 0 to obtain another admissible S-execution 0’ of d 
that is not linearizable. Define the real vector s = (so, ~1,. . , s,-1) as follows. For each 
index I E [n], SI is equal to -min{6, u}/2 if I = i, min{d, u}/2 if 1 =j, and 0 otherwise. 
Then, g’=shzft(o,s) with clocks r’=shzft(r,s). That is, each event at process pi that 
occurs at real time t in 0 will occur at real time t - min(6, u}/2 in (T’, each event at 
process pj that occurs at real time t in cr will occur at real time t - min(6, u}/2 in r#, 
and times at events at all other processes remain unchanged; pi’s local clock is shifted 
backward by min{6,u}/2, pi’s local clock is shifted forward by min(6, u}/2, and all 
other clocks remain unchanged. The execution cr’ is depicted in Fig. 5(b), using the 
same conventions as in Fig. 5(a). 
By Lemma 2.11, it follows that 
Lemma 5.12. 0’ is an execution of d with clocks r’ that is equivalent to 0 with 
clocks r. 
We proceed to show: 
Lemma 5.13. 0’ is an admissible &execution of d. 
Proof. Since all local clocks are perfectly synchronized in execution 0 and 
I IIs/lmax - IISll,inl = i,iny,“’ _ (_min~‘“‘) 1 =2miny’u) =min{d,u} < 6, 
Lemma 2.13 immediately implies that G’ is a &execution. We continue to show that 
all delays are in the range [d - u,d]. Fix any MCS processes pc and pm. Let drm be 
the delay of any message m from pi to pm in g; By Lemma 2.12, the delay dllrn of 
m in CJ’ is dj, + sl - s,. Clearly, if I@ {i,j} and m $ {i,j}, so that SI =sm = 0, then 
d’,, = d,,. We proceed to consider all remaining cases. 
1. .Assume that I= i and m =j, so that d lrn = d, sl= -min{&,u}/2, and s,=min{6, u}/2. 
Then, d’,, =d - min(6,u). 
2. 
3. 
4. 
5. 
6. 
Assume that E = j and m = i, so that dl, = d - min{ 6, u}, sf = min{ 6, u},&!, and sm = 
- min(6, ti},/2. Then, d’,, = d. 
Assume that I = i and m # j, so that d fin = d - min(6, u)/2, sl = - min{b, u}j2, and 
s, = 0. Then, d;, = d - min{S, u}. 
Assume that I = j and m # i, so that dl, =d - min(&u}/2, sI =min(&u}/2, and 
s, = 0. Then, djm = d. 
Assume that m = i and 1 #j, so that d,, = d - min{6,u}/2, S[ =O, and s,~ = - 
min(6, u}/2. Then, d;, = d. 
Assume that m = j and 1 #i, so that dl,,, = d - min{&u}/2, sI = 0, and s, --. 
min{&u}/2. Then, d’,,=d - min{J,u}. 
Since d - u G d - min(6,u) d d, it follows that in all cases dim E [d - u,d], as 
needed. This completes the proof that o’ is an admissible &execution of ~4. El 
Since d is linea~~ble, there exists a Legal linearization z’ of CT’ such that for each 
MCS process pl, ops(d) II= z’ / I. We show: 
Claim 5.14. rop(ro”l) L rop(ro) 
Proof. Clearly, 
t,‘~(rup~“+‘)) < t,C,(r~p(‘~+‘)) + /R&X)1(6) 
(by definition of IR,&Y)/(6)) 
(by assumption on jR.&X)l(6)) 
= @.opPO+U _ min{&u} 
2 
-t min{d,u) 
2 
(by construction of 0’) 
= t,c(rop(Q0+‘)) 
= tyrup(‘q + 
min{ 6, U} 
* 2 
(by construction of CT) 
= tic, (asp) 
(by cons~~tion of a’), 
Hence, by definition of 5, ropf’o+r) L YC&‘~), which implies, by definition of 
linearization, that ~op@o+‘) A rop(‘~), as needed. 0 
However, Lemma 5.12 implies that val,t(rop @oil)) =x and val,~(rop(‘O)) = -L; since 
r’ is a legal operation sequence, this implies that rop(‘O) 2 rop(‘~+‘). A contradiction. 
0 
We remark that the general structure of the proof of Theorem 5.8 follows the one of 
[IS, Theorem 3.11 showing a lower bound of u/4 for the imperfect clocks model. How- 
ever, due to the more delicate timing assumptions in the approximately synchronized 
cIocks model, our proof has required more careful timing arguments. Our improvement 
over [15, Theorem 3. l] is achieved by carefully choosing message delays in shifting 
process histories. 
5.3. Write operations 
We finally show that, under reasonable assumptions on the sharing pattern of pro- 
cesses, in any linearizable implementation of read/write objects including one with at 
least two writers and a distinct reader, the worst-case response time for a write op- 
eration is at least min{6,u}/2. The proof constructs an execution for which if write 
operations are too short, then linearizability can be violated by appropriately shifting 
process histories. 
Theorem 5.15. Assume X is an object with at lease two writers and a distinct reader. 
Then, for the approximately synchronized clocks model, in any linearizable implemen- 
tation s? of X, (&(X)1(G)>min{G,u}/2. 
Proof. Let pi and pj be two processes that write X, and let pk be a process that 
reads X. Assume, by way of contradiction, that there exists a linea~zable implemen- 
tation s!’ of X for which IW~(X)I(B)<min{b:u}/Z. We construct an admissible 6- 
execution of .d that is not iinearizable. 
An informal outline of our proof follows. We start with an execution in which pi 
writes x; to X, then pj writes ZCi to X, and finally p& reads Xj from X. If pi’s process 
history is shifted later by min{ 6, u}/2, while Pj’S process history is shifted earlier by 
min{d, u}/2, the result is an execution in which the write operation by pj precedes 
the write operation by pi, while Pk still “sees” the same events occurring at the same 
local time; thus, p& still reads Xj from X, which contradicts linearizability. We now 
present the details of the formal proof. 
By the serial specification of X, there exists an admissible, synchronized execution 
0 of .& consisting of the following operations at processes pi, pj and pk: 
l pi performs a write operation wopi on X with t~(~opi)=O and vaZ(wopj)=x.i; 
308 M. Mavronicolas, D. Roth I Theorrtical Computer Science 220 (1999) 267-319 
l pj performs a write operation wopj on X with tz(wopj) =min{6,u}/2 and 
ZlUl(WOpj)=Xj; 
0 pk performs a read operation ropk on X with $(ropk) = min{b, u}. 
We assume that the message delays in g are as follows. Each message from pi 
to pl, 1 # i, incurs a delay of either d if I = j or d -min{b, u}/2 if I # j; each message 
from pj to PI, 1 # j, incurs a delay of either d - min(6,u) if 1= i or d - min{&u}/2 
if 1 # i; each message from pi to pi or pj, I 4 {i, j}, incurs a delay of d - min{ 6, u}/2. 
Moreover, we assume that all local clocks in r are perfectly synchronized. 
Fig. 5(a) depicts the execution 0, where time runs from left to right, each horizontal 
line represents events at a single process and time points that are used in the proof are 
marked at the bottom. 
Since ,d is linearizable, there exists a legal linearization r of g such that for each 
MCS process pi, ops(a)jZ = 7IZ. The following sequence of simple claims describes the 
sequence r. 
Claim 5.16. WOpi L WOpj. 
Proof. Clearly, 
t;(woPi) < t:(woPi) + IV, (X)1(@ 
(by definition of IW~d (X)1(S)) 
< 0 + +min{&u} 
(by construction of g and assumption on l&(X)l(S)) 
= tz(WOpi) 
(by construction of 0). 
Hence, by definition of 5, wopi A wopj, which implies, by definition of lineariza- 
tion, that wopi 5 wop], as needed. 0 
We continue to show: 
Claim 5.17. WOpj A rOpk. 
Proof. Clearly, 
tL(WOPj) d tg(WOPj) + IWd Cx)l(S> 
(by definition of I& (X)l(S)) 
< 0 + imin(6,u) 
(by construction of 0 and assumption on II+& (X)((6)) 
= tz(rOpk) 
(by construction of a). 
M. Muvronicolrs, D. Roth I Theoreticul Computer Science 220 (I 999) 267-319 309 
WqPi 
Pi t I 
*oPj 
% I 1 
Pb 
ropk 
I I 
I 
Time 0 
I I 
1 2 
(4 Th e execution (J 
Wopi 
Pi I 1 
WWj 
Pi t I 
Pk 
I 
Time 0 
I 
1 
ropk 
t I 
I 
2 
(b) The execution u’ 
Fig. 6. The executions cr and a’. Time is measured in units of min{ 6, u}/2. 
Hence, by definition of L,wopj 4 To&, which implies, by definition of lineariza- 
tion, that wopi A ropk, as needed. 0 
Since z is a legal operation sequence, it follows by Claims 5.16 and 5.17 that 
~az~(r~pk ) = UU~~(W~pj) = Xj. 
We now show how to “perturb” 0 to obtain an admissible &execution G’ of .& 
that is not linearizable. Define the real vector s = (sO,sl,. , , ,s,,_,) as follows. For each 
index 1 E [n}, s[ is equal to -min{6,24}/2 if 1= i, min{b, u}/2 if 1 =j and 0 otherwise. 
Then, 0’ = slzif(rs,s) with clocks r’ = shz~(r,s). That is, each event at process pi that 
occurs at real time t in g will occur at real time t + min{S, u)/2 in C-J’, each event at 
process pi that occurs at real time t in CT will occur at real time t - min(6, u)/2 in c’, 
and times’of events at all other processes remain unchanged; pi’s local clock is shifted 
backward by min{6,u}/2, pj’s local clock is shifted forward by min{&u}/2, and all 
other clocks remain unchanged. The execution CT’ is depicted in Fig. 6(b), using the 
same conventions as in Fig. 6(a). By Lemma 2.1 I, it follows that: 
Lemma 5.18. CT’ is an execution with clocks r’ that is equivalent o r~ with clocks r. 
We proceed to show: 
Lemma 5.19. CT’ is an ud~zissi~~~ &execution qf .d. 
Proof. Since all local clocks are perfectly synchronized in execution G and 
Lemma 2.13 immediately implies that CJ’ is a S-execution. We continue to show that 
all delays are in the range [d-u,d]. Fix any MCS processes pt and pm. Let dl, be 
the delay of any message m from pi to pm in cr. By Lemma 2.12, the delay d;,, of 
m in B’ is dl,,, + st - sm. Clearly, if I ${ij} and m q! {i,j}, so that SI =sm = 0, then 
dim =df,, and d;m is in the range [d - u,d] since di, is. We proceed to consider all 
remaining cases. 
1. 
2. 
3. 
4. 
5. 
6. 
Assume that I = i and m = j, so that dl,pl =d, sl= -min{d,u}/2, and s, = 
min{d, u}/2. Then, d;, = d - min(6, u}. 
Assume that I =j and m = i, so that d,M =d - min(6, u}, sI =min{&u}/2, and 
s, = -*in{& u}/2. Then, d)* = d. 
Assume that I = i and m fj, so that dl, = d - min{d, u}/2, sl = -min{ 6, u}/2, and 
s, =O. Then, di, =d-min{d,u}. 
Assume that I = j and m # i, so that di, = d - mint& w}/2, sf = min(6, u}/Z, and 
s, = 0. Then, di,,, = d. 
Assume that m = i and 1’ # j, so that dir,, = d - *in{& ~}/2, SI = 0, and s, = 
-*in{&, u}/2. Then, d{, = d. 
Assume that m = j and I # i, so that dl, =d - min{6,u}/2, sl = 0, and s, = 
min(6, u}/2. Then, d’,,,r = d - min(b, u}. 
Since d - u G d - min{&u} 6 A, it follows that in all cases dl,, E [d - u,d]. This 
completes the proof that 6’ is an admissible S-execution of <_QI. 0 
Since d is linearizable, Lemma 5.19 implies that there exists a legal linearization 
r’ of cr’ such that for each MCS process pt, ops(cr’)ll = ~‘11. The following sequence 
of simple claims describes the operation sequence r’. 
Proof. Clearly, 
$(wopj) G $$(nQPi) + IW,(X)l(@ 
(by definition of IW&Y)l(S)) 
< t$(wopi) + $min{&u} 
(by assumption on ]W&Y)](S)) 
= ~~(~0~~) - imin(6,tl) + imin{&u} 
(by const~ction of CJ’) 
= tg(WOpi) 
M. Mavronicohs. D. Roth I Theoretical Computer Science 220 (1999) 267-319 311 
= $(wop;) + ~min{&u} 
(by construction of a) 
= tz/(WOJ7i) 
(by construction of a’). 
0’ 0’ 
Hence, by definition of -, wopj --+ wop,, which implies, by definition of lin- 
earization, that wopi i wopi, as needed. 0 
We continue to show: 
d 
Claim 5.21. wop, d ropk. 
Proof. Clearly, 
&(WOPi) d $(WOPi) + IWd(X)l(@ 
(by definition of ]W,(X)l(S)) 
< t$(wopi) + +min{&u} 
(by assumption on IW,(X)l(S)) 
= tz(WOpi) - imin(6,u) + kmin(6,u) 
(by construction of 0’) 
= t~(WOpi) 
= tz(rOpk) + imin(6,u) 
(by construction of a) 
= t;, (ropk ) 
(by construction of a’). 
0’ u1 
Hence, by definition of +, wop; + ropk, which implies, by definition of lin- 
earization, that wop; i ropk, as needed. 0 
Since r is a legal operation sequence, it follows by Claims 5.16 and 5.17 that 
nal,,(ropk) = UZl(WOpj) = Xj. Since Z ’ is a legal operation sequence, it follows by 
Claims 5.20 and 5.21 that val,/(ropk) = val(ropi)=xi. However, Lemma 5.18 implies 
that val,,(ropk) = va/,(ropk). A contradiction. 0 
We remark that the general structure of the proof of Theorem 5.15 follows the one of 
[ 15, Theorem 3.21 showing a corresponding lower bound of u/2 for the imperfect clocks 
model. However, due to the more delicate timing assumptions in the approximately 
synchronized clocks model, our proof has required more careful timing arguments. 
We can show that the algorithm in Theorem 3.1 for the perfect clocks model still 
works for the imperfect clocks model, and, hence, for the approximately synchronized 
clocks model too, if there are either a single reader and more than one writers or a 
single writer and more than one readers. (See [ 15, Section 3.1. l] for a correspond- 
ing observation.) This implies that the assumptions about the numbers of readers and 
writers made in Theorems 5.8 and 5.15, respectively, are necessary. 
6. Imperfect clocks 
In this section, we state our upper and lower bounds for the imperfect clocks model. 
6.1. Upper bound 
Fix any arbitrarily small constant t: subject to the constraint 0 < E <min{2u,d - u}. 
Since the imperfect clocks model can be simulated by the approximately synchronized 
clocks model with 6 = U, Theorem 4.1 immediately implies: 
Theorem 6.1. For the imperfect clocks model, there exists a linearizable implementa- 
tion d”r of read/write objects that achieves IR.+P,P [(cc) <pd +4u+ E and IW&\ J(W) 
<( 1 - P)d + 314 for any constant B such thut 06 /S < 1 - u/d. 
6.2. Lower bo~n~ 
Theorem 5.6 immediately implies: 
Theorem 6.2. For the imperfect clocks model, in uny sequentially consistent, object- 
separuble and object-symmetric implementation d of at least three objects accessed 
by at least ,four processes, 
Since linearizability implies sequential consistency, Theorem 6.2 immediately im- 
plies: 
Corollary 6.3. For the imperfect clocks model, in any Iinearizabk, objects-separable 
any object-symmetric implementation d of at least three objects accessed by at least 
four processes, 
(F-b + IW,l)(@ 2 d + f. 
Theorem 5.8 immediately implies: 
Theorem 6.4. Assume X is an object with at least two reffders and a distjnct writer. 
Then, for the in~perfe~t clocks mode/, in uny lineurizable implementation ST! 
ofX IRn(X)l(~) 2 u/2. 
M. Mavronicolas. D. Roth I Theoretical Computer Science 220 (1999) 267-319 313 
Finally, Theorem 5.15 immediately implies: 
Theorem 6.5. Assume X is an object with at least two writers and a distinct reader. 
Then, for the imperfect clocks models, in any linearizable implementation x2 
of x IW,(Wl(~) 3 42. 
7. Discussion and future research 
In this section, we provide a review of our results, a survey of related work, and 
directions for further research. 
7.1. Review 
We have shown a collection of lower and upper bounds for linearizable implementa- 
tions of shared memory consisting of read/write objects, in models of perfect, imperfect, 
and approximately synchronized clocks. For the perfect clocks model, we presented a 
parameterized linearizable implementation, achieving worst-case response times of fld 
and (1 - b)d for read and write operations, respectively, where fl is a trade-off parame- 
ter, 0 d fl d 1. For the approximately synchronized clocks model, our linearizable im- 
plementation achieves worst-case response times of less than /Id+3u+min{delta, u} fc, 
and of ( 1 - p)d + 324 for read and write operations, respectively, where E > 0 is an ar- 
bitrarily small constant. For the approximately synchronized clocks model, we also 
showed a lower bound of d + min(6, u}/2 on the sum of these worst-case response 
times, assuming certain symmetry properties for the implementations, and a lower 
bound of min(6, u}/2 on the worst-case response times for read and write operations. 
Although there remains a gap between our upper and lower bounds for the approx- 
imately synchronized clocks model, we feel that our work substantially answers the 
question of how the time requirements for read and write operations depend on the 
timing uncertainties of this model, as measured by the parameters d and u and 6. 
In particular, we have shown that only a single “long communication” (i.e., a com- 
munication requiring time d) is required for both read and write operations, and this 
communication cannot be avoided. 
This paper continues the complexity-theoretic study of the cost of implementing 
memory objects in a message-passing system, under various correctness conditions and 
timing assumptions, which was initiated in [lo, 15,411. Although our model ignores 
several important practical issues, like, e.g., limitations on local memory size, clock 
drift, and “hot spots”, we believe that our algorithms can be adapted to work in more 
realistic systems. We also believe that our results contribute to the understanding of 
the fine and intrinsic relation between sequential consistency and linearizability. 
314 M. Mavronicohs, D. Roth I Theoretical Computer Science 220 (I 999) 267-319 
7.2. Related work and comparison 
In this section, we review works that present time bounds for message-passing im- 
plementations of read/write objects under sequential consistency and linearizability. As 
those works are directly related to our work, we comment in detail on the relation 
between the results they provide and our results. 
7.2.1. Attiya and Welch [15] 
For the perfect clocks model, Attiya and Welch [15, Theorems 3.2 and 3.31 present 
a fast read linearizable implementation that guarantees time 0 for a read and time d for 
a write, and another fast write linearizable implementation that guarantees the reverse. 
These implementations are the special cases of our implementation where B = 0 and 
fl= 1, respectively. Both the implementations in [15] and ours rely heavily on using 
timers and use messages that carry explicit timing information. 
Attiya and Welch next consider the imperfect clocks model; they present 
[15, Theorems 4.5 and 4.61 a sequentially consistent implementation that guarantees 
time 0 for a read and time 2d for a write, and another sequentially consistent imple- 
mentation that guarantees the reverse. Both of these implementations use as a subroutine 
a fast atomic broadcast algorithm they devise, but their modularity allows the use of 
any atomic broadcast algorithm like e.g., the one in [19]. They also show lower bounds 
of u/4 and u/2 for read and write operations, respectively. 
7.2.2. Attiya and Friedman [ 121 
Attiya and Friedman [ 121 introduced a new hybrid condition for shared memory mul- 
tiprocessors, called hybrid consistency, which combines the expressiveness of strong 
consistency conditions, like, e.g., sequential consistency and linearizability, and the ef- 
ficiency of weak consistency conditions, like, e.g., pipelined RAM [41] and causal 
memory [7]. In hybrid consistency, memory access operations are classified as either 
weak or strong. Attiya and Friedman defined two versions of hybrid consistency, one 
based on sequential consistency and another based on linearizability; they presented a 
completely asynchronous message-passing implementation of hybrid consistency based 
on linearizability allowing for instantaneous weak operations while the response for 
strong operations is linear in the network delay d. 
7.2.3. Chaudhuri, Gawlick and Lynch [20] 
Building on our work, Chaudhuri et al. [20, Section 61 show how to simulate our 
algorithm for the perfect clocks model (Section 3) in the imperfect clocks model 
and obatin a linearizable algorithm for that model which is simpler than ours. Their 
algorithm achieves worst-case response times of u + c and d + u - c for read and write 
operations, respectively, where c is a trade-off constant between 0 and d. For purpose 
of comparison, set c = /?d, where 0 d b d 1, so that these bounds can be written as 
bd + u and (1 - P)d + u, respectively. These bounds are more tight than ours in terms 
of the number of additive multiples of the message delay uncertainty U. However, 
M. Mavronicolas, D. Roth1 Theoretical Computer Science 220 (1999) 267-319 315 
since our algorithm for the perfect clocks model uses messages that carry explicit 
timing information, the simulation algorithm in [20] does so too; in this aspect, the 
simulation algorithm in [20] is inferior to a fairly obvious modification of our algorithm 
for the imperfect clocks model that uses messages of bounded size. Furthermore, we 
believe that the time-slicing technique used by our main algorithm not only provides 
more insight into the inherent difficulties of implementing linearizability in message- 
passing environments, but will also prove useful to efficiently solving other problems 
in distributed computing. 
7.2.4. Kosa [37] 
Kosa [37] considers the worst-case response time for operations on abstract data 
types and studies the combined effect of the amount of synchrony, the strength of the 
consistency guarantee and algebraic properties of the operations on this response time. 
For a wide variety of algebraic properties, Kosa extends the following results, already 
shown for read/write objects by results in this paper and in [ 151: 
l sequential consistency and linearizability are equally costly in the perfect clocks 
model; 
l linearizability is more expensive in the imperfect clocks model than in the perfect 
clocks model ([37, Theorems 4.1 and 4.21 are shown using the same techniques as 
Theorems 5.15 and 5.8, respectively, in this paper); 
l sequential consistency is cheaper than linearizability in the imperfect clocks model. 
For sake of completeness and comparison, we summarize in Table 1 our main results 
and other related results known to us that provide similar bounds for message-passing 
implementations of sequential consistency and linearizability. 
7.3. Future research 
Our work leaves open several interesting questions. Most obviously, it would be 
interesting to see if our bounds for the approximately synchronized clocks model, and, 
hence, for the imperfect clocks model, can be further improved. (Partial improvements 
have been presented in [20] for the case of upper bounds for the imperfect clocks 
model; see Section 7.2 for a description.) 
Our results assume that clocks are available to processes; what if processes have no 
timing information at all and computations are completely asynchronous? What is the 
tightest coefficient of d bounding /RI + ] WI for sequentially consistent or linearizable 
implementations of read/write objects in this case? Also, it will be very interesting 
to obtain bounds on the worst-case response times of implementing other memory 
objects like, e.g., atomic snapshots [3], under sequential consistency and linearizability. 
How does strengthening of the shared memory primitives affect the worst-case response 
times? (Partial answers have been provided by Friedman [25].) 
It would be interesting to examine the benefits of using timing information for im- 
plementing hybrid consistency [12] and see how the time requirements for performing 
weak and strong operations depend on the timing uncertainties of the model studied in 
316 M. Muvronieokus. Q. Roth I Theoretical Computer Scimw 220 (1949) 267-319 
Table 1 
Summary of time bounds for message-passing implementations of read/write objects under sequential consis- 
tency and ~inea~~bili~. Results marked by * are shown in this paper; references for other results are also 
given. An arrow T (resp., 1) indicates that the result follows from the corresponding result for the stronger 
(resp., weaker) timing model 
Timing 
model 
Correctness 
condition 
cost 
measure 
Lower bounds Upper bounds 
Perfect Sequential 
clocks consistency, 
(&GO) linearizability 
Approximately 
synchronized 
clocks 
(O<S<m,u>O) 
Imperfect 
clocks 
(6=c)3, 
u>O) 
No clocks 
Sequential 
consistency 
Linearizability 
Sequential 
consistency 
Linearizability 
Linearizability 
PI 
PI 
IRI+IW 
IR! 
/WI 
/R/+/W 
PI 
IWI 
lR/+IW dftc/2* 
/RI 
IWI 
IRI+IW 
/RI 
WI 
/RI+/W 
PI 5d [I21 
WI 5d [12] 
P/+/WI 10d [12] 
0 
0 
d[15,41] 
0 
0 
7 
mini&, u}/2* 
min{&u}/2* 
0 
0 
T 
t 
u/4 ll51 
T (also in [15]) 
t 
@d*,O < fi < 1 
(@=O and 1 also in [15]) 
(I - B)d*,O G/i G 1 
(/j=O and 1 also in [15]) 
d(als0 in[ 151) 
1 
1 
I 
bd+3u+ min{& u}+c* 
O<fl<l-u/dand 
O<c<min{2u,d-21) 
1 
(l-/Qd+3u* 
O<fi< I-u/d 
d+6u+min{&u}+s*, 
O<a<min{tu,d-u} 
1 
0 [I51 
0 [I51 
2d [I51 
/!d+4u+r:, * 
O<fi<I-u/d and 
Otcbmin{Zu,d-u} 
fid+u,O<b<l [20] 
(1 -B)d+3u* 
O<fl< l-u/d 
(l-jQd+u,O<:p<l [20] 
d+7u&, 
Oi~<min{&d-u} 
df2u [20] 
- 
this paper. Another interesting open question related to hybrid consistency is whether 
hybrid consistency based on sequential consistency allows for more efficient implemen- 
tations than hybrid consistency based on linearizability. Lower and upper bounds shown 
M. Mawonicolas, D. Roth I Theoretical Computer Science 220 (1999) 267-319 317 
in [12] imply that as far as fast implementations are considered, i.e., implementations 
for which the response times for weak read and weak write operations are both strictly 
less than d/2, there is no significant improvement in performance for hybrid consis- 
tency based on sequential consistency over hybrid consistency based on linearizability. 
However, results in this paper and in [ 151 suggest that a higher gain in performance 
might be possible if the implementation is not required to be fast. 
A wide avenue for further research suggested by our work is the study of the costs of 
implementing sequentially consistent and linearizable objects in the presence of partial 
.synchrony. The assumption of clocks that advance at the same rate, that of real time, 
is crucial for the results in this paper. It would be interesting to see what might be 
achieved if there were a known bound on the relative speeds of processors’ clocks, 
or if no such bound existed. Some preliminary steps in this direction have been taken 
by Eleftheriou and Mavronicolas [22], in the context of the drifting clocks model and 
under different assumptions on message delays. 
For additional work on memory consistency conditions and related issues of complex- 
ity, implementations, performance, verification and programming, the reader is referred 
to a substantial body of recent research [l, 2, 5-9, 11-14, 24-29, 33-35, 37, 40, 42, 
43, 50, 531. 
Acknowledgements 
Our work has been inspired and heavily influenced by the earlier pioneering work of 
Hagit Attiya and Jennifer Welch on a quantitative comparison of sequential consistency 
and linearizability [ 10, 151 under various timing assumptions. In particular, we would 
like to thank Hagit Attiya for making early versions of [ 10,151 available to us and 
for helpful discussions. We owe special thanks to Harry Lewis for conjecturing the 
existence of implementations for the perfect clocks model falling between the extreme 
ones of Attiya and Welch [ 151; these implementations subsequently led us to discover 
those for the approximately synchronized clocks model that trade the network latency 
cost between read and write operations. We are also thankful to Soma Chaudhuri, Maria 
Eleftheriou, Maurice Herlihy and Nancy Lynch for helpful discussions and comments, 
and to Roy Friedman, Martha Kosa, Yishay Mansour and the WDAG’92 Program 
Committee members and referees for their comments on earlier versions of our papers. 
References 
[I] S. Adve, M. Hill, A unified formalization of four shared-memory models, IEEE Trans. Parallel D&rib. 
Systems 4 (6) (1993) 613-624. 
[2] S. Adve, K. Gharachorloo, Shared memory consistency models: a tutorial, Computer 29 (12) ( 1996) 
66-76. 
[3] Y. Afek, H. Attiya, D. Dolev, E. Gafni, M. Merritt, N. Shavit, Atomic snapshots of shared memory, 
J. ACM 40 (4) (1993) 873-890. 
[4] Y. Afek, G. Brown, M. Merritt, Lazy caching, ACM Trans. Programming Languages Systems 15 (I ) 
(1993) 182-205. 
318 M. Mavronicolus, 1). Roth I Theoreticul Computer Scienw 220 (I 999 j X7-31 9 
[5] D. Agrawal, M. Choy, H.V. Leong, A.K. Singh, Mixed consistency: a model for parallel programming, 
Proc. 13th Annu. ACM Symp. on Principles of Distributed Computing, August 1994, pp. 101-l 10. 
[6] M. Ahamad, R. Bazzi, R. John, P. Kohli. G. Neiger, The power of processor consistency, Proc. 5th 
Annu. ACM Symp. on Parallel Algorithms and Architectures, June/July 1993, pp. 251-260. 
[7] M. Ahamad, J. Bums, P. Hutto, G. Neiger, Causal memory, Distrib. Comput. 9 (1995) 37-49. 
[8] M. Ahamad, P. Hutto, R. John, Implementing and programming causal distributed shared memory, Proc. 
11 th Intemat. Conf. on Distributed Computing Systems, May 1991, pp. 274-281. 
[9] R. Alur, K. McMiIlan, D. Peled, Model-checking of correctness conditions for concurrent objects, Proc. 
I lth Annual IEEE Symp. on Logic in Computer Science, July 1996, pp. 219-228. 
[IO] H. Attiya, Implementing FIFO queues and stacks, in: S. Toueg, P.G. Spirakis, L. Kirousis (Eds.), Proc. 
5th Intemat. Workshop on Distributed Algorithms (WDAG’91), Lecture Notes in Computer Science, 
vol. 579, Springer, Berlin, 1991, pp. 80-94. 
[I 11 H. Altiya, S. Chaudhuri, R. Friedman, J.L. Welch, Shared memory consistency conditions for non- 
sequential execution: definitions and programming strategies, SIAM J. Comput. 27 (I ) (1998) 65-89. 
[ 121 H. Attiya, R. Friedman, A correctness condition for high-performance multiprocessors, Proc. 24th Annu. 
ACM Symp. on Theory of Computing, May 1992, pp. 679-690. 
1131 H. Attiya, R. Friedman, Programming DEC-alpha based muitiprocessors the easy way. Proc. 6th Annu. 
ACM Symp. on Parallel Algorithms and Architectures, June 1994, pp. 157-166. 
[14] H. Attiya, R. Friedman, Limitations of fast consistency conditions for distributed shared memory, Inform. 
Process. Lett. 57 (5) (1996) 243-248. 
[15] H. Attiya, J.L. Welch, Sequential consistency versus iinearizability. ACM Trans. Comput. Systems 12 
(2) (1994) 91-122. Preliminary version: Proc. 3rd Annu. ACM Symp. on Parallel Algorithms and 
A~hitectures, July 1991, pp. 304-315. 
1161 H. Attiya, J.L. Welch, Distributed Computing: Fundamentals, Simulations and Advanced Topics, 
McGraw-Hill, New York, 1998. 
[ 171 H. Bal, M. Kaashoek, A. Tanenbaum, Orca: a language for parallel programming of distributed systems, 
IEEE Trans. Software Eng. 18 (3) (1992) 190-205. 
[18] P. Bernstein, V. Hadzilacos, H. Goodman, Concurrency Control and Recovery in Database Systems, 
Addison-Wesley, Reading, MA, 1987. 
[ 191 K. Birman, T. Joseph, Reliable communication in the presence of failures, ACM Trans. Comput. Systems 
5 (1) (1990) 47776. 
[20] S. Chaudhuri, R. Gawlick, N. Lynch, Designing algorithms for distributed systems using partially 
synchronized clocks, Proc. 12th Annu. ACM Symp. on Principles of Distributed Computing, August 
1993, pp. 121-132. 
f2l] B. Coan, Ci. Thomas, Agreeing on a leader in real time, Proc. I lth IEEE Real-Time Systems Symp., 
December 1990, pp. 166-172. 
[22] M. Eleftheriou, M. Mavronicoias, Linea~~bility in the presence of partial synchrony and under different 
delay assumptions, Preprint, Department of Computer Science, University of Cyprus, February 1998. 
[23] Encore 91 Series Technical Summary, 1991. 
[24] A. Fekete, F. Kaashoek, N. Lynch, Providing sequentially consistent shared objects using group and 
point-to-point communication, J. ACM 45 (I ) (I 998) 35-69. 
[25] R. Friedman, Implementing high-level synchronization operations in hybrid consistency, Distrib. Comput. 
9 (3) (1995) 119-129. 
[26] K. Gha~chorloo, P. Gibbons, Detecting violations of sequential consistency, Proc. 3rd Annu. ACM 
Symp. on Parallel Algorithms and Architectures, June 1991, pp. 316-326. 
[27] P. Gibbons, E. Korach, On testing cache-coherent shared memories, Proc. 3rd Annu. ACM Symp. on 
Parallel Algorithms and Architectures, June 1994, pp. 177-188. 
[28] P. Gibbons, M. Merritt, Specifying non-blocking shared memories, Proc. 3rd Annu. ACM Symp. on 
Parallel Algorithms and Architectures, July 1992, pp. 292-303. 
1291 P. Gibbons, M. Merritt. M. Charachorloo. Proving sequential consistency of high-performance shared 
memories, Proc. 3rd Annu. ACM Symp. on Parallel Algorithms and Architectures, June 1991, 
pp. 292-303. 
[30] J. Halpem, N. Megiddo, A. Munshi, Optimal precision in the presence of uncertainty, J. Complexity 1 
(1985) 170-196. 
1311 M. Herlihy, Wait-free implementations of concurrent objects, Proc. 7th Annu. ACM Symp. on Principles 
of Distributed Computing, August 1988, pp. 276-290. 
M. Mavronicolas, D. Roth I Theoretical Computer Science 220 (1999j 267-319 319 
[32] M. Herlihy, J. Wing, Linearizability: a correctness condition for concurrent objects, ACM Trans. 
Programming Languages Systems 12 (3) (1990) 463-492. 
[33] P. Hutto, M. Ahamad, Slow memory: weakening consistency to enhance concurrency in distributed 
shared memories, Proc. 10th Intemat. Conf. on Distributed Computing Systems, May 1990. 
pp. 302231 I. 
[34] M. moue, T. Masuzawa, N. Tokura, Efficient linearizable implementation of shared FIFO queues and 
general objects on a distributed system, IEICE Trans. Fundamentals of Electronics, Communications, 
and Computer Sciences E81A (5) (1998) 768-775. 
[35] J. James, A.K. Singh, Fault tolerance bounds for memory consistency, in: M. Mavronicolas, 
Ph. Tsigas (Eds.), Proc. 1 lth Intemat. Workshop on Distributed Algorithms (WDAG-97) Lecture 
Notes in Computer Science, vol. 1320, Springer, Saarbriicken, Germany, September 1997, pp. 200-214. 
[36] H. Kopetz, W. Ochsenreiter, Clock synchronization in distributed real-time systems, IEEE Trans. 
Computers C-36 (8) (1987) 933-939. 
[37] M.J. Kosa, Making operations of concurrent data types fast, Proc. 13th Annu. ACM Symp. on Principles 
of Distributed Computing, August 1994, pp. 32-41. 
[38] L. Lamport, How to make a multiprocessor computer that correctly executes multiprocess programs, 
IEEE Trans. Computers C-28 (9) (1979) 690-691. 
[39] L. Lamport, On interprocess communication, parts I and II, Distrib. Comput. 1 (2) (1986) 777101. 
[40] L. Lamport, How to make a correct multiprocess program execute correctly on a Multiprocessor, DEC 
SRC Research Report # 96, February 14, 1993. 
[41] R. Lipton, J. Sandberg, A scalable shared memory, Technical Report CS-TR-180-88, Princeton 
University, September 1988. 
[42] B. Liskov, Practical uses of synchronized clocks in distributed systems, Distrib. Comput. 6 (1993) 
211-219. 
[43] V. Luchangco, Precedence-based memory models, in: M. Mavronicolas, Ph. Tsigas (Eds.), Proc. 
11 th Internat. Workshop on Distributed Algorithms (WDAG-97) Lecture Notes in Computer Science. 
vol. 1320, Springer, Saarbriicken, Germany, September 1997, pp. 215-229. 
[44] J. Lundelius, N. Lynch, An upper and lower bound for clock synchronization, Inform. and Control 
62 (2/3) (1984) 190-204. 
[45] N. Lynch, M. Tuttle, An introduction to input/output automata, CWI Quarterly 2 (3) (1989) 219-246. 
[46] M. Mavronicolas, D. Roth, Sequential consistency and linearizability: read/write objects, Proc. 29th 
Annu. Allerton Conf. on Communication, Control and Computing, October 1991, pp. 683-692. 
[47] M. Mavronicolas, D. Roth, Efficient strongly consistent implementations of shared Memory, in: 
A. Segall, S. Zaks (Eds.), Proc. 6th Intemat. Workshop on Distributed Algorithms (WDAG’92), Lecture 
Notes in Computer Science, vol. 647, Springer, Berlin, November 1992, pp. 346361. 
[4X] J. Misra, Axioms for memory access in asynchronous hardware systems, ACM Trans. Programming 
Languages Systems 8 (1) (1986) 142-153. 
[49] C. Papadimitriou, The serializability of concurrent database updates, J. ACM 26 (4) (1979) 631-653. 
[50] F. Pong, M. Dubois, The verification of cache coherence protocols, Proc. 5th Annu. ACM Symp. on 
Parallel Algorithms and Architectures, June/July 1993, pp. 1 l-20. 
[51] B. Simons, J. Welch, N. Lynch, An overview of clock synchronization. IBM Technical Report RJ 6505. 
October 1988. 
[52] R. Strong, D. Dolev, F. Christian, New latency bounds for atomic broadcast, Proc. 1 1 th IEEE Real-Time 
Systems Symp., December 1990, pp. 1566165. 
[53] R.N. Zucker, J.-L. Baer, A performance study of memory consistency models, Proc. 19th ACM lntemat. 
Symp. on Computer Architecture, May 1992, pp. 2212. 
