CoVer-ability: Consistent Versioning for Concurrent Objects by Nicolaou, Nicolas et al.
CoVer-ability: Consistent Versioning for Concurrent Objects ∗
Nicolas Nicolaou † Antonio Ferna´ndez Anta † Chryssis Georgiou ‡
October 28, 2018
Abstract
An object type characterizes the domain space and the operations that can be invoked on an object of
that type. In this paper we introduce a new property for concurrent objects, we call coverability, that aims to
provide precise guarantees on the consistent evolution of an object. This new property is suitable for a va-
riety of distributed objects including concurrent file objects that demand operations to manipulate the latest
version of the object. We propose two levels of coverability: (i) strong coverability and (ii) weak coverabil-
ity. Strong coverability requires that only a single operation can modify the latest version of the object, i.e.
“covers” the latest version with a new version, imposing a total order on object modifications. Weak cov-
erability relaxes the strong requirements of strong coverability and allows multiple operations to modify the
same version of an object, where each modification leads to a different version. Weak coverability preserves
consistent evolution of the object, by demanding any subsequent operation to only modify one of the newly
introduced versions. Coverability combined with atomic guarantees yield to coverable atomic read/write
registers. We also show that strongly coverable atomic registers are equivalent in power to consensus. Thus,
we focus on weakly coverable registers, and we demonstrate their importance by showing that they cannot
be implemented using similar types of registers, like ranked-registers. Furthermore we show that weakly
coverable registers may be used to implement basic (weak) read-modify-write and file objects. Finally, we
implement weakly coverable registers by modifying an existing MWMR atomic register implementation.
Submission Type: Regular paper.
∗Supported in part by FP7-PEOPLE-2013-IEF grant ATOMICDFS No:629088, Ministerio de Economia y Competitividad grant
TEC2014- 55713-R, Regional Government of Madrid (CM) grant Cloud4BigData (S2013/ICE-2894, co- funded by FSE & FEDER),
NSF of China grant 61520106005, and European Commission H2020 grants ReCred and NOTRE.
†IMDEA Networks Institute, Madrid, Spain, nicolas.nicolaou@imdea.org,antonio.fernandez@imdea.org
‡Dept. of Computer Science, University of Cyprus, Nicosia, Cyprus, chryssis@cs.ucy.ac.cy
ar
X
iv
:1
60
1.
07
35
2v
3 
 [c
s.D
C]
  1
1 M
ar 
20
16
1 Introduction
Motivation and Prior Work. A concurrent system allows multiple processes to interact with a single object
at the same time. A long string of research work [1, 5, 14–16] has been dedicated to explain the behavior of
concurrent objects, defining the order and the outcomes of operations when those are invoked concurrently on
the object. Lamport in [15,16] presented three different incremental semantics, safety, regularity, and atomicity
that characterize the behavior of read/write objects (registers) when those are modified or read concurrently
by multiple processes. The strongest, and most difficult to provide in a distributed system, is atomicity which
provides the illusion that the register is accessed sequentially. Herlihy and Wing presented linearizability
in [14], an extension of atomicity to general concurrent objects. More recent developments have proposed
abortable operations in the event of concurrency [1], and ranked registers [5] that allow operations to abort in
case a higher “ranked” operation was previously or concurrently executed in the system.
With the advent of cloud computing, emerging families of more complex concurrent objects, like files,
distributed databases, and bulleting boards, demand precise guarantees on the consistent evolution of the object.
For example, in concurrent file objects one would expect that if a write operation ω2 is invoked after a write
operation ω1 is completed, then ω2 modifies either the version of the file written by ω1 or a version of the file
newer than the one written by ω1. So is it possible to provide such guarantees using simpler objects as building
blocks?
In existing atomic read/write distributed shared register implementations, write operations are usually al-
lowed to modify the value of the register, even when they are unaware of the value written by the latest preceding
write operation. In systems that assume a single writer [2, 7, 11, 12], the problem may be diminished by having
the sole writer compute the next value to be written in relation to the previous values it wrote. The problem
becomes more apparent when multiple writers may alter the value of a single register concurrently [8, 19]. In
such cases, atomic read/write register implementations appear unsuitable to directly implement objects that
demand evolution guarantees. Closer candidates to build such objects are the bounded [3] and ranked [5] regis-
ters. These objects take into account the “rank” or sequence number of previous operations to decide whether
to allow a read/write operation to commit or abort. These approaches do not prevent, however, the use of an
arbitrarily higher rank, and thus an arbitrarily higher version, than the previous operations. This affects the
consistent evolution of the object, as intermediate versions of the object maybe ignored.
Contributions. In this paper we propose a formalism to extend a concurrent object in such a way that the
evolution of its state satisfies certain guarantees. To this end, we extend an object state with a version, and
introduce the concept of coverability, that defines how the versions of an object can evolve (Section 3).
In particular, we first introduce a new class of a concurrent read/write register type, which we call versioned
register. A concurrent register is of a versioned type, if the state of the register and any operation (read or write)
that attempts to modify the state of the register, are associated with a version. An operation may modify the
state and the version of the register, or it may just retrieve its state-version pair.
Coverability defines the exact guarantees that a versioned register provides when it is accessed concurrently
by multiple processes with respect to the evolution of its versions. We define two levels of coverability: strong
and weak coverability. Strong coverability ensures that only a single operation may change a given version
(and thus the state) of the register, resulting in a lineal evolution of the versions (and the states) of the register.
Weak coverability relaxes this rule and allows multiple operations to change a version, generating in this way
a tree with possibly multiple version branches that can grow in parallel. This shares similarities with fork
linearizability presented in [20]. However, in contrast to [20], weak coverability allows processes, that change
the same version of the object, to see the changes of each other in subsequent operations. In particular, by
weak coverability, when all the operations that extend a particular version of the object terminate, there is one
version ver that was generated by one of those operations, which is the ancestor of any version extended by any
subsequent operation. Thus, only a single branch in the tree is extended and that branch denotes the evolution of
the register. Combining strong/weak coverability with atomic guarantees we obtain strongly/weakly coverable
atomic read/write registers. While strongly coverable atomic registers are very desirable objects, we show that
they are in fact very strong. In particular, we argue that these object types are as powerful as consensus objects
(the details are given in Appendix B). Hence, it is challenging to implement these objects in some distributed
1
systems, and impossible in an asynchronous system prone to failures (from the FLP result [10]).
The good news is that even weakly coverable atomic registers have very interesting features. On the one
hand, they can be implemented in message passing asynchronous distributed systems where processes can
fail. To show this, we describe how algorithms that implement atomic R/W registers can easily be modified to
implement these objects (Section 6). On the other hand, we show that weakly coverable atomic registers cannot
be implemented using other previously defined register types such as ranked registers (Section 4).
One of the main motivation for introducing coverable registers are file objects, which can be seen as a
special case of register objects in which each new value is a revision of the previous value. In essence, each
modification of a file can be seen as an atomic read-modify-write (RMW) operation. Strongly coverable atomic
registers provide the desired strong guarantees for files, since they are powerful enough to support atomic RMW
operations. However, we show that even weakly coverable atomic registers can be used to provide interesting
weak RMW guarantees that can be used to implement files with a good level of consistency (Section 5).
2 Model
We consider a distributed system composed of n asynchronous processes, with identifiers from a set I =
{p1, . . . , pn}, each of which represents a sequential thread of control. Processes may interact with a set of
shared objects O. Each object in O represents a data structure shared among the processes, and has a type
which defines the possible set of object states and the set of operations that provide the means to manipulate
the object. A subset of processes may fail by crashing.
Processes can be modeled in terms of I/O Automata [18]. An automaton A (which combines the automata
Ai for each process pi ∈ I) is defined over a set of states and a set of actions. An execution ξ of A is an
alternating sequence of states and actions ofA. An execution fragment is a finite prefix of an execution. We say
that an execution fragment ξ′ extends an execution fragment ξ, if ξ is a prefix of ξ′. A history of an automaton
A, denoted by Hξ, is the subsequence of actions occurring in some execution fragment ξ. An automaton A
invokes an operation when an invocation action occurs in an execution ξ, and receives a response to an action
when a response action occurs. An operation pi is complete in an execution ξ, ifHξ contains both the invocation
and the matching response actions for pi; otherwise pi is incomplete. A history Hξ of the automaton Ai of a
process pi is well formed if it begins with an invocation event and alternates between matching invocation
and response events. (This demonstrates the assumption that each process is a single thread of control.) Each
history Hξ includes a precedence relation →Hξ on its operations. An operation pi1 precedes an operation pi2
(or pi2 succeeds pi1) in Hξ if the response of pi1 appears before the invocation of pi2 in Hξ. This is denoted by
pi1 →Hξ pi2. If pi1 6→Hξ pi2 and pi2 6→Hξ pi1 in Hξ, then pi1 and pi2 are concurrent. A process pi crashes in an
execution ξ if the event failpi appears and is the last action of pi in Hξ; otherwise pi is correct.
3 Coverable Atomic Read/Write Registers
In this section we define a new type of R/W register, the versioned register. Next we provide new consistency
properties for concurrent versioned registers called (strong/weak) coverability. We show how coverability can
be combined with atomic guarantees to yield a coverable atomic register.
Versioned register. Let Versions be a totally ordered set of versions. A versioned register is a type of
read/write register where each value written is assigned with a version from the set Versions . Moreover,
each write operation pi that attempts to change the value of the register is also associated with a version, say
verpi, denoting that it intends to overwrite the value of the register associated with the version verpi. More
precisely, an implementation of a R/W register offers two operations: read and write. A process pi ∈ I
invokes a write (resp. read) operation when it issues a write(val)pi (resp. readpi) request. The versioned
variant of a R/W register also offers two operations: (i) cvr-write(val, ver)pi , and (ii) cvr-read()pi . A process
pi invokes a cvr-write(val, ver)pi operation when it performs a write operation that attempts to change the
value of the object. The operation returns the value of the object and its associated version, along with a flag
2
informing whether the operation has successfully changed the value of the object or failed. We say that a write
is successful if it changes the value of the register; otherwise the write is unsuccessful. The read operation
cvr-read()pi involves a request to retrieve the value of the object. The response of this operation is the value of
the register together with the version of the object that this value is associated with.
Read operations do not incur any change on the value of the register, whereas write operations attempt to
modify the value of the register. More formally, let ∆T be the set of transitions for the versioned register. Then,
each δ ∈ ∆T is a tuple 〈σ, pi, pi, σ′, res〉, denoting that the register moves from state σ to state σ′, and responds
with res, as a result of operation pi invoked by process pi ∈ I. The state of a versioned register is essentially its
value, drawn from a set V , and its version, drawn from the set Versions . We assume that ∆T is total, that is,
for every pi ∈ {cvr-write(val, ver)pi , cvr-read()pi}, pi ∈ I, and σ = (val, ver) ∈ V × Versions , there exists
σ′ = (val′ , ver′) ∈ V × Versions and res such that 〈σ, pi, pi, σ′, res〉 ∈ ∆T . As such, the transitions of the
versioned register type can be written as follows:
1. 〈(val, ver), cvr-write(val′, verω), pi, (val′, ver′), (val′, ver′, chg)〉, for verω = ver,
2. 〈(val, ver), cvr-write(val′, verω), pi, (val, ver), (val, ver, unchg)〉, for verω 6= ver
3. 〈(val, ver), cvr-read(), pi, (val, ver), (val, ver)〉.
Notice that write operations may or may not modify the value/version of the register. In the transitions above,
verω denotes the version of the register which the write operation tries to modify. The relationship of ver
with ver′ may vary depending on the application that uses this register (but seems natural to assume that
ver′ > ver). A read operation does not make any changes on the value or the version of the object. To simplify
notation, in the rest of the paper we avoid any reference to the value of the register. Additionally we only use
the flag when its value is unchg. Thus, cvr-write(v, ver)(v, ver′, chg)pi is denoted as cvr-ω(ver)[ver′]pi , and
cvr-write(v, ver)(v′, ver′, unchg)pi is denoted as cvr-ω(ver)[ver′, unchg]pi .
We say that, a write operation revises a version ver of the versioned register to a version ver′ (or produces
ver′) in an execution ξ, if cvr-ω(ver)[ver′]pi completes in Hξ. Let the set of successful write operations on a
history Hξ be defined as:
Wξ,succ = {pi : pi = cvr-ω(ver)[ver′]pi completes in Hξ}
The set now of produced versions in the history Hξ is defined by:
Versionsξ = {veri : cvr-ω(ver)[veri]pi ∈ Wξ,succ} ∪ {ver0}
where ver0 is the initial version of the object. Observe that the elements of Versionsξ are totally ordered. In the
rest of the text we use ‘∗’ in the place of some parameter to denote that any legal value for that parameter can
be used. Now we present the validity property which defines explicitly the set of executions that are considered
to be valid executions.
Definition 1 (Validity) An execution ξ (resp. its history Hξ) is a valid execution (resp. history) on a versioned
object, ifWξ and for any pi, pj ∈ I:
• ∀cvr-ω(ver)[ver′]pi ∈ Wξ,succ, ver < ver′,
• for any operations cvr-ω(∗)[ver′]pi and cvr-ω(∗)[ver′′]pj inWξ,succ, ver′ 6= ver′′, and
• for each verk ∈ V ersionsξ there is a sequence of versions ver0, ver1, . . . , verk, such that cvr-ω(veri)[veri+1]
∈ Wξ,succ, for 0 ≤ i < k.
Validity makes it clear that an operation changes the version of the object to a larger version, according to the
total ordering of the versions. Also validity specifies that versions are unique, i.e. no two operations associate
two states with the same version. This can be easily achieved by, for example, recording a counter and the id
of the invoking process in the version of the object. Finally, validity requires that each version we reach in an
execution is derived (through a chain of operations) from the initial version of the register ver0. From this point
onward we fix ξ to be a valid execution and Hξ to be its valid history.
3
Coverability. We can now define the strong and weak coverability properties over a valid execution ξ of
versioned registers with respect to some total order >ξ on the operations of ξ.
Definition 2 (Strong Coverability) Let ver0 < ver1 < . . . < ver|Wξ,succ| be the versions in Versionsξ.
A valid execution ξ is strongly coverable with respect to a total order <ξ on operations inWξ,succ if:
• cvr-ω(veri−1)[veri] ∈ Wξ,succ, for 1 ≤ i ≤ |Wξ,succ|,
• cvr-ω(veri−1)[veri] <ξ cvr-ω(veri)[veri+1], for 1 ≤ i < |Wξ,succ| , and
• if pi1, pi2 ∈ Wξ,succ, and pi1 →Hξ pi2 then pi1 <ξ pi2.
By Definition 2, all successful write operations are totally ordered with respect to the versions they modify.
Notice than only a single write operation modifies each version veri−1 to the next version veri. Thus, strong
coverability defines an object type which is difficult to provide in an asynchronous distributed setting. In fact it
can be shown that strongly coverable registers can be used to solve consensus among asynchronous fail-prone
processes (see Appendix B). However, as shown by Fischer, Lynch and Paterson [10], solving consensus in
such a system is impossible in the existence of a single crash failure, unless some powerful object is used.
Hence the interest in defining a weaker version of coverability.
Definition 3 (Weak Coverability) A valid execution ξ is weakly coverable with respect to a total order <ξ on
operations inWξ,succ if:
• (Consolidation) If pi1 = cvr-ω(∗)[veri], pi2 = cvr-ω(verj)[∗] ∈ Wξ,succ, and pi1 →Hξ pi2 in Hξ, then
veri ≤ verj and pi1 <ξ pi2.
• (Continuity) if pi2 = cvr-ω(ver)[veri] ∈ Wξ,succ, then there exists pi1 ∈ Wξ,succ s.t. pi1 = cvr-ω(∗)[ver]
and pi1 <ξ pi2, or ver = ver0.
• (Evolution) let ver, ver′, ver′′ ∈ V ersionsξ. If there are sequences of versions ver′1, ver′2, . . . , ver′k
and ver′′1 , ver′′2 , . . . , ver′′` , where ver = ver
′
1 = ver
′′
1 , ver
′
k = ver
′, and ver′′` = ver
′′ such that
cvr-ω(ver′i)[ver
′
i+1] ∈ Wξ,succ, for 1 ≤ i < k, and cvr-ω(ver′′i )[ver′′i+1] ∈ Wξ,succ, for 1 ≤ i < `,
and k < `, then ver′ < ver′′.
By Definition 3, weak coverability allows multiple write operations to revise the same version veri of the
register, each to a unique version verj . Given the set of successful operationsWξ,succ and the set of versions
Versionsξ, Definitions 1 and 3 define a connected rooted tree T s.t.:
• The set of nodes of T is Versionsξ,
• The initial version ver0 of the object is the root of T ,
• A node veri is the parent of a node verj in T iff ∃pi(veri)[verj ] ∈ Wξ,succ,
• If pi1 = cvr-ω(∗)[veri] ∈ Wξ,succ, s.t. pi1 is not concurrent with any other operation, then ∀pi2 ∈ Wξ,succ,
s.t. pi1 →ξ pi2 and pi2 = pi(verz)[∗], then veri is an ancestor of verz in T , or veri = verz (by Consolidation,
Continuity, and Validity)
• if veri is an ancestor of verj in T , then cvr-ω(∗)[veri] <ξ cvr-ω(∗)[verj ] (by Continuity).
• if veri is at level k of T and verj is at level ` of T s.t. k < `, then veri < verj (by Evolution).
Observe that without the properties imposed by weak coverability, validity allows the creation of a tree of
versions and does not prevent operations from being applied on an old version of the register. Continuity,
Consolidation, and Evolution explicitly specify the conditions that reduce the branching of the generated tree,
and in the case of not concurrency lead the operations to a single path on this tree. Consolidation specifies
that write operations may revise the register with a version larger than any version modified by a preceding
write operation, and may lead to a version newer than any version introduced by a preceding write operation.
4
Figure 1: Tree Illustration from Weak Coverable Execution
Continuity defines that a write operation may revise a version that was introduced by a preceding write operation
according to the given total order. Finally, Evolution limits the relative increment on the version of a register
that can be introduced by any operation. Figure 1 provides an illustration of a tree created from a coverable
execution ξ. We box sample instances of the execution and we indicate the coverability properties they satisfy.
Atomic coverability. We now combine coverability with atomic guarantees to obtain coverable atomic
read/write registers. A register is linearizable [14], or equivalently atomic (as defined specifically for regis-
ters by [16, 17]) if the following conditions are satisfied by any execution ξ of an implementation of the object.
Definition 4 (Atomicity) [17, Section 13.4] An execution ξ of an automaton A is atomic if every read and
write operation in ξ is complete and there is a partial ordering≺Hξ on all operations Π in Hξ such that: A1. For
any pair of operations pi1, pi2 ∈ Π, if pi1 →Hξ pi2 then it cannot hold that pi2 ≺Hξ pi1, A2. If pi ∈ Π is a write
operation and pi′ any operation in Π, then either pi ≺Hξ pi′ or pi′ ≺Hξ pi, and A3. If v is the value returned by a
read ρ then v is the value written by the last preceding write according to ≺Hξ (or the initial value v0 if there is
no such a write).
In the context of versioned registers, in Definition 4, a write refers to a successful write (cvr-ω(∗)[∗, chg])
operation on the versioned register. Therefore, all the write operations in an execution ξ are the ones that
appear inWξ,succ. A read refers to a versioned read (cvr-ρ()[∗]) or an unsuccessful write (cvr-ω(∗)[∗, unchg])
operation that does not modify the value (nor the version) of the register.
Definition 5 (Coverable atomic register) A versioned register is (strongly/weakly) coverable and atomic, re-
ferred as (strongly/weakly) coverable atomic register, if any execution ξ on the register satisfies: (i) atomicity,
and (ii) strong/weak coverability (Definition 2/3) with respect to the total order imposed by A2 onWξ,succ.
Note that in a coverable atomic register, the ordering of read operations follows the ordering from atomicity.
From this point onward, when clear from context, we refer to a coverable atomic register, as simply coverable
register.
4 Weakly Coverable Atomic Registers vs Ranked Registers.
A type of registers that at first might resemble coverable registers are ranked-registers [5]. As we show here,
ranked-registers are weaker than weakly coverable registers. In particular, we show that it is impossible to
implement weakly coverable registers using ranked-registers; we begin by providing a formal definition of
ranked-registers.
5
Definition 6 (Ranked-Registers [5]) Let Ranks be a totally ordered set of ranks with r0 the initial rank. A
ranked register is a MWMR shared object that offers the following operations: (i) rr-read(r), with r ∈ Ranks
and returns (r, v) ∈ Ranks × V alues, and (ii) rr-write(〈r, v〉), with (r, v) ∈ Ranks × V alues and returns
commit or abort. A ranked register satisfies the following properties: (i) Safety. Every rr-read operation
returns a value and a rank that was written in some rr-write invocation or (r0, v0). Additionally, if W =
rr-write(〈r1, v〉) a write operation which commits and R = rr-read(r2) such that r2 > r1, then R returns (r, v)
where r ≥ r1. (ii) Non-Triviality. If a rr-write operation W invoked with a rank r1 aborts, then there exists an
operation with rank r2 > r1 which returns before W is invoked, or is concurrent with W (iii) Liveness. if an
operation is invoked by a correct process then eventually it returns.
We want to use rank-registers to implement the operations of a weakly coverable register. As in Section 2,
we denote by cvr-ω(ver)[ver′, f lag] the coverable write operation that tries to revise version ver, and returns
version ver′ with a flag ∈ {chg, unchg}. Similarly we denote by rr-ω(r)[rh, res] a write operation on a
ranked-register that uses rank r and tries to modify the value of the register. The rank rh is the highest rank
observed by an operation and res ∈ {abort, commit}. In the following results we assume that a weakly
coverable register is implemented using a set of ranked-registers. We begin with a lemma that shows that
a coverable write operation revises the coverable register only if it invokes a write operation on some rank
register and that write operation commits. Omitted proofs can be found in Appendix A.
Lemma 7 Suppose there exists an algorithm A that implements a weakly coverable register using ranked-
registers. In any execution ξ of A, if a process pi invokes a coverable write operation cvr-ω(ver)[ver′, chg]pi ,
then pi performs a write rr-ω(r)[rh, commit]pi,j on some shared ranked-register j.
Next we show that if pi1, pi2 are two non-concurrent write operations on the weakly coverable register, then
pi2 performs a ranked write (that commits or aborts) on at least a single ranked register on which pi1 performed
a committed ranked write operation. For the sake of the lemma Ri is the set of ranked registers on which pii
writes, and cRi a subset of them on which the write commits.
Lemma 8 Let pi1 = cvr-ω(ver)[ver1, chg]pi and pi2 = cvr-ω(ver1)[ver2, ∗]pz , i 6= z, be two write operations
that appear in an execution ξ s.t. pi1 →ξ pi2. There exists some shared register j ∈ R2∩cR1 with a highest rank
rj before the invocation of pi1, such that pi performs an rr-ω(r)[∗, commit]pi,j during pi1, and pz performs an
rr-ω(r′)[∗, ∗]pz ,j during pi2.
Thus far we showed that a successful coverable write operation needs to commit on at least a single ranked
register (Lemma 7), and two non-concurrent coverable write operations need to invoke a ranked write operation
on a common rank register (Lemma 8). Using now Lemma 8 we can show that a coverable write operation that
changes the version of the coverable register must use a rank higher than any previously successful coverable
write operation.
Lemma 9 In any execution ξ if pi1 = cvr-ω(ver)[ver1, chg]pi and pi2 = cvr-ω(ver1)[ver2, chg]pz ,z 6= i, s.t.
pi1 →ξ pi2, then there exists some shared register j such that pi performs an rr-ω(r)[∗, commit]pi,j during pi1,
and pz performs an rr-ω(r′)[∗, commit]pz ,j during pi2, and r′ > r.
Now we prove our main result stating that a weakly coverable register cannot be implemented with ranked
registers as those were defined in [5].
Theorem 10 There is no algorithm that implements a weakly coverable register using a set of ranked registers.
Proof. The theorem follows from Lemmas 7, 8, and 9, and the fact that a ranked register allows a write operation
to commit even if it uses a rank smaller than the highest rank of the register. As by Lemma 7 a successful write
must commit, then by ranked registers it can commit with a rank smaller than the highest rank of the accessed
6
register. This, however, by Lemma 9 may lead to violation of the consolidation and continuity properties and
thus violation of weak coverability. 
Observe that the key fact that makes ranked registers weaker than weakly coverable registers is that the
former allow write operations to commit even if their ranks are out of order. In particular, note that the Non-
Triviality property does not force a write operation invoked with a rank r1 to abort, even if there exists a
completed prior operation with rank r2 > r1. As shown in [5] non-fault-tolerant ranked registers may preserve
the total order of the ranks, and thus be used to implement consensus. As we show in Appendix B such ranked
registers (i.e., that implement consensus) could be used to implement strongly coverable registers.
5 Applications of Weakly Coverable Atomic Read/Write Registers
Weak RMW registers. A shared object satisfies atomic read-modify-write (RMW) semantics if a process can
atomically read and modify the value of the object using some function F , and then write the new value on the
object. Weakly coverable atomic R/W registers can be used to implement a weak version of RMW semantics.
In a weak RMW object not all operations may successfully modify the value of the object. In case that a RMW
operation is not concurrent with any other operation then this operation satisfies the RMW semantics. In case
where two or more operations invoke RMW concurrently, at least one of them will satisfy the RMW semantics.
Finally, weak RMW allow multiple RMW operations to modify successfully the same value.
Figure 2 presents an implementation of a weak RMW object using weakly coverable atomic R/W registers.
We assume that the object offers a rmw(F) action that accepts a function and tries to apply that function on the
value of the object. The object returns the initial value of the object and a flag indicating whether the value of
the object was modified successfully.
At each process i ∈ I
Local Variables: lcver ∈ V ersions, oldval, lcval, newv ∈ V alues, flag ∈ {chg, unchg}
function RMW(F)
〈oldval, lcver〉 ← cvr-read()
newv ← F(oldval)
〈lcval, lcver, flag〉 ← cvr-write(lcver, newv)
if flag == chg then return 〈lcval, success〉
else return 〈lcval, fail〉
Figure 2: Weak RMW using Weakly Coverable Atomic R/W Registers
Theorem 11 The construction in Figure 2 implements a weak RMW object.
Proof. Consider an execution ξ of the algorithm. We begin the proof by studying the case where an operation
rmw(F) is not concurrent with any other operation in ξ. The atomic nature of the register ensures that cvr-read
returns the latest value and version, say 〈ver, val〉, written on the register. When the cvr-write operation is
invoked, the write operation tries to modify the value associated with version ver. As there is no concurrent
operation, the version of the register remains ver and thus according to consolidation and continuity, the write
operation successfully writes the new value completing the RMW operation.
Consider now the case of two operations, pi1 and pi2, invoking rmw concurrently. Each of these operations
involve a cvr-read followed by a cvr-write operation. Let ρpii (resp. ωpii) denote the read (resp. write) operation
invoked during pii, for i ∈ [1, 2]. We have the following cases wrt the order of these operations: (i) ωpi1 → ρpi2 ,
(ii) ωpi2 → ρpi1 , (iii) ρpi2 → ωpi1 → ωpi2 , (iv) ρpi1 → ωpi2 → ωpi1 , or (v) ωpi1 is concurrent with ωpi2 . In
case (i), both read and write operations of pi1 complete before the read and write operations of pi2 are invoked.
In this case notice that the version of the object remains the same from the read to the write operation of both
operations. Thus, according to consolidation and continuity, both write operations will successfully change the
value of the register. The same holds for case (ii), where pi2’s ops complete before the invocation of pi1’s ops.
In case (iii) the write operation of pi1 completes before the write operation of pi2. Let ρpi2 in this case complete
7
At each process i ∈ I
Local Variables: lcver ∈ V ersions, initially ver0
lcval, newv ∈ V alues, initially ⊥ flag ∈ {chg, unchg}, ini-
tially chg
function REVISE(v, ver)
〈lcval, lcver, flag〉 ← cvr-write(ver, v)
if flag == chg then
return OK
return 〈lcval, lcver〉
function GET()
〈lcval, lcver〉 ← cvr-read()
return 〈lcval, lcver〉
Figure 3: File Object using Weakly Coverable Atomic R/W Registers
before ωpi1 . Both read operations ρpi1 and ρpi2 discover by atomicity the same version, say ver. So both write
operations will be invoked as cvr-write(ver, v). Since no operation changes the version of the register before
ωpi1 is invoked, then by consolidation and continuity, ωpi1 changes the version of the object to, say, verpi1 .
Notice that by validity, verpi1 > ver. When ωpi2 is invoked it fails by consolidation to change the value of the
object as ωpi1 → ωpi2 and it tries to change the version ver < verpi1 (the version of ωpi1). Hence, only pi1 will
manage to preserve RMW semantics. Similarly, we can show that only pi2 will preserve RMW semantics in
case (iv). Finally, in case (v) if both writes try to change the version ver, both may succeed and preserve RMW
semantics. Since, however, their versions are unique and comparable, then by consolidation any subsequent
operation will RMW the highest of the two versions. So in all cases at least a single operation satisfies the
RMW semantics, as desired. 
From the proof we can extract that weakly coverable registers may allow multiple writes to change the
same version of the register, but consolidation ensures that at least one write satisfies RMW semantics for each
version. Finally, consolidation and continuity ensure that eventually RMW operations diverge in a single path
in the constructed tree.
Concurrent File Objects A file object can be implemented directly using RMW semantics since one can
retrieve, revise, and write back the new version of the file. As RMW semantics can be used to solve consensus
[13], they are impossible to be implemented in an asynchronous system with a single crash failure. Therefore,
we consider file objects that comply to the weak RMW semantics as those were given in the paragraph above.
In particular, we consider concurrent file objects that allow two fundamental operations, revise and get to be
invoked concurrently by multiple processes. The revise operation is used to change the contents of the file
object, whereas the get action is analogous to a read operation and facilitates the retrieval of the contents of the
file. Semantically, a file object requires that a revise operation is applied on the latest version of the file and a
get operation returns the file associated with the latest written version. Depending on the implementation, the
values written and returned by these operations can be the complete file object, a fragment of the file object, or
just the journal containing the operations to be applied on a file (similar to a journaled file system).
Figure 3 presets the algorithm that implements the two operations. The revise operation specifies the version
of the file to be revised along with the new value of the shared object. The cvr-write operation attempts to
perform the write with the given version and returns the value and version of the register, and whether the write
succeeded or not. If the write succeeded then the operation informs the application for the proper completion
of the revise operation; otherwise the latest discovered value-version pair is returned. From Theorem 11 and
Figure 3 we may conclude the following theorem.
Theorem 12 The construction in Figure 3 implements a file object.
6 Implementing Weakly Coverable Atomic Read/Write Registers
We now show how we can implement weak coverable atomic registers. We do so by enhancing the Multi-Writer
version of algorithm ABD [2, 19] (referred as MWABD) to preserve the properties of weak coverability. The
presented technique can be applied to implementations of atomic R/W objects that utilize a 〈tag, value〉 pair
to order the write operations and where each write performs two phases before completing: a query phase to
obtain the latest value of the atomic object and a propagation phase to write the new value on the object. We
8
cvr-write(val, ver = maxtag)
query-phase: Send query request to all the replicas and wait to receive (tag, value) responses from a majority of them. Select the
(tag, value) among the collected replies with the largest tag; let the 〈τ, v〉 be this pair and the integer component of τ be z. Then:
– If ver == τ then: Create a new tag τnew = 〈z+1, wid〉wherewid is the unique identifier of the writer and set valnew = val.
– If ver 6= τ then: Set 〈τnew, valnew〉 = 〈τ, v〉.
propagation-phase: Send 〈τnew, valnew〉 to all the replicas and wait to receive responses from a majority of them. if ver == τ
then respond with 〈valnew, τnew, chg〉, otherwise respond with 〈valnew, τnew, unchg〉 to the process.
cvr-read()
query-phase: Send query request to all the replicas and wait to receive (tag, value) responses from a majority of them. Select the
(tag, value) among the collected replies with the largest tag; let the 〈τ, v〉 be this pair and the integer component of τ be z.
propagation-phase: Send 〈τ, v〉 to all replicas and wait for responses from a majority of them. Respond with 〈v, τ〉 to the process.
at-replica
On receipt of query message: Send the tag-value pair 〈τr, vr〉 stored locally.
On receipt of propagation message: Let 〈τm, vm〉 be the tag-value pair enclosed in the received message and 〈τr, vr〉 the local pair
on the replica. Compare the tags τm and τr . If τm > τr then store 〈τm, vm〉 locally. Reply with “ack”.
Figure 4: The operations of algorithm vMWABD.
could also adopt implementations of stronger objects like the ones presented in [3–6] but we preferred to show
the simplest modification in a fundamental algorithm. To capture the semantics of a coverable atomic register
we modify the operations of algorithm MWABD to comply with the versioned variant of the R/W register. We
use cvr-write(ver, v) and cvr-read() as the write and read operations respectively. A cvr-write(ver, v) operation
may impact differently the state of the object, depending on the version of the shared object: it may appear as
a read not modifying the value nor the version of the register or as a write changing both the value and the
version of the register.
In brief, the original MWABD replicates an object to a set of hosts S ⊂ I and it uses 〈tag, value〉 pairs
to order the read and write operations. A tag consists of a non-negative integer number and a writer identifier
which is used to break the ties among concurrent write operations. Both the read and write protocols have
two phases: a query and a propagation phase. During the query phase the invoking process broadcasts a
query message to all the replica hosts (replicas) and waits for a majority of them to reply with their tag-value
pairs. Once those replies are received the process discovers the largest tag-value pair among the replies. In the
second phase, a read operation propagates the discovered tag-value pair to the majority of the replicas. A write
operation increments the largest tag, associates the new tag with the value to be written, and propagates the new
tag-value pair to the majority of the replicas.
In the versioned MWABD, vMWABD for short, we use the tags associated with each value to denote the
version of the register. The pseudocode of each operation of vMWABD is described in Figure 4. The cvr-read
operation is similar to the read operation of MWABD with the difference that it returns both the value and the
version of the register. A cvr-write operation differs from the original write by utilizing a condition before its
propagation phase and depending whether the condition holds it changes the state of the register (value and
version) or not, as detailed in Figure 4. Note that the version parameter of the write operation is equal to the
maximum tag that the invoking process witnessed.
Theorem 13 Algorithm vMWABD implements weak coverable atomic registers.
Proof. It is clear that vMWABD still satisfies properties A1-A3. Any write operation that is not successful can
be mapped to a read operation that performs two phases and propagates the latest value/version of the register
to a majority of replicas before completing. It remains to show that vMWABD also satisfies the properties of
validity and weak coverability.
Validity is satisfied since each tag is unique, as it is composed by an integer and the id of a process. The
tag is monotonically incrementing at each replica, as according to the algorithm a replica updates its local copy
only if a higher tag is received. A writer process discovers the maximum tag maxtag among the replicas
9
and in the second phase it generates a tag 〈maxtag + 1, wid〉. As the tag at each replica is monotonically
incrementing then each writer never generates the same tag twice. Also, for every write cvr-ω(tag)[tag′, chg],
tag′ = 〈tag.ts + 1, wid〉 ⇒ tag′ > tag. Finally, since every tag is generated by extending the initial tag and
each write operation extends a tag that obtains during its query phase then there is a sequence of tags leading
from the initial tag to the tag used by the write operation.
For consolidation we need to show that for two write operations ω1 = cvr-ω(∗)[tag1, chg] and ω2 =
cvr-ω(tag2)[∗, chg], if ω1 →ξ ω2 then tag1 ≤ tag2. According to the algorithm ω1 propagates tag1 to the
majority of replicas before completing. In the query phase, ω2 receives messages from the majority of replicas.
So there is one replica s that received tag1 from ω1 before replying to ω2. Since the tag in s is monotonically
incrementing, then s replies to ω2 with a tag tags ≥ tag1. So ω2 receives a maxtag ≥ tag1. Since ω2 also
changes the value and version of the register it means that its local tag tag2 is equal to maxtag. This shows
immediately that tag2 ≥ tag1, completing the proof.
Continuity is preserved as a write operation first queries the replicas for the latest tag before proceeding
to the propagation phase to write a new value. Since the tags are generated and propagated only by write
operations then if a write changes the value of the system then it appends a tag already written, or the initial tag
of the register.
Finally, to show that evolution is preserved, we observe that the version of a register is given by its tag,
where tags are compared lexicographically (first the number tag.ts and then the writer identifier to break ties).
A successful write pi1 = cvr-ω(tag)[tag′] generates a new tag tag′ from tag such that tag′.ts = tag.ts + 1.
Consider sequences of tags tag1, tag2, . . . , tagk and tag′1, tag′2, . . . , tag′` such that tag1 = tag
′
1. Assume that
cvr-ω(tagi)[tagi+1], for 1 ≤ i < k, and cvr-ω(tag′i)[tag′i+1], for 1 ≤ i < `, are successful write operations. If
tag1.ts = tag
′
1.ts = z, then tagk.ts = z + k and tag
′
`.ts = z + `, and if k < ` then tagk < tag
′
`. 
Supporting Large Versioned Objects. Fan and Lynch [9], using algorithm MWABD as a building block,
showed how large atomic R/W objects can be efficiently replicated. The main idea of their algorithm, called
LDR, is to have two distinguished sets of servers: Replicas and Directories. Replica servers are the ones that
actually store the object’s data (value), while Directories keep track of the tags of the object and the associated
Replicas that store the data of the object. A reader or writer first runs algorithm MWABD on the Directories
to obtain the highest tag of the object, and the identity of the Replicas that have the associated value (aka, the
most recent value of the object). A read operation, then contacts a subset of the Replicas to obtain the value
of the object. A write sends the new value to a majority of the Replicas, while ensuring that Directories are
updated (see [9] for details). By replacing algorithm MWABD with algorithm vMWABD and performing a few
modifications to the Replicas, we can turn algorithm LDR into an algorithm that can handle large versioned
R/W objects, such as large files. See Appendix C for the modified LDR.
7 Conclusion
In this paper we have introduced versioned registers and a new property for concurrent versioned registers, we
call coverability. A versioned register associates a version with its value, and with each operation that wants
to modify its value. An operation may modify the value and the version of the register, or it may just retrieve
its value-version pair. Coverability defines the exact guarantees that a versioned register provides when it is
accessed concurrently by multiple processes with respect to the evolution of its versions, over a total order of
its operations. We introduce two levels of coverability: strong and weak. Strong coverability requires that only
a single operation modifies each version of the register, whereas weak coverability is more relaxed allowing
multiple concurrent operations to modify the same version.
We combine coverability with atomicity to obtain (strongly/weakly) coverable atomic registers. The suc-
cessful writes on the register follow the total order of atomicity, while preserving the properties required by
coverability. We note that a different total ordering could be used with coverability to obtain other types of
“coverable objects”. In fact, we believe it would be interesting to investigate further the use of coverable ob-
jects for the introduction of distributed algorithms for various applications. The fact that each operation is
enhanced by the version of the object provides the flexibility to manipulate the effect of an operation under
some conditions on the version of the object with respect to the version of the operation.
10
References
[1] AGUILERA, M. K., AND HORN, S. L. Abortable and query-abortable objects and their efficient imple-
mentation. In In Proceedings of the 26th Symposium on Principles of Distributed Computing (2007).
[2] ATTIYA, H., BAR-NOY, A., AND DOLEV, D. Sharing memory robustly in message passing systems.
Journal of the ACM 42(1) (1996), 124–142.
[3] BOICHAT, R., DUTTA, P., FRØLUND, S., AND GUERRAOUI, R. Deconstructing paxos. SIGACT News
34, 1 (Mar. 2003), 47–67.
[4] CHOCKLER, G., DOBRE, D., AND SHRAER, A. Brief announcement: Consistency and complexity
tradeoffs for highly-available multi-cloud store. In The International Symposium on Distributed Comput-
ing (DISC) (2013).
[5] CHOCKLER, G., AND MALKHI, D. Active disk paxos with infinitely many processes. Distributed
Computing 18, 1 (2005), 73–84.
[6] DOBRE, D., VIOTTI, P., AND VUKOLIC´, M. Hybris: Robust hybrid cloud storage. In Proceedings of the
ACM Symposium on Cloud Computing (New York, NY, USA, 2014), SOCC ’14, ACM, pp. 12:1–12:14.
[7] DUTTA, P., GUERRAOUI, R., LEVY, R. R., AND CHAKRABORTY, A. How fast can a distributed atomic
read be? In Proceedings of the 23rd ACM symposium on Principles of Distributed Computing (PODC)
(2004), pp. 236–245.
[8] ENGLERT, B., GEORGIOU, C., MUSIAL, P. M., NICOLAOU, N., AND SHVARTSMAN, A. A. On the
efficiency of atomic multi-reader, multi-writer distributed memory. In Proceedings 13th International
Conference On Principle Of DIstributed Systems (OPODIS 09) (2009), pp. 240–254.
[9] FAN, R., AND LYNCH, N. Efficient replication of large data objects. In Distributed algorithms (Oct
2003), F. E. Fich, Ed., vol. 2848/2003 of Lecture Notes in Computer Science, pp. 75–91.
[10] FISCHER, M. J., LYNCH, N. A., AND PATERSON, M. S. Impossibility of distributed consensus with one
faulty process. Journal of ACM 32, 2 (1985), 374–382.
[11] GEORGIOU, C., NICOLAOU, N. C., AND SHVARTSMAN, A. A. On the robustness of (semi) fast quorum-
based implementations of atomic shared memory. In DISC ’08: Proceedings of the 22nd international
symposium on Distributed Computing (Berlin, Heidelberg, 2008), Springer-Verlag, pp. 289–304.
[12] GEORGIOU, C., NICOLAOU, N. C., AND SHVARTSMAN, A. A. Fault-tolerant semifast implementations
of atomic read/write registers. Journal of Parallel and Distributed Computing 69, 1 (2009), 62–79.
[13] HERLIHY, M. Wait-free synchronization. ACM Trans. Program. Lang. Syst. 13, 1 (1991), 124–149.
[14] HERLIHY, M. P., AND WING, J. M. Linearizability: a correctness condition for concurrent objects. ACM
Transactions on Programming Languages and Systems (TOPLAS) 12, 3 (1990), 463–492.
[15] LAMPORT, L. How to make a multiprocessor computer that correctly executes multiprocess progranm.
IEEE Trans. Comput. 28, 9 (1979), 690–691.
[16] LAMPORT, L. On interprocess communication, part I: Basic formalism. Distributed Computing 1, 2
(1986), 77–85.
[17] LYNCH, N. Distributed Algorithms. Morgan Kaufmann Publishers, 1996.
[18] LYNCH, N., AND TUTTLE, M. An introduction to input/output automata. CWI-Quarterly (1989), 219–
246.
11
[19] LYNCH, N. A., AND SHVARTSMAN, A. A. Robust emulation of shared memory using dynamic quorum-
acknowledged broadcasts. In Proceedings of Symposium on Fault-Tolerant Computing (1997), pp. 272–
281.
[20] MAZIE`RES, D., AND SHASHA, D. Building secure file systems out of byzantine storage. In Proceedings
of the Twenty-first Annual Symposium on Principles of Distributed Computing (New York, NY, USA,
2002), PODC ’02, ACM, pp. 108–117.
12
Appendix
A Impossibility of Implementing Weakly CoVerable Registers using Ranked
Registers
In this section we provide the proofs to the lemmas presented in Section 4. Before proceeding to the proofs let
us introduce some notation we use throughout this section.
Let R be a set of ranked registers. Let Rx ⊆ R denote the set of ranked registers on which a process pi
performs a rr-ω(r)[∗, ∗]pi,∗ during a coverable write operation pix in an execution ξ. Rx = cRx ∪ aRx, where
cRx is the set of ranked register such that pi performs a rr-ω(r)[∗, commit]pi,∗ that commits during pix, and
aRx the set of ranked register such that pi performs a rr-ω(r)[∗, abort]pi,∗ that aborts during pix. For any pair
of write operations pix →ξ piy, let the set Rx,y = Rx ∩ Ry be the set of ranked registers such that both pix and
piy perform a ranked write. We finally denote by cRx,y = cRx ∩Ry and aRx,y = aRx ∩Ry, the set of registers
where pi committed (or aborted resp.) during pix and they were also written during operation piy.
Proof of Lemma 7. Let the weakly coverable register be implemented by k ranked registers each with a
highest rank r1, r2, . . . , rk respectively at the end of some execution fragment ξ. For the rest of the proof we
will construct extensions of ξ. Also, let the state of the coverable object be (v, ver) at the end of ξ.
Assume to derive contradiction that we extend ξ with a write operation ω1 = cvr-ω(ver)[ver′, chg]pi that
revises the coverable register, and all the write operations performed during ω1 on the ranked registers abort.
From that it follows that for each write operation rr-ω(r)[rj , abort]pi,j performed by pi on some register j,
rj > r. Let the new execution be ξ′.
We extend ξ′ with another write operation ω2 = cvr-ω(ver′′)[ver′′′, chg]pz by process pz to obtain exe-
cution ξ′′. Since ω1 →ξ ω2 then by consolidation, ver′′ ≥ ver′, and ω1 <ξ ω2. Moreover, since ω1 is not
concurrent with any other operation, then by consolidation ver′ is the largest version introduced in ξ. Since, by
continuity, ver′′ has to be equal to a version introduced by a preceding operation, and since ver′′ ≥ ver′ (the
largest version introduced), then ω2 revises ver′′ = ver′ to a newer version ver′′′. Note however that for any
write operation rr-ω(r′)[rj , ∗]pz ,j performed on any of the ranked registers, for 1 ≤ j ≤ k, the highest rank for
j at the time of the write was rj .
Finally consider the execution ∆ξ′′ that is similar to ξ′′ without containing ω1. In other words ∆ξ′′ extends
ξ with the write operation ω2. Observe that any write operation rr-ω(r′)[rj , ∗]pz ,j performed by pz on ranked
register j during ω2 observes a highest rank rj as in ξ′′. So if such a write committed (or aborted) in ξ′′ will
also commit (or abort) in ∆ξ′′ as well. Therefore, since ω2 revised the value of the coverable register in ξ′′ will
revise the value of the coverable register in ∆ξ′′ as well. However, the last proceeding write operation is of
the form cvr-ω(∗)[ver, ∗] for ver 6= ver′. Thus ∆ξ′′ violates the continuity property and hence contradicts our
initial assumption. 
Proof of Lemma 8. We will assume to derive contradiction that pi2 does not write on any ranked register that
pi wrote (and committed) during pi1. More formally, let cR1 be the set of ranked registers s.t. for all j ∈ cR1,
rr-ω(r)[∗, commit]pi,j for some rank r during pi1. Let R2 be the set of ranked registers s.t. for all q ∈ R2,
pz invokes rr-ω(r′)[∗, ∗]pz ,q during pi2. Note that since pi1 revises the version of the object, then according to
Lemma 7, |cR1| ≥ 1. According to our assumption cR1 ∩R2 = ∅.
Let us now construct an execution that contains the two operations pi1 and pi2. Consider an execution
fragment ξ that ends with a state associated with a version ver. Let us assume that there exists an algorithm
A that uses k ranked registers each with a highest rank r1, r2, . . . , rk respectively at the end of ξ. We extend ξ
with operation pi1 and obtain ξ1. Since pi1 changes the version of the object, by Lemma 7, there exists a ranked
register j ∈ cR1 such that, pi invokes an operation that commits on j, rr-ω(r)[∗, commit]pi,j , during pi1.
Next we extend ξ1 by pi2 and obtain ξ2. Since according to our assumption, cR1 ∩R2 = ∅, then it must be
the case that the highest rank observed by pi2 in any j ∈ R2 is rj , i.e. the highest rank of j at the end of ξ. So
it returns either rj or r′ the rank used by pz . That includes also the ranked registers that pi tried to modify and
aborted during pi1.
13
Consider now the execution ∆ξ2 which is similar to ξ2, without operation pi1. In particular, ∆ξ2 is obtained
by extending ξ with pi2. Notice that since pz does not communicate with pi, then pz appears in the same state
in both ξ2 and ∆ξ2 before invoking pi2. Thus, pz attempts to write on the same set of ranked registers R2 in
both executions. Since ξ is extended by pi2 alone, then any write operation on the ranked registers j ∈ R2 is
rj (as in ξ2). So pi2 cannot distinguish ξ2 from ∆ξ2 and thus revises ver1 in ∆ξ2 as well. However, ∆ξ2 does
not contain a cvr-ω(∗)[ver1, chg] operation, therefore pi2 violates the continuity property of weak coverability.
This contradicts our assumption. 
Proof of Lemma 9. Consider again an execution fragment ξ that ends with a state associated with a version
ver. Let us assume that there exists an algorithm A that uses a set |R| = k of ranked registers each with a
highest rank r1, r2, . . . , rk respectively at the end of ξ. We know by Lemma 7, that each operation pix that
changes the version of the weakly coverable register performs a write that commits on at least a single ranked
register in R.
We extend ξ with the following non-concurrent operations (listed in the order they take place) to obtain
execution ξ1:
• operation pi1 = pi(ver)[ver1, chg]p1
• operation pi2 = pi(ver1)[ver2, chg]p2
By Lemma 8, cR1∩R2 6= ∅. Assume to derive contradiction that ∀j ∈ cR1∩R2, p1 performs a committed
write with a rank rpi1 > rj and p2 performs a write with a rank rpi2 < rpi1 (that may commit or not). Since
according to our assumption, ∀j ∈ cR1 ∩ R2, the rank of p2 has to be smaller than the rank used by p1, we
assume w.l.o.g. that p2 uses the same rank r2 for all the ranked writes.
By the order of operations in ξ1 it follows that for all j ∈ cR1 ∩ R2, rr-ω(rpi1)[rj , commit]p1,j appears
before rr-ω(rpi2)[r1, ∗]p2,j in ξ1. Moreover, observe that, by Definition 6, for each register i ∈ R1 − cR1,
ri > rpi1 since the write from pi1 aborted. Since pi2 changes the version of the weakly coverable register, then
by Lemma 7, cR2 6= ∅. Notice that, even though we assume that r2 < r1, the operations in ξ1 may commit
without violating the ranked register properties of Definition 6 (as a write operation with a smaller rank does
not have to abort). In order to preserve weak coverability, pi2 changes the version ver1 to ver2.
Consider now the execution ∆ξ1 that contains the same operations but with pi1 and pi2 in reverse order. In
particular ∆ξ1 extends ξ with operations:
• operation pi2 = pi(ver′1)[ver′2, chg]p2
• operation pi1 = pi(ver)[ver1, chg]p1
Since there is no communication assumed between the processes then pi2 uses rank rpi2 in ∆ξ1 as well. It is easy
to see that for any register i ∈ R2 − R1, pi2 observes the same highest rank ri in both executions ξ1 and ∆ξ1.
So if the rank write of pi2 on those registers commits in ξ1 then it also commits in ∆ξ1. So the only registers
that may allow pi2 to differentiate between the two executions are the ones in the intersection cR1 ∩R2. There
are two cases to consider: (i) ∀j ∈ cR1 ∩R2, rj > rpi2 , and (ii) ∃j ∈ cR1 ∩R2, and rj ≤ rpi2 .
Case (i): In case (i), pi2 witnesses a higher rank from all the registers in cR1 ∩R2 as in ξ1. So for each register
j ∈ cR1 ∩ R2, if rr-ω(rpi2)[rj , ∗]p2,j committed in ξ1 then the write commits in ∆ξ1 as well. Thus, pi2 will
not be able to distinguish the two operations and it extends ver′1 = ver1 in ∆ξ1 as well. However, ver1 is not
written in ∆ξ1, thus pi2 violates continuity property and contradicts our assumption.
Case (ii): So it remains to examine the second case were ∃j ∈ cR1 ∩ R2, and rj ≤ rpi2 . In this case
rr-ω(rpi2)[rj , ∗]p2,j has to commit in ∆ξ1. If the same operation committed in ξ1 as well then pi2 cannot
distinguish the two executions and thus violates coverability as shown before. Let us assume that pi2 did not
commit in ξ1. Hence, pi2 distinguishes ∆ξ1 from ξ1. To preserve weak coverability, pi2 has to extend version
ver′1 = ver to a version ver′2 > ver. At the end of pi2 the highest rank of rj = rpi2 . When pi1 is invoked it
performs rank writes using rank rpi1 , since there is no communication between the processes. Since, according
14
to our assumption rpi2 < rpi1 , it follows that rr-ω(rpi1)[∗, ∗]p1,j commits in both ξ1 and ∆ξ1. Moreover, since
for all the rest registers i ∈ cR1, ri > rpi2 , pi1 will witness the same highest rank ri from each of those registers,
in both executions. Thus, all the write operations on those registers pi1 will commit on all those registers, and
thus, pi1 will not be able to distinguish ∆ξ1 from ξ1. Since, however ,it extended ver in ξ1, then it extends ver
in ∆ξ1 as well. However, as pi2 →ξ pi1, then by consolidation, pi needs to extend a version larger or equal to
ver′1. Since ver < ver′1 then consolidation is violated. And this completes the proof. 
B Strong Coverability vs Consensus
Consensus [17] is defined as the problem where a set of fail-prone processes try to agree on a single value
for an object. A consensus protocol must specify two operations: (i) propose(v)pi , used by the process pi to
propose a value v for the object, and (ii) decide()pi , used by the process pi to decide the value of the object.
Any implementation of consensus must satisfy the following three properties:
• (1) CTermination: Every correct process decides a value;
• (2) CValidity: Every correct process decides at most one value, and if it decides some value v, then v must
have been proposed by some process;
• (3) CAgreement: All correct process must decide the same value.
We show that a strongly coverable atomic register is equivalent to a consensus object. To support this
statement we first present an implementation of a consensus object using a strongly coverable register, and then
we describe an implementation of a strongly coverable register assuming the existence of a consensus object. In
the implementation of consensus that follows we assume that all the processes propose a value and they decide
by the end of the propose operation. Thus we combine the two actions in one operation. Figure 5 presents the
pseudocode of the implementation of a consensus object using a strongly coverable atomic register.
At each process i ∈ I
Local Variables: lcver ∈ V ersions, lcval ∈ V alues, flag ∈ {chg, unchg}
function PROPOSE(v)
lcval← v
(lcval, lcver, flag)← cvr-write(lcval, ver0)
return lcval
Figure 5: Consensus using Strongly Coverable Atomic Registers
We assume that ver0 is the initial version of the coverable register. When each process begins executing
the algorithm it issues a write operation trying to revise ver0 and propose its own local value as the value to
be decided. According to strong coverability only a single write operation cvr-write(v, ver0)(v, ver1, chg) is
going to succeed proposing its value, say v, and change the version of the register from ver0 to some version
ver1. All the rest of the write operations will be of the form cvr-write(v′, ver0)(v, ver1, unchg) and thus will
fail to change the value and version of the register. The write operation will return (lcval, lcver, flag) =
(v, ver1, unchg) no matter what value they tried to propose, and each will be able to agree on value lcval = v
reaching this way agreement. This discussion yields the following theorem.
Theorem 14 The construction in Figure 5 implements a consensus object.
Figure 6 shows the implementation of a strongly coverable atomic register using a consensus object. For our
implementation of consensus we assume that the consensus oracle runs a separate instance of consensus on each
version of the object. Thus, the oracle accepts as inputs the version we want to revise as well as the 〈v, ver′〉
tuple that consists of the value we propose. When that value is not specified, the oracle returns the tuple decided
on the instance associated with the given version. If no consensus was reached for a given version then the
15
oracle returns the tuple 〈⊥,⊥〉. To generate a new version a process calls the function generate-version(ver).
This procedure produces a unique version larger than any previous version, each time is executed. A trivial
implementation of this function is to append the given version with the unique id of the invoking process.
At each process i ∈ I
Local Variables: lcver, vernew ∈ V ersions initially ver0; lcval ∈ V alues; P ∈ V alues× V ersions
function cvr-write(v, ver)
vernew ← generate-version(ver)
P ← propose(ver, 〈v, vernew〉)
if P.ver == vernew then
lcver ← vernew
return 〈P, chg〉
else
while P.ver 6= ⊥ do
〈lcval, lcver〉 ← 〈P.val, P.ver〉
P ← propose(lcver,⊥)
end while
return 〈lcval, lcver, unchg〉
function cvr-read()
P ← propose(⊥, lcver)
while P.ver 6= ⊥ do
〈lcval, lcver〉 ← 〈P.val, P.ver〉
P ← propose(⊥, lcver)
end while
return 〈lcval, lcver〉
Figure 6: Strongly Coverable Atomic Registers using Consensus
Theorem 15 The construction in Figure 6 implements a strongly coverable atomic register.
Proof. We show that the algorithm satisfies two properties: (i) strong coverability and (ii) atomicity.
Strong coverability requires that only a single write operation changes each version of the register. Let
us assume to derive contradiction that there exists a version ver of the object s.t. two operations pi1 =
cvr-write(v, ver)(v, ver1, chg) and pi2 = cvr-write(v′, ver)(v, ver2, chg) both revise ver leading to two po-
tentially different versions ver1 and ver2. For this to be possible it means that P.ver = ver1 for pi1 and
P.ver = ver2 for pi2. P however is the value decided by the consensus oracle. Since both pi1 and pi2 revise the
same version ver then they both invoked the consensus oracle on the same instance of the version ver. Since
the consensus oracle reaches agreement on a single value then it must be the case that P is the same for both pi1
and pi2, and hence P.ver = ver1 = ver2. This however contradicts our assumption. Thus, only a single write
operation is able to modify each version and this preserves strong coverability.
Atomicity is trivially preserved by the write operations as they follow the total order imposed by the versions
they change. Read operations are ordered in terms of the write operations since they invoke the consensus
oracle until they reach the latest version of the object. A read operation ρ1 does not return an older value than a
preceding read ρ2, since ρ2 would reach an earlier or at most the same version as ρ1 before completing. Thus,
ρ2 will return the same or an older value as desired. Finally, a write operation that does not change the version
of the register it must be ordered with respect to the rest of the read operations. Such write also discovers the
latest accepted version and thus, as before, it will return the same or a newer value than the one returned by a
preceding read or unsuccessful write operation. 
16
C Supporting Large Files
Figure 7 depicts a modified version of the LDR algorithm [9], that implements versioned large objects.
tr-write(val, ver = maxtag)
get-metadata: Send query request to directory servers and wait for (tag, location) responses from a majority of them. Select the
(tag, location) among the collected replies with the largest tag; let 〈τ,S〉 be this pair and the integer component of τ be z. Then:
If τ 6= ver then do the following:
put-metadata: Send 〈τ,S〉 to the directory servers and wait for a majority of them to reply. Once those replies are received
set 〈τnew,Snew〉 = 〈τ,S〉.
get: Send get object request to f + 1 replica servers in S for the τ version of the object and wait for a single server to reply
with x. Return 〈x, τ, unchg〉.
If τ = ver then do the following:
put: Create a new tag τnew = 〈z+1, wid〉 where wid is the unique identifier of the writer. Send 〈τnew, val〉 to 2f +1 replica
servers and wait for f + 1 replies. Collect the identifiers of the servers that replied in a set Snew.
put-metadata: Send 〈τnew,Snew〉 to all the directory servers and wait for the majority of them to reply. Return
〈val, τnew, chg〉.
tr-read()
get-metadata: Send query request to directory servers and wait for (tag, location) responses from a majority of them. Select the
(tag, location) among the collected replies with the largest tag; let 〈τ,S〉 be this pair and the integer component of τ be z.
put-metadata: Send 〈τ,S〉 to the directory servers and wait for a majority of them to reply
get: Send get object request to f + 1 replica servers in S for the τ version of the object and wait for a single server to reply with
x. Return 〈x, τ〉.
directory-server
On receipt of get-metadata message: Send the tag-locations pair 〈τs,S〉 stored locally.
On receipt of put-metadata message: Let 〈τm,Sm〉 be the tag-location pair enclosed in the received message and 〈τs,S〉 the local
pair on the server. Compare the tags τm and τs. If τm > τs and |Sm| ≥ f + 1 then store 〈τm,Sm〉 locally.
replica-server
On receipt of put message: Add the 〈τm, value〉 pair enclosed in the message to the local set of available pairs and send an
acknowledgement.
On receipt of get message: If the value associated with the requested tag is in the set of pairs stored locally, respond with the value.
Otherwise ignore the message.
Figure 7: Operations of the modified LDR algorithm
17
