How to construct shared data objects is a fundamental issue in asynchronous concurrent systems, since these objects provide the means for communication and synchronization between processes in these systems. Constructions which guarantee that concurrent access to the shared object by processes is free from waiting are of particular interest, since they may help to increase the amount of parallelism in such systems. The problem of constructing a k-valued wait-free shared register out of binary subregisters of the same type where each write access consists of one subwrite (constructions with one-write) has received some attention, since it lies at the heart of studying lower bounds of the complexities of register constructions and trade-o s between them. The rst such construction was for the safe register case which uses k binary safe registers and exploits the properties of a rainbow coloring function of a hypercube. The best known construction for the regular/atomic case uses ? k 2 binary regular/atomic registers. In this work we show how the rainbow coloring function can be extended to simulate a handshaking mechanism between the reader and the writer of the register, thus o ering a solution for the atomic register case with one reader, which uses only 3k ? 2 binary registers. The lower bound for such a construction is k ? 1.
Introduction
In all forms of communication in a concurrent system the problem of sharing data between multiple processes must be faced at some level. The traditional way to share data among processes which either read or write them is to require that a write have exclusive access to the data, thus making only concurrent reading possible 5] . The requirement that some actions happen in an exclusive manner implies waiting by some process for another. However, in an asynchronous system, where some processors may be inherently faster than others, the above approach would slow a fast process down to the speed of a slow one. However, a natural property to require from an implementation of a shared data object in an asynchronous concurrent system is to guarantee that any process can complete any access to the object in a nite number of steps, regardless of the execution speeds of the other processes. Such an implementation is called wait-free. Wait-free shared data objects not only help in taking advantage of the inherent parallelism in concurrent systems, but also guarantee resiliency to halting failures, since a process that crashes while accessing the object cannot block the progress of any other process intending to access the same object.
A shared variable that supports concurrent read and write operations by a number of processes in a wait-free manner is also called a wait-free shared register; from now on we adopt the convention to call it register. Registers can be classi ed according to the strength of the consistency guarantees they provide in the presence of concurrent operations. Three kinds of consistency guarantees, namely safeness, regularity and atomicity, have been de ned by Lamport in 14] and have become of fundamental importance in the study of shared registers. According to those de nitions: (i) A register is called safe if it guarantees only that a read operation which does not happen concurrently with any write always returns the most recent value written to the register. The safeness property ensures nothing for the value returned by a read which overlaps with writes; this value may equal any possible value of the register.
(ii) A register is called regular if, besides ensuring safeness, it guarantees that a read that happens concurrently with one or more writes returns a \reasonable" value, which might be either the old one or one of the values written by one of the overlapping writes. (iii) A register is called atomic if, besides ensuring safeness, it guarantees that although read and write operations may overlap, there exists a way to \shrink" each one of them in an atomic grain of time which lies in its respective time duration, in a way that the value returned by each read equals the value written by the most recent write according to the sequence of \shrunk" operations in the time axis. Except from the above classi cation, registers are also distinguished by the number of readers that may concurrently read the register, the number of writers that may concurrently write the register, as well as the number of values it can take on. All these dimensions imply a hierarchy on registers, with single-reader, single-writer boolean safe register in the lowest level and multireader, multiwriter, multivalued atomic variables in the highest level.
Despite the fact that there has been a great deal of research on developing implementations of stronger registers out of weaker ones 1, 2, 9, 10, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25] , to the best of our knowledge, comparatively few results have appeared studying the costs incurred by such implementations 3, 4, 11, 21]. Chaudhuri and Welch in 4] summarize the issues involved in the study of the intrinsic complexity of register constructions: Since registers may di er in several dimensions, the inherent complexity of constructing a strong register out of weaker ones is a problem with multiple cases to be examined. As a second step they focus on two parameters of interest: the number of values and the consistency guarantees of the register. Thus, they propose the problem of studying the inherent cost of constructing multireader, single-writer, k-valued safe, regular and atomic registers out of multireader, single-writer, 2-valued (binary) safe, regular and atomic registers, respec-tively. Following their abbreviation, we refer to such constructions as k-valued from 2-valued safe/regular/atomic register constructions. The cost measures considered are the number of subregisters used in the register construction and the number of subreads and subwrites performed during each read or write operation of the registers, respectively. In that paper the particular classes examined are those of k-valued from 2-valued safe and regular constructions. First they prove that for the case in which the writer performs only one write suboperation, k?1 subregisters are necessary. As a second step they give an algorithm which implements a safe register, using as an encoding function a rainbow coloring function of the (k ? 1)-dimensional hypercube with k colors, after proving that such a coloring exists if and only if k is a power of 2. Kant and van Leeuwen 8] independently have shown the same result for the rainbow coloring of a hypercube and they applied it to the le distribution problem. A rainbow coloring of a hypercube with k colors is a coloring such that each node of the hypercube has a neighbor with each one of the k ? 1 colors other than its own. In a subsequent paper 3], Chaudhuri, Kosa and Welch give a construction of a k-valued regular/atomic register out of binary regular/atomic ones in which each write performs only one subwrite to one of them. This algorithm requires ? k 2 subregisters. It must be pointed out that both these two algorithms in 4] and in 3] are for the multi-reader case while they have the writer perform only one suboperation per write, the one subwrite.
In this paper we show how a rainbow coloring |which has been used in order to construct the weakest of the k-valued from 2-valued registers, namely the safe one| can be extended to simulate a powerful handshaking mechanism (see Tromp 21] , Spirakis, Kirousis, Tsigas 13]
and Dwork et al. 6, 7] ). Using this simulation we can get a k-valued from 2-valued atomic construction where the writer performs one write suboperation per operation and there are no overlapping reads. Our construction uses 3 2 dlogke ? 2 subregisters (when k is a power of 2 this is 3k ? 2), while the lower bound for such a construction is k ? 1. To the best of our knowledge this is the rst linear one-write construction.
Model of a Register Construction
A shared register is an abstract data object shared by a number of concurrent processes which may either read or write it. A construction of a register comprises of i) a data structure consisting of memory cells called subregisters, ii) a set of initial values for the subregisters and iii) a set of read and write procedures which provide the means to the processes to access the register; these procedures are also referred as protocol. When a process needs to perform either a read or a write operation on the register it must invoke the respective procedure. We call this process either reader or writer, respectively. Each operation execution, or shortly operation, is a sequential execution of a procedure's statements (steps), which may be either read or write suboperations on the subregisters or some local computations of the procedure. In order to avoid confusion between operations on the constructed register and operations on the subregisters used in the construction, the term operations is used only for the former and suboperations is used for the latter. A construction C is called wait-free if any operation will complete in a nite number of steps. Roughly speaking, the wait-free condition rules out unbounded busy waiting as well as conditional waiting. The reason for the former is obvious, while the latter holds because otherwise a process might be executing an in nite number of steps waiting for a condition to become true by a crashed process.
In a global time model each operation q is assumed to have a time interval s q ; f q ] on one linear time axis (s q < f q ). Think of s q and f q as the starting and nishing time instants of q.
During this time interval the operation is said to be pending. There is a precedence relation on operations (denoted by`!'), which is a strict partial order. q 1 ! q 2 means that operation q 1 ends before operation q 2 starts. If two operations are incomparable under !, they are said to overlap. If q 1 ! q 2 , then for any suboperations op 1 and op 2 of q 1 and q 2 , respectively, it holds that op 1 ! op 2 .
A reading function for a register construction C is a function that assigns a write operation w to each read operation r on the register, such that the value returned by r |according to the read procedure invoked| is the value written by w. It is assumed that there exists a write operation, which initializes the register, that precedes all other operations on it.
Let A denote the set of read and write procedures of the construction. A triple = (A; !; ) is called a system execution of C.
A construction is said to implement a regular register if all its system executions are regular.
A system execution = (A; !; ) is regular if for any read operation r of i) not r ! (r) and ii) there is no write w such that (r) ! w ! r. A construction is said to implement an atomic register if all its system executions are atomic. A system execution = (A; !; ) is atomic if ! can be extended to a total order ) such that for any read operation r of i) (r) ) r and ii) there is no write w such that (r) ) w ) r. From the respective de nitions it can be noticed that an atomic construction is also regular. Finally, the cost measures for the computation of space and time complexities of a construction C are: i) the number of subregisters used by C and ii) the maximum number of suboperations on the subregisters performed during any read and any write operation in any system execution of C. In order to ensure atomicity, the construction employs handshaking. This mechanism implies that there are two \virtual places", also called modes, where the reader and the writer may be during each access to the register; the reader tries to be at the same place with the writer, while the latter tries to avoid it, by \moving" to the other virtual place when it sees that it has been \followed". By having disjoint sets of subregisters that can be accessed in each virtual place, the handshaking mechanism guarantees the existence of a piece of information that can be accessed by each communicating part without collision on the physical level.
The controller of the game is the writer, who, in each write operation, has to: 1) determine the reader's mode by reading the subregister RM and 2) assign the new value to the register and change place if it has been followed by the reader. From the particular rainbow property of the coloring function as described above, it follows that the writer has the capability of changing the value of the register by modifying a single one of the construction's subregisters; moreover, in order to do so, it has two options: to modify either one of the High Order Bits On the other hand, the reader, in each read operation, rst assigns to its local variable wm the mode in which the writer is. This can be determined from the values of the subregisters of H, using a parity function, as was explained in the previous paragraph. If the writer has moved (changed mode) since the previous read, the reader reads the subregisters in L wm , in order to nd which was the last con guration of that set when the writer had to move to virtual place wm. (Notice that, if the writer has not \moved" since the previous read, the information in L wm remains intact since it was last read.) After that, the reader updates RM in order to show to the writer that it has followed it in its new virtual place. Subsequently, The protocol is formally described in gure 1. There, bin(i) denotes the binary representation of i in log k bits, represents exclusive-or and represents multiplication (bin(i) multiplied by bit 0 is the zero-vector of length log k and bin(i) multiplied by bit 1 is bin(i) itself). An example of the coloring function f for a 6-dimensional hypercube is given in 
Correctness Proof of the Construction
First we prove that the encoding adopted using f is correct: Lemma 1 The function f as de ned in gure 1 has the property that for all x 2 f0; 1g 2(k?1) and for all v 2 f0; : : : ; k ? 1g if v 6 = f(x) then there exist y 1 and y 2 which are both in f0; 1g 2(k?1) such that v = f(y 1 ) = f(y 2 ), y 1 6 = y 2 and both y 1 and y 2 di er from x in exactly one bit.
Proof. It holds that for all x; y 2 f0; 1g 2(k?1) such that x; y di er only in one bit (assume w.l.o.g either bit i or bit k ? 1 + i, where 1 i k ? 1), f(x) 6 = f(y) because f(x) f(y) = bin(i) which is not zero. Moreover, given x and y as above there exists another y 0 2 f0; 1g 2(k?1) such that y 0 6 = y and y 0 di ers from x only in one bit, either bit k ? 1 + i or bit i, respectively, and f(y) = f(y 0 ). This is because f(y) f(y 0 ) = 0. On the other hand,
given again x and y as above, for any other y 00 2 f0; 1g 2(k?1) which di ers from x in only one bit excluding bits i and k ? 1 + i (x and y 00 di er either in bit j or in bit k ? 1 + j, 1 j k ? 1 and j 6 = i) it holds that f(y) 6 = f(y 00 ) because f(y) f(y 00 ) = bin(i) bin(j) which is not zero, because i 6 = j.
2
From now on we concentrate in proving the atomicity of our construction. First we introduce some auxiliary terminology, which will help the presentation of our arguments:
-For a read operation r, put(r) denotes its subwrite to RM, mode(r) is the value it writes to RM, while view(r) is the 2(k?1)-tuple of values that it uses as input in its invocation of f. For a write operation w, get(w) denotes its subread from RM, mode(w) is the value of the writer's -A phase of writes W is a sequence of write operations w 1 ; : : : ; w n such that w 1 ! : : : ! w n and mode(w 1 ) = : : : = mode(w n ) = m and for which there exist w 0 (if w 1 is not the rst write operation of the respective execution ) and w n+1 such that w 0 directly precedes w 1 , w n+1 is directly preceded by w n and mode(w 0 ) = mode(w n+1 ) = m.
-For a read operation r, let each one of its read suboperations be mapped to the most recent write operation which modi ed the respective subregister (according to the total order de ned on the actions of the atomic subregister). We de ne (r) to be the write operation of this set such that every other operation of this set precedes it. This function is well de ned because the write operations are totally ordered, since there are no overlapping writes. A read operation r is called related to a phase of writes W, if (r) is one of the writes in W. Lemma 2 For any read r and for any write w, such that put(r)!get(w) and (:9 read r 0 : put(r) ! put(r 0 ) ! get(w)), it is mode(r) = mode(w).
Proof. Since there are no overlapping reads, the lemma hypothesis implies that r is the last read to modify RM before w reads it. Thus w will read from RM into its local variable rm the value that r wrote; either this value will be complementary to the value of w's local variable wm, or w will complement wm. In both cases, due to the de nitions of mode(r), mode(w), the lemma follows.
2 Lemma 3 Let W be the phase of writes related to a read operation r. Then there is no write operation w in W such that put(r) ! get(w).
Proof. Since the unique subwrite operation of any write operation is also its last suboperation, (r) either precedes or overlaps r. Let m = mode(r). Lemma 2 implies that each write w, which overlaps r and put(r)!get(w), writes in one of the subregisters in L m or H. But after put(r), r will read L m . Thus, it cannot be put(r)!get( (r)) and neither can be put(r)!get(w) for any other w in W, since the rst write occurring after put(r) initiates a new phase. Proof. Since there are no overlapping reads, we can use induction on the number of reads that occur in a system execution.
Let r i denote the ith read and w i denote the ith write (there are no overlapping writes, as well) in a system execution. We prove the induction basis by showing that for r 1 , r 2 it holds that between the rst subaction of r 1 and put(r 2 ) at most one write to the subregisters of H can occur. Suppose, towards a contradiction, that there exist write operations w x ; w x+1 ; : : : w x+q (q 1) whose write suboperations modify subregisters in H and occur between the rst subaction of r 1 and put(r 2 ). Then the following conditions hold: (i) put(r 1 ) ! get(w x ). This is because, due to the initialization, each write w such that get(w) ! put(r 1 ) sees that RM = 1 and its local variable wm = 0; therefore, according to the writer's protocol, w will not write in H.
(ii) mode(w x ) = 1 and mode(w x+j ) 6 = mode(w x+j+1 ) (0 j q ? 1). This follows from the de nition of mode(w), because initially the writer's static variable wm equals i=1;:::;k?1 H i (= 0) and it is modi ed i one of the subregisters of H are modi ed.
But from (i) and from our assumption if follows that put(r 1 ) ! get(w x ) : : : ! get(w x+q ) ! put(r 2 ), which implies (from lemma 2) that it should be mode(w x ) = mode(w x+1 ) = : : : = mode(w x+q ). Thus, we have a contradiction to our assumption; therefore, the induction basis is true.
The induction step is proven with similar reasoning and the additional argument (to substitute the initializing conditions) that for each write w it holds that mode(w) = i=1;:::;k?1 H i after the write suboperation of w. 
In order to complete the proof of the atomicity of our construction we will use the following atomicity criterion for single-writer registers (Lamport 14] ). Atomicity Criterion: A register construction is atomic if for any system executions the following three conditions are satis ed:
No-Future: For any read operation r of it is not the case that: r ! (r). No-Past: For any read operation r of there is no write operation w such that (r) ! w ! r.
No-New-Old-Inversion: For any two read operations r 1 and r 2 of it is not the case that:
(r 1 ! r 2 and (r 2 ) ! (r 1 )). Lemma 6 The protocol satis es the above atomicity criterion.
Proof. Lemma 5 implies that for any read r of a system execution of the protocol (r) = (r). Therefore, it su ces to prove that the three conditions of the criterion hold using (r) instead of (r).
No-Future: From the de nition of (r) it follows that the last suboperation of (r) occurs before the last suboperation of r. No-Past: Suppose, towards a contradiction, that there exist a read r and a write w of such that (r)!w!r. From lemma 5 we have that mode( (r)) = mode(r) = m, where m 2 f0; 1g.
There are two cases to be considered:
(1)mode(w) = m: Then w and (r) are in the same phase of writes, which implies that w will write on a subregister in L m . This contradicts the de nition of (r).
(2)mode(w) = m: From the protocol we have that 9 a write w 0 : (r) ! w 0 ! r (w 0 may equal w) such that w 0 writes in one of the subregisters in H. This contradicts the de nition of (r), since r reads the subregisters in H.
No-New-Old-Inversion: Suppose, towards a contradiction that 9 reads r 1 , r 2 in such that r 1 ! r 2 and (r 2 ) ! (r 1 ). From the de nition of (r 1 ) it follows that the last suboperation of (r 1 ) occurs before the last suboperation of r 1 . This implies that (r 2 ) ! (r 1 ) ! r 2 , since r 1 ! r 2 . But this is a contradiction to the No-Past condition, which has already been shown to hold.
For the case where k is not power of 2, then the protocol can use 3l ?2 subregisters, where l = 2 dlogke , i.e. l is the smallest power of 2 larger than k. In this way the protocol will in fact implement an l-valued atomic register (k < l), which can also serve as a k-valued one. Theorem 1 The construction correctly implements a wait-free k-valued atomic register using 3 2 dlogke ? 2 atomic binary subregisters. The maximum number of suboperations performed during any read r is 3 2 dlogke ? 3, while each write w performs one read and one write suboperation.
Conclusions
In this work we have shown how a simple \encoding" function can be used in order to simulate a powerful wait-free mechanism. It would be useful to examine whether more sophisticated encoding can be used in order to gain in e ciency in wait-free constructions for various other objects.
