We address the problem of reading more than one variables (components) X 1 ; : : : ; X c , all in one atomic operation, by only one process called the reader, while each of these variables are being written by a set of writers. All operations (i.e. both reads and writes) are assumed to be totally asynchronous and wait-free. For this problem, only algorithms that require at best quadratic time and space complexity can be derived from the existing literature (the time complexity of a construction is the number of sub-operations of a high-level operation and its space complexity is the number of atomic shared variables it needs). In this paper, we provide a deterministic protocol which has linear (in the number of processes) space complexity, linear time complexity for a read operation and constant time complexity for a write. Our solution does not make use of time-stamps. Rather, it is the memory location where a write writes that di erentiates it from the other writes. Also, introducing randomness in the location where the reader gets the value it returns, we get a conceptually very simple probabilistic algorithm. This algorithm has an overwhelmingly small, controllable probability of error. Its space complexity as well as the time complexity of a read operation are both sublinear. The time complexity of a write is constant. On the other hand, under the Archimedean time assumption, we get a protocol whose both time and space complexity do not depend on the number of writers but are linear in the number of components only (the time complexity of a write operation is still constant).
Introduction
A shared register is an abstract data structure shared by a number of asynchronous concurrent processes which perform either read or write operations. We adopt the model where each process is assumed to execute either only read operations|it is then called a reader|or only write operations|it is then called a writer. Operations by the same process are assumed to be executed sequentially. An implementation (construction) of a register consists of: (i) protocols for the execution of an operation (read or write) by a process (ii) a data structure consisting of memory cells, called subregisters and (iii) a set of initial values of the subregisters. The execution of a protocol by a process involves a number of both read and write operations, called sub-operations, on the subregisters. To distinguish operations on the register from sub-operations on the subregisters, we sometimes call the former high-level operations. An implementation is wait-free if it guarantees that any process will complete an operation in a nite number of steps (i.e., sub-operations) independent of the execution speeds of the other processes. Obviously, the wait-free condition rules out many conventional algorithmic techniques, such as busy-waiting, conditional waiting or critical sections. The basic correctness condition for such an implementation is linearizability, i.e. although concurrent operations by processes may overlap in time, each one of them appears to take e ect instantaneously, in an order that preserves the operations' semantics (see 7] ). Such an implementation is called atomic.
In this paper we study a type of register called composite register. A composite register is a register partitioned into a number of components, X 1 ; : : : ; X c . A high-level operation on such a register either writes a value to one of the components or reads the values of all the components (the components should not be confused with the subregisters used in an implementation). A composite register is characterized by the number of readers that may concurrently read the composite register, the number of writers that may concurrently write to the same component, the number of components, and the number of bits a component is allowed to hold. The general case of a composite register is the case of n-reader, m-writer per component, c-component, b-bit composite register. The problem which we study in this paper is the implementation of a singlereader, m-writer per component, c-component, b-bit composite register using as subregisters atomic, single-component registers that are allowed to hold a bounded|i.e, independent of the number of operations|number of bits. By our assumption, no concurrent high-level read operations are allowed. There is an extended literature on implementing our building blocks, the single-component, atomic registers, from simpler, i.e., single-bit, single-writer, single-reader, single-component, subregisters satisfying only minimal correctness conditions. See, e.g., 3] for a reference list.
The time complexity of a construction implementing a composite register is the number of sub-operations of an operation, while the space complexity is the number of atomic, singlecomponent subregisters used by the construction. We measure them as a function of the number of the processes which share the composite register.
Afek et al. 2] and Anderson 3] have previously given wait-free constructions for the multireader, multiwriter, c-component, b-bit composite register. Their time complexity as well as their space complexity are at best quadratic as a function of the number of processes. The complexities do not improve if we assume the existence of only one reader.
In this paper, we rst give a conceptually very simple wait-free construction for the singlereader, multiwriter per component case assuming that we have an unbounded (but linearly dependent on the total number of operations performed) number of memory locations (sub-registers). In this rst construction, the number of sub-operations of a high-level operation is unbounded and moreover, one subregister is assumed to hold an unbounded (but logarithmically dependent on the total number of operations performed) number of bits.
We then show how to \recycle" the memory locations in order to obtain a deterministic protocol that uses only a bounded number of subregisters. Our bounded construction has linear space complexity, linear time complexity for a read operation and constant time complexity for a write operation (the building blocks, in the bounded case, are assumed to be single-component, atomic registers so that each may hold either a value from the domain of values allowed to appear on the components of the register or, alternatively, an integer not exceeding a constant multiple of the number of writers per component). We believe that the tool of using uboundedly many memory locations is stronger than the method of unbounded time-stamps (for constructions with unbounded time-stamps see 5] and 8]).
Moreover, introducing randomness in the choice of the memory location recycled by a read, we obtain a conceptually very simple probabilistic protocol. If m is the number of writers per component, c is the number of components, and l and q are constants that can be chosen by the algorithm designer, then the space complexity of our probabilistic algorithm is O(lc). The time complexity of a read operation is O(lc), whereas the time complexity of a write operation is O(q). Finally, there is a O(mc(q=l) q ) probability of error. Our randomized protocol works even if the adversary is assumed to observe a random bit the moment it is generated (this is the strong model for an adversary assumed in 1] and 4]). Randomized algorithms that, as the one in this paper, allow the possibility of error (i.e., Monte Carlo algorithms) may have important drawbacks when applied to shared-memory data structures. However, we believe that they might be interesting not only because the probability of error is overwhelmingly small and controllable|an important factor per se|but also because they may pave the way for new Las Vegas randomizations (i.e., no-error randomizations, where, however, complexity bounds are probabilistic). For error-free, randomized protocols with nite expected time complexity see 6] .
Finally, under the Archimedean Time assumption, i.e. assuming that there are xed, known upper and lower bounds for the ratio of the execution rates of the processes (limited asynchrony), we give a protocol with space and time complexities that do not depend on the number of processes but are linear in the number of components only (the time complexity of a write is still constant). Notice that this assumption does not imply any restriction on the idle time intervals between operations.
Our is a precedence relation on operations which is a strict partial order (denoted by`!'). For two operations a and b, a ! b means that operation a ends before operation b starts. If two operations are incomparable under !, they are said to overlap. Since we have assumed that there is only one reader, all read operations are comparable under !. The protocols, apart from the shared variables, make use of local variables as well (these cannot be shared by concurrent processes). The local variables are assumed to retain their values between invocations of the corresponding procedures (in the programming languages literature, the term static is sometimes used for such variables). We adopt the convention to denote shared variables with capital letters and local variables with lower case letters.
A reading function k for a component k is a function that to each high-level read operation r assigns a high-level write operation w on component k such that the value returned by r for the component k is the value written by w. Similarly for a subregister R, a reading function R is a function that for each read sub-operation r on R assigns a write sub-operation w on R such that the value returned by r is the value written by w. It is assumed that for each subregister there exists a write sub-operation which initializes the subregister, i.e., precedes all other sub-operations on the subregister.
A run (or history) is an execution of an arbitrary number of operations according to the respective protocols. Formally, a run is atomic if the partial order ! on its operations can be extended to a total strict order ) and if for each component X k there is a reading function k such that for all high-level reads r: (i) k (r) ) r and (ii) there is no write w on X k such that k (r) ) w ) r. A construction is atomic if all its runs are atomic. We assume all subregisters to be atomic, therefore we can assume that the precedence relation ! is total when restricted to sub-operations on a single subregister (alternatively, we assume that all sub-operations are instantaneous|i.e., their duration intervals are singletons).
One obviously necessary condition for a composite register to be atomic is that for any read r and for any component X k , it is not the case that r ! k (r) (indeed, otherwise the extension of ! to a total order respecting the reads would be impossible). All our constructions will satisfy this condition for trivial to check reasons. For notational convenience, we call registers satisfying this condition normal.
An Atomicity Criterion
For the case of a single-reader (where we do not have overlapping high-level reads) we have the following criterion for atomicity of a composite register:
Lemma 1 A construction of a normal composite register is atomic if and only if for each component X k , the write operations to it can be serialized by a strict total order ) k compatible with the precedence relation ! and such that the following two conditions hold: 1. Each ) k is compatible with the respective reading function k , i.e. for each read r, it is not the case that there is a write operation w on X k so that k (r)) k w ! r. Moreover, for any two reads r and s and for any component X k , it is not the case that: r ! s and
2. For any two di erent components X k and X l and for any read r, it is not the case that there are write operations v and w on X k and X l respectively such that
where w = ) l l (r) means that either w) l l (r) or w = l (r).
This lemma is essentially the restriction to the single-reader case of the atomicity criteria mentioned in 3], and so we omit its proof. Based on the above, we obtain the following basic lemma which gives su cient conditions for atomicity that refer to each component separately. All our algorithms satisfy the conditions of this basic lemma. Therefore, our algorithms not only implement an atomic register but have stronger, in general, properties described by these conditions. Intuitively, the su cient conditions of the basic lemma require that the write operations of each component can be consistently serialized so that Condition 1 of Lemma 1 is satis ed, and moreover if a write operation w starts after the start of a read operation r, or if w follows (in the serialization of the operations of its component) another write operation that starts after the start of r, then r does not read the value written by w. Formally: Basic Lemma A construction of a normal composite register is atomic if for each component X k , the write operations to it can be serialized by a strict total order ) k that is compatible with the precedence relation ! and such that the following two conditions hold:
1. Each ) k is compatible with the respective reading function k , i.e. for each read r, it is not the case that there is a write operation w to X k so that k (r)) k w ! r and moreover, for any two reads r and s and for any component X k , it is not the case that: r ! s and
2. For any read r and for any component X l , if a is either the rst sub-operation of l (r) or the rst sub-operation of any write operation w to X l for which w) l l (r), and if b is the rst sub-operation of r then a and b take place on the same (atomic) subregister and a precedes b.
Proof It su ces to prove the second condition of Lemma 1. Indeed, let X k and X l be two distinct components. Suppose, towards a contradiction, that there is a v and a w on X k and X l , respectively, such that:
Then, since by hypothesis the rst sub-operation of w precedes the rst sub-operation of r, we get that k (r)) k v ! r, a contradiction. 2 3 The Deterministic Approach
Unbounded Memory-Space
In this subsection we are going to describe a single-reader, multiwriter per component construction that uses unbounded memory-space (i.e. the number of subregisters used may be equal to the number of operations to be performed). In the next subsection then, we show how to \recy-cle" the memory space in order to obtain a construction with bounded space, i.e., independent of the number of operations (actually, the space will be linear in the number of writers). Let us point out that in the unbounded memory-space construction, there is a subregister whose values are addresses of memory locations. Therefore, this subregister must be assumed to hold an unbounded number of bits. This is not the case in the bounded memory-space construction, where there is only a bounded number of addresses. In the unbounded space construction, the number of sub-operations of a high-level read operation is unbounded.
The architecture of our unbounded construction is as follows: For each component k = an integer (a pointer to a memory location). This subregister can be written to by the reader and can be read by all writers. It is initialized with the value 0.
In the protocol the reader is the controller: it is the one who determines where the writers must write. All that a writer has to do is to write its value to the memory location forwarded by the reader through a pointer. More speci cally, the protocol works as follows: A writer rst reads PTR and then writes its value to the memory location of the corresponding component that is pointed to by PTR. The reader, on the other hand, rst increments PTR by one; stores its new value into a local variable ptr and then for each component k = 1; : : : ; c gets the value to be returned by reading ML k] ptr ? 1]; : : : ; ML k] 0] in this order until it gets a value which is not nil. The protocol is given formally in Figure 1 . The reader, by forwarding to the writer, with its very rst sub-operation, a new subregister, which it does not use again during the current read, it succeeds to avoid reading values written by write operations that started after its own starting point. Moreover, the reader, by scanning the subregisters in the reverse order from the one that they were forwarded in previous operations and by returning the rst \non-empty" value, it achieves to return nonoverwritten values.
Correctness Proof We will show that the above construction satis es the two conditions of the Basic Lemma. To show that Condition 1 of the Basic Lemma is satis ed, de ne the relation ) k between writes on the same component as follows: w) k v if and only if either w and v write their value to the same memory location and the last sub-operation of w precedes the last sub-operation of v, or w writes its value to a memory location with address less than the address of the corresponding memory location of v. It is clear that thus Condition 1 is satis ed. Also, Condition 2 of the Basic Lemma is satis ed. Indeed, both k (r) and r have their rst sub-operation performed on PTR. If the rst sub-operation of k (r) followed that of r, then k (r) would write its value to a memory location not visited by r. We get a similar contradiction if there is a w such that w) k k (r) and the rst sub-operation of r precedes that of w. 2
Bounded Memory-Space
In this subsection, we will show how to transform the unbounded space protocol of the previous subsection into one that uses bounded space only. First observe that both conditions of the Basic Lemma, through which we prove the correctness of our protocols, refer to each component separately (i.e., no reference to two components is made in any one of the conditions, as is the case, e.g., in Condition 2 of Lemma 1). This property of the Basic Lemma allows us, without loss of generality and for reasons of simplicity, to present our protocols considering only one component. (Notice that the conditions that the protocol must satisfy for each component in order to comply with the requirements of the Basic Lemma are stronger than what is required from a single-component atomic register; therefore, although we assume the existence of only one component, it is not the case that we reduce the problem of multiple components to the single-component case.) So, indices of variables referring to component numbers are not used in this and the next subsection. However for reasons of completeness, in the description of the formal protocol given in Figure 3 , we assume that there is an arbitrary number of components.
In the bounded space protocol as well, we are going to keep the role of the reader as the controller of the game. It still is the one who determines the subregister where the writer is going to write. However, because the number of the subregisters must be bounded, instead of forwarding a new subregister each time, the reader has to nd an obsolete subregister which will be forwarded to the writer after erasing its contents. We call this procedure of erasing the contents of a subregister and its forwarding to the writer recycling of the subregister.
We keep the techniques used in the previous algorithm, that is: (i) The writer writes to the memory location forwarded by the reader. (ii) The reader, by forwarding with its very rst sub-operation a recycled subregister, which it is not going to use again during the current read, it succeeds to avoid reading component values written by write operations which start after its own starting point. (iii) The reader in each read operation reads the remaining subregisters| i.e. the entries of the array ML except from the entry corresponding to the subregister ML i] currently forwarded|in the reverse order from the one that they had been previously forwarded to the writer.
Thus, the problem of designing a correct algorithm that uses a bounded number of subregisters is reduced to the problem of having the reader choose each time a provably obsolete subregister for recycling. That means that we have to make sure that the following two conditions are satis ed:
Condition A: A read operation, when recycling, does not erase the component value that it returns (this is required because this value must be available to the next read as well).
Condition B: A read operation r, when recycling, does not erase a component value written by a subwrite (of a write operation) that follows the rst subread of r from an entry of ML of the corresponding component (again, in order to avoid the possibility of erasing a value that must be available to the next read). The way to guarantee the above conditions is described in the next two subsections. To make the presentation more understandable, we chose to present rst the case of a single-writer and in the sequel the case of multiple writers.
The Single-Writer Case
The formal protocol for the single-writer bounded case is given in Figure 2 . The initializations of the variables are given at the end of the current Subsection 3.2.1
The reader, in its local memory, maintains an array ma 1::dim] of addresses of memory locations (i.e., the entries of ma are pointers to entries of the array ML). These are the addresses of the memory locations recycled in the last dim read operations, in the order they appear in the array. In other words, by the local array ma, the reader \remembers" the order in which a number of dim memory locations are recycled and forwarded to the writer. For reasons to be explained below, it turns out that the the value of dim must be at least 5. To guarantee Condition A, a read operation never recycles the location where it obtained the value it returns. Thus, the value last obtained by the reader remains available for possible future use (otherwise, the next read operation might be left with nothing to read).
To guarantee Condition B a read r must \know" which are the memory locations where a component value by the writer might appear during r and after the rst subread of r from ML.
These locations should not be recycled. The reader stores the addresses of these not-for-recycling (forbidden) locations into local variables denoted by vb 0 and vb 1 . In the next paragraphs we explain how the reader decides which addresses should be stored into vb 0 and vb 1 .
First, the shared variable PTR (where, in the unbounded case, the reader writes the memory address it forwards to the writer) now has two elds: one, called PTR:flagfield, is a boolean ag; the other, called PTR:ptrfield, is a two-entry array storing two memory addresses both of which are forwarded to the writer for possible use (however, at each read operation, only one of the two entries of PTR:ptrfield gets a possibly new value). In order to write its component value, the writer chooses one among the two entries in PTR:ptrfield according to the value of PTR:flagfield it reads. Moreover, the writer maintains a shared boolean array WFLAG that is read by the reader.
To be more speci c, the writer rst reads PTR and copies the value of PTR:flagfield to WFLAG. Then it re-reads PTR and moves on to the memory address PTR:ptrfield WFLAG], where it writes the component value.
On the other hand, the reader, during an operation r, rst updates the variable PTR. In PTR:flagfield, it writes the complement of the value obtained from WFLAG in its previus read operation (at the starting point of r, this complement is available through the reader's local variable flag). In PTR:ptrfield flag], it writes the address that in its previous operation decided to recycle (i.e., the address in ma dim]). The other entry of PTR:ptrfield gets the same value it had before. Then the reader stores the value of WFLAG in its local variable flag and moves on to scan the addresses stored in ma in order to decide, as explained above, which value to return. Finally, it executes the`recycle' procedure and returns. During the procedurè recycle', the reader, as explained above, chooses the address ma j] to be recycled, erases the value of the corresponding memory location and cyclically rotates the array ma j]; : : : ; ma dim]. The alternation of the values of the boolean variables, and the consequent alternation between the two entries of PTR:ptrfield, where the writer gets the address it uses in order to write its component value, guarantees that the reader has the correct knowledge about the forbidden addresses which must be stored in vb 0 and vb 1 . Indeed, suppose, w.l.o.g., that a read operation r reads at its`read from WFLAG' sub-operation the boolean value 0 and suppose that this value was written to WFLAG by a write operation w. Let w + be the write operation immediately following w and let r 0 be the last read operation preceding r that reads at its`read from WFLAG' sub-operation the boolean value 1. Notice that according to the protocol, r chooses for recycling a memory address not in vb 0 fma i]; ma dim]g. Also, vb 0 is last updated during r 0 . Since the rst subread of r from ML (i.e., the subread from ML ma dim ? 1]]) follows its subread from WFLAG, it can be easily seen that a write operation that writes a component value during r and after the rst subread of r from ML must be either (i) w or (ii) w + or (iii) a write operation that started after the starting point of r. Also, the write operations (i){(iii) (given that they nish before the end of r) choose to write their respective component values in addresses obtained from the variable PTR:ptrfield at an instant when this variable carries values written to it by a read operation between r 0 and r (r 0 and r included). This is so because, since r 0 reads the value 1 from WFLAG, the`read from WFLAG' sub-operation of r 0 must precede the`write to WFLAG' sub-operation of w and hence it must precede the w's second reading of PTR as well.
Using this last fact and by an easy case analysis, it follows that the write operations (i){(iii)
(given that they they nish before the end of r) choose addresses that are either in vb 0 or are equal to the value that ma dim] has at the start of r. Therefore the component values of the write operations (i){(iii) are not erased by r, and so Condition B is satis ed. Notice that according to the protocol, both vb 0 and vb 1 have at most two elements. Since the address to be recycled must be chosen not to be in the set vb flag fma i]; ma dim]g and since this set has at most four elements, the value of dim should be at least ve.
We have proved that both Conditions A and B are satis ed. Since we have assumed that there is only one writer per component, the write operations on each component are linearly writer separately, the reader must know which are the memory locations where this writer might write to, so that it does not recycle them. This is implemented by having the reader keep, for each writer separately, two sets, each having as elements at most two possibly forbidden-to-berecycled memory addresses|exactly as in the single-writer case. During an invocation of the procedure`recycle', for each writer, again only one of the two forbidden sets is considered active, according to the value of the corresponding flag. It follows that in order to always have a spare memory location to recycle, at least 2m + 3 memory locations should be kept in ma. We give the formal protocol for this case in Figure 3 (in the formal protocol, we assume that the number of components is arbitrary). The initialization of the variables is, for each component and for each writer, analogous to the single-component, single-writer case (with 2m + 3 in place of 5).
Proof of Correctness
Conditions A and B mentioned in the introductory paragraphs of Subsection 3.2 remain true, because the communication (other than reading and writing component values) of the reader with each writer is through separate variables. However, in the multiwriter case, in order to show that the conditions of the Basic Lemma are satis ed, we also need to de ne a total order ) among the write operations of each component. Towards this, we rst de ne the tag of a read r to be an integer whose value is equal to the number of read operations that precede r. Whenever a memory location is recycled by a read r whose tag is t, we say that this memory location gets associated with the tag t. This association is kept active until the location is recycled anew; the association is then updated to hold with the tag of the new read invoking the`recycle' subroutine. Also, we say that a write w is associated with a tag t, if the subregister where w is going to write its component value is associated with the tag t at the moment when the write of this value takes place. Now, for each pair of write operations w and w 0 de ne w ) w 0 if w and w 0 are associated to tags t and t 0 , respectively, and either (a) t < t 0 or (b) t = t 0 and (consequently) w and w 0 write In this subsection, we describe a randomized protocol which will satisfy the atomicity requirements, except that for each high-level read r there is an overwhelmingly small and controllable probability that r will erase a value of a write that, otherwise, might have been read by a later read (a run is atomic if we ignore such erased writes).
The idea is (again) to recycle the memory space (which is assumed bounded). The protocol works essentially as in the deterministic case, except that the value to be written to PTR is chosen randomly rather than through the subroutine`recycle'. To avoid, with high probability, to recycle a memory location where a pending write operation may write, we assume that there are su ciently many locations that are candidates for recycling. The number of these locations, l, is determined by the algorithm designer. The protocol is formally given in Figure 4 .
Analysis of the protocol's behaviour
The protocol, in order to be correct, must guarantee that Conditions A and B of Subsection 3.2 are satis ed. Indeed, for the protocol under examination, note that a read operation never recycles the location where it got the value it returns. Thus, Condition A is satis ed. However, Condition B is not deterministically satis ed because it is possible for a read operation r to erase a component value written after the rst subread of r from ML by a write operation that started before the start of r (we call such write operations overlapping erased writes). A run following our randomized protocol is atomic if we ignore the overlapping erased writes. Notice, however, that in the case of a single component and a single writer on it, for each read there is at most one overlapping write that starts before the start of the read. The reader scans l ? 1 memory locations among which it chooses the one it recycles. It does not choose for recycling the location where it obtains the value that it returns. Therefore the probability of erasing the value of an overlapping write is at most 1 l?2 , where l is the number of memory locations (the value of l is decided by the algorithm designer). Moreover, observe that our randomized protocol works even if the adversary can observe a random bit the moment it is generated. This is so because if the choice of the memory location where a write operation w will write its component value is made after the starting point of a read operation r, then the memory location that w will choose does not depend on the random number generated by the read operation r. If on the other hand, the choice of w is made before the starting point of r, then, obviously, the random number to be generated by r is not known to the operation w at the moment the choice is made.
We can further improve our probabilistic algorithm as follows: the reader instead of keeping one memory address (ptr k]) for each component, keeps a sequence ptr 1 k]; : : : ; ptr q k] of them, where q is a constant to be chosen by the algorithm designer. These addresses are chosen randomly and uniformly so that they are all di erent from the location where the read gets the value it returns and from the previous values of the ptr k]'s. On the other hand, a writer of the component k writes its value on the q memory locations it reads from PTR. Now observe that a write can be erased by the one or more reads that overlap its subwrites. By an easy counting argument, it can be proved that for any particular component, the probability for such an error to take place is O(m(q=l) q ) (l is the number of memory locations). Therefore by the Bernoulli Inequality, the probability not to erase an overlapping write on any component is ((1 ? mc(q=l) q )). From that we get that: Theorem 3 The space complexity of our improved probabilistic protocol as well as the time complexity for a read operation are O(lc). The time complexity for a write is O(q) and the probability for a read to erase an overlapping write is O(mc(q=l) q ) (q and l are chosen by the algorithm designer). A run is atomic if we ignore the overlapping erased writes.
As a remark, notice that if we choose q = l=m (that makes a read operation mc times slower than a write operation, something to be expected since there is only one reader vs mc writers) and if moreover we want a total error expectation of at most in a number N of read operations ( and N are given), then it is enough to choose l m+m(log m (cN= )). For example, if we have 100 components with 10 writers each, and if we want a total error expectation of :5% in, say 5 10 14 read operations (if a read operation needs a nanosecond to be executed, then 5 10 14 of them, by the same reader, need at least fteen years), then l must be at least 200, so the space complexity of our construction is of the order of 20,000 registers (as it can be easily veri ed, the constants in the O complexity computations are very small).
An Approach under the Archimedean Assumption
It has been pointed out (see, e.g., 10] or 11]) that in real distributed systems, it is reasonable to assume that the ratio of the rates of execution of elementary instructions for arbitrary pairs of processes is bounded by a xed constant. In other words, it is assumed that the clocks of any two processes have a bound on their running rates. Systems complying with such a restriction are called Archimedean (this assumption does not imply any restriction on the idle time intervals between two high-level operations by the same process). In this section, we give a protocol for a composite register under the Archimedean assumption. Our construction has the interesting property that both its space and time complexity are independent of the number of processes and are both linear only on the number of components. Moreover, the time complexity of a write operation is an absolute constant.
To formalize the above notions, we assume that there is a global time-reference system, which however is not known to the processes (this is not an essential restriction; it is proved in 9] that under some quite general assumptions any system has such a global-time model). Therefore, with every operation (low-or high-level) there is associated a nite time interval, its duration. Now, our assumption of Archimedean time states that there is a xed integer A 0 such that for any two high-level operations a and b and for any time interval I within which a completes the execution of A 0 elementary instructions, if b starts before I does, then b completes the execution of at least one elementary instruction before the end of I. It must be pointed out that by elementary instructions we mean instructions at the lowest level (e.g., assignments of variables, tests, calculations of logical or arithmetical expressions, etc.) Observe, however, that for any particular implementation of subregisters, a constant A can be found, (depending on A 0 and this implementation) such that if within I, a executes A elementary instructions, then b will complete at least one sub-operation (subread or subwrite) before the end of I.
The idea of our construction is the following: As explained in the previous subsection, a basic di culty for a read r in selecting a memory location to be recycled is to avoid an ML x] where x is an old value of PTR that a \slow" write w (i.e. one that overlaps r and which started before the start of r) read in the past. If such an x is chosen, then the value of ML x] can be erased after w writes on it, thus the next high-level read may miss values. Notice, however that such a w must have started before the start of r. So, by the Archimedean-time assumption, if we require from r to do busy-waiting for a su cient number of its clock ticks, before it starts reading the ML x]'s and after it has written on PTR, we can guarantee that r will see all writes that are to write on an ML x] with x 6 = ptr. We give the formal protocol for the reader in Figure 5 . Notice that although there is a busy-wait instruction, the length of this wait is constant (independent of the length of any other operation), therefore the protocol is \wait-free". The protocol for the writer is exactly the same as in the probabilistic case. So, we have:
Theorem 4 Under the Archimedean assumption, a single-reader, c-component, m-writers per component composite register can be constructed with time and space complexities independent of the number of processes. Speci cally, the number of subregisters is 3c + 1, the number of sub-operations of a read operation is at worst 2c + 1, while a write has only two sub-operations.
Conclusion
We have dealt with the problem of designing objects (data structures) shared by asynchronous, wait-free readers and writers. We examined the case of a shared array that must be atomically read by a single reader while each entry of the array is written by a set of writers. Constructions for the more general problem of multiple readers were known. However, the complexity of the extant solutions, even for the case of a single-reader, is at best quadratic. In this paper, we gave a solution that for the single-reader case has linear space complexity, linear time complexity for a read, and constant time complexity for a write. Moreover, again for the single-reader, multiwriter per component case, we gave probabilistic algorithms with very small, controllable probability of error. These algorithms have sublinear space complexity and also sublinear time complexity for a read. The time complexity for a write is still constant. Finally, we examined a model of limited asynchrony known as Archimedean. For this model, we gave a protocol whose both time and space complexity do not depend on the number of processes. They are linear in the number of entries of the shared array (the time complexity of a write is still constant).
