Abstract. A distributed shared memory protocol is called memoryadaptive, if all writes to MWMR registers are "close to the beginning of shared memory", that is the indices of all MWMR registers processes write to when executing the protocol are functions of the contention. The notion of memory-adaptiveness captures what it means for a distributed protocol to most efficiently make use of its shared memory. We previously considered a store/release protocol where processes are required to store a value in shared MWMR memory so that it cannot be overwritten until it has been released by the process. We showed that there do not exist uniformly wait-free store/release protocols using only the basic operations read and write that are memory-adaptive to point contention. We further showed that there exists a uniformly wait-free store/release protocol using only the basic operations read, write, and read-modifywrite that is memory-adaptive to interval contention and time-adaptive to total contention. This left a significant gap which we close in this paper. We show that no uniform store/release protocol can exist that is memory adaptive to interval contention and only uses read/write (no read-modify-write) registers. We furthermore illustrate the validity and practicality of the concept of memory adaptiveness by providing a uniform, memory-adaptive to interval contention store/release protocol for Network Attached Disks.
Introduction
Shared memory algorithms such as collect or renaming provide essential building blocks for many applications. Most often collect or renaming are designed based on an a priori knowledge of an upper bound n on the number of participating processes or of an upper bound N on the ids of participating processes. Algorithms such as collect or renaming, however, become inefficient if only few of the n processes are actually participating. This motivated researchers to look for adaptive algorithms whose step complexity only depends on the number of participating processes. Besides a possibly inefficient use of time, inefficient use of space is also a potential drawback of many distributed algorithms. In particular many shared memory algorithms require memory space whose size is a function of N (or n) even if only few of the processes are actually participating. Hence to truly improve the efficiency of distributed algorithms the step complexity should be made adaptive to the number of participating processes, i.e. the contention, and the space requirements should (if possible) depend on the number of participating processes or the contention. Following this approach one obtains two possible kinds of adaptive algorithms: Algorithms where the step complexity adapts to the contention are traditionally called adaptive. We called such algorithms time-adaptive to distinguish them from algorithms where the memory space consumption adapts to the contention that we called memoryadaptive [27] . In memory-adaptive algorithms processes are only allowed to write to a shared MWMR register whose index is a function of the contention (possibly point-, interval-or total contention) during the processes previous shared memory access.
Time-adaptive algorithms have a worst case step complexity that is bounded by a function of the number of concurrently participating, or actually active processes [6] . Motivated by Lamport's MX algorithm [34] , many such time-adaptive algorithms have since been designed [3, 4, 6-9, 11, 17, 18, 20-22, 26, 35] .
With respect to memory consumption of time-adaptive renaming or collect algorithms Afek, Boxer and Touitou [5] showed that the number of Multi-Writer Multi-Reader (MWMR) registers used must be a function of N . They specifically show that for any constant d there is a large enough N d such that every longlived time-adaptive (to interval contention, and hence, point contention as well) read/write implementation of collect (and renaming) with N d processes must use at least d MWMR registers. In their paper they use a simple object called weak test and set [15] to derive their impossibility results. More recently Attiya, Fich and Kaplan [19] significantly improved on [5] . They showed that if a collect algorithm is time-adaptive to total contention, namely, its step complexity is f (k), where k is the number of processes that ever became active during the current execution, then it uses Ω(f −1 (N )) MWMR registers, where N is the total number of processes in the system.
In this paper we will remove the assumption of a known upper bound on the number of participating processes and consider uniform protocols [16, 29, 33] , i.e., protocols that do not require a priori knowledge of or an upper bound on the number of processes that may participate. At the same time we will assume that the number of participating processes is always finite.
The notion of memory-adaptiveness [27] requires that each(!) write operation that a process makes must be close to the "front" of shared memory. The idea here is that if protocols allow processes to write to registers whose index depends say on the processes id and no upper bound on the number of participating processes is known in advance then memory must be unpredictably large. On the other hand, if we can guarantee that the memory required by each protocol that runs on a shared memory system is a bounded function of the contention, then a distributed operating system can allocate large memory blocks to each protocol on an ad hoc basis and, on rare occasions when needed, increase or decrease the individual allocations as necessary.
Also, if processes are allowed to write to registers with arbitrary indices in time-adaptive protocols they eventually must move the values they wrote close to the beginning of memory for the protocol to stay time-adaptive during solo executions. Hence ideally processes will want to register in a fixed finite subset of the infinite set of MWMR registers that we called "close to the beginning of shared memory" [27] .
Consider the renaming problem [3, 4, 6, 7, 10, 20, 36] for example: processes are allowed to use any shared MWMR register during the execution of the protocol, even a register with an extremely large index, but the final result must lie within a bounded distance from the front of shared memory. In the definition of memory-adaptiveness, to capture the notion of having to write close to the front of shared memory every time, we require processes to write to a MWMR register whose index is a function of the contention during the previous operation of the same process.
In [27] we investigated simple tasks, store and release, that require a given process to store a value in shared MWMR memory that cannot be overwritten by any other process and then to erase the value when no longer needed, freeing the memory for other processes to use.
We studied whether these simple commands can be implemented memoryadaptively under different assumptions about the contention of the protocol.
We showed that in a system with infinitely many MWMR registers and infinitely many SWMR registers: 1. There is no uniform, long-lived memoryadaptive to point-contention implementation of store/release that uses only read/write registers. 2. There does exist a uniform, long-lived implementation of store and release in the read-write model that is memory-adaptive to total contention. 3. Allowing write-plus (read-modify-write) there exists a uniform, long-lived implementation of store and release in the read-write model that is memory-adaptive to interval contention.
The question remained, however, whether in this setting there exists a uniform, long-lived, memory-adaptive to interval contention store release protocol that uses only read/write registers. This is of particular interest because one could argue that with adaptiveness to interval contention "true adaptiveness" really starts, since adaptiveness to total contention allows for the memory requirements to grow independent of the contention during operations. Moreover, even though we were able to show that there exists a protocol memory-adaptive to interval contention using read-modify-write registers, we were not able to justify the use of these stronger primitives. In this paper we will close this gap and hence significantly strengthen our previous results.
Our Approach
To prove our impossibility result we will use a covering construction as in the memory-adaptiveness to point contention impossibility proof in [27] . Our proof, however, will have to be more complex and requires greater care. In [27] we were able to select all potentially participating processes in advance and construct the run that produced the contradiction as a single run. Using the pigeonhole principle we simply reduced the set of participating processes in pseudo solo runs more and more until all MWMR registers at the beginning of shared memory (that are accessed in solo runs) were covered. All processes that were covering one of the MWMR registers at the end of the construction had been participating from the beginning. If we want to obtain a contradiction to memory-adaptiveness to interval contention, however, we cannot proceed in this manner. Every time a new register is covered we must choose a new set of processes that never acted before since otherwise processes could receive information about the increasing interval contention allowing them to write to more ("new") MWMR registers and making it impossible for us to get a contradiction. In other words, when we extend the covering from the first j to the first j + 1 MWMR registers that processes write to during their pseudo solo runs, we first eliminate the traces of the process that now covers the j +1st register. This is done by releasing covering writes to overwrite this process. In [27] we were simply able to cover each of the required MWMR registers with infinitely many processes and then release covering writes as need be. Here we are not able to do this anymore. Instead -to ensure that our construction is memory-adaptive to interval contentionwe must rebuild our covering after each time the covering writes have been released. Otherwise it might be possible for processes to detect that some of the processes involved in these covering writes were concurrently active with them and hence allow them to write to MWMR registers outside the bounded (by a constant) range of MWMR registers at the "beginning of shared memory". As we would cover more registers, processes would be able to memory-adaptively write to more "new" MWMR registers outside the "beginning of shared memory" making it impossible to get a contradiction.
As a result, our construction is similar to the construction in [5] , however we note that the result there does not directly imply our result since it assumes a finite number of available MWMR registers while we allow for infinitely many available MWMR registers. Therefore we must always ensure that at all important steps of our construction processes are only able to write to the bounded set of registers at the "beginning of shared memory". We will do this by making these processes believe that they execute the protocol solo. Also in [5] algorithms are assumed to be time-adaptive allowing to bound the execution length of each participating process since any time-adaptive protocol is by definition wait-free. Here we do not assume that our protocols are time-adaptive. Instead to bound the execution length (otherwise processes could simply keep reading each others SWMR registers) we assume the protocol to be uniformly wait-free, that is that the length of all executions is uniformly bounded. The covering techniques used in our impossibility proofs first appeared in [24] to show some bounds on the number of registers necessary for mutual exclusion. Similar covering arguments were used in many recent papers to prove space and time lower bounds. For example, see the survey by Fich and Ruppert [28] .
Network Attached Disks
In the second part of the paper we will consider an important and natural application of memory-adaptive algorithms. Recent advances in storage technology [32] have enabled systems like Storage Area Networks [13, 14, 23, 37, 39] , which have network attached disks or NADs. A NAD is a simple device that just executes requests to read and write blocks of data. It can be accessed by any process in the system, so that the NADs effectively become a shared storage medium that can be used to solve distributed problems such as consensus, as in Disk Paxos [1, 25, 30] . Unlike message-passing systems, which typically require a majority of processes to be correct to avoid partitioning, NADs allow protocols that can withstand the crash of any number of processes. Therefore -like conventional shared-memory models -the model allows uniform protocols. One difficulty of this model is that a NAD can fail by crashing and thereby become inaccessible.
In [12] Aguilera, Englert and Gafni showed that one cannot uniformly implement a MWMR register on a NAD with a finite number of fail-prone base registers, even if the implementation need not be wait-free. Therefore, one cannot use the standard technique of implementing a MWMR register by first implementing SWMR registers: doing so would blow up the space complexity.
These implies the need for infinitely many base registers. Since this is however an unrealistic assumption in real NAD's, memory-adaptive algorithms are of particular practical interest on such disks. If uniform protocols that require infinitely many base registers and that run on NAD are memory-adaptive to interval contention they will remain practical since they will allow us to efficiently bound the memory requirements based on this contention.
Based on our impossibility result, we will show how store/release can beuniformly and memory-adaptively to interval contention -implemented on Active Disks. This is possible since read-modify-write objects are available on Active Disks [2, 38] .
Related Work
Uniform protocols have been studied (e.g., [16, 33] ), particularly in the context of ring protocols. Adaptive protocols, i.e. protocols whose step complexity is a function of the size of the participating set, have been studied in [6-8, 26, 35] . Long-lived adaptive protocols that assume some huge upper bound N on the number of processes, but require the complexity of the algorithm to be a function of the concurrency have been studied in [3, 4, 9-11, 20, 21, 36 ].
Contributions
We summarize the contributions of our paper.
Interval contention: (Theorem 1)
We show that in a system with infinitely many MWMR registers and infinitely many SWMR registers, for any constant d, there exists a number N d such that if N d processes are allowed to participate then there does not exist a memory-adaptive (to INTERVAL contention) implementation of store/release. In other words we show that under these conditions processes cannot memory-adaptively store a value in shared memory. This closes a gap that remained open in [27] and implies the impossibility of uniform memory adaptive to point contention algorithms [27] with all its consequences. Moreover it justifies the use of read-modify write registers and similar stronger primitives to uniformly implement memoryadaptive to interval contention store/release [27] . 2. We present a uniform implementation of memory-adaptive to interval contention store/release on Active Disks. While it was shown [12] that one cannot uniformly implement a MWMR register on a NAD with a finite number of fail-prone base registers, this results provides a realistic and practical building block for algorithms on NAD where an upper bound on the interval contention can be enforced.
Paper Organization: We will first in Section 2 review our model, followed by our impossibility proof in Section 3. We conclude with a transfer of our algorithm to NAD's (Section 4) and some final remarks (Section 5).
Model and Preliminaries
For our impossibility result we use the standard shared-memory model of distributed computation. There are infinitely many processes each modeled by infinite-state machines that are capable of unbounded computation. Processes participate in a distributed deterministic asynchronous protocol and are indexed by the positive natural numbers so that each process "knows" its own "name." There are two areas of memory: the single-writer multi-reader (SWMR) space and the multi-writer multi-reader (MWMR) space. In the SWMR space each register is associated with a distinct process so that only this process is allowed to write to this register while all other processes are able to read it. Each SWMR register can store an unbounded number of bits. The MWMR registers have all the same properties as the SWMR registers with the exception that any process may both read and write to any register.
Processes access the memory space using basic atomic operations. The atomic operations we will allow in this paper are read, write, and read-modify-write.
-READ: To execute a read command, a process specifies a register to be read and upon completion of the read, the process has gained a snapshot of the contents of the specified register. -WRITE: A process specifies which register to write to (in either private or shared memory) and the data to be written. Upon completion of the write command, all previous data is overwritten with the new data specified by the process. (Note that we do not allow a process to overwrite "part" of a register.) -READ-MODIFY-WRITE (RMW): The RMW command allows the unbreakable execution of the following code (where X is a shared variable and f is a mapping):
function RMW(X,f) begin temp←X; X←f(X); return(temp); end A protocol is an algorithm that accomplishes a task using basic operations. An adaptive protocol is one in which the resources consumed by the protocol are functions of the number of processes that actually participate in the protocol (a.k.a. active processes) rather than the total number of processes. In an adaptive protocols, the size of the resources (time or space) consumed is a function of the contention. The contention can be measured in three different ways, effecting the strength of adaptiveness of a protocol: Total contention refers to the total number of processes that become active during the entire execution of the protocol. Interval contention during a given processes protocol is defined to be the total number of processes that become active during the execution interval of a processes protocol. Finally, point contention during a given processes protocol refers to the maximum number of processes that are simultaneously active at any point during the execution interval of a processes protocol.
A protocol is time-adaptive to a particular type of contention if the maximum number of basic operations executed during the protocol by any given process is a bounded function of the contention type. This type of definition has been studied extensively [3, 6, 4, 7-9, 11, 17, 18, 20-22, 26, 35] .
We say that a basic operation is memory-adaptive [27] to a type of contention if and only if the following is true. Whenever a process executes a basic operation, if the next basic operation changes the state of a shared memory register, the index of the register at which this change occurs is a bounded function of the contention (point, interval or total) at the time of the previous basic operation. (In the asynchronous model, without loss of generality, we may assume that the first basic operation in any protocol is a read, which does not change the state of any register.) In other words, a process can read wherever it wants, but it can only write to places that are as close to the "front" of shared memory as possible. Most time-adaptive algorithms that were presented [3, 6, 4, 7-9, 11, 17, 18, 20-22, 26, 35] are not memory-adaptive in this sense. They might however force that the final result of a computation lies within a bounded distance of the "front" of shared memory.
The three protocols that we will focus on in this paper are store, release and Weak Test and Set.
-STORE: A data value is specified in advance by the process. The goal is for the process to store the data value in some shared register in such a way that upon completion the process knows that the value will not be moved or erased by any other process until the register is explicitly released. -RELEASE: Assumes execution of a previous store protocol. Upon completion, the shared register occupied by the process is released.
-WEAK TEST AND SET (WT&S): Based on a test&set object. WT&S object guarantees safety -no two processes are able to concurrently "set the bit". However liveness is only guaranteed in solo executions: if two or more processes access a WT&S object concurrently it is possible that none of them captures the bit (i.e. none of the participating processes reads 0 as the bits value). We model the behavior of a weak test and set object with the following program (Figure 1 ). Each process is in one of four possible states: thinking, WT & Set, eating and RESET. A WT&S object satisfies the following two properties:
• Exclusion: At most one process is eating at any system state of the execution.
• If a process becomes hungry, that is leaves the thinking state, while all other processes are thinking and it only takes steps then it must eventually start eating.
Note that STORE and RELEASE are fundamental building blocks useful for many distributed protocols (e.g. collect, mutual exclusion, consensus, approximate agreement, and so on). WEAK TEST AND SET on the other hand is useful in the proof of lower bounds. We call a protocol uniformly wait-free if there exists a uniform bound applicable to all processes on the number of basic operations that the protocol requires before termination. All protocols considered in this paper will be uniformly wait-free. We make the following definitions:
-A system state consists of the state of all processes and the value of all registers in the system. A system has one or more initial system states in which the system starts its execution. 
Interval Contention
In [27] we showed that there is no uniform, long-lived and memory-adaptive to point contention store and release protocol. We will now strengthen this result by showing that there is no uniform and memory-adaptive to interval contention weak test and set protocol. We begin by showing that we can implement WT&S from memory-adaptive store and release. The reduction uses the fact that store/release is uniformly wait-free. Reduction from memory-adaptive and uniformly wait-free store and release to memory-adaptive and uniformly wait-free WT&S: In the uniform memoryadaptive store and release problem, processes repeatedly store and release values in shared memory. The index of the MWMR registers to which they write must be in the range {1, ..., f (k)} where k is the number of processes that are active concurrently with the process that is trying to store or release a value. So when a process runs solo the index of the MWMR registers it writes to must be bounded by some constant f (1). In an implementation of WT&S from memoryadaptive store and release we use one copy of the memory adaptive store and release object. To perform the WT&S operation a process first attempts to memory adaptively store the value "active" in shared memory. If at any point in time during the execution of the algorithm it writes to a MWMR register with an index greater than f (1) it fails the WT&S and -if it already stored a value -releases the value it stored. Otherwise it reads all other f (1) MWMR registers to see if any other process was able to concurrently store a value in shared memory. If it sees any other process as active it fails WT&S and releases the value. Otherwise it wins the WT&S object. To release the WT&S object a process releases the value it stored in shared memory. This clearly satisfies the required properties and implements the desired object.
We furthermore assume that each process has only one single-writer, singlereader (SWMR) register: All SWMR registers of a process can always be replaced by a single SWMR register.
A condition or property holds in a run if it holds at the end of that run (unless we state otherwise).
It hence suffices to show that there is no uniform memory-adaptive to interval contention (uniformly wait-free) Weak Test And Set implementation using only read/write registers.
The Theorem
Clearly if an implementation of a weak test and set object is memory-adaptive then there exists a constant i such that, no solo run segment writes to a MWMR register with an index greater than i. We say that the algorithm is "i-solomemory-adaptive".
Theorem 1. For any constant i there is no long-lived, uniformly wait-free, isolo-memory-adaptive to interval contention implementation of Weak-Test & Set in a system with infinitely many processes and infinitely many MWMR and SWMR read/write registers.
Note that in contrast to [5] we do not require our algorithms to be timeadaptive. Hence the result in [5] does not immediately imply our result. Moreover the number of available MWMR registers is now unbounded, that is when covering writes are released processes that detect contention can possibly write to more than the first k MWMR registers. After addressing such issues our proof proceeds in a similar manner as [5] .
The proof is by way of contradiction. First, assume that there is a memoryadaptive to interval contention WT&S implementation with infinitely many MWMR registers for a system with infinitely many processes. Then we show that under these conditions there is a run in which two processes p and q are in the critical section, i.e. are eating at the same time.
1. We construct a run prefix α s.t., the state at the end of α is transparent with respect to p, and every MWMR register that p writes in its pseudo-solo run starting at α is covered. As in [5] we construct this cover inductively. 2. Let solo p be the pseudo-solo run segment of process p starting after α. Hence p is eating in α· solo p . Let {r 1 , ..., r i } be the set of MWMR registers written by p in solo p , where i ≤ i. 3. We now enable the covering writes and wait until all processes reach a thinking state. This is guaranteed by the fact that we are dealing with a uniformly wait-free WT&S implementation. 4. We ensure that processes that are active do not detect each other by selecting them in such a way that they do not read each others SWMR registers. (This also follows from the protocol being uniformly wait-free. We show later in detail that this is possible.) 5. We select a process q that does not read the SWMR register of p. This process will enter the critical section together with p, a contradiction.
Sketch of the Proof of the Main Lemma
The proof is based on [5] . We construct α by first, for explanatory reasons, making strong assumptions. We then remove these assumptions to obtain the claimed result. We use the following notations. For an infinite set R M W of MWMR registers, we consider W to be an i-solo-memory-adaptive implementation of WT&S in the Read/Write shared memory model. Note that by the definition of memoryadaptiveness i must be a constant.
Phase 1:
Assumption A: There are no write operations to SWMR registers in all legal runs. That is, we assume for the moment that there is a uniformly wait-free WT&S protocol that is i-solo-memory-adaptive to interval contention and uses no SWMR registers. Assumption B: If G is a set of processes and s is a state that is transparent with respect to G, then during their pseudo-solo runs starting at s all processes in G write in the same MWMR registers in the same order.
These assumptions will later be removed. We will be able to remove assumption B because of the i-solo-memory-adaptiveness of the algorithm, that is processes can only write to a fixed number of MWMR registers in pseudo-solo runs and the fact that our protocol is uniformly wait-free. Hence using a Ramsey theoretic argument we can find a large enough set of processes that will write in the same order into these registers.
In the following lemma α is denoted by s · β and satisfies the properties of α: Property 1: the state at the end of s · β is transparent with respect to some set of processes called G e , and property 2: there is a cover on all the MWMR registers written by processes in G e in their pseudo-solo run segments, starting after s · β. The size of G e is a parameter and is determined in the full proof. Proof. In the full proof we show that n e = n e,i is a function of i, j and e. It is similar to [5] and can be found in the full version of the paper.
Phase 2:
We now relax assumption A. To do this we use techniques developed in [5] . The run constructed in the previous lemma may not be valid anymore, as processes are allowed to write to their SWMR registers. The argument presented above may collapse in one of the following two ways:
1. The participating processes in any clean run segment may read the SWMR registers of other active processes. In particular, they may read the SWMR register of the processes whose traces their writes are supposed to eliminate. They would then leave the system in a non-transparent state by writing about the value they read. 2. After a clean run segment, a process q might start its q-segment execution and may read the SWMR register of another concurrently active process p. Hence, q will not perform a pseudo-solo run anymore, that is it may write to a MWMR register with an index greater than i and it may stop without covering the MWMR registers. Moreover, q may decide "on the spot" to write into different MWMR registers than what we originally planned.
As in [5] or [12] , we will avoid the two dangerous situations by not allowing processes, whose SWMR registers are later read to take part in the constructed run. So, if in any given state in the run, if process q reads the SWMR register of process p and p is active, we construct another run in which p is replaced by another process p . Process q will still read the same SWMR registers. The behavior of p and p is in some sense "equivalent". They both write and cover the same MWMR registers. All we need to do is to show that a process like p always exists since (1): There is a large enough set of processes to select p from s.t., p did not participate in the run before and has the same general properties as p. We can do this since at any give point in time at most finitely many processes participate in the execution while infinitely many processes are available. (2): Process q can perform only a constant number of read operations, since the number of concurrently active processes in the run is a function of d and k and since the algorithm is uniformly wait-free.
We maintain a large enough set of 'equivalent' runs, which allows us to replace at any point in time at which we fail to reach a transparent state. This set will shrink as the construction progresses. In our construction, whenever a process p that was previously selected to participate in the run is discovered by a covering process, we need to replace it with some other process p that cannot be discovered. We achieve this by considering an equivalent run in which p takes steps instead of p. This allows us to restate the central inductive lemma as follows: 
Note that we also need to modify the proof of the Main Theorem along the lines of the proof of this lemma.
Proof. Similar to [5] . For lack of space we leave it to the full paper. It remains to remove Assumption B. During a WT&SET operation processes are now allowed to write to different MWMR registers in different orders. This means that the cover we constructed earlier might not be on the "correct" registers anymore since two processes p and q may write into the MWMR registers in different orders.
To overcome this difficulty we first recall that we are only interested in pseudo solo runs. We know, however, that processes executing such runs are only allowed to write to the first i MWMR registers in shared memory. Hence in pseudo-solo runs the number of MWMR under consideration is a constant. Second we recall that our algorithm is uniformly wait-free that is the length of every pseudo solo run is a constant. Hence we can consider the different sequences of write operations to MWMR registers by the different pseudo-solo run segments of processes in G. The number of these sequences is bounded by i and m where m is the uniform bound on the length of a solo execution. Each such sequence defines an equivalence class in G. Since G is infinite, we can always find a subset of processes that in pseudo solo runs performs the same sequence of writes to MWMR registers starting at s.
But since in two different states s and s that are transparent with respect to G the sequence of MWMR registers that processes in G write to in pseudo-solo runs need not be the same, the required subset of processes cannot be computed in advance. Instead it is computed iteratively in rounds as in [5] .
We restate the main inductive lemma with assumption B removed. Proof. We leave the complete proof to the full paper. It is similar to [5] .
4 Uniform Memory-adaptive algorithms for NAD's
We will now discuss what our results imply for the design of memory adaptive algorithms (e.g. store/release) for NAD's. Earlier in this paper we showed that there is no uniformly wait-free, uniform store/release protocol memory-adaptive to interval contention that uses only read/write registers. In [12] it was shown that one cannot uniformly implement a MWMR register on a NAD with a finite number of fail-prone base registers, even if the implementation need not be wait-free. This implies the need for infinitely many base registers. Since this is however an unrealistic assumption, memory-adaptive algorithms are of particular practical interest in the uniform setting on NAD's. If uniform protocols that require infinitely many base registers and that run on NAD are memory-adaptive they will remain practical since they will allow us to efficiently bound the memory requirements based on the contention. Hence memory-adaptive algorithms are not only attractive but essential for uniform algorithms on NAD's.
In [27] we provided a uniform memory-adaptive to interval contention implementation of store/release using stronger primitives namely an operation we called write-plus which is weaker than the standard read-modify-write. The write-plus command is equivalent to specifying that the function f in the definition of read-modify-write (see the model section) is required to be a constant independent of X (the value read).
Active Disks [32] on the other hand are capable of supporting stronger semantics that are not normally provided by disk drives. In particular they can provide read-modify-write operations. Our results imply that to run realistic uniform algorithms on a NAD -that is algorithms that are memory adaptive to interval contention -read/write registers are not sufficient. Our results justify the use of Active Disks in the uniform setting. We will now show how to implement memory adaptive to interval contention store/release on active disks if disks and hence registers may fail. Proof. We first recall our memory-adaptive to interval contention algorithm using read-modify write registers from [27] :
We assume that memory is arranged in the form of a two-dimensional grid, this time indexed by N × N . Whenever a process executes a read-modify-write into shared memory, it keeps a copy of what was previously written there in its private memory space along with whatever it writes into the register. As a result the process always has a complete record of all of its operations starting from the beginning till the current time in its private space along with the values that it overwrites. During each store and with each write, the process keeps track of the number of times it has stored a value in shared memory. Each write will contain a field with this parameter. The algorithm uses splitters [36] . We assume that splitters are able to hold values. Each process when executing the algorithm attempts to capture a splitter so that it can store its value in this splitter.
Using these assumptions we showed in [27] that a process has the ability to tell whether a splitter is "clean" or "dirty". In other words, the process is able to tell whether, given a splitter, there exists another process that has previously written into the splitter's slot #1 and yet has not either written into slot #2 or written into some other shared register. Based on this processes execute the following protocol: Whenever a process executes a store, it begins at splitter (1, 1) = (i, j). If the splitter is taken with a value, then the process moves to (i + 1, 1). If the splitter is dirty, it moves to (i, j + 1). If the splitter is clean, it competes. It writes his name into slot #1 and checks slot #2. If there is a "new" name (i.e. a name that has been written in the splitter after the process started competing) in slot #2, the process moves to (i + 1, 1). If there is no new name, then the process writes its name into slot #2 and checks slot #1. If there is a new name in slot #1, then the process moves to (i, j + 1). If the process's name is still written in slot #1, then the process has won the splitter and the right to use its value register. It notes this in the register and writes its value.
In order to execute a release, the process simply indicates that the splitter is now clean. Also in [27] we showed that this protocol is time-adaptive to total contention and memory-adaptive to interval contention.
We now transfer this algorithm to Active Disks. To do so we use Active Disks that provide Read-Modify-Write registers. Active Disks however may fail. So to make this algorithm fault-tolerant assuming that at most t disks may fail we simply let each process execute a store on 2t + 1 active disks. Each process is guaranteed to receive responses from a majority of disks so it suffices to wait for these responses when executing either store or release.
