We prove a lower bound of Ω(log n/ log log n) for the remote memory reference (RMR) complexity of abortable test-and-set (leader election) in the cache-coherent (CC) and the distributed shared memory (DSM) model. This separates the complexities of abortable and non-abortable test-andset, as the latter has constant RMR complexity [25] . Golab, Hendler, Hadzilacos and Woelfel [27] showed that compare-and-swap can be implemented from registers and TAS objects with constant RMR complexity. We observe that a small modification to that implementation is abortable, provided that the used TAS objects are abortable. 
Introduction
In this paper, we study the remote memory references (RMR) complexity of abortable testand-set. Test-and-set (TAS) is a fundamental shared memory primitive that has been widely used as a building block for classical problems such as mutual exclusion and renaming, and for the construction of stronger synchronization primitives [35, 39, 19, 14, 8, 7, 6, 27] . We consider a standard asynchronous shared memory system in which n processes with unique IDs communicate by reading and writing shared registers. A TAS object stores a bit that is initially 0, and provides two methods, TAS(), which sets the bit and returns its previous value, and read(), which returns the current value of the bit. TAS is closely related to mutual exclusion [17] : a TAS object can be viewed as a one-time mutual exclusion algorithm, where only one process (the one whose TAS() returned 0) can enter the critical section [18] .
TAS objects have consensus-number two, and therefore they have no wait-free implementations. In particular, in deterministic TAS implementations, processes may have to wait indefinitely, by spinning (repeatedly reading) variables. It is common to predict the performance of such blocking algorithms by bounding remote memory references (RMRs). These are memory accesses that traverse the processor-to-memory interconnect. Local-spin algorithms achieve low RMR complexity by spinning on locally accessible variables. Two models are common: In distributed shared memory (DSM) systems, each shared variable
Abortable Compare-And-Swap in the CC Model
In this section we consider the cache-coherent (CC) model. Each process obtains a cachecopy with each read of a register, and the cache-copy gets only invalidated if some process later writes to the same register. Writes as well as reads of non-cached registers incur RMRs, while reads of cached registers do not.
A CAS object provides two operations, CAS(cmp, new), and read(). Operation read() returns the current value of the object. Operation CAS(cmp, new) writes new to the object, if the current value is cmp, and otherwise does not change the value of the object. In either case it returns the old value of the object.
Golab et. al. [26] gave an implementation of CAS from TAS and registers, which has constant RMR complexity in the CC model, i.e., each CAS() and reach read() operation incurs only O(1) RMRs. In this section we show how to make that implementation abortable, provided that we have access to abortable TAS objects. The pseudocode is in Figure 1 . The original (non-abortable) version of the code is shown in black and our additional code to make it abortable in red (lines 6 and 20) .
Method NameDecide() Implementation of (abortable) NameDecide() and CAS(). Without lines 6 and 20 the algorithms are equivalent to the non-abortable implementations in [26] .
From TAS to Name Consensus
The implementation in [26] first constructs a name consensus object from a single TAS object T . This implementation provides a method NameDecide(), which each process is allowed to call at most once. All NameDecide() calls return the same value (agreement), which is the ID of a process calling NameDecide() (validity).
The non-abortable implementation in [26] uses a TAS object T and a register leader that is initially ⊥. In a NameDecide() call, a process p first calls T .TAS(). If the TAS() returns 0, then p wins, and writes p to leader. Otherwise, p loses, and so it repeatedly reads leader, until leader = ⊥, upon which p can return the value of leader. It is easy to see (and was formally proved in [26] ) that this is a correct name consensus algorithm.
We now show how this implementation can be made abortable, assuming the TAS object T is abortable. We assume that when a process receives the abort signal, a static processlocal variable abort, which is initially false, changes to True.
Recall that abortability requires that the return value of a TAS() operation indicates whether it failed or succeeded. We assume a failed TAS() simply returns ⊥. In NameDecide(), processes are only waiting until leader changes. If a process is receiving the abort signal while waiting for leader to change, then it can also simply return ⊥. The rest of the algorithm is the same as the original name consensus algorithm.
Clearly, the new code (line 6) does not affect RMR complexity, and following an abort the code is wait-free. Moreover, correctness (validity and agreement) in case of no failed NameDecide() operations follow immediately from correctness of the original algorithm. If a NameDecide() operation fails (i.e., returns ⊥), then it did not change any shared memory object (its TAS() must have either failed, or returned 1). Hence, removing an aborted and failed NameDecide() operation from the execution does not affect any other processes, and therefore the resulting execution must be correct.
From Name Consensus to Compare-And-Swap
We now show how the abortable name consensus algorithm can be used to obtain abortable CAS. Consider the implementation of CAS(cmp, new) on the right hand side in Figure 1 . The black code is logically identical to the one in [26] . It uses a register D that points to a page, which stores two registers, value and f lag, as well as a name consensus object N . Register value at the page pointed to by D stores the current value of the object. (Thus, a read() operation, for which we omit the pseudo code, simply returns D → value.) The CAS() operation assumes a wait-free method getNewPage(), which returns an unused page from a pool of pages (for simplicity assume this pool has infinitely many pages, but there are methods for wait-free memory management that allow using a bounded pool [27, 3] ).
For a description of how the algorithm CAS(cmp, new) works, we refer to [26] . We can prove that the abortable version presented here is correct, provided that the non-abortable version (with line 20 removed) is: First of all, obviously line 20 does not change the RMR complexity. Moreover, if a process receives the abort-signal, then its abortable NameDecide() call terminates within a finite number of steps, and the process also does not wait in the while-loop, so its CAS() call completes within a finite number of its steps. Finally, notice that a CAS() call returns ⊥ only if an abort signal was received, and in that case no shared memory objects are affected (the process cannot have won the NameDecide() call). Hence, all aborted and failed operations can be removed from the execution without changing anything for the remaining operations. As a result we obtain Theorem 1.
RMR Lower Bound for Abortable Leader Election
In this section, we give an overview of the RMR lower bound proof for abortable leader election (and thus TAS) as stated in Theorem 3. First, we define some notation, the system model, RMR complexity, and the abortable leader election problem.
Lower Bound Preliminaries
System Model and Notation. For a set Q, set Q k , for some non-negative integer k, denotes the set of all sequences of length k that contain only the elements in Q. Furthermore, Q * denotes the sets of all sequences that contain only elements of set Q.
For the lower bound we assume a set P of n processes, and an arbitrary large but finite set R of shared registers. Processes are infinite state machines. In each shared memory step (corresponding to a state transition), a process either reads or writes a register in R. At an arbitrary point, a process may also receive an abort signal which does not result in a shared memory access, but in a state change of that process, provided the process has not earlier received the abort signal. Once a process has reached a halting state, it will remain in that state forever, and does not execute any further shared memory steps.
For each process p ∈ P, we define a special abort symbol p ⊤ . For a set P ⊆ P let P ⊤ = {p ⊤ | p ∈ P }, and P ∆ = P ∪ P ⊤ . A configuration is a sequence that describes the state of each process in P and each register in R. A schedule is a sequence σ over P ∆ . Thus, any schedule σ is in (P ∆ ) * . The length of an schedule σ is denoted by |σ|. Let σ 1 and σ 2 be two schedules. Then σ 1 • σ 2 is the schedule obtained by concatenating σ 2 to the end of σ 1 , without changing the order within σ 1 and σ 2 . Let P roc(σ) denote the set of processes p ∈ P that occur in σ at least once, not counting symbols in P ⊤ . A configuration C and a schedule σ ∈ P ∆ of length one result in a new configuration Conf (C, σ), obtained from C by process p taking its next step, if σ = p ∈ P, or by process p receiving the abort signal, if 
To specify that an execution starting in C and running by schedule σ is running algorithm A, we use Exec A (C, σ). The length of an execution E is denoted by |E|. We call s i an abort step by process p, if in s i process p receives the abort signal. Let E 1 and E 2 be two executions. Then E 1 • E 2 is the execution obtained by concatenating the steps of E 2 after the steps of E 1 , without changing the order of steps within E 1 and E 2 .
The initial configuration is denoted by Γ. A configuration C is reachable, if there exists a schedule σ such that Conf (Γ, σ) = C. Since only reachable configurations are important in our algorithms and proofs, we use configuration instead of reachable configuration from this point on. For a configuration C we let σ →C denote an arbitrary but unique schedule such that Conf (Γ, σ →C ) = C, and we define E →C = Exec(Γ, σ →C ).
The projection of a schedule σ to a set Q ⊆ P ∆ is denoted by σ|Q. For an execution E and a set Q of processes, E|Q denotes the sub-sequence of E that contains all (abort and shared memory) steps by processes in Q.
Recall that a configuration C determines the state of each process. I.e., for any two executions E and E ′ resulting in the same configuration C, each process is in the same state at the end of E as at the end of E ′ , and in particular E|p = E ′ |p. Therefore, we associate the state of a process in configuration C with E →C |p. (But note that if two executions E and E ′ are indistinguishable to each process in Q ⊆ P, then this does not in general imply that E|Q = E|Q ′ .) The value of register r in configuration C is denoted by val C (r). Configurations C and D are indistinguishable to some process p, if E →C |p = E →D |p and val C (r) = val D (r) for every register r ∈ R. For a set Q ⊆ P, we write C ∼ Q D to denote that configurations C and D are indistinguishable to each process in Q; for a set consisting of a single process p we write
RMR Complexity. Our lower bound applies to both, the standard asynchronous distributed shared memory (DSM) model and cache-coherent (CC) model. In fact, we use a model that combines both, caches as well as locally accessible registers for each process.
We assume that set of registers, R, is partitioned into disjoint memory segments R p , for p ∈ P. The registers in R p are local to process p and remote to each process q = p. We say that at the end of execution E a process p has a valid cache copy of register r, if in E process p reads or writes r at some point, and no other processes writes r after that. Note that the configuration obtained at the end of an execution starting in Γ uniquely determines whether p has a valid cache copy of a register r. The reason is that the state of p in configuration C determines the value that was written to or read from r when p accessed r last, and p has a valid cache copy of r if and only if val C (r) equals that value. Let Cache p (C) denote the union of R p and the set of registers of which process p has a valid cache copy in configuration C if p has not terminated in C, and the empty set if p is terminated in C.
A step in an execution E is either local or remote (we say it incurs an RMR if it is remote). All abort steps are local. A non-abort step by process p is local, if and only if it is either a read or a write of a register in R p , or it is a read of a register of which p has a local cache copy.
For an execution E and a process p, RMR p (E) is the number of RMR steps by process p in execution E. Further, RMR(E) is the number of RMR steps incurred by all processes in execution E. For Q ⊆ P we define RMR Q (E) = q∈Q RMR q (E), which is equal to the total number of RMRs incurred by processes in Q in E. For the sake of conciseness, we use RMR(E) instead of RMR P (E).
Abortable Leader
Election. An algorithm solves abortable leader election, if for any schedule σ, in Exec(Γ, σ) each process that terminates returns win or lose, at most one process returns win, and if all processes in P roc(σ) return lose, then all processes in P roc(σ) receive the abort signal.
We usually assume without explicitly saying so that an abortable leader election satisfies deadlock-freedom and bounded abort, defined as follows: Bounded abort means that after a process received the abort signal it terminates within a finite number of its own steps. An infinite execution σ is P -fair for P ⊆ P, if each process appears infinitely many times in σ. An infinite execution E is P -fair for P ⊆ P, if for some configuration C and a P -fair schedule σ, it holds E = Exec(C, σ). We use fair schedule and fair execution, instead of P -fair, when P = P. An algorithm is deadlock-free if for any schedule σ all processes terminate in Exec(Γ, σ), provided this execution is fair.
Properties of Abortable Leader Election
In this section we derive the critical property that distinguishes non-abortable from abortable leader election for the purpose of the lower bound. We consider algorithms in which each process returns either win or lose upon termination. We call such algorithms binary. Note that any (abortable) leader election algorithm is a binary algorithm.
Several results in this section will concern only two arbitrarily selected processes in the n-process system for n ≥ 2. For ease of notation, we will call these processes a and b.
For an execution E of a binary algorithm in which a returns x and b returns y, let (x, y) denote the outcome vector of E. For a binary algorithm A and a configuration C, let V A (C) denote the set of all outcome vectors of {a, b}-only executions starting in C, in which processes a and b terminate.
First we observe that the outcome vectors of two indistinguishable configurations are equal. Proof. Since C and D are indistinguishable to processes a and b, E →C |a = E →D |a, E →C |b = E →D |b, and for any register r, val C (r) = val D (r). Thus, for any x in {a, b}
|b, and for any register r, val Conf (C,x) (r) = val Conf (D,x) (r). So by induction, for any {a, b}-only schedule σ, Conf (C, σ) ∼ {a,b} Conf (D, σ). Therefore, if in Exec(C, σ) process p ∈ {a, b} terminates, it also terminates in Exec(D, σ) and it returns the same value in both executions. Hence, the outcome vector
. This definition of bivalency refers to two fixed but arbitrarily chosen processes, a and b. In a system with more than two processes, we may write {a, b}-bivalent to indicate the two processes a and b to which this definition applies. A configuration is strongly bivalent (or strongly {a, b}-bivalent) if it is bivalent and a solo-run by any process p ∈ {a, b}, starting in C, results in p winning.
A similar argument to the FLP Theorem [20] implies that for any deadlock-free binary algorithm and for any reachable bivalent configuration, there exists an infinite execution, where no process terminates. ◮ Lemma 5. Let A be a deadlock-free binary algorithm and C an {a, b}-bivalent configuration. There exists an infinite schedule σ ∈ {a, b} * , such that in Exec A (C, σ) none of a and b terminate.
To prove this lemma we first prove Claim 6 and use the fact that none of a and b can be terminated in an {a, b}-bivalent configuration.
◮ Claim 6. In any deadlock-free binary algorithm A, if configuration C is {a, b}-bivalent, then either one of Conf (C, a) and Conf (C, b) is {a, b}-bivalent, or there exists an infinite {a, b}-only execution, where none of a and b terminates.
(1)
We now distinguish two cases. Case 1: In C, processes a and b are poised to access different registers or poised to read the same register. Thus,
By (1)
But this contradicts deadlock-freedom, as in a fair schedule starting in Conf (C, b • a) both processes must terminate and output something.
Case 2: In configuration C, both processes are poised to access the same register r, and at least one of them is poised to write r. Without loss of generality, assume that a is poised to write register r. If a takes its write step after b's step, then a's state and shared register values are no different than if only a takes its write step and b does not take its step. So Conf (C, a) ∼ a Conf (C, b • a). If process a does not terminate in a solo-run starting in Conf (C, a), then the claim is true, because there exists an infinite execution starting in C that neither a nor b terminates. However, if process a terminates in a solo-run starting in Conf (C, a), by (1), we can conclude that (
Any deadlock-free (non-abortable) 2-process leader election algorithm has a bivalent initial configuration. But in any fair schedule, both processes terminate. Therefore, the infinite execution that is guaranteed by the above corollary cannot be fair; in particular, it requires one of the two processes to run solo at some point. However, one can construct a deadlock-free (non-abortable) leader election algorithm in which one process never takes an infinite number of steps, no matter what the schedule is. The lemma below shows that this is not true for abortable two-process leader election algorithm. Proof. Let Γ be the initial configuration of A. For the purpose of contradiction, assume there is a fixed process, a, that terminates within a finite number of its own steps in all executions. Let b be the other process.
◮
By the safety property of abortable leader election, there is no execution in which both processes win, i.e.,
Let algorithm A ′ be the same as A except that during any execution,
(1) if any of the two processes receive the abort signal, the abort signal is ignored; and (2) if in step s process b reads (a, x), where x = ⊥, then b continues its program, as if it had received the abort signal immediately after step s.
In any execution of A ′ , a and b can only both lose, if they both receive the abort signal. Since both ignore the abort signals (and only b possibly simulates having received an abort signal), there is no execution of A ′ in which a and b both lose. Thus, for the initial
Consider any execution
We now create an execution E = Exec(Γ, σ) of A starting in Γ, by scheduling the processes in exactly the same order as in E ′ , but removing all abort signals. Moreover, when for the first time b reads a value of (a, x) in E, where x = ⊥ (if that happens), then we send process b the abort signal. By construction of A ′ , processes a and b execute exactly the same shared memory steps in execution E of algorithm A as in execution E ′ of algorithm A ′ . Thus, for every schedule σ ′ there is a schedule σ such that processes a and b execute in Exec A ′ (Γ ′ , σ ′ ) the same shared memory steps as in Exec A (Γ, σ). This implies
Note that in the construction above, if σ ′ is fair, then so is σ. Hence, the fact that A is deadlock-free implies
In algorithm A, in a sufficiently long solo-run by a, in which a does not receive the abort-signal, process a terminates (by deadlock-freedom) and returns win (by the safety property of abortable leader election). Hence, in A ′ process a also terminates and returns win after a sufficiently long solo-run, because it takes exactly the same steps as in A. Since A ′ is deadlock-free by (6) , process b terminates after a sufficiently long solo-run following a's solo-run, and by (3) process b returns lose. With a symmetric argument, for algorithm A ′ , in a sufficiently long solo-run by b followed by a sufficiently long solo-run of a, process b returns win and process a returns lose. Hence, {(win, lose), (lose, win)} ⊆ V A ′ (Γ ′ ). Using (3) and (4) we conclude
We will now show that A ′ is wait-free. This together with (7) contradicts Lemma 5, and thus proves the lemma.
Recall that in every execution of algorithm A process a terminates within a finite number of its own steps. As a result, the same is true for A ′ . Hence, it suffices to show that b terminates within a finite number of its own steps. Suppose there is an execution E * of A ′ in which b executes an infinite number of steps. Then b never reads a value of (a, x), where x = ⊥, as otherwise it would simulate having received the abort-signal in A, and then terminate after a finite number of steps. Since b never reads a value of (a, x), where x = ⊥, it cannot distinguish E * from a solo-run starting in Γ ′ . Hence, b does not terminate in such an infinite solo-run. This contradicts (6) . ◭
One of the core properties of the abortable leader election problem that allows us to prove the lower bound is that there are no reachable strongly bi-valent configurations in any execution.
◮ Lemma 8. Let A be an abortable n-process leader election algorithm with bounded aborts for n ≥ 2. Further, let C be a reachable configuration and a, b two distinct processes that terminate in any {a, b}-fair execution starting in C. For any schedule
Proof. Suppose C is strongly {a, b}-bivalent. Then it is {a, b}-bivalent, so
and if a or b runs solo in C, then that process wins. Because σ ∈ P * , neither a nor b receives the abort-signal in Exec(Γ, σ). By the assumption that aborts are bounded, processes a and b both terminate in sufficiently long solo runs starting in Conf (C, a ⊤ ) and Conf (C, b ⊤ ), respectively. Let x and y be the return values of a in Exec(C, a
Similarly, since
We distinguish the following cases.
Applying a symmetric argument to a sufficiently long solo-run by a following
Hence, using (8), we get (win, lose), 
We get a contradiction for the same reasons as in Case 1.
Case 3: {x, y} = {win, lose}: Without loss of generality, assume x = win. Then in Exec(C, a ⊤ a ka ) process a wins. On the other hand, since C is strongly bivalent, b wins in a sufficiently long solo-run starting in C. Since C ∼ b Conf (C, a ⊤ ), process b also wins in a long enough solo-run starting in Conf (C, a ⊤ ). Hence, we have shown that any of the two processes in {a, b} wins in a solo-run starting in Conf (C, a ⊤ ). By deadlock-freedom and (8) the other process loses, if it performs a long enough solo-run afterwards. This shows that Conf (C, a ⊤ ) is strongly bivalent. Now let A ′ be the 2-process algorithm in which a and b act exactly as in algorithm as A, but the initial configuration is Γ ′ = Conf (C, a ⊤ ). Then A ′ is a deadlock-free abortable 2-process leader election algorithm with bounded aborts: The bounded abort property is inherited from A. Deadlock-freedom follows from the assumption that a and b terminate in any fair execution starting in C. The safety property of abortable leader election follows from (8) and the fact that each process wins in a long enough solo-run starting in the initial configuration Conf (C, a ⊤ ) (because that configuration is strongly bivalent). Moreover, in A ′ process a always terminates within a finite number of its own steps. This follows from the bounded abort property of A and the fact that both processes simulate A starting in configuration Conf (C, a ⊤ ), in which a has already received the abort-signal. This contradicts Lemma 7. ◭
Properties of Executions and Safe Configurations

Additional Assumptions
We make the following assumptions that do not restrict the generality of our results. Recall that processes are state machines, each using some infinite state space Q. We assume that during an execution a process never enters the same state twice. Further, we assume that each register stores a pair in P × (Q ∪ {⊥}), where ⊥ / ∈ Q. The initial value of each register in R p is (p, ⊥), and when a process p writes to any register, it writes a pair (p, x), where x is p's state before its write operation. I.e., we are using a full information model, where processes write all information they have observed in the past. As a result, no two writes in an execution write the same value. Each process's first shared memory step is a read outside of its local shared memory segment, that we call invocation read, and thus incurs an RMR. Adding such a step to the beginning of each process's program does not affect the asymptotic RMR complexity of the algorithm. We will assume that at the end of its execution, each process p reads all registers in R p once. Since those reads do not incur any RMRs, this assumption can be made without loss of generality. We call p's last read of register r ∈ R p the terminating read of r, and we assume that after p's last terminating read, p will immediately enter a halting state.
Terminology and Notation
We define some additional terms and notation.
We say process p is visible on register r in configuration C if val C (r) = (p, x), for some x ∈ Q. Let L(C) be the set of processes that have lost in configuration C.
When we construct our high RMR execution, we need to make sure that whenever a process gains information about some other process that has not yet lost, someone pays for that with an RMR. To keep track of who knows who, we define a set K(C) that contains pairs (p, q) of processes. Informally, (p, q) is in K(C) if p has already gained information about process q in the execution leading to configuration C, or p can gain such information for "free" (i.e., without an RMR being paid for that). Gaining information does not only mean that p reads a register that q has written; it means anything that might affect p's execution, e.g., p's cache copies being invalidated. K(C) is the union of three sets K 1 (C), K 2 (C), and K 3 (C), defined as follows:
is the set of all pairs (p, q), p = q, such that in E →C process p reads a register while process q is visible on that register. I.e., p reads a value of (q, x), where x ∈ Q. Informally: p has learned about q in E →C . K 2 (C) is the set of all pairs (p, q), p = q, such that in E →C process q takes at least one shared memory step and process p reads a register in R q . Informally: Process p may have a valid cache copy of a register r ∈ R q , and by writing to r process q can invalidate that cache copy without incurring an RMR.
is the set of all pairs (p, q), p = q, such that in E →C process p takes at least one shared memory step, and q writes to a register r ∈ R p before p's terminating read of r. Informally: p may learn about q without incurring an RMR by scanning all its registers in
Recall that in our inductive construction of an RMR expensive execution, we will sometimes erase processes from the constructed execution. For that reason, if p knows about q, i.e., (p, q) ∈ K(C), then we will not remove a process q from the execution E →C . We achieve this by ensuring that whenever (p, q) ∈ K(C), q ∈ L(C), and as discussed earlier no lost processes will be erased.
However, we have to be careful about cases in which p does not know directly about q. For example, suppose process q writes to register r in execution E, and later some process z overwrites r and finally p becomes poised to read r. In our inductive construction we may want to remove either z or p from the execution, because we do not want z to be discovered by p. However, removing z reveals q on register r, and so now p may discover q. To account for that we introduce the concept of hidden processes.
In particular, for a configuration C and a register r we define a set H r (C) of processes hidden on r as follows:
if and only if either p does not access r in E →C , or p accesses r in E →C at some point t, and either no process writes r after t, or at least one process that writes r after t is in L(C); Idea: If p's write to r was overwritten by some processes, then at least one of them has lost and thus will not be erased from the execution. Hence, erasing a process does not reveal p's write to any other process. (H2) For r ∈ R p , p ∈ H r (C) if and only if any process other than p that writes to r in E →C is in L(C). Idea: If a process q wrote to a register r in p's local memory segment, then q has lost. Therefore, q will not be erased from the execution. This is important because p can read r for free and we have to assume that it does so frequently, so erasing q from the execution might change what p observes in the execution. Let H(C) = r∈R H r (C). We say process p is hidden in configuration C, if p ∈ H(C).
We finally define the concept of a safe configuration as follows. Configuration C is safe, if
, or p takes no shared memory step in E →C . The first property ensures that no process p knows another process q that has not yet lost, and the second property says that all processes that are not hidden must have lost, or not even started participation. As a result, in an execution leading to a safe configuration, we can erase all processes that do not lose, without affecting any other processes. Formally, we will prove for a schedule σ, a safe configuration C = Conf (Γ, σ) and a set of processes
is also safe.
Forcing Processes to Lose
Lemma 8 is a core lemma in the construction of an RMR-expensive execution, which states that we can force two processes to lose starting in a reachable configuration, that the two processes terminate in any fair execution of those two processes and win in their solo execution. Proof. For the purpose of contradiction, assume that for any execution Exec(C, σ), where σ ∈ {a, b} ∆ * , in which a and b both terminate, one of the processes wins. Then (lose, lose) / ∈ V A (C). Since a solo-run by either a or b, starting in C, results in that process winning, C is {a, b}-strongly bivalent. This contradicts Lemma 8. ◭
Projections
We continue by proving properties of the projection operation. First, the projection of a schedule to a superset of lost processes, P , does not change the execution of those processes, if any process that is known by a process in P is lost.
Proof. We prove the claim by induction on the length of σ. If σ is the empty schedule, then the claim is trivially true. Now suppose that σ = σ ′ λ, where λ ∈ P ∆ is a schedule of length one, and the inductive hypothesis is true for σ ′ , i.e.,
Let D = Conf (Γ, σ ′ ), and
Then it follows from (14) that
, which completes the inductive step.
If λ / ∈ P ∆ , then each of the two executions on the left and right hand side of (15) is the empty execution, so (15) is true. Now suppose λ ∈ {p, p ⊤ } for some process p ∈ P . Then in Exec(D, λ) = Exec(D, λ)|P , either process p receives the abort signal or process p executes a shared memory operation. First assume that p receives the abort signal or writes some value x to a shared register r in that step. By (14) process p is in the same state in D as in D ′ , so p receives the abort signal or writes x to register r, respectively, in 
contains a write to r by some process q. Since only processes in P take steps in that execution, q ∈ P . But since q does not write in (14) . Now assume Exec(Γ, σ ′ ) contains a write to r, and let w be the last such write, executed by some process q. Thus, val D (r) = (q, x) for some value x ∈ Q. Since in Exec(D, λ)
by the assumption of the claim that C is safe. Because L(C) ⊆ P , it follows that q ∈ P . Therefore, by (14) , q's write w, with value (q, x), also occurs in Exec(Γ, σ ′ |P ∆ ), and q does not write to r again after w. By the assumption that val D (r) = val D ′ (r), Exec(Γ, σ ′ |P ∆ ) must contain another write w ′ that is executed after w by some process q ′ = q. All steps in that execution are performed by processes in P , so q ′ ∈ P . But then by (14) , w and w ′ are executed in the same order in Exec(Γ, σ ′ ), contradicting that w is the last write to r in that execution. ◭ If C is a safe configuration, then by (S1) q ∈ L(C) for each pair (p, q) ∈ K(C). Hence, from Claim 10 we immediately get:
The projection of a schedule leading to a safe configuration to a superset of lost processes does not change the cached values of those processes.
◮ Claim 12. Let σ be a schedule, P ⊆ P, C = Conf (Γ, σ), and
Proof. Let E = Exec(Γ, σ), and E ′ = Exec(Γ, σ|P ∆ ). Since C is safe, and L(C) ⊆ P , by Theorem 11,
Fix a process
′ some process q writes to r after step s. Since only processes in P take steps in E ′ , q ∈ P . But then by (17) process q also writes to r after step s in E|P and thus in E-a contradiction.
We now prove
in E ′ process p accesses r and no process writes to r after p's last access.
By (17), p also accesses r in E|P , and thus in E. For the purpose of a contradiction assume r / ∈ Cache p (C). Therefore, some process writes to r in E after p's last access of r. Since C is safe, p / ∈ L(C), and p takes at least one shared memory step in E →C , we obtain from (S2) that p ∈ H(C). Thus, by the assumption that r / ∈ R p , by (H1) at least one process, q, that writes to r in E after p's last access of r, must be in L(C). Therefore, q ∈ P . Since p ∈ P , by (17) , q writes r after p's last access in E ′ . This contradicts (18) . ◭ Removing a winning process from a schedule that leads to a safe configuration does not affect the state and cache values of other processes. Proof. Because p wins in Exec(Γ, σ), we have L(C) ⊆ P ⊆ P. Now the claim follows immediately from tje fact hat C is safe and Theorem 11 and Claim 12. ◭
Safe Configurations
The following claims and lemmas describe the properties of safe configurations. First we show that if starting in a safe configuration, a process that has not yet received the abort signal takes a step which does not incur an RMR, then the resulting configuration is also safe.
◮ Claim 14. Let C be a safe configuration and x ∈ P roc(σ →C ), such that
Proof. Let s be the single step Exec(C, x), and r the register accessed in s. Since x takes at least one shared memory step in E →C (because x ∈ P roc(σ →C ) and x ⊤ does not appear in σ →C ),
Suppose s does not incur an RMR. To prove that C ′ is safe, we will first show that C ′ satisfies (S1). Suppose not. Then there exists a pair (p, q)
Since C is safe, (p, q) / ∈ K(C), i.e.,
By Claim 22,
By (21) there is an index j ∈ {1, 2, 3} such that
For each of j ∈ {1, 2, 3} we will show that this is impossible.
If
Since by (19), s is not q's first shared memory step in E →C • s, in step s process p reads r ∈ R q , and p does not read any register in R q throughout E →C . Hence, q takes at least one shared memory step in E →C .
Since s does not incur an RMR, r ∈ Cache p (C), and so p reads or writes r in E →C , and no other process writes r after that. If p reads r ∈ R q during E →C , then by (23) (p, q) ∈ K 2 (C), which is a contradiction. Hence, in E →C process p writes r, and no other process writes r after that. Since r ∈ R q and p / ∈ L(C) (as in C process p is poised to executes step s), we have q / ∈ H r (C) according to (H2), and thus, q / ∈ H(C). By (20) , q / ∈ L(C) and by the claim assumption q takes at least one step in E →C . Therefore, (S2) is not satisfied, which contradicts the assumption that C is safe.
, then either s is a write by process q and r ∈ R p , or s is p's first shared memory step. The latter is not possible because of (19) . And if the former is the case, then s incurs an RMR, which contradicts the assumption that RMR Exec(C, x) = 0. Thus, we have shown that C ′ satisfies (S1). We will now prove that C ′ also satisfies (S2). Suppose not. Then there exists a process
and p takes at least one shared memory
∈ H(C), then by (S2) process p takes no shared memory steps in E →C . As p takes a shared memory step in E →C ′ = E →C • s we have x = p, and in particular s is x's first shared memory step. This contradicts (19) .
, and so since p ∈ H(C), by (H2) process z does not write v in E →C . Hence, Exec(C, x) is a write to v ∈ R p by z = p, and this write incurs an RMR. This contradicts the claim assumption, RMR Exec(C, x) = 0.
Now suppose v /
∈ R p . Let q ′ = p be the process such that v ∈ R q ′ . Because p ∈ H v (C), there is a non-empty set Z of processes that write v after p's last access of v during E →C , and 
We now show that a process p, which executes a solo-run starting from a safe configuration, must eventually either terminate or incur an RMR.
◮ Claim 15. Let C be a safe configuration, and let p be an arbitrary process in P roc(σ →C ) \ L(C), such that p ⊤ does not appear in σ →C . There exists a non-negative integer k, such that in Exec(C, p k ), process p terminates or incurs an RMR.
Proof. Assume that there exists a process p that does not terminate and does not incur any RMRs in an infinite solo-run starting in C. Let P = L(C) ∪ {p} and σ = σ →C . Since C is safe, p ∈ P roc(σ →C ), and p incurs no RMRs in its solo-run starting in C, the conditions of Claim 14 are met. Hence, for any non-negative integer t, by applying Claim 14 t times,
Since only p takes steps in Exec(C, p t ), and p does not terminate in its solo-run starting in C, we obtain L(Conf (C, p t )) = L(C) ⊆ P . This together with (24) allows us to apply Theorem 11 to obtain
Therefore, if process p does not terminate or incur any RMRs in its t-step solo-run starting in C, then p does not terminate or incur any RMRs in its t-step solo-run starting in Conf (Γ, σ|P ∆ ). Since this is true for all t ≥ 0, in the infinite execution Exec(Γ, σ ′ ), where
.., process p does not terminate. But schedule σ ′ is fair, because each process in P roc(σ ′ ) \ {p} is in L(C) and thus loses in Exec(Γ, σ ′ ), and p performs infinitely many shared memory steps. This contradicts deadlock-freedom. ◭ If at the end of an execution, which starts in a safe configuration, a process that terminates knows the same set of processes as in the beginning of that execution, then that process returns win. ◮ Claim 16. Let C be a safe configuration, p ∈ P \ L(C), and σ a schedule, such that p ⊤ does not appear in σ →C • σ, and
If p terminates in Exec(C, σ), then p wins. (26) , and since C is safe, q ∈ L(C) according to (S1). Thus, we can apply Claim 10 to configuration C ′ and obtain 
Then process p wins in its solo-run starting in C.
Proof. We prove that p terminates in Exec(C, p k ), for some positive integer k. Then by (27) , and Claim 16, p wins in its solo-run starting in C, and the lemma follows.
Let P = L(C) ∪ {p}. Since C is safe, by Theorem 11,
We will show by induction for all k ≥ 0 that
Note that in Exec Γ, (σ →C |P ∆ )•p k all processes in P \{p} = L(C) lose. Hence, by deadlockfreedom, there is an integer k 0 ∈ N such that p terminates in Exec Γ, (σ →C |P ∆ )•p k0 . Then by (29) p also terminates in Exec Γ, σ →C • p k0 , and by Claim 16 it wins in that execution. Thus, p wins in a solo-run starting in C.
It remains to prove the inductive hypothesis (29) . By (28) the hypothesis is true for k = 0. Now assume (29) is true for some integer k ≥ 0. Let x be the the last step in Exec(Γ, σ →C • p k+1 ), and y the last step in Exec Γ, (σ →C |P ∆ ) • p k+1 . To complete the inductive step, it suffices to show that x = y. By the inductive hypothesis, p is in the same state in
Thus, either x and y are both read steps, or they are both write steps, and in the latter case, the value written in step x also gets written in step y. Thus, if x and y are both write steps, then x = y.
Hence, assume x and y are both read steps. In that case, p reads the same register r in x as in y. Let (a, b) be the value p reads in x, and (c, d) the value p reads in y. It suffices to show that (a, b) = (c, d).
First assume that r gets written in the last k steps of Exec(Γ, σ →C • p k ). Then it must be p that writes (a, b) to r itself (i.e., a = p), and by the inductive hypothesis (29) Hence, suppose r gets written in Exec(Γ, σ →C ), and thus the last process writing to r in that execution is a. Recall that in step x process p reads (a, b) from register r, so (p, a) ∈ K 1 (C 1 ) ⊆ K(C 1 ). Then (p, a) ∈ K(C) by (27) . Since C is safe and by (S1), a ∈ L(C) ⊆ P . Since a ∈ P is the last process to write to r in Exec(Γ, σ →C ), by (28) , it is also the last process to write r in Exec(Γ, σ →C |P ∆ ), and in both executions it writes the value (a, b). Hence, r has the same value (a, b) in configuration C as in D.
◭ Starting in a safe configuration, if the executions of two schedules from two disjoint sets of processes do not incur any RMRs, then the execution made up of the concatenation of those schedules does not incur any RMRs and the ordering does not matter.
◮ Claim 18. Let C be a safe configuration, and Q 0 , Q 1 ⊆ P roc(σ →C ) two disjoint sets of processes, such that for any j ∈ {0, 1} there exists σ j ∈ (Q 
First, assume that
This proves Part (a) for j = 1.
Let s be the last step in Exec(C, σ 0 • σ 1 ), and s ′ the last step in Exec(C, σ 1 ). We will show:
step s incurs no RMR in execution Exec(C, σ 0
Then Part (b) follows immediately from (32) and (34), and Part (a) for j = 1 from
First note that using (31) and because p ∈ Q 1 we have
We separately consider the case that s is a read and that s is a write.
Step s is a write. By (35) process p writes the same value to the same register in s as in s ′ . This implies (33) . Moreover,
where the last equality follows from the claim's assumption. Hence, s ′ does not incur an RMR, which is only possible if in s ′ process p writes a register in R p . Because s = s ′ , s does not incur an RMR either, and so (34) follows.
Case 2:
Step s is a read. Let r be the register process p reads in step s, and thus by (35), also in s ′ .
We first prove (33) . To that end we will show that the value of r is the same in
. As a result, in step s process p reads the same value from r as in step s ′ , and so s = s ′ .
All writes to r in Exec(C, σ Since p ∈ Q 1 , we have p = q, and thus r ∈ R p . Process p reads r during Exec(C, σ 1 ) at least once (in its last step s). By the claim's assumption no such read by p incurs an RMR, so r ∈ Cache p (C). But then in E →C process p reads or writes register r ∈ R q before q's terminating read (because q writes r in Exec(C, σ 0 ). If p reads r in E →C , then (p, q) ∈ K 2 (C), and if p writes r in E →C , then (q, p) ∈ K 3 (C). Hence, we have either
But neither is possible, as q takes a step in Exec(C, σ 0 ) (its write to r) and p a step in Exec(C, σ 1 ) (step s). This is a contradiction, and completes the proof of (33 
Because p is the only process that takes steps in E, it is true that the set of all pairs (a, b) , a = b, such that in E →C process a takes at least one shared memory step, and b writes to a register r ∈ R a before a's terminating read of r). Since p does not incur any RMRs in E, if p reads some register r during E, then either r ∈ R p , or r ∈ Cache p (C). Thus,
Lemma 17, p wins in E. ◭
As long as the set of knowing relations does not change during an execution starting from a safe configuration, at most one process terminates.
◮ Claim 20. Let C be a safe configuration, such that if p ⊤ ∈ P ⊤ appears in σ →C , then p ∈ L(C). Then for any schedule σ ∈ P * , when K(C) = K Conf (C, σ) , at most one process terminates in Exec(C, σ).
Assume that in E = Exec(C, σ) two distinct processes, p and q, terminate. Since we assumed that p terminates in E, process p is not terminated in C, and hence, p ∈ P \ L(C). 
Proof. For the purpose of contradiction assume that C ′ is not safe. First assume there exists a process p / ∈ H(C ′ ), such that p takes at least one shared memory step in E →C ′ and p / ∈ L(C ′ ). Because p takes at least one shared memory step in E →C ′ , p ∈ P . Since C is safe, for any pair (p, q) ∈ K(C), process q is in L(C). Hence, by Claim 10, Exec(Γ, σ)|P = Exec(Γ, σ|P ∆ ). Therefore, p takes at least one shared memory step in E →C and p / ∈ L(C).
, there exists a register r ∈ R, such that p / ∈ H r (C ′ ). If r ∈ R p , then at least one process that writes to r in E →C ′ , is not in L(C ′ ). Let q be one of the processes that write to r in E →C ′ and are not in L(C ′ ). Since q takes a step in E →C ′ , process q is in P , and by Claim 10, takes the same write step to r and is not in L(C). Therefore, p / ∈ H r (C), which contradicts C being safe. If r / ∈ R p , then in E →C ′ process p writes to r, and at least one process, q, writes to r after p's write, such that q / ∈ L(C ′ ). Since q takes a step in E →C ′ , process q is in P , and by Claim 10, takes the same write step to r and is not in L(C). Therefore, p / ∈ H r (C), which contradicts C being safe. Now assume that for any p / ∈ H(C ′ ), either p ∈ L(C ′ ) or p does not take any shared memory steps in E →C ′ . Then there exists a pair (p, q)
, then both p and q take steps in E →C ′ (p takes at least a read step, and q takes at least a write step), and thus, are in
both p and q take steps in E →C ′ (p takes at least a read step, and q takes at least a shared memory step), and thus, are in P . If (p, q) ∈ K 3 (C ′ ) \ K 3 (C), then both p and q take steps in E →C ′ (p takes at least a shared memory step, and q takes at least a write step), and thus, are in P . Hence, by Claim 10, p and q take the same steps in E →C and E →C ′ . This
Auxiliary Claims
We now show that during an execution, the knowing relations can only change as a result of a shared memory step by one of the processes, that is in the difference of the relation sets.
◮ Claim 22. Let σ ∈ P ∆ , C a configuration, and C ′ = Conf (C, σ). If there exists a pair (p, q) in the symmetric set difference of K(C ′ ) and K(C), then Exec(C, σ) is a shared memory step by p or by q.
Proof. Let s = Exec(C, σ)
, and (p, q) be a pair in the symmetric set difference of K(C) and K(C ′ ).
Step s causes the difference between
, then in step s process p reads a register on which q is visible. If K 2 (C) = K 2 (C ′ ), then either s is q's first shared memory step, or in s process p reads a register in R q . Finally, if
, then s is p's first shared memory step, or in s process q writes to a register in r ∈ R p . In all cases, s is a shared memory step by p or q. ◭ If two executions are equal when projected to a set of processes, P , then each process in P takes the same number of RMR steps and knows the same set of processes in P at the end of the execution.
◮ Claim 23. Let P be a set of processes, σ and σ ′ schedules, and define E = Exec(Γ, σ),
, for any process p ∈ P , and
Proof. Recall that we assume without loss of generality, that a value does not get written twice in the same execution. Hence, if p reads a value v from register r in execution E, then that read incurs no RMR if and only if p accessed r earlier, and in its preceding access of r process p either read or wrote the same value v. Therefore, E|p uniquely determines which of p's steps are RMRs, and in particular RMR p (E). This proves Part (a). We will show that K(C)
If (a, b) ∈ K 1 (C), then in some step of execution E process a reads a value of (b, x), where x ∈ Q, from some register r.
, then in E process a reads a register r ∈ R b , and b takes at least one shared memory step. As E|P = E ′ |P , process b takes at least one shared memory step in
, then in E process a takes at least one shared memory step, and b writes a register, r ∈ R a , before a's terminating read of r. Since E|P = E ′ |P , process a takes at least one shared memory step in E ′ , and b writes r in E ′ , before a's terminating read of r. Hence, (a,
Constructing an RMR-Expensive Execution
We now consider an abortable leader election algorithm. We will construct a schedule such that in an execution starting in the initial configuration at least one process takes Ω(log n/ log log n) RMR steps, where n is the number of processes.
Overview of the Construction
Let n ≥ 4, ℓ = ⌊log n/c log log n⌋ for some sufficiently large constant c (which we determine in the appendix). We inductively construct a schedule σ i and a set of processes P i ⊆ P, for all i ∈ {0, ..., ℓ}. For the sake of conciseness, let
The construction will satisfy the following invariants for i ∈ {0, ..., ℓ}:
Hence, by (I3) there are at least two processes that each incur Ω(ℓ) = Ω(log n/ log log n) RMRs. Theorem 3 follows.
We now sketch how we construct σ i and P i inductively so that the invariants are satisfied. We start with P 0 = P and the initial configuration C 0 . We then schedule processes in rounds. In round i, we choose a subset P i+1 of the processes in P i \ L i and remove all processes in P \ (P i+1 ∪ L i ) from the execution constructed so far. This does not affect any of the remaining processes, because C i is safe. Then we schedule the processes in P i+1 in such a way that each of them incurs an RMR, and only a small fraction of them lose.
To decide which processes to remove and to schedule the remaining processes, we proceed as follows: First we let each process in P i \ L i take sufficiently many steps until it is poised to incur an RMR. It is not hard to see that in an execution in which no process incurs an RMR, processes do not learn about each other, so the resulting configuration, D i , is again safe. Moreover, in a safe configuration processes only know about lost processes, so they cannot lose.
We then distinguish between a high contention write case, where a majority of processes are poised to write to few registers, and a low contention write case, where either many registers are poised to being accessed or a majority of processes are poised to read. Let S i be the set of registers processes in P i \ L i are poised to access in configuration D i . The high contention write case occurs if there are few such registers and a majority of processes are poised to write, i.e.,
, and otherwise the low contention write case occurs.
In the low contention write case, we choose a set Q i of processes, which contains for each register r ∈ S i at most one process poised to write to r in D i . We consider the step s p each process p ∈ Q i is poised to take. We then create a directed graph G with processes as vertices, and an edge from p to q if in the resulting configuration (I) due to s p or s q process p knows q, or (II) due to step s p process q is not hidden. Each application of rule (I) must be paid for by RMRs in the execution, and for each application of (II) a process p must overwrite some process q. As a result graph G is sufficiently spares, and by Turán's theorem [42] we obtain a large independent set J. We let each process p ∈ J take one step, s p , and erase all remaining processes that haven't lost yet from the execution. It is not hard to see that no process loses in any of the steps added, the resulting configuration is safe (this follows from how we added edges to G) and, because of the sparsity of the graph, a sufficiently large number of processes survive. From that we obtain Invariants (I1) and (I2). Since each process p performs an RMR in step s p and only local steps before that, we get (I3) and (I4). Moreover, we don't abort any processes, so (I5) is true.
In the high contention write case, we erase all readers from the execution. For each register r ∈ S i , let W r denote the set of processes poised to write to r. Since this is a high contention case, |W r | is large for most registers r. For each register r with sufficiently large |W r |, we choose two distinct processes a, b ∈ W r .
We then argue that, after erasing some O(log n) processes, we obtain a configuration D ′ i
and an {a, b}-only schedule σ such that in execution Exec(D ′ i , σ) processes a and b both lose and see no process other than those in L i , which have lost already. The argument is based on Lemma 8, but quite involved. We now let, starting from D ′ i , all processes in W r \ {a, b} execute one step, in which they write to r. After that we schedule a and b as prescribed by σ. Then a and b will both first write to r, and thus overwrite the writes by all other processes in W r , then continue to take steps and lose without seeing any processes that haven't lost, yet. As a result, all processes in W r \ {a, b} have taken a step but are now hidden, two processes (a and b) have lost, and O(log n) processes have been removed. It is not hard to see that the resulting configuration is safe again. We repeat this for all registers r for which |W r | is large enough. Then, we let P i+1 denote the set of all surviving processes and C i+1 the resulting configuration.
Configuration C i+1 is safe, and sufficiently few processes are removed or have lost so that (I1) and (I2) remain true. Moreover, each process that does not lose performs exactly one RMR, so (I3) and (I4) are true. (I5) is true because all processes that received the abort signal lost.
Partial Execution Constructions
One of the critical properties that results in constructing a long enough execution, is that we can keep many processes running while keeping them from gaining information. What follows are the formal description and proofs of this property.
First, we claim that the information exchanged during specific executions is bounded.
◮ Claim 24. Let C be a safe configuration, P = P roc(σ →C ) \ L(C), and σ ∈ P * , such that in C each process in P is poised to perform an RMR step, and in Exec(C, σ) each process takes at most one step and each register gets written at most once. Then
Proof. Since C is safe, by (S1), K(C) ∩ (P × P ) is the empty set. Thus, to prove Part (a) it is sufficient to show that each step in Exec(C, σ) adds at most two pairs of processes to
Let σ ′ be a proper prefix of σ, and p a process so that σ ′ • p is also a prefix of σ. Since p's state is the same in Conf (C, σ ′ ) as in C, and p is poised to perform an RMR step in C, the step Exec Conf (C, σ ′ ), p incurs an RMR. Now let p) is a read from some register r ∈ R q2 , q 2 ∈ P . Let (q 1 , x) = val C1 (r) (if r is in its initial state, then x = ⊥ and q 1 = q 2 ). We prove that no pair other than (p, q 1 ) and (p,
, and 
, then since p ′ takes at least one shared memory step in E →C1 , process q ′ writes a register in R p ′ during Exec (C 1 , p) . This contradicts Exec(C 1 , p) being a read step. Now assume step Exec (C 1 , p) is a write to register r ∈ R q . We prove no pair other than p) is a write step, no process reads a register in that step and thus, (q
, then in E →C2 process p ′ takes at least one shared memory step and q ′ reads a register in R p ′ . Since Exec (C 1 , p) is a write step, it must be the first shared memory step by p
′ writes a register in R q ′ . Thus, p ′ = p, and since r ∈ R q , we have
In order to prove Part (b), we map each pair in M to an RMR step in E →C • Exec(C, σ) in such a way that the mapping is injective. Consider a pair (p, q) ∈ M . I.e., during Exec(C, σ), process q writes to a register r ∈ R p ∪ Cache p (C). If r ∈ R p , then we map (p, q) to q's write step to r. Recall that in Exec(C, σ) each process executes at most one step, and that step incurs an RMR. So (p, q) is mapped to a unique RMR step. Now suppose r / ∈ R p , so r ∈ Cache p (C). Then there exists a step in E →C or in Exec(C, σ), prior to q's write, in which p caches r. Let (p, q) be mapped to the last such step. That step incurs an RMR, so it suffices to show that the mapping is injective. First note that if (p, q) is mapped to a step s, then in its unique step in Exec(C, σ) process q writes to the register that is accessed in step s. Suppose two distinct pairs, (p 1 , q 1 ) and (p 2 , q 2 ) are mapped to the same step s. Let r be the register accessed in s. Then in their steps in Exec(C, σ), processes q 1 and q 2 must both write to r. Since only one process writes to r during Exec(C, σ), we have q 1 = q 2 . Therefore, p 1 = p 2 , and so r / ∈ R pj for some j ∈ {1, 2}. Without loss of generality assume j = 1. Then r ∈ Cache p1 (C), and step s is by p 1 . If r / ∈ R p2 , then (p 2 , q) would not be mapped to s (it would be mapped to a step by p 2 ). Thus, r ∈ R p2 , so (p 2 , q 2 ) is mapped to q 2 's step in Exec (C, σ) . This means that step s is performed by process q 2 . Hence, p 2 = q 2 , which contradicts the definition of M . ◭ Then, we construct and prove the properties of an execution where we have a lowcontention write case (where either most processes are poised to read, or many registers are poised to being accessed).
◮ Lemma 25. Let ℓ be a positive integer, C a safe configuration, and P = P roc(σ →C )\L(C), such that in E →C each process in P takes at most ℓ RMR steps and does not receive the abort signal, and in C each process in P is poised to perform an RMR step. If in C at least half of the processes in P are poised to read or at least |P |/(10ℓ) different registers are poised to being accessed by processes in P , then there exists a set of processes Q ⊆ P and a
and (d) no process in Q receives the abort signal in Exec(Γ, σ).
Proof. Let V = {x 1 , ..., x m } be a maximal subset of P such that for each register r, set V contains none of the processes that are poised to write to r in C, or V contains at most one process that is poised to access r in configuration C.
Hence, by Theorem 11 processes in V are in the same state in C ′ as they are in C, and by Claim 12 have the same cache. Therefore, all processes in V are poised to perform an RMR step and access the same registers in C ′ as in C. Create a directed graph G, where each process in V forms a vertex, and where there is an edge from p to q, p = q, if one of the following is true:
, process q writes to a register r ∈ R p ∪ Cache p (C ′ ). Let M be the set of edges in G because of (ii). Since each process in V is poised to perform an RMR step in C ′ , each process takes at most one step, and each register gets written at most once in Exec(
, by Claim 24 Part (a), the number of edges in G from
′ be a largest independent set in graph G, where the direction of edges are ignored. 
Because at most one process wins in a leader election algorithm |X| ≤ 1, and thus, |Q| ≥ |Q ′ | − 1. By Turán's theorem [42] , the size of the largest independent set in a graph with average degree d and k vertices, is at least k/(d + 1). The number of edges in G is at most
incurs an RMR and each process takes at most ℓ RMR steps during E →C ′ , the number of edges in G is 3m + mℓ. Because |V | = m, the average degree of G is at most 2(3m + mℓ)/m. Hence, the size of Q ′ is at least
The assumption is that in C either at least |P |/2 processes are poised to read, or at least |P |/(10ℓ) registers are poised to being accessed. Hence, m ≥ min |P |/2, |P |/(10ℓ) ℓ≥2 = |P |/(10ℓ) and so by (37)
Since |Q| ≥ |Q ′ | − 1, Part (a) is proven. First, we observe that C ′ is safe by Claim 21. Hence, each process p ∈ Q, we have
∆ |p (this is true by C ′ being safe and Theorem 11). Further by Claim 12, process p has the same cache in Conf Γ,
Hence, each process in Q is poised to take the exact same step that incurs an RMR in
∆ . Therefore, since no two processes that satisfy (i) or (ii) are in Q, we have
We prove Part (b) by contradiction. Assume that D is not safe. Hence, at least one of (S1) or (S2) is violated. First assume that (S1) is not true for D. Thus, there exists a pair (p,
, and since C ′′ is safe, we have (p, q) / ∈ K(C ′′ ). Hence, p gets to know q in E ′ . From (39) and (i), there is an edge between p and q in G, which contradicts p and q both being in an independent set of graph G. Now assume that (S2) is not true for D. Hence, there exists a process p ∈ O,
, and p takes at least one shared memory step in E →D . Since
, process p takes at least one shared memory step in E →C ′′ . Hence, because process p / ∈ L(D), it holds p ∈ H(C ′′ ). Since any process that takes at least one shared memory step in E →D and is not in L(D) is in Q, it holds p ∈ Q. Hence, since there is no edge from p to any process in Q, by (39) and (i) no process in Q \ {p} writes to a register in R p during E ′ . Thus, if (S2) is not satisfied for D, then (H2) is true for p. Therefore, the reason that p is not hidden in D is because of (H1). Hence, there exists a register r / ∈ R p , such that process p accesses r in E →D at some point t and at least one other process writes to r after t, but none of the processes that write to r after t are in L(D). If p's last access to r is in E →C ′′ , then since C ′′ is safe and
, all the processes that write to r after t write during E ′ . Therefore, r ∈ Cache C ′′ (p), and if there exists a process q ∈ Q that is poised to write to r in C ′′ , then there is an edge from p to q in G, which contradicts p, q ∈ Q. If p's last access to r is during E ′ , then no other process writes r after that. This, completes the proof of Part (b). Since each process in Q has the same cache in C ′′ as in C and in E ′ each process in Q takes the step that it is poised to take in C, each process in Q performs an RMR step in E ′ . Hence, E ′ incurs |Q| RMRs, which proves Part (c). Since processes in Q do not receive the abort signal in E →C ′′ , and no process receives the abort signal in E ′ , Part (d) is true. ◭ For a high-contention write case on a specific register (where many processes are poised to write to it), we present a way to construct an execution that can be used to construct our desirable execution. ◮ Claim 26. Let C be a safe configuration, such that for a fixed register r each process in P r ⊆ P roc(σ →C ) \ L(C) is poised to perform an RMR write step to r in C, for any execution E starting in C, no process incurs more than ℓ RMRs during E →C • E, and any process that receives the abort signal in E →C is in L(C). There exists a set of processes
Proof. By Theorem 11, processes in Q are in the same state in C ′ as in C. Further, by Claim 21, configuration C ′ is safe.
least one register in R in configuration C or R ∩ R q = ∅. Since C is safe, for any process
Hence, in any {a, b}-only execution starting in C, for each register r ∈ R on which a process q ∈ Y is visible or r ∈ R q , the first read by each process in {a, b} from r incurs an RMR. Thus, because a and b incur at most ℓ RMRs in Exec Conf (C, a) , λ , it is true that |Y | ≤ 4ℓ.
Let w be the process, such that r ∈ R w . Note that since all processes in P r are poised to perform an RMR step on r, we have w / ∈ P r . Further let X = Z a ∪ Z b ∪ Y ∪ {w}, and
Since C is safe, by Theorem 11, processes in P r \ X are in the same state in D ′ as they are in C, which means they are poised to write to r. Let
We now show that (S1) and (S2) are satisfied for Conf (C ′ , σ). For (S1) we need to show for any pair (p, q)
, then since any visible process on a register read by a or b in Exec(C ′ , σ) is lost (otherwise, it is a process in Y , which does not take any steps in
. Since each process in P r is poised to preform an RMR step in C ′ and both a and b are poised to write to r in C ′ , we have r / ∈ R a ∪ R b . Thus, p / ∈ {a, b}. Since for the process w that r ∈ R w , it holds w / ∈ Q, we have p / ∈ Q. Hence, (S1) is satisfied. Since C ′ is safe, for any p / ∈ H(C ′ ), either p does not take any shared memory steps in E →C ′ , or p ∈ L(C ′ ). Thus, because any register that is accessed in Exec(C ′ , σ) is last accessed by either a or b, and by (44), (S2) is also satisfied. This proves Part (a).
From Q = (P ∪ {a, b}) \ X we get Proof. Let P 0 ⊆ P be the set of processes that are poised to write in configuration C, and C 0 = Conf Γ, σ →C | P 0 ∪ L(C) ∆ . Let {r 1 , ..., r k } be the set of registers that are poised to being written in C. We inductively construct schedule σ i , for i ∈ {1, ..., k}. Our inductive hypothesis is that for i ∈ {1, ..., k}, (IH1) configuration C i = Conf (Γ, sigma i ) is safe, (IH2) |P i | ≥ |P | − i(8ℓ − 1), (IH3) any process that receives the abort signal in E →Ci is in L(C i ).
Since C is safe, by Claim 21, configuration C 0 is safe. By Claim 10, it holds E →C |P = E →C0 |P . Therefore, any process that receives the abort signal in E →C0 is in L(C 0 ). Thus, by (IH3) it holds that any process in P roc(C i ) \ L(C i ), for i ∈ {0, ..., k}, does not receive the abort signal in E →Ci . Hence, by (IH1) and the fact that no process takes more than ℓ RMR steps in any execution starting in Γ, we can apply Claim 26 to C i−1 , where r i is the fixed register. For i ∈ {1, ..., k}, let σ ′ and P i be the schedule and set of processes achieved by applying Claim 26 to configuration C i−1 and the fixed register r i . Then let
By 
Detailed Construction
Let n ≥ 4, c = 10, and ℓ = ⌊log n/(c log log n)⌋. We inductively construct a schedule σ i and a set of processes P i ⊆ P, for all i ∈ {0, ..., ℓ}. For the sake of conciseness, let E i = Exec(Γ, σ i ),
The following invariants are satisfied for i ∈ {0, ..., ℓ}:
⊤ does not appear in σ i . We now describe our inductive construction in detail.
Base Case:
Schedule σ 0 is a schedule in which each process scans its own shared memory segment, and P 0 = P. Note that P roc(σ 0 ) = P.
0:32 RMR Lower Bound for Abortable TAS Inductive
Step:
In C i , we let each process in P i \ L i that does not win in a solo-run take solo-steps until it is poised to perform an RMR. By Claim 19, there is at most one process that wins in a solo-run starting in C i , so in our execution all but one process participate. By Claim 10 each process performs the same steps in the solo-run starting in C i as in the constructed execution, and by Claim 15 each process will eventually become poised to perform an RMR. If there is a process that wins in a solo-run starting from C i , we remove that process from the entire execution constructed so far.
More precisely, let {q 1 , ..., q k } = P i \ L i and let t j be the largest integer, such that RMR Exec(C i , q tj j ) = 0 and q j does not terminate in Exec(C i , q tj j ), for j ∈ {1, ..., k} (since by (I1), C i is safe, and by (I5) q j does not receive the abort signal in E →Ci , such an integer t j exists according to Claim 15) . By (I5), no process in P i \ L i receives the abort signal in E →Ci . Thus, by Claim 19, any process that starting in C i terminates in its solo-run wins. Hence, by the safety property of leader election, at most one process terminates in its solorun starting in C i . If such a process does not exist, then let λ i = q . Hence, by (47) and the construction of λ i , each process in R i (r) ∪ W i (r) is poised to perform an RMR step in D i . Therefore, the register that each process p ∈ R i (r) ∪ W i (r) is poised to access is not in its own memory segment. ◭ Let X i = r∈Si W i (r), and Y i = r∈Si R i (r). We distinguish the following cases to complete the inductive step of our construction:
