Model checking software transactional memories (STMs) is difficult because of the unbounded number, length, and delay of concurrent transactions and the unbounded size of the memory. We show that, under certain conditions, the verification problem can be reduced to a finite-state problem, and we illustrate the use of the method by proving the correctness of several STMs, including two-phase locking, DSTM, TL2, and optimistic concurrency control. The safety properties we consider include strict serializability and opacity; the liveness properties include obstruction freedom, livelock freedom, and wait freedom.
Introduction
With the advent of multi-core processors, there is a new urgency for concurrent programming models that give the programmer the illusion of sequentiality and the compiler maximal flexibility. A model that has enjoyed particular recent success is software transactional memory (STM), which allows the programmer to think in coarsegrained code blocks that appear to be executed atomically and, at the same time, minimally constrains the compiler. Inspired by how databases manage concurrency, transactional memory was first introduced by Herlihy and Moss [HM93] in multi-processor design. Later Shavit and Touitou [ST95] introduced STM, a software-based variant of the concept, which enables a new way of looking at concurrent programming. An extensive overview of STM can be found in [LR07] . In this paper, we consider the following STM algorithms: two-phase locking, DSTM [HLMS03] , TL2 [DSS06] , and optimistic concurrency control [KR81] .
Precisely because STM algorithms encapsulate the difficulty of handling concurrency, the potential of subtle errors is enormous. This makes STM a ripe and important proving ground for formal verification. While there have been initial steps in this direction [COP + 07], the challenge remains daunting for several reasons. First, there is no generally agreed upon formal notion of correctness for STM. Scott [Sco06] was the first to provide a formal semantics for STM. However, his weakest correctness criterion requires the order of commits to be preserved. Thus, the popular STM algorithm TL2 [DSS06] , which does not preserve the order of commits, falls outside the semantic classification by Scott. Guerraoui and Kapalka [GK08] discussed various alternatives to precisely capture the safety aspect of STM and highlighted the subtle differences with database transactions. Second, while model checking is the verification technique that is best equipped to find concurrency bugs, model checking is severely handicapped by several sources of unbounded state in STM: memory size, thread count, and transaction length cannot be bounded, and neither can the delay until a transaction commits, nor the number of times that a transaction aborts. As with relaxed memory models, special care is needed in formulating a verification problem that is both relevant and solvable, as some problems about sequentializing concurrent systems are undecidable [AMP00] .
Third, the specification of an STM universally quantifies over all possible application programs, requiring the desired safety and liveness conditions for all programs that are executed on the STM. In this sense, STM verification resembles the problem of checking that a processor implements an instruction set architecture, where the executed programs are also universally quantified. In both cases, the key is to define (and check) a suitable implementation relation [BD94] . While in processor verification, the implementation relation needs to handle pipelines and out-of-order execution, in STM, we need to handle aborted transactions.
We present in this paper a new technique for verifying STM safety and liveness properties. Our technique addresses the three issues above as follows.
First, the safety requirements we consider are strict serializability [Pap79] and opacity [GK08] . ( We consider a single-version read/write restriction of the general notion of opacity.) Strict serializability preserves the order of conflicting operations between transactions, and the order of non-overlapping transactions. Opacity ensures, in addition, that aborting transactions do not see an inconsistent state of the memory, which can be disastrous in STMs (due to infinite loops, or exceptions). We study opacity, because it provides the programmer with the full sequentiality illusion and is satisfied by most STM protocols that claim that illusion [LR07] . Strict serializability is considered here for pedagogical reasons, as it is intuitive and captures the main technical difficulties behind verifying opacity. The liveness requirements we consider are the standard notions of obstruction freedom [HLM03] , livelock freedom [AKH03] , and wait freedom [Her91] .
Second, we exploit the structural symmetries that are inherent in STM algorithms to reduce the verification of unbounded STM state spaces to a problem that involves only a small number of threads and shared variables. Specifically, we show that every STM that enjoys certain structural properties either violates any of the considered safety and liveness requirements on some program with two threads and two shared variables, or satisfies the requirement on all programs. The structural properties, which expect all threads to be treated equally, are fulfilled by most transactional algorithms, including for instance, two-phase locking, DSTM, TL2, and optimistic concurrency control. Similar techniques for reducing unbounded instances of model-checking tasks to small, characteristic instances have been used for verifying protocols with an unbounded number of identical processes [BCG89] and cache-coherence protocols [HQR99] .
Third, and perhaps most importantly, we define two finite-state transition systems that generate exactly the strictly serializable (resp. opaque) executions of programs with two threads and two shared variables. These transition systems can be viewed as most liberal reference STM algorithms guaranteeing strict serializability (resp. opacity). To our knowledge, the transition systems presented in this paper provide the first finite-state representation of the language of strictly serializable (resp. opaque) executions for transactions that may abort. The finite size of the transition systems is achieved by a careful choice of state, which encompasses for every thread a set of read variables (at most two), a set of written variables (at most two), a set of variables not allowed to be read (at most two), a set of variables not allowed to be written (at most two), and a set of threads with overlapping, preceding transactions (at most 1). We show that an STM algorithm is strictly serializable (resp. opaque) iff for a specific, most general program with two threads and two variables, all executions are permitted by the reference STM algorithm. Then, instead of checking language containment between a given STM algorithm and the reference algorithm, we check for the existence of a simulation relation between both transition systems [Mil71] . The existence of a simulation relation is a commonly used, efficient sufficient condition for language containment.
Putting all steps together, we reduce the problem of verifying the safety of an STM algorithm, which is unbounded in many dimensions (memory size, thread count, transaction delay, etc.), to a simulation check between two finite-state systems. For two-phase locking, DSTM, TL2, and optimistic concurrency control, we obtain transition systems with up to 12,000 states, and the reference transition systems have about 12,500 states. We implemented a simulation checker that automatically verifies strict serializability for optimistic concurrency control and opacity for two-phase locking, DSTM, and TL2 in less than 30 minutes. It should be noted that the methodology is applicable to any other STM algorithms that satisfy the structural properties. Our simulation checker finds that correctness is not self-evident in many STM algorithms. For example, we found an ambiguity in ordering of two particular operations in the published TL2 algorithm [DSS06] . One of the orderings makes TL2 unsafe. In this case, the simulation check provides as counterexample an execution that is not strictly serializable (and thus not opaque). We therefore expect our verification tool to be useful to STM designers when they develop or modify STM algorithms.
On the liveness side, we prove again a structural reduction theorem to check the desired liveness requirement on the finite-state transition system that results from a given STM algorithm applied to a most general program with two threads and one variable. We built a model checking tool to verify the different liveness properties. In the case of obstruction freedom, this amounts to checking a Streett condition. The check goes through for DSTM. For twophase locking, TL2, and optimistic concurrency control, the model checker automatically generates counterexamples to obstruction freedom, as it does for DSTM and livelock freedom.
Safety in transactional memories
We introduce a few notions about transactions, and then formalize the correctness of transactional memories.
Let V be a set {1, . . . , k} of k variables, and let C = {commit} ∪ ({read, write} × V ) be the set of commands on the variables V . Also, letĈ = C ∪ {abort}. Let T = {1, . . . , n} be a set of n threads. LetŜ =Ĉ × T be the set of statements. Also, let S = C × T . A word w ∈Ŝ * is a finite sequence of statements. Given a word w ∈Ŝ * , we define the thread projection w|t of w on thread t ∈ T as the subsequence of w consisting of all statements s in w such that s ∈Ĉ × {t}. Given a thread projection w|t = s0 . . . sm of a word w on thread t, a statement si is finishing in w|t if it is a commit or an abort. A statement si is initiating in w|t if it is the first statement in w|t, or the previous statement si−1 is a finishing statement.
Given a thread projection w|t of a word w on thread t, a consecutive subsequence x = s0 . . . sm of w|t is a transaction of thread t in w if (i) s0 is initiating in w|t, and (ii) sm is either finishing in w|t, or sm is the last statement in w|t, and (iii) no other statement in x is finishing in w|t. The transaction x is committing in w if sm is a commit. The transaction x is aborting in w if sm is an abort. Otherwise, the transaction x is unfinished in w. Given a word w and two transactions x and y in w (possibly of different threads), we say that x precedes y in w, written as x <w y, if the last statement of x occurs before the first statement of y in w. A word w is sequential if for every pair x, y of transactions in w, either x <w y or y <w x. We define a function com :Ŝ * → S * such that for all words w ∈Ŝ * , the word com(w) is the subsequence of w that consists of every statement in w that is part of a committing transaction.
A transaction x of a thread t writes to a variable v if x contains a statement ((write, v), t). A statement s = ((read, v), t) in x is a global read of a variable v if there is no statement ((write, v), t) before s in the transaction x. A transaction x of a thread t globally reads a variable v if there exists a global read of variable v in transaction x. A word w is transaction equivalent to a word w if for every thread t ∈ T , we have w|t = w |t. Note that two transaction equivalent words have the same order of commands for all threads.
Safety criteria
Conflict serializability [EGLT76] is a commonly used correctness criterion for concurrent systems and, in particular, for transactional systems. Conflict serializability allows us to omit the values of read and write commands, since the consistency of the values follows from preserving the order of conflicts. In the context of transactional memories, a stronger property, called strict serializability, is considered. Strict serializability preserves the order of nonoverlapping transactions too. We note that strict serializability does not state any restrictions on the operations of the aborting transactions. In the scope of STMs, an even stronger notion of correctness, referred to as opacity, has been suggested [HLMS03, GK08] to avoid unexpected side effects, like infinite loops, or array bound violations. Opacity requires that a word be strictly serializable, and that even aborting transactions do not read inconsistent values. Now, we formalize these correctness criteria. We start with the notion of a conflict. Transactional memories use direct update semantics (every transaction modifies the shared variables in place and restores them upon abort), or deferred update semantics (every transaction modifies a local copy, and changes the shared copy upon a commit). We choose to define conflicts under the deferred update semantics. A statement s1 of transaction x and a statement s2 of transaction y (where x is different from y) conflict in a word w if (i) s1 is a global read of some variable v, and s2 is a commit, and y writes to v, or (ii) s1 and s2 are both commits, and x and y write to the same variable v. A word w = s0 . . . sm is conflict equivalent to a word w if (i) w is transaction equivalent to w , and (ii) for every pair si, sj of statements in w, if si and sj conflict and i < j, then si occurs before sj in w . Note that transaction equivalence ensures that conflict equivalence is a symmetric relation, since w is a permutation of w.
A word w = s0 . . . sm is strictly equivalent to a word w if (i) w is conflict equivalent to w , and (ii) for every pair x, y of transactions in w, where x is a committing or an aborting transaction, if x <w y, then it is not the case that y < w x. A word w ∈Ŝ * is strictly serializable if there exists a sequential word w such that w is strictly equivalent to com(w). Furthermore, a word w is opaque if there exists a sequential word w such that w is strictly equivalent to w. We note that given a word w, if w is opaque, then w is strictly serializable. An infinite word w ∈Ŝ ω is strictly serializable (resp. opaque) if every finite prefix of w is strictly serializable (resp. opaque).
Example. Consider a word w = ((read, v1), t1), ((write, v1), t2), ((write, v2), t2), (commit, t2), ((read, v2), t1), (abort, t1). w has two transactions: (i) an aborting transaction of t1, and (ii) a committing transaction of t2. The following pairs of statements conflict: (((read, v1), t1),(commit, t2)) and (((read, v2), t1), (commit, t2)). The word w is strictly serializable because com(w) = ((write, v1), t2), ((write, v2), t2), (commit, t2). On the other hand, w is not opaque since t1 reads the old value of v1 (before t2 commits) and the new value of v2 (committed by t2).
Transactional memories
We consider thread programs as our basic sequential unit of computations. We express thread programs as infinite binary trees on commands. This makes the representation independent of specific control flow statements, such as exceptions for handling aborts of transactions. For every command of a thread, we define two successor commands, one if the command is successfully executed, and another if the command fails due to an abort of the transaction. Note that this definition allows us to capture easily different retry mechanisms of TMs, e.g., retry the same transaction until it succeeds or try another transaction after an abort. We use a set of thread programs to define a multithreaded program. Formally, a thread program θ on a set C of commands is a function θ : B * → C. We write Θ for the set of thread programs. A (multithreaded) program p on n threads and k variables is an n-tuple p = θ 1 , . . . , θ n of thread programs on C. Figure 1(a) shows an example program on two threads and two variables. Let P n,k be the set of all programs on n threads and k variables. Let P be the set of all programs.
We define a transactional memory as an abstract function that takes as input a program, and produces a set of infinite words. Formally, a transactional memory (TM) is a function M : P → 2Ŝ ω . A transactional memory M ensures strict serializability (resp. opacity) for all programs with n threads and k variables if for every program p ∈ P n,k , every word w ∈ M (p) is strictly serializable (resp. opaque). Moreover, a transactional memory M ensures strict Figure 1 . Our framework of transactional memory serializability (resp. opacity) if it ensures strict serializability (resp. opacity) for all programs with an arbitrary number of threads and variables.
Transactional memory algorithms
We use state transition systems to define TM. A TM algorithm is a family of TM transition systems, one for n threads and k variables, for every n and k. A TM transition system consists of a set of states, an initial state, an extended set of commands depending on the underlying TM, a pending function, and a transition relation between the states. The extended commands include the set C of commands, and TM specific additional commands. For example, a given TM may require that a thread locks a variable before writing to the variable, or that a thread validates the variables read in a transaction, before accessing a new variable. Every extended command is assumed to execute atomically. The pending function represents the pending command of a thread in a state, and ensures that if a thread has not finished the execution of a particular command, then no other command is executed by the thread.
A TM algorithm interacts with a program and a scheduler (see Fig. 1(b) ). The scheduler chooses a thread, which determines the next command to be executed. The TM transition system decides whether the command can be executed in a single atomic step, or in several atomic steps (using additional extended commands), or has to be aborted. The TM algorithm gives back to the program a response. The response is ⊥ if the TM algorithm needs additional steps to complete the command, 0 if the TM algorithm needs to abort the transaction, and 1 if the TM algorithm has completed the command. Given a program, a scheduler, and a TM transition system, we get a run. Projecting the run to the set of successful statements (that is, aborts, and statements that get response 1) gives a word inŜ ω . We describe the language of a TM transition system as the set of words onŜ ω that it can produce for any program and any scheduler.
Formally, a scheduler σ on T is a function σ : N → T . We define a TM algorithm A as a family of TM transition systems A n,k = Q, qinit , D, π, δ for each n and k, where Q is a set of states, qinit is the initial state, D is the set of extended commands with C ⊆ D, the function π : Q × T → C ∪ {⊥} represents the pending command in a state for a thread , and δ ⊆ Q ×Ĉ × SD × Resp × Q is the deterministic or non-deterministic transition relation, whereŜD = (D ∪ {abort}) × T and Resp = {⊥, 0, 1}. The transition relation δ and the pending function π obey the following rules: 1. For all threads t ∈ T , we have π(qinit , t) =⊥. 2. For all states q, q ∈ Q such that there exists an incoming transition (q, c, (d, t), r, q ) ∈ δ to q , if r =⊥, then π(q , t) = c, otherwise π(q , t) =⊥. 3. For all states q, q ∈ Q such that there exists an incoming transition (q, c, (d, t), r, q ) ∈ δ to q , then π(q , u) = π(q, u) for all threads u = t.
4.
For all states q and all threads t, if π(q, t) = c where c =⊥, then for all outgoing transitions (q, c1, (d, t), r, q ) ∈ δ from q, we have c1 = c. 5. For all states q and all threads t, if π(q, t) =⊥, then there exists an outgoing transition (q, c, (d, t), r, q ∈ δ from q for every command c ∈ C. 6. For all q ∈ Q, for all transitions (q, c, (d, t), r, q ) ∈ δ, we have d = abort if and only if r = 0.
Note that the rules above restrict the transition relation and the pending function π such that π is unique. A command c is enabled in a state q for thread t if π(q, t) ∈ {⊥, c} (i.e., either no command is pending, or c itself is pending). In a deterministic transition relation δ, a command c is abort enabled in a state q for thread t if c is enabled in q for thread t and there is no transition (q, c, (d, t), r, q ) ∈ δ such that d ∈ D. A transition relation δ is deterministic if for all q ∈ Q and (c, t) ∈ S, if (q, c, (d1, t), r1, q1) ∈ δ and (q, c, (d2, t), r2, q2) ∈ δ, then d1 = d2, r1 = r2, and q1 = q2. Unless otherwise stated, TM transition systems have deterministic transition relations. We shall use nondeterministic TM transition systems later to describe reference TM algorithms.
Let p = θ 1 , . . . , θ n be a program in P n,k . Let σ be a scheduler on n threads. A run ρ = q0, l0, (d0, t0), r0 q1, l1, (d1, t1), r1 . . . of a TM transition system A n,k with scheduler σ on program p is an infinite sequence of tuples of states, program locations, statements, and responses, where
n for all j ≥ 0 and the following hold: (i) q0 = qinit and l0 = , . . . , , (ii) for all j ≥ 0, there exists a transition (qj, cj, (dj, tj), rj, qj+1) ∈ δ such that tj = σ(j) and cj = θ t j (l t j j ) and for all t ∈ T , we have l t j+1 = l t j if either t = tj or rj =⊥, and l t j+1 = l t j · rj otherwise. Given a scheduler and a program, there is exactly one run for a deterministic TM transition system A n,k , whereas there is at least one run for a nondeterministic TM transition system. We say that a statement sj ∈Ŝ is successful in the run ρ = q0, l0, s0, r0 q1, l1, s1, r1 . . . if (i) rj ∈ {0, 1}, or (ii) r k = 1 with j < k and rj+1 . . . r k−1 are all equal to ⊥. We define the language L(A n,k ) of A n,k as the set of all infinite words w ∈Ŝ ω such that w is the sequence of all successful statements in a run of A n,k with some scheduler on n threads, on some program on n threads and k variables. For a TM algorithm A, we require that for n ≤ n and k ≤ k , the language
A TM algorithm A defines a transactional memory M such that for all n and k, for every program p in P n,k and every word w ∈Ŝ ω , we have w ∈ M (p) iff there exists a scheduler σ on T and a corresponding run ρ of A n,k with σ on p such that w is the sequence of all successful statements in ρ. It follows that a TM defined by a TM algorithm A ensures strict serializability (resp. opacity) for all programs with n threads and k variables iff all words in L(A n,k ) are strictly serializable (resp. opaque). In the following sections, we describe different transactional memories as TM algorithms. To simplify the description, we view a state q of the corresponding TM transition systems as an n-tuple q 1 . . . q n , where each component q t corresponds to a thread t and is called the thread state of t.
The sequential TM
To keep our first example simple, we describe a sequential TM. The sequential TM executes the transactions sequentially (as ideally suited for a uniprocessor). We define the sequential TM Mseq using a sequential TM algorithm Aseq . The sequential TM transition system A n,k seq for n threads and k variables is given by the tuple Q, qinit , D, π, δ . The thread state q t of thread t is in {T, F}. If a thread t has an unfinished transaction in a state q, then the thread state q t is T, and F otherwise. The initial state qinit = F, . . . , F . The set of extended commands is D = C. A transition (q1, c, (d, t), r, q2) is in δ if c is enabled in q1 for thread t and one of the following holds: 1. Read (resp. write). (i) c = (read, v) (resp. c = (write, v)) and d = c and r = 1, and (ii) q u 1 = F for all u = t, and (iii) q A transition (q1, c, (abort, t), 0, q2) is in δ if c is abort enabled in q1 for thread t and q2 = q1.
The two-phase locking TM
Our second example of a TM algorithm is based on two-phase locking (2PL) protocol, commonly used in database transactions. Every transaction locks the variables it reads or writes before accessing them, and releases all acquired locks during the commit. A shared lock is acquired for reading, and an exclusive lock is acquired for writing. We define the 2PL TM M2PL using a 2PL TM algorithm A2PL. The 2PL TM transition system A n,k 2PL for n threads and k variables is given by the tuple Q, qinit , D, π, δ . The thread state q t of thread t is a pair rs, ws , where rs ⊆ V is the set of variables locked by t in shared mode, and ws ⊆ V is the set of variables locked in exclusive mode. For every thread, the initial thread state of thread t is q When a thread commits, the read set and the write set are changed to empty.
A transition (q1, c, (abort, t), 0, q2) is in δ if c is abort enabled in q1 for thread t and rs 
The dynamic software transactional memory
Dynamic software transactional memory (DSTM) [HLMS03] is one of the most popular STM algorithms. The algorithm exists in several flavors. In this work, we focus on one of them, called invisible read DSTM, where the transactions require ownership of variables only for writing. The readers are not visible to the writers. Upon reading, a transaction validates its read set in order to ensure opacity. In our work, we ignore optimizations like early release possible in DSTM. We model the situation of a transaction aborting another transaction by allowing each transaction to set an abort flag for other transactions, and requiring that a transaction aborts whenever the abort flag is set for the thread. We define DSTM TM M dstm using a DSTM TM algorithm A dstm . The DSTM TM transition system A n,k dstm for n threads and k variables is given by Q, qinit , D, π, δ . A thread state q t of thread t is defined as a 3-tuple status t , rs t , os t , where status t ∈ {aborted, validated, invalid, finished} is the status of thread t, and rs t ⊆ V is the read set of thread t, and os t ⊆ V is the ownership set of thread t. For every thread, the initial thread state of thread t is q When a thread t commits, if the status is finished, then the status is changed to validated, and for all threads u whose own set intersects with the read set of t, the status of u is changed to aborted. . When a thread t commits, if the status is validated, then the own set and read set of t are set to empty and the status is set to finished, and the status of threads, whose read set intersects with the own set of t, is set to invalid.
A transition (q1, c, (abort, t), 0, q2) is in δ if the command c is abort enabled in q1 for thread t, and status 
The TL2 transactional memory
Transactional locking 2 (TL2) [DSS06] is a TM that uses global version numbers to ensure correctness. Version numbers allow efficient read set validation in a distributed setting. We model version numbers using modified sets for each thread. When a transaction commits, it adds its write set to the modified set of every thread with an unfinished transaction. We define the TL2 TM MTL2 using the TL2 TM algorithm as ATL2 . The TL2 TM transition system A n,k TL2 for n threads and k variables is given by the tuple Q, qinit , D, π, δ . A thread state qt init = finished, ∅, ∅, ∅, ∅ for all threads t ∈ T . The set of extended commands is . When a thread t commits, if the status is validated, then the status is changed to finished, and the read, write, lock, and modified sets are set to empty, and all variables written by t are added to the modified sets of all threads that have an unfinished transaction.
A transition (q1, c, (abort,
The optimistic concurrency control TM
We now discuss a common concurrency protocol used in databases. It was proposed by Kung et al. [KR81] , and is called optimistic concurrency control (OCC). OCC executes the transactions of the threads without any synchronization. Before committing, every transaction chooses a sequence number and validates its read set. Transactions commit in the order of sequence numbers, which we model using precedence sets, similar to the way we modeled version numbers using modified sets in the TL2 TM algorithm.
We define the OCC TM Mocc using an OCC TM algorithm Aocc. We refer to the OCC TM transition system with n threads and k variables as A n,k occ . The formal definition of the transition system can be obtained from the original algorithm, as we did in the previous examples. Table 1 shows runs with different schedules on the program in Figure 1(a) , for each TM algorithm described above. 
Reduction theorem for safety
We present a reduction theorem for strict serializability and opacity. The theorem states that if a TM ensures strict serializability (resp. opacity) for all programs on two threads and two variables, then the TM ensures strict serializability (resp. opacity). The reduction theorem relies on certain structural properties of transactional memories. These properties are satisfied by all TMs that we discussed in the previous section. For every property, we also give more details on why the mentioned TMs satisfy these properties. Note that the properties are sufficient (and not necessary) conditions for the reduction theorem to hold. We define four structural properties for TMs. Let M be a transactional memory. Let p be a program on n threads and k variables. Let w be a finite prefix of a word in M (p).
P1. Transaction projection. Aborting and unfinished transactions can influence other transactions only by forcing them to abort. Thus, removing all aborting transactions and some of the unfinished transactions do not change the response of the TM to the remaining statements. Formally, let X be the set of transactions in w. We define the transaction projection of w on X ⊆ X as the subsequence of w that contains every statement of all transactions in X . The property P1 states that the transaction projection of w on X , where X contains all committing transactions, no aborting transactions, and any subset of the unfinished transactions in w, is in M (p ) for some program p . For instance, a TM satisfies P1 if for every thread t: (i) whenever a statement of an aborting or unfinished transaction of thread t changes the state of another thread u, then u cannot commit, and (ii) upon an abort, the state of t is reset to the initial thread state of t.
P2. Thread symmetry. For non-overlapping transactions, the TM is oblivious to the identity of the thread executing the transaction. The property P2 states that if (i) w have no aborting transactions, and (ii) there exist two threads u and t such that for all committing transactions x of u and y of v in the word w, either x <w y or y <w x, then the word w obtained by renaming all transactions of thread u to be from thread t is a finite prefix of a word in M (p ) for some program p on n − 1 threads and k variables. For instance, a TM satisfies P2 if (i) the thread state is set to the initial thread state upon a commit, and (ii) the transition relation is identical for all threads.
P3. Variable projection. If a transaction can commit, then removing all statements that involve some particular variables does not cause the transaction to abort. We define the variable projection of w on V ⊆ V as the subsequence of w that contains all commit and abort statements, and all read and write statements to variables in V . The property P3 states that if w has no aborting transactions, then for all V ⊆ V , the variable projection of w on V is in M (p ), where p is obtained by removing all read and write statements to variables in V \ V from all thread programs in p. For instance, a TM satisfies P3 if reading or writing a variable does not remove a conflict on other variables. All TMs we know of satisfy P3 as they track every variable accessed by every thread independently. P4. Monotonicity. If a word is allowed by the TM, then more sequential forms of the word are also allowed. Formally, let F ⊆ S * be the set of opaque (resp. strict serializable) words with exactly one unfinished transaction. We define a function seq : F → 2 F such that if w2 ∈ seq(w1) and y is the unfinished transaction in w1, then (i) com(w2) is sequential and strictly equivalent to com(w 1 ), and (ii) all statements of y in w 1 occur in w2 in some order such that order of all conflicts of global reads in y with other transactions in w 1 is preserved, where w 1 is obtained from w1 by adding for every transaction x that commits before y in w, a write of an auxiliary variable vxy to x, and a read of vxy to y. (These variables are introduced to maintain the order of transactions.) The monotonicity property for opacity (resp. strict serializability) states that if w = w · s, where w ∈ F , and s is not an abort, and s is a statement of the unfinished transaction in w , then for every word w2 ∈ seq(w ), the word w2 · s is a finite prefix of a word in M (p ) for some program p . For instance, a TM satisfies P4 if it is unfinished commutative and commit commutative. A TM is unfinished commutative if for all words wp, wq, ws ∈ S * , if the word wp · wq · s · ws is a finite prefix of a word in M (p), where s is a global read and no statement in wq conflicts with s, then wp ·s·wq ·ws is a finite prefix of a word in M (p ) for some program p . A TM is commit commutative if for all words wp, wq, ws ∈ S * , if wp · wq · s · ws is a finite prefix of a word in M (p), where s is a commit of some transaction x and no statement in wq conflicts with s, then the word wp · x · w q · ws is a finite prefix of a word in M (p ) for some program p , where w q is the word obtained by removing transaction x from wq. The idea is that with these commutativity rules, an interleaved word can be made sequential. The TMs, 2PL, DSTM, TL2 and OCC are unfinished commutative and commit commutative, and thus satisfy monotonicity. Theorem 1. If a TM M ensures strict serializability (resp. opacity) for all programs on two threads and two variables, and satisfies the properties P1, P2, P3, and P4 for opacity (resp. strict serializability), then M ensures strict serializability (resp. opacity).
Proof. We prove the theorem for strict serializability. A similar proof holds for opacity. The proof is by contradiction. Let p be a program in P n,k . Let w be a word in M (p) such that w is not strictly serializable. Let wp be the longest finite prefix of w such that wp is strictly serializable and let w1 = wp · s, where s = (c, t) is a statement of transaction x. Let X be the set of committed transactions in wp. By property P1, there exists a word w2 generated by projecting w1 to X ∪ {x} such that w2 is a finite prefix of a word in M (p2) for some program p2. We note that w2 = w p · s and w p is strictly serializable and w2 is not strictly serializable. So, using property P4 for strict serializability, there exists a word w p ∈ seq(w p ) such that the word w3 = w p · s is a finite prefix of a word in M (p2). In w3 only one transaction, x, does not execute sequentially. Using property P2, we rename the threads for the transactions in w3. We let all transactions except x to be executed by thread u. Let this renaming give word w4. We note that the last statement of x is a commit. As w4 is not strictly serializable, we know (by the definition of conflict) that one of the following holds: (i) s1 = ((read, v1), t) and s2 = ((read, v2), t) are global reads of transaction x such that some transaction y of thread u writes to v1 and some transaction y of u with y = y or y <w 4 y writes to v2 and both commit between s1 and s2, (note that y and y cannot overlap due to the structure of w4,) or (ii) s1 = ((read, v1), t) is a global read of transaction x such that some transaction y of thread u writes to v1 and commits after s1, and there is a committing transaction y with y = y or y <w 4 y which has a command (read, v2) or (write, v2), and x also writes to v2. (Note that v1 may be same as v2). Let w5 be a variable projection of w4 on {v1, v2}. We know that w5 is a finite prefix of a word in M (p5) for some program p5 on two threads and two variables, by property P3. Also, we note that w5 is not strictly serializable. As M ensures strict serializability for all programs on two threads and two variables, we get a contradiction. Thus, there is no such program p5. This leads us to a contradiction.
The reference TM algorithms
To verify the safety properties of a transactional memory, we take the following approach. We construct a reference TM algorithm for strict serializability (RSS TM algorithm), whose language is exactly the set of all strictly serializable words. Similarly, we construct a reference TM algorithm for opacity (RO TM algorithm), whose language is exactly the set of all opaque words. Then, we show that a given TM defined by a TM algorithm A ensures strict serializability (resp. opacity) iff for all n and k, all words in L(A n,k ) are in the language of the RSS (resp. RO) TM transition system for n threads and k variables. If the given TM satisfies the structural properties presented in the previous section, it is sufficient to check that all words in L(A 2,2 ) are in the language of the RSS (resp. RO) TM transition system for 2 threads and 2 variables.
The key insight that makes our technique work is that the reference TM algorithms for strict serializability and opacity for two threads and two variables can be defined as finite-state transition systems. This is not obvious, as threads may be delayed arbitrarily, transactions may contain arbitrarily many statements and may be aborted arbitrarily often. We present the RSS TM transition system first, because it provides the basis for defining the RO TM transition system. Suitable finite-state reference TM transition systems can also be defined for stronger notions of safety, such as the notions described by Scott [Sco06] , by modifying the semantics of conflict.
The reference TM algorithm for strict serializability
The classical approach to checking whether a word is strictly serializable is to construct a directed graph G = (V, E), called the conflict graph [Pap79] , of the committing transactions in the word. The conflict graph captures the precedence of the committing transactions based on the conflicts. Given a word w = s0s1 . . ., the transactions in w form the set V of vertices in the conflict graph. There exists an edge from a vertex v1 to a vertex v2 if v2 commits or aborts before v1 starts, or a statement si of v1 conflicts with a statement sj of v2 and i > j. The conflict graph G is acyclic iff the word w is strictly serializable. We note that the size of this construction is unbounded. The following parameterized word illustrates the point: wm =((read, v1), t1), (((write, v1), t2), (commit, t2)) m , (commit, t1). The number of vertices in the conflict graph of wm is m + 1. Thus, we cannot aim to create a finite transition system for the RSS TM algorithm using conflict graphs. We give a first finite state representation for the language of strictly serializable words, when transactions may abort. The idea of maximal serializability was also addressed earlier [FR85] for a bounded number of non-aborting transactions with a bounded number of statements per transaction. The idea was built upon a notion of transitive conflicts, which does not hold when transactions may abort.
The key idea to get around the problem of infinite states is to maintain sets called prohibited read and write sets for every thread. These sets allow to handle unbounded delay between transactions, as committing transactions store the required information in the sets of other threads. Once a transaction commits or aborts, we need not remember it (unlike conflict graphs). Thus, we need to store information of at most one transaction per thread. The RSS TM transition system is based on the following observation: Every committing transaction serializes at some point during its execution. The RSS TM transition system makes a non-deterministic guess of when a transaction serializes. Depending upon the guess, the transition system checks upon the commit of a transaction, whether the commit can be executed, or it needs to abort.
Formally, we define the RSS TM algorithm Ass as a family of RSS TM transition systems. The RSS TM transition system A n,k ss for n threads and k variables is given by the tuple Q, qinit , D, π, δ . The thread state qcommands is D = C ∪ {serialize}. The transition relation δ is nondeterministic. A transition (q1, c, (d, t), r, q2) is in δ if c is enabled in q1 for thread t and one of the following holds. . When a thread t reads v globally, v is added to the read set, and if the status of t is finished, then the status of t is changed to started, else if the status of t is serialized and v is in the prohibited read set, then the status of t is changed to invalid. 1 When a thread t writes to v, the variable v is added to the write set, and if the status of t is finished, then the status of t is changed to started, else if the status of t is serialized and v is in the prohibited write set, then the status of t is changed to invalid. Figure 2 . We use the same notation as in Table 1 . The commits inside ovals are disallowed by the RSS algorithm. Each condition shows various cases. The arrows represent different possible positions for a command to occur in a given condition. otherwise. When a thread t commits, if the current status of t is serialized or finished, then the following happen: The status of t is set to finished. For every predecessor thread u of t, all variables in the write set of t are added to the prohibited read set and the prohibited write set of u, and all variables in the read set of t are added to the prohibited write set of u. For all predecessor threads u of t such that the write set of u intersects with the read set or write set of t, the status of u is set to invalid. For all threads u that are not predecessors of t such that the read set of u intersects with the write set of t, the status of u is set to invalid.
Serialize. (i)
For every state q1 ∈ Q, a transition (q1, c, (abort, t), 0, q2) is in δ if c ∈ C is enabled in q1 for thread t, and rs Note that the non-determinism in the transition relation comes from the serialize command, and the fact that abort is allowed in every state where a command is enabled.
Theorem 2. Given a word w on n threads and k variables, the word w is strictly serializable if and only if w ∈ L(A n,k ss ). Proof. Consider a run ρ of A n,k ss . Let w1 be an arbitrary finite prefix of the sequence of all successful statements in ρ, and let X be the set of finished transactions in w1. Let w be the sequential word such that w is transaction equivalent to w and x < w y if the serialize command of transaction x comes before that of transaction y in ρ (Note that every non-empty transaction has the serialize command exactly once.) Then, com(w ) is strictly equivalent to com(w1) if for every transaction x ∈ X, either the transaction x does not commit in w1, or one of the following conditions holds for x (graphically shown in Figure 2 ): C1. There exists a transaction y such that x serializes before y and y writes to a variable v and commits, and then x globally reads v. C2. There exists a transaction y such that x serializes before y and x writes to v and y reads v before x commits, and y commits. C3. There exists a transaction y such that x serializes before y and both x and y write to a variable v, and y commits before x does.
C4. There exists a transaction y such that x serializes after y and y writes to v and x reads v before y commits, and then y commits.
The RSS TM transition system A n,k ss guarantees by construction, that a transaction x does not commit iff one of the conditions, C1-C4 holds. Hence, every word in L(A n,k ss ) is strictly serializable. Conversely, let w be strictly serializable. Consider an arbitrary finite prefix w2 of w. As w2 is strictly serializable, there is a sequential word w such that com(w ) is strictly equivalent to com(w2). Let the committing transactions in the sequential word w be given by the sequence x1x2 . . . of transactions. Consider a run ρ of the RSS TM transition system A n,k ss such that w2 is a finite prefix of all successful statements of ρ, and for all i and j such that i < j, the transaction xi serializes before xj in ρ. The run ρ exists because (i) the RSS TM transition system guesses every possible serialization for every transaction during its execution, and (ii) given that w2 is strictly serializable, there is no transaction x in the sequence x1x2 . . . that satisfies any of the conditions C1-C4, and commits in w2. Thus, the word w is in the language L(A n,k ss ).
The reference TM algorithm for opacity
Apart from the requirements of the above mentioned reference TM algorithm for strict serializability, opacity requires that even global reads of aborting transactions observe consistent values. It turns out that we can obtain a finite-state representation of the RO TM transition system by slightly modifying our RSS TM transition system.
The RO TM transition system is based on the following observation: Every committing and aborting transaction should serialize at some point during its execution. Like the RSS TM transition system, the RO TM transition system makes a non-deterministic guess of when a transaction serializes. In this case, the transition system checks upon every global read and every commit of a transaction, whether the command can be executed or the transaction needs to be aborted. The formalism for RO TM algorithm Aop and the RO TM transition system A n,k op is identical to that of the RSS TM algorithm and the RSS TM transition system. The only difference comes in the transition relation δ, on a global read, and on a serialize command. We obtain the transition relation for A for all threads u = t. When a thread t reads v globally, if v is not in the prohibited read set, then the following happen: v is added to the read set. If the status of t is finished, then it is changed to started. For every other thread u with status serialized such that t is not a predecessor of u, the variable v is added to the prohibited write set of u, and if v is in the write set of u, then the status of u is set to invalid. Table 2 . Time for simulation checking for TM algorithms on a quad dual core 2.8 GHz server with 16 GB RAM. In case simulation holds, we write YES followed by the time required for the simulation. Otherwise, we write NO followed by the counterexample produced, followed by the time required to prove that no simulation exists, followed by the time required to find the counterexample. A '*' for the search for simulation relation means that it does not complete in 2 hours, and we try to find a counterexample. TM for all threads u = t. Upon any command of thread t, if the current status of t is started, and if the thread chooses to serialize, then the following happen: If there is a thread u with status started such that the read set of u intersects with the write set of t, then the status of t is set to invalid, else the status of t is set to serialized. All variables in read sets of threads with status started are added to the prohibited write set of t. All threads with status serialized are added to the predecessor set of t. For every other thread u, if the status of u is serialized and the write set of u intersects with the read set of t, then the status of u is set to invalid. For every thread u with status serialized, the read set of t is added to the prohibited write set of u.
Theorem 3. Given a word w on n threads and k variables, the word w is opaque if and only if w ∈ L(A n,k op ).
Implementation and simulation checking
A TM defined by a TM algorithm A ensures strict serializability
op )). As checking language inclusion is PSPACE-hard, we use the common technique of checking for the existence of a simulation relation between both transition systems. The existence of a simulation relation is a sufficient condition for language inclusion. We write A 2,2 1 ≺ A 2,2 2 to denote that there exists a simulation relation between A 2,2 1 and A 2,2 2 . For a TM M defined by a TM algorithm A which satisfies the structural properties of the reduction theorem (Theorem 1), M ensures strict serializability (resp. opacity) if A 2,2 ≺ A 2,2 ss (resp. A 2,2 ≺ A 2,2 op ). We built an automatic verification tool in C for checking the existence of simulation relations using the quadratic algorithm by Henzinger et al. [HHK95] . The tool is conceived as a platform for the automatic verification of TMs that satisfy the structural properties. We mention that simulation checking requires extra technical care in this scenario due to different extended alphabets in different TMs. The tool takes as input two TM algorithms A1 and A2, and checks whether A 2 ). In certain cases, it is possible that although language inclusion holds, the tool cannot find a simulation relation. Thus, our decision procedure is sound but not complete. For all TM transition systems we considered, our tool terminates after finding a simulation relation, or a counterexample.
The results of our simulation checks are presented in Table 2 . Our results demonstrate that all TMs discussed in Section 3 -sequential, 2PL, DSTM, and TL2 TM-are simulated by both reference TM transition systems. As for the OCC TM, it is simulated by the RSS TM transition system, but not by the RO TM transition system. The tool gives a counterexample in the latter case. Our results establish the following theorem.
Theorem 4. The sequential TM, two-phase locking, DSTM, and TL2 ensure opacity. The optimistic concurrency control ensures strict serializability, but not opacity.
Our tool discovered a subtle point in TL2. In the description of the published TL2 algorithm, we found the order of two operations, validating the read set (rvalidate), and checking whether a variable in the read set is locked (chklock), ambiguous. We modeled these operations as two separate atomic operations, such that that chklock happens after rvalidate, to obtain a modified TL2 TM algorithm. The tool found that the modified TL2 TM algorithm is not simulated by the RSS TM transition system, and the tool provided a counterexample. Thus, we conclude that the modified TL2 TM does not ensure strictly serializability, and thus does not ensure opacity. In the published TL2 algorithm, the authors maintain the version number and the lock bit of every variable in the same memory word. This ensures that the two operations chklock and rvalidate execute atomically, and thus they can be executed in any order. So, our experiments discover that the correctness of TL2 is based on the subtle fact that either the version number and the lock bit have to be accessed atomically, or rvalidate has to occur after chklock.
Verifying liveness
We define two different notions of liveness, obstruction freedom and livelock freedom, as discussed in the transactional memory literature. A third notion, wait freedom [Her91] , implies livelock freedom. Since we will show that none of our example TMs satisfy livelock freedom, they do not satisfy wait freedom either.
A word w ∈Ŝ ω is obstruction free [HLM03] if for all threads t, if the word w has an infinite number of aborts of t, then w has an infinite number of commits of t or there are infinitely many statements of some thread u = t. Formally, w is obstruction free if
). This is a Streett condition. A word w ∈Ŝ ω is livelock free [AKH03] if the word has an infinite number of commits, or there is a thread t such that t has infinitely many statements and finitely many aborts in w. Formally, w is livelock free if ♦( , t) ). Note that livelock freedom implies obstruction freedom.
A TM M ensures obstruction freedom (resp. livelock freedom) for all programs with n threads and k variables if for every program p ∈ P n,k , every word w ∈ M (p) is obstruction free (resp. livelock free). M ensures obstruction freedom (resp. livelock freedom) if M ensures obstruction freedom (resp. livelock freedom) for all programs with an arbitrary number of threads and variables. We use the formalism of TM algorithms to verify liveness properties of TMs. We define a loop l in a TM transition sys-tem A n,k as a finite word s0 . . . sm such that there exists a run q0, l0, s0, r0 . . . qm, lm, sm, rm of A n,k such that q0 = qm. Note that every word w that is not obstruction free violates at least one of the conjuncts of the Streett condition stated above. Each conjunct (Streett pair) corresponds to one thread. A word w can violate the condition for thread t, only if w has from some point on only statements of t. Note that in this case w trivially satisfies the Streett pairs for other threads. This fact allows us to use a simple model checking procedure, even though obstruction freedom is formally a Streett condition.
In particular, a TM defined by a TM algorithm A ensures obstruction freedom for all programs with n threads and k variables iff there is no loop l in A n,k such that all statements in l are from the same thread, and l contains no commit, and l contains an abort. Similarly, a TM ensures livelock freedom for all programs with n threads and k variables iff there is no loop l in A n,k such l contains no commit, and every thread that has a statement in l, has an abort in l.
Reduction theorem for liveness
As we did for safety, we state a reduction theorem that proves that it is sufficient to verify liveness of a TM on programs with two threads and one variable to generalize the result to all programs. For this purpose, we describe two more structural properties of TMs. These properties are again satisfied by all TMs that we have discussed. Let w = w1 · w2 be an infinite word such that w is in M (p) for some program p, and no unfinished transaction in w1 has a statement in w2, and all statements in w2 are from the same thread, and there is no commit command in w2. For i ∈ {1, 2}, let Vi be the variables accessed in wi. P5. Transaction projection. A thread t running in isolation (no interleaved step from other threads) shall abort repeatedly only if it conflicts with some unfinished transaction. As the number of threads is finite, and a thread can have at most one unfinished transaction, there are infinitely many aborts of t due to a particular thread. The property P5 states that (i) the word w 1 · w2 is in M (p ) for some program p , where w 1 is obtained by taking the transaction projection of w1 on non-aborting transactions, and (ii) if w1 has no aborting transactions and w2 reads or writes only one variable, then there exists a word w = w 1 ·w2 ∈ M (p), where w 1 is obtained by projecting w1 to transactions of some thread t that has statements in w1. For instance, a TM satisfies P5 if the state of a thread is reset to the initial state upon an abort command, and every variable accessed by every thread is tracked independently. P6. Variable projection. A thread t running in isolation shall abort repeatedly only if some commands corresponding to some variables are not allowed. As the number of variables is finite, there are infinitely many aborts of t due to a particular variable. The property P6 states that (i) there exists a word w1 · w 2 ∈ M (p ) for some program p , where w 2 is the variable projection of w2 on {v} for some variable v ∈ V2, and (ii) if w1 has no aborting transactions, then the word w = w 1 · w2 is in M (p ) for some program p , where w 1 is the variable projection of w1 on V2. For instance, a TM satisfies P6 if the TM tracks every variable accessed by every thread independently.
Theorem 5. If a TM M satisfies properties P5 and P6, and M ensures obstruction freedom for two threads and one variable, then M ensures obstruction freedom.
Proof. Let w ∈ M (p) be a word on arbitrary number of threads and variables such that w is not obstruction free. As w is not obstruction free, it can be written in the form w1 · w2 as required by the properties P5 and P6. We can then use these properties to obtain a word w on two threads and one variable such that w ∈ M (p ) for some program p . 
Model checking liveness
We built a verification tool to check obstruction freedom and livelock freedom properties for transaction memories defined by TM algorithms. To check obstruction freedom, our tool tries to find a loop l in the TM transition system such that all statements in l are from the same thread, and l has no commit, and l has an abort. If the tool finds such a loop, the loop is a counterexample to obstruction freedom. If the tool does not find a loop, we know that the TM ensures obstruction freedom. Similarly, to check livelock freedom, our tool tries to find a loop l in the TM transition system such that there is no commit in l, and every thread that has a statement in l, has an abort in l.
In this way, our tool provides a platform for TM designers to check which liveness properties are ensured by their TMs. If the liveness property fails, then the tool provides feedback in the form of a run that represents a counterexample. Our results are shown in Table 3 . The next theorem follows.
Theorem 6. DSTM ensures obstruction freedom and does not ensure livelock freedom. Sequential TM, two phase-locking, TL2, and optimistic concurrency control do not ensure obstruction freedom.
Related Work
There has been recent independent work on the formal verification of STM algorithms [COP + 07] . Cohen et al. model checked STMs applied to programs with a small number of threads and variables against the strong safety criteria of Scott [Sco06] . They do not offer a reduction theorem and do not consider liveness properties.
Our construction of the reference TM algorithms is related to the work of Fle and Roucairol [FR85] . They investigated the set of concurrent traces that are generated by a finite set of iterating transactions. They proved that the language consisting of all traces that are conflict equivalent to a sequential trace is regular. Their results cannot be applied in the presence of aborting transactions, as they require the transitivity of conflicts, which does not hold when transactions may abort.
There has been much research on the formal verification of relaxed memory models and cache-coherence protocols for modern multi-processors, e.g., [HQR99, Qad03, GYS04, BAM07] . In this work, the semantics of a shared memory is generally given by a memory consistency model, which defines the possible outcomes of executing a concurrent program. Since our approach specifically targets STM, we use a deferred update semantics rather than a memory consistency model.
Conclusion
We presented a new technique for verifying STM safety and liveness properties. The cornerstones of our technique are a finite-state representation for the languages of strictly serializable and opaque executions, and an automated verification tool for STMs. Our method applies to all STM protocols that satisfy certain structural properties, and we successfully verified opacity for 2PL, DSTM, and TL2, and the obstruction freedom of DSTM.
Currently, our framework does not apply when transactions help each other. For instance, we cannot model Fraser's STM [FH07] where threads help each other in order to ensure livelock freedom. For efficient performance during contention, many STM protocols rely on a contention manager, like the Polite or Karma contention manager of Scherer and Scott [SS05] . In this work, we do not handle some of these contention managers. We plan to extend our work by modeling different contention managers as nondeterministic transition systems. Also, our liveness properties capture deterministic notions. It will be interesting to account for probabilistic means to deal with contention, such as random exponential backoff.
We also assumed that the commands in the extended alphabet, like read, write, validate, commit, executed atomically. So, STM algorithms have to guarantee this level of atomicity to ensure correctness using our methodology. Currently we are extending our work to reason about correctness if the lower level primitives are not atomic.
