Modular Transactions: Bounding Mixed Races in Space and Time by Dongol, Brijesh et al.
Modular Transactions:
Bounding Mixed Races in Space and Time
Brijesh Dongol
University of Surrey, UK
b.dongol@surrey.ac.uk
Radha Jagadeesan
DePaul University, USA
rjagadeesan@cs.depaul.edu
James Riely
DePaul University, USA
jriely@cs.depaul.edu
Abstract
We de￿ne local transactional race freedom (LTRF), which pro-
vides a programmer model for software transactional memory.
LTRF programs satisfy the SC-LTRF property, thus allowing
the programmer to focus on sequential executions in which
transactions execute atomically. Unlike previous results, SC-
LTRF does not require global race freedom.We also provide a
lower-level implementation model to reason about quiescence
fences and validate numerous compiler optimizations.
CCS Concepts • Theory of computation → Parallel
computing models; Abstraction;
1 Introduction
For concurrent programs communicating via a shared-mem-
ory subsystem that includes locks, the SC-DRF property
states that a Data Race Free program can be fully under-
stood by considering only executions that are Sequentially
Consistent, meaning that the shared-memory subsystem can
be modeled as a standard sequential store [3].
For programs that use transactions to augment or replace
locking, the analogous SC-TRF property states that for Trans-
actionally Race-Free programs, it su￿ces to consider execu-
tions that are SC and where transactions are executed atom-
ically. For TRF programs, SC-TRF implies opacity [15, 16],
which generalizes SC-DRF to include aborted and live trans-
actions. SC-TRF is a conditional form of operational re￿ne-
ment: for TRF programs, “every behavior a user can observe
of a program using a TM implementation can also be ob-
served when the program uses an abstract TM that executes
each block atomically” [22].
Dongol is supported by EPSRC grant EP/R032556/1. Jagadeesan and Riely
are supported by National Science Foundation CCR-1617175.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for pro￿t or commercial advantage and that copies bear
this notice and the full citation on the ￿rst page. Copyrights for components
of this work owned by others than ACMmust be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior speci￿c permission and/or a fee. Request
permissions from permissions@acm.org.
PPoPP ’19, February 16–20, 2019, Washington, DC, USA
c  2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-6225-2/19/02. . . $15.00
h￿ps://doi.org/10.1145/3293883.3295708
Reasoningwith SC-TRF is powerful, particularly formixed-
mode access, where a single location is accessed both trans-
actionally and nontransactionally. A common idiom is pri-
vatization, shown in the following program.
atomica { if !  then x:=1 } || atomicb { :=1 };x:=2
Here, there are two threads, separated by parallel composi-
tion. Transactions are denoted by atomic blocks, with trans-
action names as subscripts to facilitate discussion. The ￿rst
thread atomically reads   and updates x if   is 0 (the initial
value). The second thread atomically writes  , then executes
a plain (nontransactional) write to x .
Reasoning sequentially and assuming all transactions com-
mit, it is impossible for the program to terminate with x = 1
since the atomic blocks must appear to occur in some serial
order. Suppose a serializes ￿rst—then the write of 1 to x ,
denoted hWx 1i, must precede hWx 2i, and the ￿nal result
is 2. Suppose b serializes ￿rst—then there will be no hWx 1i,
since the only available value for   is the initial value 0.
Thus, the atomic blocks are used to synchronize threads.
In the case that x:=2 is replaced with some costly computa-
tion, the privatization idiom can be used to reduce computa-
tional costs inside atomic blocks.
The reverse idiom is publication, exempli￿ed by:
x:=1; atomica { :=1 } || atomicb { z:=2; if   then z:=x }
Reasoning as before, it is impossible for the program to termi-
nate with z = 0. Suppose transaction a serializes ￿rst—then
b must see both hWx 1i and hW 1i and therefore end by
writing hWz1i. Suppose b serializes ￿rst—then there will be
no second write to z, since the only available value for   is
the initial value 0, and thus the last write to z is hWz2i.
It is a direct consequence of sequential reasoning that
these outcomes must be forbidden. In the implementation of
Software Transactional Memory (STM), many performance
enhancements, such as optimistic execution, can result in a
failure of SC-TRF, allowing behaviors such as those above.
This has led to a tension between the programmer model and
the implementation of STMs, resulting in a great literature
on the subject, with many competing notions of transactional
race that abstract away implementation details to a greater
or lesser degree [1, 17, 24, 28].
In this paper, we emphasize the programmer model, de-
veloping a high-level de￿nition of a transactional race that
makes mixed-mode idioms safe by de￿nition (§2 and §4). We
attempt to make the programmer model as broadly appli-
cable as possible by adapting the notion of local data race
PPoPP ’19, February 16–20, 2019, Washington, DC, USA Brijesh Dongol, Radha Jagadeesan, and James Riely
developed by Dolan et al. [9]. At the same time, we show that
our model is e￿ciently implementable, in that it avoids com-
mon pitfalls that overly constrain the STM implementation,
such as publication by antidependency or global lock atomic-
ity (§3). Our programmer model disables common compiler
optimizations; so, we develop a slightly more concrete imple-
mentation model that supports compiler optimizations (§5).
We describe how to compile our model to x86 and ARMv8
(§6). We are inspired by Khyzha et al. [22], who followed the
same agenda for global races using a model similar to our
implementation model; we discuss related work in §7.
In addition to providing a novel programmer model, this
paper extends existing work in several ways.
Local Race Freedom. The SC-TRF property is a global prop-
erty: a race anywhere in the program is su￿cient to nullify
the TRF property, typically resulting in unde￿ned semantics.
Recently, Dolan et al. [9] proposed local DRF as an alternative
to global DRF for programs that synchronize via Java-like
volatiles. We propose the ￿rst local TRF property.
Local TRF is strictly more expressive than the global TRF
models considered in prior work. As a result, we are able
to provide an SC-LTRF guarantee, which applies to many
additional programs. Consider the variant of the well-known
independent reads of independent writes example below.
atomic { x:=1 } || atomic { :=1 }
|| atomic { r1:=x }; z:=1; atomic { r2:=  }
|| atomic {q1:=  }; z:=2; atomic {q2:=x }
(IRIW)
The following outcome cannot occur sequentially.
Wx 1
W 1
Rx 1 Wz1 R 0
R 1 Wz2 Rx 0
If the writes to z are removed, then SC-TRF reasoning allows
a programmer to conclude that this sequence of reads can-
not occur. However, with the writes to z included, SC-TRF
reasoning says nothing about this program, since, by any
de￿nition, there is a race on z. SC-LTRF allows us to ignore
this race. Since no transactional variable is involved in a
race, we are guaranteed that every execution of this program
behaves as though the transactional portionwere executed se-
quentially with no interleaving of transactions. This example
illustrates spatial locality.
To understand the temporal ￿avor of locality, consider the
following program that uses IRIW as a parallel component.
x:= 1; atomic { F++ } || x:= 2; atomic { F++ }
|| atomic { r:=F }; if r = 2 then IRIW
Again, standard SC-TRF reasoning says nothing about this
program, since there are races on x . But there is no race on x
or   after the guard r = 2 becomes true; SC-LTRF allows us
to reason sequentially from that point, ensuring that IRIW
behaves as expected.
Thus, by adapting the notion of locality from [9], we en-
able modular reasoning with transactions by isolating trans-
actional races from other data races, in both space and time.
De￿ned Behavior for Racy Programs.Most prior models
based on SC-TRF either give unde￿ned semantics for pro-
grams with races or assume that the underlying memory
model is sequentially consistent. We de￿ne the semantics of
programs using the relaxed memory model of [9], and thus
give a de￿ned semantics for racy programs using a realistic
memory model.
Implementation-Level Reasoning. Most prior work relies
on programmers to place quiescence fences to guarantee
safety [22, 27, 34–36]. We connect our high-level model to
this previous work by developing a lower-level implemen-
tation model that includes explicit fences. Our lower-level
assumes only that the underlying transactional machinery
provides order between transactions that have a direct de-
pendency, e.g., as in the publication idiom. We note that
hardware transactions [5, 6, 10] support the ordering as-
sumptions of our lower-level model. Fences are necessary
only to provide order when there is no direct dependency, as
in the privatization idiom. We provide a correctness criterion
to realize our abstract programming model, and compare the
fences required to realize our high-level model to previous
approaches.
In addition to building on these aspects of prior work on
SC-TRF, we prove that common compiler optimizations are
sound under LTRF. In addition to all of the optimizations
validated by LDRF [9], we also validate some optimizations
speci￿c to transactions, inspired by optimizations that are
sound with respect to locks [27]. For example, we show that
empty transactions can be elided, that the scope of transac-
tions can be increased, and that adjacent transactions can be
combined.
2 Programmer Model
Dolan et al. [9] give a semantics for a language using Java-like
volatiles for synchronization. We adapt their semantics to
isolated transactions [13, 26] (where plain actions may not be
causally interleavedwith transactional actions). Transactions
are more general than volatiles in several ways:
• A transaction may abort.
• A transaction may both read and write.
• A transaction may access more than one location.
• The same location may be used in both transactional
and plain accesses.
We give the semantics of a program as a set of traces, each of
which is a sequence of actions (e.g., read, write, transaction
begin). Dolan et al. [9] give both an operational semantics
generating traces and an axiomatic semantics de￿ned over
event graphs. We concentrate on the axiomatic treatment,
treating actions as events in an event graph and deriving or-
ders over these actions. We use the words trace and execution
Modular Transactions PPoPP ’19, February 16–20, 2019, Washington, DC, USA
interchangeably, preferring “trace” when the exact sequence
of actions is relevant and “execution” when it su￿ces to
consider the derived relations.
We have designed the semantics so that transactions be-
have exactly like the volatiles of [9] for degenerate traces in
which each transaction contains a single read or write action,
transactional and nontransactional locations are disjoint, and
each transaction is committed and contiguous.
In this section, we present a programmer model that vali-
dates mixed-mode idioms such as privatization, but fails to
validate common compiler optimizations. In §5, we give a
low-level model that validates compiler optimizations, but
only conditionally validates mixed-mode idioms.
Actions The syntax of actions is as follows.
a, b, c 2 Act (Action Id)
s, t 2 Thrd (Thread Id)
x,   2 Loc (Location)
 , w 2 Z (Value)
q, p 2 Q (Timestamp)
  ::= ha:sWx qi (Write)
| ha:sRx qi (Read)
| ha:sBi (Begin)
| ha:sCbi (Commit)
| ha:sAbi (Abort)
Action ids are unique identi￿ers for actions. Thread ids
include the reserved thread id init, used for initialization. To
simplify the de￿nition of initialization, we assume that the
set of locations is ￿nite. We take values to be integers and
timestamps to be rationals, as in [9].
The write action ha:sWx qi denotes a write of   to x by
thread s , with action name a. Likewise, ha:sRx qi denotes a
read. The timestamp q is used to encode relations between
these actions, as detailed below.
The begin action hb:sBi denotes the begin of transaction
by thread s , with action name b. We also use b as the transac-
tion name. The commit action ha:sCbi denotes the commit
of the transaction named b. Likewise ha:sAbi denotes the
abort of b. We refer to commits and aborts collectively as
resolution actions.
We often drop components of the action syntax that are
not interesting for the discussion at hand, e.g., we may write
ha:sWx qi as either hai, ha:s i, hWxi, hWx i, or hWxqi.
Traces and Transactions. A trace is a ￿nite sequence of
actions  1 2 · · · n . We use  ,   to range over traces. We
only consider well-formed traces (de￿ned below), which be-
gin with an initializing transaction of the form hb:initBi
hinitWx1 10i · · · hinitWxn n 0ihinitCbi,which contains ex-
actly one write for each location, at timestamp 0. Here init
is a reserved thread name. In examples, we usually omit
this initializing transaction, assuming that all locations are
initialized to 0.
Each trace   =  1 2 · · · n generates a total order index    !  ,
where  i index    !    j i￿ i < j. Usually, the trace is clear
from context and we drop the subscript, preferring index    ! to
index    !  . We adopt this convention throughout, dropping the
subscript in de￿nitions as well as examples.
We derive several other relations from a trace, including
initialization order, program order, write-to-write order (aka
coherence) and write-to-read order (aka reads-from).
• ha:s i init  ! hb:t i i￿ s = init , t .
• ha:s i po  ! hb:t i i￿ a index    ! b and s = t .
• ha:Wxqi ww hb:W pi i￿ x =   and q < p.
• ha:Wx qi wr hb:R wpi i￿ x =  ,   = w and q = p.
All of these relations are irre￿exive. po  ! and ww are transi-
tive. The domain and range are disjoint for init  ! and wr .
In the context of a trace, we often refer to actions by name.
For example, we prefer “a po  ! b” to “hai po  ! hbi”. We also
write “a = hsWx qi” rather than “9i .  i = ha:sWx qi”.
We take the name of the begin action to be the unique id for
each transaction. We say that action a belongs to transaction
b if hb:Bi po  ! a and there is no commit or abort action c
such that b po  ! c po  ! a. We say that a is transactional if it
belongs to some transaction, and plain otherwise.
Each trace induces an equivalence over action names, re-
lating actions that belong to the same transaction:
a
tx⇠ b i￿ a = b or a and b belong to the same transaction.
Note that plain actions are included in tx⇠, although they only
relate to themselves.
There are three possible states for transactions: committed,
aborted and live. Committed and aborted transactions are
resolved. Committed and live transactions are nonaborted.
We use the same terminology to refer all of the actions in a
transaction; thus, we may use “aborted write action” to refer
to a write action that belongs to an aborted transaction.
We visualize traces as graphs. For example, the trace
ha:initBi hinitWx 00i hinitW 00i hinitCai hb:sBi hsW 11i
hsWx 11i hsCbi hc:tBi htR 11i htAci hd :tWx 22i is visual-
ized as:
b:W 1 Wx 1
c:R 1 d :Wx 2
wr ww or
b
c d
wr ww
To avoid clutter, we drop the label on po  ! and elide the initial-
izing transaction. Instead of including explicit begin and res-
olution actions, we visualize transactions using boxes. Com-
mitted and live transactions are drawn in solid boxes, colored
blue. Aborted transactions are drawn in dashed boxes, col-
ored red.
Well-Formedness. A trace is a well-formed if each of the
following hold:
WF1. The trace starts with an initializing transaction.
WF2. Action names are unique: if a index    ! b, then a , b.
WF3. Write timestamps are per-location unique:
If a = hWxqi and b = hWxqi, then a = b.
WF4. Each begin action has at most one resolution, and each
resolution has exactly one begin action.
WF5. Each resolution follows its begin in po  !, without an
intervening begin or resolution.
PPoPP ’19, February 16–20, 2019, Washington, DC, USA Brijesh Dongol, Radha Jagadeesan, and James Riely
WF6. If b is a read, then there is some a such that a wr b.
WF7. If a wr b and a is aborted or live, then a tx⇠ b.
WF8. If a wr b, then a index    ! b.
WF9. If b is transactional, then there is no committed or live
c index    ! b such that b ww c .
WF10. If b is transactional and there is some transactional
a wr b, then there is no committed or live c index    ! b
such that a ww c .
WF11. If b is transactional and there is some a wr b, then
there is no c tx⇠ b such that c index    ! b and a ww c .
WF1 ensures that locations are initialized. WF2–WF3 en-
sure that action names and timestamps are unique. WF4–WF5
ensure proper bracketing for transactions. These conditions
also preclude nesting of transactions — we leave the treat-
ment of nested transactions to future work. WF6 ensures
that all reads are ful￿lled. WF7 ensures that aborted and live
writes are not visible outside the transaction.
WF8–WF11 constrain the interleavings allowed in a trace.
For the most part, we view traces as abstract execution
graphs, where transactions are expressed as multiple po  !-
contiguous actions. In execution graphs, time is relative: it
is expressed as the happens-before relation, which captures
causal relations between actions. At the concrete level of a
trace, time is absolute: it is expressed by order in the sequence.
Viewed as execution graphs, WF8–WF11 are redundant with
respect to consistency criteria given below. These condi-
tions, instead, constrain the concrete representation of the
execution graph as a trace, enabling inductive reasoning that
mirrors the operational reasoning of [9].
WF8 ensures that reads only see the absolute past: reads are
not allowed to “see the future”. This condition is guaranteed
by the operational semantics of [9], but here must be stated
explicitly. There is no similar requirement that writes respect
absolute time. They may appear out of order. For example,
we allow the trace hWx 22ihWx 11i.
WF9–WF11 constrain the interleaving of the actions from
di￿erent transactions. There is no analogue of these rules
in [9] since volatiles are expressed as a single action. WF9
forbids hcWx 22ihbWx 11i when both are transactional —
we ignore aborted writes because they are not visible to
other transactions. WF10 forbids haWx 11ihcWx 22ihbRx 11i
when all three are transactional. WF11 forbids haWx 11i
hcWx 22ihbRx 11i when c tx⇠ b.
Antidependencies. An antidependency relates a read to any
write that cannot precede it. We use rw to represent antide-
pendency as read-to-write order (aka from-read). Ignoring
transactions, b rw c whenever a wr b and a ww c , for
some a.
As we shall see, antidependencies are not allowed to con-
tradict the happens-before order, which de￿nes causality.
The end result is that stale reads are precluded. For example,
consider the trace ha:sWx 1ihc:sWx 2ihb:sRx 1i. This trace
should not be allowed, since it reads 1 after writing 1 and
then 2 in the same thread. Because c po  ! b rw c , this trace,
shown on the left below, will not be considered consistent:
a:Wx 1
b:Rx 1
c:Wx 2
ww
wr rw
a:Wx 1
b:Rx 1
c:Wx 2
ww
wr
Aborted transactions complicate the de￿nition of antidepen-
dency. For example, if c is part of an aborted transaction, as
shown on the right, then the outcome should be allowed.
Note that if b and c belonged to the same aborted transaction,
then the execution would be disallowed by condition WF11
in the de￿nition of well-formed trace.
Thus we arrive at the following de￿nition:
b rw c i￿ a wr b and a ww c , for some a, and
c is either plain or nonaborted.
Li￿ed Relations. A common technique to enforce transac-
tional atomicity is to lift orders from individual actions to
the level of transactions [6, 10, 32]. Notationally, we indicate
a lifted relation by pre￿xing “l.” For example, the lifting of
wr is written lwr . We also use two variants.
• lR ! is the lifting of relation R !.
• xR  ! restricts lR ! to transactions.
• cR  ! restricts xR  ! to nonaborted transactions.
For any relation R !, the de￿nitions are as follows.
• a lR ! b i￿ a R ! b or a0 R ! b 0 for some a0 tx⇠ a 6tx⇠ b tx⇠ b 0.
• a xR  ! b i￿ a lR ! b and a, b are transactional.
• a cR  ! b i￿ a xR  ! b and a, b are committed or live.
Consider the following execution, where we label the in-
dividual actions of b.
b1:W 1 b2:Wx 1
c:R 1 d :Wx 2
wr ww
We have b1 wr c but not b2 wr c . In the lifted relation both
of these hold; in particular, we have b2 lwr c . Similarly, we
have b1 lww d but not b1 ww d . The “x” variants exclude
d . The “c” variants exclude both c and d .
Summarizing the relations de￿ned thus far, we have:
• index    ! is the absolute order of events in a trace.
• init  ! relates initialization events to other events.
• po  ! restricts index    ! to events of same thread.
• ww is write-to-write order, derived from timestamps.
• wr is write-to-read order, derived from timestamps.
• rw is read-to-write order, derived from ww and wr .
Lifting is only applied to the last three relations.
Happens-Before. The happens-before order, hb  !, is a partial
order that captures dependency, or causality, between ac-
tions. It serves a crucial role in understanding distributed
systems. In the next subsection, happens-before is used to
de￿ne consistent executions that obey the intended notion
of causality. In §4, happens-before is also used to de￿ne
Modular Transactions PPoPP ’19, February 16–20, 2019, Washington, DC, USA
data races. By varying the de￿nition of happens-before, we
change the de￿nition of both consistency and raciness.
We de￿ne hb  ! to be the least relation that is closed with
respect to the following.
a hb  ! c if a ( init  ![ po  ![ cwr [ cww ) c (HB￿￿￿￿)
a hb  ! c if a hb  ! b hb  ! c (HB￿￿￿￿￿)
a hb  ! c if c is plain, a lww c and a crw b hb  ! c (HB￿￿)
We discuss variations of HB￿￿ at the end of this section.
We discuss an alternative model without HB￿￿ in §5.
By HB￿￿￿￿, happens-before includes initialization order,
program order, lifted write-to-write order and lifted write-
to-read order. HB￿￿￿￿￿ says that happens-before is transitive.
These rules are adapted from the analogous rules in [9]. The
only subtlety of these rules lies in the choice of lifted relation
in HB￿￿￿￿; note that we restrict HB￿￿￿￿ to include order only
from committed and live transactions. We discuss the reason
for this in the next subsection.
HB￿￿ is designed to ensure that privatization is consid-
ered race-free. Roughly, two actions are racing if they touch
a common location, neither is aborted, one is a write, and
they are not ordered by hb  !. HB￿￿ only applies when a and
b are live or committed. If c is also live or committed, then
this rule adds nothing: HB￿￿￿￿ already gives us a hb  ! c since
a cww c .
Example 2.1. Recall the privatization example from §1.
atomica { if !  then x:=1 }
|| atomicb { :=1 };x:=2
a:R 0 Wx 1
b:W 1 c:Wx 2
crw lww
Without HB￿￿, there would be a race between hWx 1i and
hWx 2i. By including a lww c in happens-before, we ensure
that this execution is considered race free.
Order from HB￿￿ can cascade, as in the following.
atomica { if !  then x:=1 }
|| atomicb { :=1 }; atomica0 { if !  0 then x 0:=1 }
|| atomicb0 {  0:=1 };x 0:=2;x:=2
a:R 0 Wx 1
b:W 1 a0:R  00 Wx 01
b 0:W  01 c 0:Wx 02
crw lww
c:Wx 2
crw
lww
Consistency. We say that an execution is consistent i￿ it is
well-formed and the following hold.
( hb  ![ lwr [ xrw ) is acyclic. (C￿￿￿￿￿￿￿￿)
( hb  !; lww ) is irre￿exive. (C￿￿￿￿￿￿￿￿)
( hb  !; lrw ) is irre￿exive. (O￿￿￿￿￿￿￿￿￿￿)
( crw ; hb  !; lww ) is irre￿exive. (A￿￿￿￿￿)
C￿￿￿￿￿￿￿￿,C￿￿￿￿￿￿￿￿ andO￿￿￿￿￿￿￿￿￿￿ all appear in [9].
We discuss A￿￿￿￿￿ below and in Example 3.5.
Example 2.2. Consider the variant of Example 2.1, in which
the writes on x are given the reverse order in lww .
atomica { if !  then x:=2 }
|| atomicb { :=1 };x:=1
a:R 0 Wx 2
b:W 1 c:Wx 1
crw lww
Intuitively, this execution should be disallowed since lww
seems to order the writes incorrectly. A￿￿￿￿￿ forbids it.
Technically, this execution must be disallowed in order
to establish the SC-LTRF theorem, which states that any
race can be discovered in a sequential execution. To see the
issue, note that the two writes on x are not ordered by hb  !
(HB￿￿ does not apply here); thus they are in a race. SC-LTRF
requires, therefore, that we ￿nd a sequential execution of
this program that also exhibits a race. But this is impossible:
any sequential execution must have a before b, and therefore
before c , and thus a lww c . But in this case, HB￿￿ adds
order between a and c , eliminating the race.
As noted in [9], since po  ! ✓ hb  ! and wr ✓ lwr , the in-
clusion of lwr in C￿￿￿￿￿￿￿￿ forbids “load bu￿ering,” shown
on the left below, which is allowed by many other models.
Forbidden Allowed Allowed
Rx 1 W 1
R 1 Wx 1
wr
wr
Wx 1 R 1
W 1 Rx 1
rw
rw
Wx 1 W 1
R 1 Rx 0
rw
xwr
On the other hand, the model does allow “store bu￿ering,”
shown in the middle above, since plain antidependencies
only have an irre￿exivity requirement in O￿￿￿￿￿￿￿￿￿￿, not
an acyclicity requirement.
We do not include aborted transactions in HB￿￿￿￿; in con-
junction, with O￿￿￿￿￿￿￿￿￿￿, this would cause publication
through aborted reads. To see this, consider the execution on
the right above, which is allowed by our model, but would
be disallowed if hb  ! included xwr rather than cwr .
Were we to use crw in C￿￿￿￿￿￿￿￿, the execution on the
left belowwould be allowed. But this execution violates opac-
ity, which requires a total order among all transactions (in-
cluding aborted transactions) that is consistent with happens-
before order [15, 16]. Therefore the execution must be forbid-
den. If the writes are plain, however, this execution is similar
to the store bu￿ering example, and should be allowed. Thus,
it would be too strong to use lrw in C￿￿￿￿￿￿￿￿, or to re-
quire acyclicity of ( hb  ! [ lrw ) in O￿￿￿￿￿￿￿￿￿￿. Similarly,
we cannot use lww in C￿￿￿￿￿￿￿￿ or require acyclicity of
( hb  ! [ lww ) in C￿￿￿￿￿￿￿￿. In either case, we would rule
out the execution on the right.
Forbidden Allowed
Wx 1
W 1
Rx 1 R 0
R 1 Rx 0
xwr
xwr
xrw
xrw
Wx 2 W 1
W 2 Wx 1
ww
ww
PPoPP ’19, February 16–20, 2019, Washington, DC, USA Brijesh Dongol, Radha Jagadeesan, and James Riely
As noted in [9], the notion of coherence in LTRF is stronger
than Java, which allows the execution on the left below. On
the other hand, LTRF coherence is not as strong as coher-
ence in hardware models and C++ atomics, which forbid the
execution on the right—allowing such executions is neces-
sary to support compiler optimizations, such as common
subexpression elimination [9, 31].
Forbidden Allowed
Wx 1 W 1
Wx 2 R 1 Rx 2 Rx 1
ww cwr
wr
rw
Wx 1 Wx 2
ww
Rx 2 Rx 1 Rx 2
wrwr wrrw
Anti-Dependence vs Happens-Before. HB￿￿ adds to hb  !
the minimal order needed to validate privatization. There is
a design space of choices for additional constraints that can
be imposed on the compositions of crw and hb  !.
Example 2.3. There are six variants, each of which we
illustrate with an example. For completeness, we include
HB￿￿ with a variant of Example 2.1. Following Example 2.2,
many of these require an additional antidependency axiom.
The exceptions involve lwr , for which C￿￿￿￿￿￿￿￿ su￿ces.
a hb  ! c if c is plain, a lww c and a crw b hb  ! c (HB￿￿)
( crw ; hb  !; lww ) is irre￿exive. (A￿￿￿￿￿)
atomica { r:= ;x:=1 }
|| atomicb { :=1 };x:=2
a:R 0 Wx 1
b:W 1 c:Wx 2
crw lww
a hb  ! c if c is plain, a lrw c and a crw b hb  ! c (HB￿￿)
( crw ; hb  !; lrw ) is irre￿exive (A￿￿￿￿￿)
atomica { r:= ;q:=x }
|| atomicb { :=1 };x:=1
a:R 0 Rx 0
b:W 1 c:Wx 1
crw lrw
a hb  ! c if c is plain, a lwr c and a crw b hb  ! c (HB￿￿)
atomica { r:= ;x:=1 }
|| atomicb { :=1 };q:=x
a:R 0 Wx 1
b:W 1 c:Rx 1
crw lwr
a hb  ! c if a is plain, a lww c and a hb  ! b crw c (HB0￿￿)
( hb  !; crw ; lww ) is irre￿exive. (A￿￿￿0￿￿)
x:=1; atomicb { r:=  }
|| atomicc { x:=2; :=1 }
a:Wx 1 b:R 0
c:Wx 2 W 1
crwlww
a hb  ! c if a is plain, a lrw c and a hb  ! b crw c (HB0￿￿)
( hb  !; crw ; lrw ) is irre￿exive. (A￿￿￿0￿￿)
q:=x; atomicb { r:=  }
|| atomicc { x:=1; :=1 }
a:Rx 0 b:R 0
c:Wx 1 W 1
crwlrw
a hb  ! c if a is plain, a lwr c and a hb  ! b crw c (HB0￿￿)
x:=1; atomicb { r:=  }
|| atomicc {q:=x; :=1 }
a:Wx 1 b:R 0
c:Rx 1 W 1
crwlwr
3 STM Design
We consider several examples from the literature to argue
that the ordering required by our model does not impair e￿-
cient implementations of Software Transactional Memory.
Example 3.1. In accordance with [27, Figure 12], our model
does not enforce publication by antidependence: The ￿nal
outcome r = q = 0 is permitted in the program (left), as
shown by the allowable execution (right).
x:=1; atomica { r:=  }
|| atomicb {q:=x; :=1 }
Wx 1 a:R 0
b:Rx 0 W 1
crwlrw
Note that if hb  ! were to include crw , then this execution
would be forbidden by O￿￿￿￿￿￿￿￿￿￿. Note also that this
execution is forbidden by any model that enforces A￿￿￿0￿￿,
from Example 2.3.
Example 3.2. In accordance with [27, Figure 11], our model
does not enforce global lock atomicity: The ￿nal outcome
r = q = 0 is possible in the program below.
x:=1; atomica { :=1 }; r:=z
|| atomicb {q:=x; z:=1 }
Wx 1 W 1 Rz0
Rx 0 Wz1
lrw lrw
This execution is allowed by all variants discussed in Exam-
ple 2.3, including A￿￿￿0￿￿.
Example 3.3. We now consider the limitations of our ap-
proach. Menon et al. [27] describes an idiom for benign racy
publication. This outcome is considered desirable, yet our
model forbids it: The ￿nal outcome q = 0 is not possible for
the following program.
x:=1; atomica { :=1 }
|| q:=2; atomicb {r:=x;
if   then q:=r }
Wx 1 a:W 1
b:Rx 0 R 1
cwrlrw
The outcome is only allowed if b reads 0 for x and 1 for  ,
but this execution is disallowed by O￿￿￿￿￿￿￿￿￿￿.
Note that, in accordance with the name, the program is
not race-free: the execution in which b reads 0 for   has a
race on x . Thus, there is no canonical answer as to whether
this execution is indeed benign and should be allowed.
Example 3.4. The literature describes a class of STMs that
implement eager versioning, which create an undo log for
each write, perform writes as they are encountered (as op-
posed to during commits). If the transaction aborts, the up-
dates are rolled back to their original logged values. Shpeis-
man et al. [34] describe potential issueswith eager versioning
in a mixed mode SC setting. In our relaxed memory setting,
we show that these have natural explanations.
Modular Transactions PPoPP ’19, February 16–20, 2019, Washington, DC, USA
Consider the following program.
atomica { if !  then x:=1; abort };
atomicb { if !  then x:=1 }; r:=x
|| x:=2; :=1;q:=x
Under SC, the ￿nal value r=0 is considered to be problematic
[34, Figure 3a] since it follows from a scenario in which the
non-transactional write hWx 2i is lost, known as a speculative
lost update. Assuming SC, suppose transaction a executes its
write to x , then second thread executes its ￿rst two writes.
Since transaction a aborts, the write to x would be rolled
back to 0. Transaction b would then skip over the update to
x (because it now observes   = 1). This allows r = q = 0.
In our setting, the ￿nal value q = 0 is immediately disal-
lowed by HB￿￿￿￿ and C￿￿￿￿￿￿￿￿. Moreover, the ￿rst thread
may read either 0 or 2 for x , whereas the second thread must
read 2 for x , i.e., non-transactional write hWx 2i is not lost.
a:R 0 Wx 1 b:R 1 Rx 0
Wx 2 W 1 Rx 2
lww lwr
The scenario above may also result in executions such as:
a:R 0 Wx 1 b:R 0 Wx 1 Rx 2
Wx 2 W 1 Rx 1
lww lwr lwr
where transaction a successfully writes hWx 1i. Again, the
non-transactional write hWx 2i is available for the ￿nal reads
in both threads.
Example 3.5. Analogous to eager versioning is a class of
STMs that implement lazy versioning that cache writes lo-
cally within a transaction and update shared memory dur-
ing a transaction’s commit operation. Shpeisman et al. [34]
discuss potential problems with lazy versioning in a mixed-
mode setting. We consider the most interesting of these
below.
Suppose z is an array in the program below.
atomica { r:=x;x:=42 }; r1:=z[r ]; r2:=z[r ]; z[r ]:=0
|| atomicb {q:=x; if q , 42 then z[q]:=z[q] + 1 }
The ￿rst thread atomically caches x and privatizes it by
setting it to a special value (denoted here by 42). From a
programmer’s perspective z[r ] should not be read by other
threads. However, in a lazy-versioning STM, transaction b
may have been serialized before transaction a, yet contain
a bu￿ered write to z[q]. Thus, the reads of z[r ] may race
with the bu￿ered write to z[q]. A consequence of this is the
execution below, where the two reads of z[0] return di￿erent
values.
a:Rx 0 Wx 42 Rz[0]0 Rz[0]1 Wz[0]0
b:Rx 0 Rz[0]0 Wz[0]1
crw lwwlrw lwr
The ￿nal outcome r1 , r2 is considered problematic in [34].
This outcome is disallowed by any variant of our model that
includes A￿￿￿￿￿ (Example 2.3).
By A￿￿￿￿￿, the execution becomes inconsistent if we
reverse the lww order above. Thus, the outcome z[0] , 0
is forbidden by our model. This outcome is also considered
problematic in [34].
4 Local Transactional Race Freedom
We introduce the concepts behind localising data race free-
dom (LDRF [9]) by example. Consider the program:
x:=1; :=1; atomica { F:=1 }; z:=1
||  :=2; atomicb { r:=F }; z:=2; if r then w:=x +      
Consider the case where b reads F from a, as depicted below.
We leave the write-to-write orders and the values of the last
four actions of the second thread unspeci￿ed.
Wx 1 W 1 a:WF 1 Wz1
W 2 b:RF 1 Wz2 Rx R  R  Ww
cwr
There are write-write races between hW 1i and hW 2i, and
between hWz1i and hWz2i. By some de￿nitions of race, the
write hW 1i is also racing with the two reads of  . Thus, a
global notion of race-freedom does not allow one to conclude
anything about this program. A localised notion, however,
would allow one to deduce that hWx 1i is correctly published
to the second thread. Moreover, the two reads of   must see
the same value and hence, the value written tow must be 1.
LDRF is de￿ned relative to (1) a set   of traces, generated
by the semantics of a program, (2) a set L of locations, and (3)
a trace   2  , denoting a partial execution. For the example,
  is ￿xed by the program. Let L = {x, , F }. A race is an
L-race if it involves a location in L; thus the race between
hWz1i and hWz2i is not considered an L-race.
Now consider the trace   = hWx 1ihW 1iha:BihWF 1i
hCaihW 2ihb:BihRF 1ihCbi that linearizes the execution
above. This   contains an L-race between hW 1i and hW 2i.
Nonetheless,   is L-stable for   because there is no    2  
that includes an L-race between any action of   and an action
of  . It is important to note the de￿nition of stability is
relative to the set  . Trace   is stable for this program, but
would not be stable if, for example, the program is modi￿ed
so that the ￿rst thread reads   after writing z:=1.
Having ￿xed  , we now consider the L-sequential exten-
sions of this pre￿x. These extensions are constrained to obey
the sequential semantics for locations in L. Extensions that
do not touch L, such as the writes to z, are unconstrained.
The SC-LDRF theorem says that either every extension
of   is L-sequential, or there is some L-sequential extension
with an L-race. Since no L-sequential extension has a race,
the program must behave sequentially from  , guaranteeing
that the read of x sees 1, that the two reads of   see the same
value, and thus that the value written forw is 1.
The use of L in the de￿nitions serves as an obvious spatial
bound on races. The temporal bounds are less direct: By
semantic ￿at, future races can be ignored, since reads cannot
see the future. By L-stability, past races are also excluded.
PPoPP ’19, February 16–20, 2019, Washington, DC, USA Brijesh Dongol, Radha Jagadeesan, and James Riely
FromD to T. Locations used to store data are often disjoint
from locations used to perform synchronization. In TRF, a
single location may serve both purposes. This is the chief
di￿culty in extending LDRF to LTRF. Consider the program
x:=1; atomic { x:=2 } || atomic { r:=x } with executions:
(1)
a:Wx 1 b:Wx 2
c:Rx 1
wr crw
(2)
a0:Wx 1 b 0:Wx 2
c 0:Rx 2
cwr
Since wr only creates happens-before order between com-
mitted transactions, there is a race in execution (1) but not (2).
Consider the linearizations in which the read occurs last in
the trace. We analyze by setting L = {x}. In trace abc , c is
not L-sequential, whereas in a0b 0c 0, c 0 is L-sequential. In the
SC-DRF theorem of [9], it is required that whenever there is
a nonsequential racy read at the end of trace, such as c , we
must be able to ￿nd a trace with a sequential read, such as
c 0, that preserves the race. But here, this is impossible.
Note, however, that ac is L-sequential and has an L-race.
In generalizing the SC-DRF theorem of [9] to mixed accesses,
we must consider such pre￿xes. When transactional and
plain accesses are disjoint this is not necessary, since well-
formedness already guarantees sequential order between
transactions. But well-formedness does not constrain inter-
actions between transactional and plain access.
Intuitively, [9] proves that data races can be discovered
by sequential reasoning. In the case of transactions, this is
not enough. We must also have that all data races can be
discovered by executing transactions one-at-time. To achieve
this, we generalize the theorem to allow permutations that
preserve order while ensuring that all actions of a transaction
are contiguous in the trace.
L-Races. Two actions are in L-con￿ict if they both access
the same x 2 L, at least one is plain, at least one is a write,
and neither is aborted.
We say that (b, c) is an L-race if b and c are in L-con￿ict
and b index    ! c , but not b hb  ! c . Two transactional actions
cannot be in a race.
In global DRF, con￿icting actions must be ordered by hb  !;
local DRF additionally constrains the direction of the order.
This captures one form of temporal locality: future actions
cannot causally a￿ect the past.
L-Sequentiality and L-Stability. For L ✓ Loc, we say that
c is L-sequential if c does not touch any location in L, or if c
is a B, C, or A action, or if we have both of the following:
1. there is no b index    ! c such that c ww b, and
2. if a wr c then there is no b index    ! c such that a ww b.
Condition (1) applies when c is a write; it ensures that the
timestamp chosen for c is larger than all preceding times-
tamps. Condition (2) applies when c is a read; it ensures that
c reads from the preceding write with the largest timestamp.
An action that is not L-sequential is L-weak. Any L-weak
action participates in an L-race: for writes, this follows from
C￿￿￿￿￿￿￿￿; for reads, from O￿￿￿￿￿￿￿￿￿￿.
Let   be a set of traces. A trace   is L-stable for   if for
every L-sequential   such that    2  , there is no a 2   and
b 2   such that (a,b) is an L-race.
Transactional L-Sequentiality and L-Stability. Transac-
tion b is contiguous if hb:sBi index    ! hc:t i and s , t imply that
either hCbi index    ! c , hAbi index    ! c , or there are no actions of
s after c , i.e., c index    ! hd :s 0i implies s , s 0.
Note that contiguity allows multiple live transactions.
A trace is transactionally L-sequential if every action is
L-sequential and every transaction is contiguous.
A trace   is transactionally L-stable for   if it is L-stable
for  , every transaction is both contiguous and resolved, and
there is no   2  ,    2  , and   2   such that   touches a
variable in L and   xrw   .
The last condition ensures that a stable state is “future
proof” by making all new con￿icting transactions serialize
afterwards.
Closure Conditions on Programs. The SC-LTRF theorem
requires that we relate an arbitrary execution to one that
is transactionally L-sequential. To ensure that such an exe-
cution exists, we assume that the semantics of programs is
closed under certain operations.
We ￿rst give some preliminary de￿nitions.
Let act⇠ relate actions with the same thread and location:
ha:sWx qi act⇠ ha0:s 0Wx 0  0q0i if a = a0, s = s 0 and x = x 0
ha:sRx qi act⇠ ha0:s 0Rx 0  0q0i if a = a0, s = s 0 and x = x 0
A set   of traces is sequentially-closed if whenever a trace
   2   includes a Loc-weak action   , there exists a Loc-
sequential action   0 act⇠   such that    0 2  .
For a 2  , let   # a be the subsequence of   obtained by
removing all the events that causally follow a:
b < (  # a) i￿ a ( hb  ![ lwr [ xrw )+ b
We say that a set of traces   is causally closed i￿ for any
  2  , for any a 2  ,   # a 2  .
Intuitively,   # a removes “causal upclosure” of a from  .
Signi￿cantly, if (b, ) is an L-race in    , then b 2    #   .
This property does not hold for the “causal downclosure.”
For any consistent trace  , we say that   is an order-
preserving permutation of   if   is a well-formed permutation
of   and po  !  = po  !  .
If a trace is consistent, then any order-preserving permu-
tation is also consistent, since the derived orders coincide.
In addition, any consistent trace has an order-preserving
permutation with contiguous transactions. We say that   is
valid as the semantics of a program if (1) every   2   is con-
sistent, (2)   is sequentially closed, (3)   is causally closed,
and (4)   is closed under order preserving permutation.
Modular Transactions PPoPP ’19, February 16–20, 2019, Washington, DC, USA
SC-LTRF. With these de￿nitions, our theorem is as follows.
The theorem establishes that any L-race can be discovered
by a sequential trace with contiguous transactions.
Theorem 4.1 (SC-LTRF). Fix   to be the semantics of a pro-
gram. Fix     2   such that
•   is transactionally L-stable,
•   is transactionally L-sequential in   ,
•   has no L-races in   , and
•   is L-weak in     .
Then, there are b 2  ,   0 act⇠   and    0  0 2   such that:
•   0  0 is transactionally L-sequential in    0  0, and
• (b,  0) is an L-race in    0  0.
With respect to the SC-LDRF theorem in [9], the SC-LTRF
result di￿ers in that we allow   0 ,   and use the transactional
variants of L-stability and L-sequentiality, which require
that we only consider traces with contiguous transactions.
In an L-stable trace, all transactions must also be resolved.
In the degenerate case, with only contiguous committed
singleton transactions, the de￿nitions of SC-LDRF and SC-
LTRF coincide.
For example, consider the (IRIW) program from the intro-
duction. Reasoning sequentially, we know that we cannot
read 1 followed by 0 in both threads. SC-LDRF validates this
reasoning for concurrent executions. Likewise, the publica-
tion and privatization examples from the introduction have
the expected behavior. As a further example in this vein,
consider the following program.
atomica { if !  then while x do skip }
|| atomicb { :=1 };x:=1
If it is possible for a to read 0 for   and then 1 for x , then
a becomes a doomed transaction, which can never commit.
By sequential reasoning, this is impossible, and therefore, by
SC-LTRF, it is impossible in our model.
It is worth emphasizing that the SC-LTRF theorem in-
cludes aborted and live transactions, and thus guarantees
opacity. In addition, the following result shows that aborted
transactions can be ignored.
Theorem 4.2. If   is consistent then so is   with aborted
transactions removed.
5 Implementation Model
An optimization is valid as long as it creates no new be-
haviors. As noted in §2, LDRF disables reads from being
reordered with later writes. Thus we cannot transform r:=z;
x:=1 to x:=1; r:=z. Unfortunately, the reverse transformation
also fails in our programmer model, due to the order created
by HB￿￿. Consider the following variant of privatization:
z:=1; atomica { if !  then x:=1 }
|| atomicb { :=1 };x:=2; r:=z
The second thread must read hRz1i. If not, we would obtain
the following execution, which is disallowed by C￿￿￿￿￿￿￿￿.
Wz1 a:R 0 Wx 1
b:W 1 Wx 2 Rz0
crw
lww (‡)
Note that hWx 2i ww hWx 1i is ruled out by A￿￿￿￿￿,
and so we must have hWx 1i ww hWx 2i, as shown. By
HB￿￿, we have hWx 1i hb  ! hWx 2i, and thus by transitivity,
hWz1i hb  ! hRz0i. C￿￿￿￿￿￿￿￿ rules out the execution, since
hRz0i rw hWz1i.
However if we replace “x:= 2; r :=z” by “r :=z;x:= 2” in
the program above, then the second thread may read hRz0i,
since we no longer have hWz1i hb  ! hRz0i. The resulting
allowed execution shows that the optimization is not valid:
Wz1 a:R 0 Wx 1
b:W 1 Rz0 Wx 2
crw
lww
In this section, we consider an “implementation” model
that removes HB￿￿. Since HB￿￿ is designed to allow non-
racy privatization, it should not be surprising that privati-
zation is racy in the implementation model. To enable the
removal of such races, we add the new action hsQxi to model
a quiescence fence [36] for thread s on location x .
Note that our implementation model is still fairly abstract.
We assume that the underlying transactional machinery pro-
vides order between transactions that have a direct depen-
dency, as in the publication idiom. Quiescence fences are
necessary only to provide order when there is no direct de-
pendency, as in the privatization idiom.
  ::= · · · | ha:sQxi (Quiesence fence)
A quiescence fence hQxi may not be interleaved with a
transaction that touches x . We therefore add the following
requirement to well-formedness:
WF12. If hb:Bi index    ! hQxi then either hCbi index    ! hQxi,
hAbi index    ! hQxi or b neither reads not writes x .
In addition, quiescence fences create order. In the de￿nition
of happens-before, we replace HB￿￿ by the following.
ha:Cbi hb  ! hc:Qxi if a index    ! c and b touches x (HBCQ)
hc:Qxi hb  ! hb:Bi if c index    ! b and b touches x (HBQB)
Because we have removedHB￿￿, we also dropA￿￿￿￿￿ from
the de￿nition of a consistent execution. The remaining de￿-
nitions are unchanged in the implementation model.
Relating implementation and programmermodels. The
implementation model allows executions that are not al-
lowed by the programmer model. Since A￿￿￿￿￿ is removed,
Example 2.2 is allowed in the implementation model; how-
ever, there is no matching execution in the programmer
model: If a precedes b, then the read of a is invalidated by
O￿￿￿￿￿￿￿￿￿￿. If b precedes a, the write-to-write order is
PPoPP ’19, February 16–20, 2019, Washington, DC, USA Brijesh Dongol, Radha Jagadeesan, and James Riely
invalidated by C￿￿￿￿￿￿￿￿. Since HB￿￿ is removed, (‡) is
allowed in the implementation model; however, there is no
matching execution in the programmer model: If a precedes
b, then the read of z is invalidated by O￿￿￿￿￿￿￿￿￿￿. If b
precedes a, then the read of   is invalidated by WF10.
We say that   has a mixed race if there is some L ✓ Loc
such that   includes an action in an L-race between a trans-
actional write and a plain write.
The following lemma establishes that the implementa-
tion and programmer models coincide for programs without
mixed races. Therefore, for mixed-race free programs in the
implementation model, SC-LTRF holds. Khyzha et al. [22]
establish a similar result for global TRF.
Lemma 5.1. Let   be an execution in the implementation
model without mixed races. Let   be the induced execution in
the programmer model obtained by dropping all the quiescence
fences in  . If   is consistent, then so is  .
Suborders. The quiescent fence hQxi has the same order-
ing properties as a committed transaction that writes x :
ha:BihQxihCai. For the purpose of studying compiler op-
timizations, we encode quiescent fences thusly as writing
transactions. With this convention, we do not mention hQxi
explicitly in the following development. The treatment fol-
lows [9] closely, including much of the notation and proofs.
We need only adapt their de￿nitions to work up to tx⇠.
Let TAct = {hBi, hCi, hAi}. De￿ne the following subsets
of po  ! \ (Act ⇥ TAct [ TAct ⇥ Act), i.e., the portion of po  !
that does not involve the transactional boundaries. In the
following de￿nitions, we quantify universally over a,b 2
Act \ TAct; all other actions are quanti￿ed existentially.
We say action a con￿icts with b i￿ they access the same
location at least one of a or b is a write.
a po-T   ! b i￿ a po  ! b,a 6tx⇠ b,b tx⇠ hBi, and b tx⇠ hWi
a poT-   ! b i￿ a po  ! b,a 6tx⇠ b, and a tx⇠ hBi
a poTT   ! b i￿ a poT-   ! b and a po-T   ! b
a poRW    ! b i￿ a po  ! b,a = hRi, and b = hWi
a poCon     ! b i￿ a po  ! b and a con￿icts with b
The relations po-T   !, poT-   !, poTT   ! do not relate actions from
the same transaction. po-T   ! is that subset of po  ! that ends in
a transactional action of a writing transaction; poT-   ! is the
subset of po  ! that begins in a resolved transactional action;
whereas poTT   ! is the subset of po  ! that begins and ends in
transactional actions with target being a writing transac-
tion. The targets of relations poTT   ! and po-T   ! are restricted
to transactions that contain a write action; this restriction
mirrors the treatment of read actions of volatiles in [9] and
ensures that read-only transactions have greater ￿exibility
in commuting earlier in program order. poRW    ! is that subset
of po  ! between reads and writes, not necessarily of the same
location. poCon     ! restricts po  ! to con￿icting actions.
In the supplementary material for this paper, we describe
an equivalent de￿nition of consistency that uses only these
suborders instead of the full po  !. This characterization of
consistency is useful for proving the correctness of the opti-
mizations enumerated in the next subsection.
Compiler optimizations. Consider a program transforma-
tion P ⇤ Q , where Q is a program obtained from P by re-
ordering its statements. To validate the transformation, for
any execution   of Q , we must associate a corresponding
execution   of P . We consider three ￿avors.
In the ￿rst method, the transformation is correct if there
is no change in transactional actions, and
( po-T   ! , poT-   ! , poTT   ! , poRW    ! , poCon     !  )
= ( po-T   !  , poT-   !  , poTT   !  , poRW    !  , poCon     !  )
This allows, for example, the reordering of independent
writes and of independent reads. Dolan et al. [9] show how
to prove the validity of some peephole optimizations using
this ￿exibility: redundant load, store forwarding, dead store
elimination, common subexpression elimination, constant
propagation and loop invariant code motion. We show that:
P; atomic {Q } ⇤ atomic {Q }; P
if Q is read-only, P is write-only and there are no con￿icts
between P,Q . For correctness, note that poTT   ! and po-T   !
relations do not target read-only transactions. The absence
of con￿ict between P,Q ensures the preservation of poCon     !.
Moreover, poRW    ! is preserved because P is write only.
Secondly, we validate transformations, such as the roach
motel optimization, where the only change is increase in the
scope of transactions; i.e, when P andQ are nontransactional:
P; atomic { R };Q ⇤ atomic { P;R;Q }.
Given   from atomic { P;R;Q }, we establish the consistency
of the corresponding   from P; atomic { R };Q by showing
that all relevant orders of   are contained in those of  .
Thirdly, we validate the fusion of adjacent transactions:
atomic { P }; atomic {Q } ⇤ atomic { P;Q }.
Given   from atomic { P;Q }, we build   for atomic { P };
atomic {Q } by adding two adjacent transactional events. On
the other hand, the converse transformation is not validated.
This is because we need to remove the two extra events to
build a witness execution of atomic { P;Q } from a given
execution of atomic { P }; atomic {Q }. These events are not
necessarily adjacent; so, the validity of the constructed exe-
cution cannot be established in general.
We can similarly establish that empty transactions can be
elided, i.e.,
P; atomic{};Q ⇤ P;Q .
Modular Transactions PPoPP ’19, February 16–20, 2019, Washington, DC, USA
6 Compilation
Dolan et al. [9] show that the LDRF memory model can be
compiled e￿ciently to both x86-TSO and AArch64/ARMv8.
Compilation of LDRF to x86-TSO requires no additional
fencing. Therefore non-volatile reads/writes execute with
native performance.
Because ARMv8 allows load bu￿ering (which is disallowed
by LDRF), compilation to ARMv8 requires some fencing,
even for non-volatile reads/writes. [9] discusses two com-
pilation schemes and studies their performance on several
benchmarks with di￿ering patterns of access. The perfor-
mance penalty is 2.5% for one compilation strategy and 0.6%
for the other. These results demonstrate that non-volatile
access is not appreciably slowed by the insertion of fences
to prevent load bu￿ering.
The compilation results for plain variables carry over to
our model, which di￿ers from [9] primarily in the style of
synchronization: [9] uses volatile variables, whereas we use
transactions. In both x86-TSO and ARMv8 models, there are
fences before and after successful transactions (see [6]), mak-
ing the fencing behavior similar to that of volatile variables.
Both x86-TSO and ARMv8 validate our implementation
model, assuming we include fences to prevent load-bu￿ering
in ARMv8, as described above.
In x86-TSO, crw order is included in hb  !. Thus, it is
straightforward to establish that x86-TSO validates even the
strongest variant of our programmer model, which includes
HB￿￿, HB￿￿, HB￿￿ and their prime variants. Like our pro-
grammer model, x86-TSO validates privatization (Example
2.1). Like models that includeA￿￿￿0￿￿, x86-TSO imposes pub-
lication by antidependence (Example 3.1). Neither of these
examples require quiescent fences on x86-TSO.
It is not immediately obvious whether ARMv8 realizes our
programmer model. In ARMv8, ob ! plays the role of hb  !. The
crw relation is included in ob ! when the source and target
come from di￿erent threads, known as external from-read.
As a result, ARMv8 gives the same strong result as x86-TSO
for Examples 2.1 and 3.1.
We expect that software transactional memories will re-
alize the implementation model of §5, rather than the pro-
grammer model. As a result, it will be necessary for either
the programmer or compiler to insert quiescent fences in
order to realize our programmer model. Our results provide
a correctness criterion: when are there su￿cient fences to
guarantee the absence of data races in the implementation
model. As we discuss in §7, our work on placing quiescent
fences is compatible with, and builds on, the extensive liter-
ature exploring this topic.
7 Related Work and Conclusions
Transactions [12, 18, 33] are motivated by the issues that
arise with lock-based programming. See [14, 16, 17, 23] for
textbook-style presentations. Hardware transactional models
that integrate with relaxedmemory are available for Pentium,
Power and ￿￿￿V8 (in design) [5, 6, 10]. Software transac-
tional memory achieves transactional guarantees limitations
of the “bounded” and “best-e￿ort” hardware transactional
model, e.g., the C++ design of transactions [29] in C11 [4],
Haskell transactions in GHC 6.4, experimental designs for
Java [20] and C# [2].
Inspired by Dalessandro et al. [7] and Grossman et al. [13],
we use memory orders to integrate transactions into the
relaxed memory model of Dolan et al. [9].
In order to permit compiler optimizations, the LDRFmodel
of [9] is more liberal than sequential consistency. Yet it es-
chews the speculative reads found in many models [19, 21,
25]. There is a rich design space for such “intermediate” mod-
els. Ou and Demsky [30] includes a survey of this work.
Transactional sequential consistency is similar to the the
strong semantics [1], StrongBasic semantics [28], strong iso-
lation [17], and transactional memory with store atomicity
by [24]. Opacity [15, 16] and TMS2 [8] treat aborted transac-
tions in this context (see [11] for a survey).
Our model of SC-TDRF replaces the global real-time order
by memory orders. We exploit the LDRF framework [8] to
achieve a modular form of LTRF that is insensitive to races
that are spatially and temporally isolated from the trans-
actions under consideration. LDRF is de￿ned operationally
in [9], using machine states. We give an axiomatic account.
The two approaches are equivalent if every machine state is
derivable from the initial state.
Our results in §5 show that our model does not su￿er from
“optimization obstruction” [35]. Prior work, e.g., [22, 34, 35],
requires that programmers place quiescence fences in order
to guarantee safe privatization. Our low level model illus-
trates the correctness criteria for such techniques.
In Spear et al. [35], transactions can optionally be marked
with annotations corresponding to publishing/privatizing
transactions. The weakest ordering sfs  ! in [35] is the smallest
transitive relation that includes transactional ordering and
ensures that a sfs  ! c in the cases when: (1) a is an acquire
transaction, a po  ! c , and a 6tx⇠ c , or (2) there is some release
transaction b such that a po  ! b and either b lwr c or a
is transactionally ordered before c . There are two kinds of
fences in the implementation level model of §5, namely the
explicit quiescence fences hQxi, and the implicit memory
fences arising from our transactional abstraction. In each
case, we can deduce a sfs  ! c , thus showing that our require-
ments for synchronization are no stronger than those of [35].
Our treatment of the implementation model is inspired by
Khyzha et al. [22]. They divide actions into request/response
pairs such that transactional response actions may abort. Our
treatment ismore abstract.We record all failed requests using
a single abort action. Our commit action corresponds to the
commit request in [22]. All of our other actions correspond
PPoPP ’19, February 16–20, 2019, Washington, DC, USA Brijesh Dongol, Radha Jagadeesan, and James Riely
References
[1] M. Abadi, A. Birrell, T. Harris, andM. Isard. 2011. Semantics of Transac-
tional Memory and Automatic Mutual Exclusion. ACM Trans. Program.
Lang. Syst. 33, 1, Article 2 (Jan. 2011), 50 pages.
[2] M. Abadi, T. Harris, and M. Mehrara. 2009. Transactional memory
with strong atomicity using o￿-the-shelf memory protection hardware.
In PPoPP, D. A. Reed and V. Sarkar (Eds.). ACM, 185–196.
[3] S. V. Adve and M. D. Hill. 1990. Weak Ordering—a New De￿nition. In
Proceedings of the 17th Annual International Symposium on Computer
Architecture (ISCA ’90). ACM, New York, NY, USA, 2–14.
[4] H.-J. Boehm and S. V. Adve. 2008. Foundations of the C++ concurrency
memory model. In PLDI. ACM, 68–78.
[5] H. W. Cain, M. M. Michael, B. Frey, C. May, D. Williams, and H. Q. Le.
2013. Robust architectural support for transactional memory in the
Power architecture. In ISCA, A. Mendelson (Ed.). ACM, 225–236.
[6] N. Chong, T. Sorensen, and J. Wickerson. 2018. The semantics of
transactions and weak memory in x86, Power, ARM, and C++. In PLDI.
ACM, 211–225.
[7] L. Dalessandro, M. L. Scott, and M. F. Spear. 2010. Transactions As
the Foundation of a Memory Consistency Model. In DISC (LNCS).
Springer-Verlag, Berlin, Heidelberg, 20–34.
[8] S. Doherty, L. Groves, V. Luchangco, and M. Moir. 2013. Towards
formally specifying and verifying transactional memory. Formal Asp.
Comput. 25, 5 (2013), 769–799.
[9] S. Dolan, K. C. Sivaramakrishnan, and A. Madhavapeddy. 2018. Bound-
ing data races in space and time. In PLDI, Je￿rey S. Foster and Dan
Grossman (Eds.). ACM, 242–255.
[10] B. Dongol, R. Jagadeesan, and J. Riely. 2018. Transactions in relaxed
memory architectures. PACMPL 2, POPL (2018), 18:1–18:29.
[11] D. Dziuma, P. Fatourou, and E. Kanellou. 2015. Consistency for Trans-
actional Memory Computing. Springer International Publishing, Cham,
3–31.
[12] J. E. Gottschlich and H.-J. Boehm. 2013. Generic programming needs
transactional memory. In The 8th ACM SIGPLAN Workshop on Trans-
actional Computing.
[13] D. Grossman, J. Manson, and W. Pugh. 2006. What do high-level mem-
ory models mean for transactions?. In Memory System Performance
and Correctness. ACM, New York, NY, USA, 62–69.
[14] D. Grossman, V. Menon, S. Srinivas, and C. Zilles. 2007. Transactional
Memory in Managed Runtimes - Hardware/Software View. h￿ps:
//www.microarch.org/micro40
[15] R. Guerraoui and M. Kapalka. 2008. On the Correctness of Transac-
tional Memory. In PPoPP. ACM, New York, NY, USA, 175–184.
[16] R. Guerraoui and M. Kapalka. 2010. Principles of Transactional Memory.
Morgan & Claypool Publishers.
[17] T. Harris, J. Larus, and R. Rajwar. 2010. Transactional Memory, 2nd
edition. Morgan & Claypool Publishers.
[18] M. Herlihy and J. E. B. Moss. 1993. Transactional Memory: Architec-
tural Support for Lock-Free Data Structures. In ISCA, A. J. Smith (Ed.).
ACM, 289–300.
[19] R. Jagadeesan, C. Pitcher, and J. Riely. 2010. Generative Operational
Semantics for Relaxed Memory Models. In ESOP. 307–326.
[20] S. Jagannathan, J. Vitek, A.Welc, and A. Hosking. 2005. A transactional
object calculus. Science of Computer Programming 57, 2 (2005), 164 –
186.
[21] J. Kang, C.-K. Hur, O. Lahav, V. Vafeiadis, and D. Dreyer. 2017. A promis-
ing semantics for relaxed-memory concurrency. In POPL, Giuseppe
Castagna and Andrew D. Gordon (Eds.). ACM, 175–189.
[22] A. Khyzha, H. Attiya, A. Gotsman, and N. Rinetzky. 2018. Safe privati-
zation in transactional memory. In PPOPP. ACM, 233–245.
[23] J. Larus and C. Kozyrakis. 2008. Transactional Memory. Commun.
ACM 51, 7 (July 2008), 80–88.
[24] J.-W. Maessen and Arvind. 2007. Store Atomicity for Transactional
Memory. Electronic Notes in Theoretical Computer Science 174, 9 (2007),
117 – 137. Proceedings of the Thread Veri￿cation Workshop (TV
2006).
[25] J. Manson, W. Pugh, and S. V. Adve. 2005. The Java memory model. In
POPL. 378–391.
[26] M. Martin, C. Blundell, and E. Lewis. 2006. Subtleties of Transactional
Memory Atomicity Semantics. IEEE Comput. Archit. Lett. 5, 2 (July
2006), 17–17.
[27] V. Menon, S. Balensiefer, T. Shpeisman, A.-R. Adl-Tabatabai, R. L. Hud-
son, B. Saha, and A. Welc. 2008. Practical Weak-atomicity Semantics
for Java Stm. In SPAA. ACM, New York, NY, USA, 314–325.
[28] K. F. Moore and D. Grossman. 2008. High-level small-step operational
semantics for transactions. In POPL, G. C. Necula and P. Wadler (Eds.).
ACM, 51–62.
[29] Y. Ni, A. Welc, A.-R. Adl-Tabatabai, M. Bach, S. Berkowits, J. Cownie,
R. Geva, S. Kozhukow, R. Narayanaswamy, J. Olivier, S. Preis, B. Saha,
A. Tal, and X. Tian. 2008. Design and Implementation of Transactional
Constructs for C/C++. SIGPLAN Not. 43, 10 (Oct. 2008), 195–212.
[30] Peizhao Ou and Brian Demsky. 2018. Towards Understanding the
Costs of Avoiding Out-of-thin-air Results. Proc. ACM Program. Lang.
2, OOPSLA, Article 136 (Oct. 2018), 29 pages.
[31] W. Pugh. 1999. Fixing the Java Memory Model. In Proceedings of the
ACM 1999 Conference on Java Grande (JAVA ’99). ACM, New York, NY,
USA, 89–98.
[32] A. Raad, O. Lahav, and V. Vafeiadis. 2018. On Parallel Snapshot Isolation
and Release/Acquire Consistency. In ESOP (Lecture Notes in Computer
Science), Vol. 10801. Springer, 940–967.
[33] N. Shavit and D. Touitou. 1995. Software Transactional Memory. In
PODC. ACM, New York, NY, USA, 204–213.
[34] T. Shpeisman, V. Menon, A.-R. Adl-Tabatabai, S. Balensiefer, D. Gross-
man, R. L. Hudson, K. F. Moore, and B. Saha. 2007. Enforcing isolation
and ordering in STM. In PLDI, J. Ferrante and K. S. McKinley (Eds.).
ACM, 78–88.
[35] M. F. Spear, L. Dalessandro, V. J. Marathe, and M. L. Scott. 2008.
Ordering-Based Semantics for Software Transactional Memory. In
PODS, T. P. Baker, A. Bui, and S. Tixeuil (Eds.). Springer Berlin Heidel-
berg, Berlin, Heidelberg, 275–294.
[36] M. F. Spear, V. J. Marathe, L. Dalessandro, and M. L. Scott. 2007. Pri-
vatization Techniques for Software Transactional Memory. In PODC.
ACM, New York, NY, USA, 338–339.
Supplementary Material For “Modular Transactions:
Bounding Mixed Races in Space and Time”
A Proof of SC-LTRF Theorem
We begin with an example to explain the last condition in
the de￿nition of transactional L-stability.
Example A.1. Recall the de￿nition of transactionally L-sta-
bility: A trace is transactionally L-stable for   if it is L-stable
for  , every transaction is both contiguous and resolved, and
there are no    2  ,   2  , and   2   such that   touches a
variable in L and   xrw   .
To see the need for the last requirement, consider the
following consistent execution:
Wx 1 a:Wx 2
b:Rx 1 W 1
wr crw
Take L = { } and consider the execution in   contains the
top thread,   contains the read of the bottom thread, and  
is the write. Ignoring initialization, we have   = hsWx 1i
ha:sBi hsWx 2i hsCai,   = hb:tBi htRx 1i, and   = htW 1i.
This particular decomposition invalidates the theorem,
since we must remove a from   in order to linearize b, yet a
occurs in  .
The last requirement forbids this decomposition. In con-
sidering the trace where a occurs before b, we must include
a in  , not  .
In order to prove the theorem, we ￿rst establish several
lemmas. The ￿rst two concern causal closure. Recall that
  # a is the subtrace of   that discards all causal dependents
of a, de￿ned as:
b < (  # a) i￿ a ( hb  ![ lwr [ xrw )+ b
In the rest of the appendix, we use the notation   #  to stand
for:
b < (  #  ) i￿ (8a 2  ) a ( hb  ![ lwr [ xrw )+ b
It is immediate that   #   is invariant under permutations of
 .
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for pro￿t or commercial advantage and that copies bear
this notice and the full citation on the ￿rst page. Copyrights for components
of this work owned by others than ACMmust be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior speci￿c permission and/or a fee. Request
permissions from permissions@acm.org.
Conference’17, July 2017, Washington, DC, USA
c  2018 Association for Computing Machinery.
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00
h￿ps://doi.org/10.1145/nnnnnnn.nnnnnnn
Note that a 2   # a. In the case where a is transactional,
the e￿ect of   #a is to remove all the dependent transactions
that read from the transaction, and also the anti-dependent
transactions. Thus, for any transactional b tx⇠ a that is a read,
there are no transactional con￿icting writes in   # a with a
later timestamp.
The ￿rst lemma shows that   is included in     #  when-
ever it is stable.
Lemma A.2. Suppose   is transactionally L-stable   is trans-
actionally L-sequential in   , and   touches a location in L.
Then   is a pre￿x of     #   .
Proof. We show that for all b 2  
¬(  ( hb  ![ lwr [ xrw )+ b)
It su￿ces to prove 8c 2     , if a 2   is such that
c ( hb  ![ lwr [ xrw ) a,
then c 2  . We proceed by cases:
• c xrw a. By the assumption that   is transaction-
ally L-sequential. (This requires the assumption that
  touches a location in L.)
• c lwr a. Since   is a pre￿x of     , the result follows
by WF8.
• c hb  ! a. There are two sub cases.
– c hb  ! a by item HB￿￿￿￿. The required result follows
by WF9–WF11.
– c hb  ! a by item HB￿￿. The required result follows
by transactional L-stability of  . ⇤
The next lemma establishes that causal closure preserves
transactional L-sequentiality.
Lemma A.3. Suppose   is transactionally L-stable   is trans-
actionally L-sequential in   , and   touches a location in L.
Then     #   =    0  , where   0  is transactionally L-sequen-
tial in   .
Proof. If   <  , then the result is trivial, since  #  =  . Thus,
assume   2  . Using the Lemma A.2, we know that     #  
includes  . Thus we can ￿x   0 so that     #   =    0  .
First, we show that    0  is well-formed. WF1–WF5, and
WF7–WF8 follow from the well-formedness of  . WF6 follows
since   #   is closed under the predecessors of lwr , for any
  . WF9 and WF10 follow since,   \ (  #  ) is closed under tx⇠.
WF11 follows, since it is preserved under removal of actions.
Consistency of    0  follows from the consistency of    
since all relations on    0  are subrelations of     .
Transactional L-sequentiality of    0  follows from trans-
actional L-sequentiality of     . ⇤
1
Conference’17, July 2017, Washington, DC, USA
The next lemma establishes that that any L-weak action
participates in an L-race. The proof mirrors the last two para-
graphs of the proof theorem 13 of [1]. Instead of reasoning
operationally, we use the consistency axioms C￿￿￿￿￿￿￿￿
and C￿￿￿￿￿￿￿￿.
Lemma A.4. Suppose   is L-stable,   is L-sequential and
  hci 2  . If c is L-weak then there exists some b 2   such
that (b, c) is an L-race.
Proof. Suppose c is L-weak. Then, there exists a write action
b index    ! c such that either
• c ww b, or
• a wr c and a ww b; thus, c rw b.
If b hb  ! c , we have a contradiction, either because
• c ww b contradicts the irre￿exivity of ( hb  !; lww ), or
• c rw b contradicts the irre￿exivity of ( hb  !; lrw ).
So, (b, c) is an L-race. Further, b cannot be in   since   is
L-stable; therefore, b must be in  , as required. ⇤
The next lemma says that every execution has an order-
preserving permutation with contiguous transactions. The
proof formalizes the following argument: All the ordering
between transactional actions is re￿ected in the causality
order. Furthermore, two actions in the same transaction are
treated identically by the causality order. Consequently, since
the causality order is acyclic, we can use a linearization of it
to achieve contiguity in transactions.
Lemma A.5. Let   be the semantics of a program, and ￿x
   2  . Suppose that all transactions of   are contiguous and
no transactions of   are live in  . Then there exists an order-
preserving permutation    of    such that    2   and   
has contiguous transactions.
Proof. By C￿￿￿￿￿￿￿￿, ( hb  ! [ lwr [ xrw ) is acyclic. Thus,
we can extend ( hb  ![ lwr [ xrw )⇤ to a total order over the
actions of   . Fix such a total order (with the initializing
begin transaction as minimal element), and let R be the sub-
order that includes only nontransactional actions and begin
actions. We extend R to a total order over the actions of   
as follows. De￿ne a E b when one of the following holds:
a 2   ^ b 2   (1)
a
tx⇠ a0 R b 0 tx⇠ b (2)
a
tx⇠ b ^ a index    !   b (3)
Condition (1) ensures that the actions in   are ordered be-
fore those in  . Condition (2) ensures that the actions in a
transaction of    are treated identically by E with respect
actions outside the transaction—recall that tx⇠ relates each
nontransactional action to itself. Condition (3) forces order
within a transaction of    to coincide with the order from
index    !   .
It is clear that E induces a total order on the actions   
with contiguous transactions. Supposing that the trace or-
dered byE is well formed, then it trivial to show that it is con-
sistent, since no orders are changed. Because the semantics
of a programmust be closed with respect to order-preserving
permutation, we further have that the trace belongs to  .
Thus, to prove the lemma it su￿ces to show that the trace
ordered by E is well-formed. We consider each of the well-
formedness criteria given in §2.
WF1 follows from the choice of R.
WF2–WF4 and WF6–WF7 follow from the well-formedness
of   .
WF5 holds due to well formedness of    and (3).
If both actions are nontransactional, WF8 follows from
well-formedness of   . If both are transactional, it follows
because cwr is included in hb  !. Suppose the write is transac-
tional and the read is not. Then the begin is ordered with re-
spect to the read in the lifted relation lwr . Using (2) and (3),
the result holds. The argument is symmetric for the case
where the read is transactional and the write is not.
For WF9, if a,b are con￿icting transactional writes, then
a hb  ! b or b hb  ! a. In the former case, a E b, by de￿nition
of R. The case for b hb  ! a is symmetric.
For WF10, let a,b be con￿icting transactional writes such
that a ww b and let a wr c . Thus, c rw b. Since c is also
transactional we have c xrw b. Thus, c E b by the de￿nition
of R.
For WF11, let b be transactional and a wr b and a ww c
and c tx⇠ b. If c E b, then c index    ! b contradicting WF11 on
  . ⇤
The next lemma shows that races are preserved by delay-
ing the timestamp of writes. The intuition is that delaying
the timestamp of a write can only decrease happens before.
Note that only the timestamp of the last write is increased,
and since timestamps are rationals, it is straightforward to
change a timestamp so that the execution under considera-
tion remains consistent.
The key step in the proof is the inductive case for HB￿￿,
which requires A￿￿￿￿￿.
Lemma A.6. Let   =    be a consistent execution such that
( , ) is an L-race in   between two writes. Let   =    0 where
  0 act⇠   and   0 has a later timestamp.
Then   is a consistent and ( ,  0) is an L-race in  .
Proof. Since   =    0 and a hb  !  c implies a index    !  c , it is
not possible that   0 hb  !  c or   hb  !  c . We call this property
Terminal.
We show that a hb  !  c implies a hb  !  c , for any a, c .
The proof proceeds by induction on the de￿nition of hb  !.
The empty relation satis￿es the hypothesis. For the inductive
step, we have three cases. If a hb  !  c and   0 , c , then a hb  ! 
c and therefore a hb  !  c . Thus, we need only consider cases
where   = c .
2
Supplementary Material For “Modular Transactions:
Bounding Mixed Races in Space and Time” Conference’17, July 2017, Washington, DC, USA
• For HB￿￿￿￿, note that init  !  = init  !  and po  !  = po  !  .
If   0 is transactional, then, by construction,   must
also be transactional. In this case, using Terminal, we
deduce that cwr   = cwr   . Since   and   0 are transac-
tional writes on the same variable, using Terminal, we
deduce that cww   = cww   . If   0 is nontransactional,
then modifying the timestamp of   has no e￿ect on
any of the relations in HB￿￿￿￿.
• HB￿￿￿￿￿ follows immediately by induction.
• For HB￿￿, suppose that a and b are nonaborted and
transactional,   0 is plain, a lww     0, a crw   b and
b hb  !    0.
We have a crw   b and therefore a crw   b.
Since   act⇠   0, we know that   and   0 have the same
name. Applying the induction hypothesis to b hb  !    0,
we have b hb  !    .
Note that the timestamp of   cannot be less than that
of a. If this were the case, then we would also have
that the timestamp of   is less than that of a, and we
would have   lww   a and a crw   b hb  !    . Thus,  
would fail to be consistent by A￿￿￿￿￿.
Since the timestamp of   must be greater than that
of a, we have a lww     . Thus, by HB￿￿, we have as
a hb  !    required.
Well-formedness of   is immediate.
Consistency of   follows from hb  !  ✓ hb  !  , using the
consistency of  .
The raciness of ( ,  0) in   also follows from hb  !  ✓ hb  !  ,
using the from the fact that hb  !  is included in hb  !  , using
the fact that ( , ) is an L-race in  . ⇤
The next lemma shows that races are preserved by delay-
ing the timestamp of some reads. The intuition again is that
delaying the timestamp can only decrease happens before.
The sole case when delaying the timestamp of a read can
actually increase happens before is when the read is transac-
tional and the newly matched write is also transactional. The
hypothesis of the following lemma rules out this problematic
case.
Lemma A.7. Let   =    be a consistent execution such that
( , ) is an L-race in  ,   is a write and   is a read. Let   =    0
where   0 act⇠   and   0 has a later timestamp.
Suppose that the writes satisfying   and   0 are nontrans-
actional when   is transactional (and therefore   0 is transac-
tional).
Then   is a consistent and ( ,  0) is an L-race in  .
Proof. The proof is similar to the proof of Lemma A.6.
For HB￿￿￿￿, the result follows since the writes matching
 ,  0 are not transactional when   and   0 are transactional.
For rule HB￿￿, the result follows since   and   0 are not
writes. ⇤
Lemma A.8. Fix   to be the semantics of a program. Fix
    2   such that
•   is transactionally L-stable,
•   is transactionally L-sequential in   , and
•   has no L-races in   .
Then, there is    0  2   such that
•   0 is transactionally L-sequential in    0,
•   0  has contiguous transactions, and
•   0 is an order-preserving permutation of a subsequence
of  .
Proof. If   is non-transactional or a begin action, setting
  =   0 meets the requirements. Thus we suppose that   is
transactional, belonging to transaction a of thread s .
Let   =     # a. Let   0 be derived from   by permuting
the events of the open transaction a to the end.
This order preserving permutation establishes contiguity
of transactions in   0  .
Next, we show that    0  is well-formed. WF1–WF7 are
immediate. WF8 follows because the writes of a are only read
by actions of a by WF7. WF9 and WF10 follow because   0 is
derived from   . WF11 is inherited from well-formedness of
    .
Finally, we show that   0 is L-sequential in    0. We proceed
by contradiction. There are two cases to consider. Let c be
an arbitrary action in   0.
• c touches a location in L and there is a b index    ! c such
that c ww b. Since     is well formed, this can only
happen if c is in open transaction a and c was before
b in     .
We reason by cases based on whetherb is transactional.
– If b is transactional, c hb  ! b; so b <   .
– If b is not transactional. In this case, since b <   , we
deduce that ¬(c hb  ! b). So, since there are no data
races in  , we deduce that b hb  ! c which contradicts
C￿￿￿￿￿￿￿￿ of   .
• c touches a location in L, a wr c , and there is b index    !
c such that a ww b. We reason by cases based on
whether b is transactional.
– If b is transactional, c xrw b; so b <   .
– If b is not transactional. In this case, since b <   , we
deduce that ¬(c hb  ! b). So, since there are no data
races in  , we deduce that b hb  ! c which contradicts
O￿￿￿￿￿￿￿￿￿￿ of   . ⇤
We now turn to the theorem.
Theorem 4.1. Fix   to be the semantics of a program. Fix
    2   such that
•   is transactionally L-stable,
•   is transactionally L-sequential in   ,
•   has no L-races in   , and
•   is L-weak in     .
Then, there are b 2  ,   0 act⇠   and    0  0 2   such that
•   0  0 is transactionally L-sequential in    0  0, and
• (b,  0) is an L-race in    0  0.
3
Conference’17, July 2017, Washington, DC, USA
Proof. By Lemma A.8, we can assume without loss of gener-
ality that     has contiguous transactions.
Choose b as follows. Since   is L-weak, by Lemma A.4, we
know that there is some b such that (b, ) is an L-race. By
the de￿nition of stability, we know that b must occur in  .
Choose   0 as follows. Since   is sequentially-closed, there
must be a L-sequential action   0 act⇠   such that     0 2  .
Choose   0 as follows. By Lemma A.3, there is some   0
such that     #   0 =    0  0. Since   is causally-closed, we
know that    0  0 2  . Since   0 is a subsequence of  , all
transactions of   0 are contiguous. By construction, using
LemmaA.3, we know that   0  0 is L-sequential in   0  0. Thus,
  0  0 is transactionally L-sequential in    0  0.
We need only show that (b,  0) is an L-race. We proceed
by cases.
• Suppose that   is a B, C, and A action. This is not
possible since these actions are always L-sequential.
• Suppose that   is a write. The result follows from
Lemma A.6.
• Suppose that   is a non transactional read. The result
follows from Lemma A.7.
• Finally, suppose that   is a transactional read.
The write matching   must be nontransactional. Oth-
erwise WF10 guarantees that   would be L-sequential.
The write matching   0 must be nontransactional. Oth-
erwise it would follow   in xrw , and thus must have
been removed from the causal closure. (This case cor-
responds to executions illustrated at the beginning of
the paragraph labelled “From D to T” on page 8.)
Given that the ful￿lling writes for   and   0 are not
transactional, the hypotheses of lemma A.7 are satis-
￿ed, yielding the required result. ⇤
B Aborted Transactions
Theorem 4.2. If   is consistent then so is   with aborted
transactions removed.
Proof. Let   be any well-formed and consistent trace. Then:
•   without a is well-formed in the case that a = hRi or
a = hWi and a is not the source of an xwr edge.
• by WF7, if a = hWi is in an aborted transaction, any
read of a is also in the same aborted transaction.
•   with hBi ( and any matching hendi) removed is also
well-formed.
Let   be a well-formed and consistent trace. Let us write
  \ A for   with aborted transactions removed. By above
observation,  \A is well-formed. Consistency of  \A follows
from the consistency of   because the relations on   \ A are
merely the restriction of those in   to a subset of events. ⇤
C Technical Development for §5
The intuition behind the proof of Lemma 5.1 is that the extra
explicit ordering in an implementation race free execution
compensates for the speci￿ed extra HB￿￿ and A￿￿￿￿￿ ax-
ioms in the programmer model.
Lemma 5.1. Let   be an execution in the implementation
model without mixed races. Let   be the induced execution in
the programmer model obtained by dropping all the quiescence
fences in  . If   is consistent, then so is  .
Proof. Well-formedness of   is immediate.
Consistency of   follows if we can show that the orders
in   agree with those in  . Thus, it su￿ces to show that  
satis￿es HB￿￿ and A￿￿￿￿￿. We proceed as follows.
To showHB￿￿, let c be plain, a lww c , and a crw b hb  ! c
in  . Then, by implementation race freedom, we must have
a hb  ! c , otherwise a and c would be racing.
To show A￿￿￿￿￿, suppose a crw b hb  ! c lww a in  .
By implementation race freedom, we must have c hb  ! a.
However, this leads to a cycle in crw [ hb  !, contradicting
the observation axiom of  . ⇤
Suborders We follow [1] in providing an alternate charac-
terization of hb  ! in the implementationmodel. Recall that the
hb  ! relation in the implementation model does not include
HB￿￿.
Let swe  ! = ( cwr [ cww ) \ po  ! be the external trans-
actional communication relation, which captures the basic
ingredients in the hb  ! relation across threads, namely ex-
ternal transactional reads-from and external transactional
coherence.
Let hbe  ! = po-T   !; ( swe  !; poTT   !)?; swe  !; poT-   ! be the exter-
nal component of hb  !, which captures how synchronization
propagates across di￿erent threads.
These de￿nitions provides a clean decomposition of hb.
Lemma C.1 (Characterizing hb). hb  ! = init  ! [ hbe  ![ po  !
Proof. The inclusion of init  ![ hbe  ![ po  ! ✓ hb  ! is immediate.
For the converse direction. The following calculations are
immediate.
init  !; hb  ! ✓ init  !
poT-   !; po-T   ! ✓ poTT   !
po  !; hbe  !; po  ! ✓ hbe  !
Thus we are able to deduce that hbe  !; hbe  ! ✓ hbe  ! as follows:
hbe  !; hbe  !
= po-T   !; ( swe  !; poTT   !)?; swe  !; poT-   !;
po-T   !; ( swe  !; poTT   !)?; swe  !; poT-   !
✓ po-T   !; ( swe  !; poTT   !)?; swe  !; poTT   !; ( swe  !; poTT   !)?; swe  !; poT-   !
✓ po-T   !; ( swe  !; poTT   !)?; swe  !; poT-   !
= hbe  !
Hence, init  ![ hbe  ![ po  ! is transitive. The proof is completed
by noting that cwr [ xrw ✓ hbe  ![ po  !. ⇤
They also provide an alternative characterization of con-
sistency in the implementation model1.
1We include init  ! to be consistent with [1]. It can be removed since the
initializing transaction has only one write per location; thus, initialization
actions are not the target of any of our relations.
4
Supplementary Material For “Modular Transactions:
Bounding Mixed Races in Space and Time” Conference’17, July 2017, Washington, DC, USA
Let wre  ! = lwr \ po  ! be the external portion of the read-
to-write relation, and xrwe   ! = xrw \ po  ! be the external
portion of the transactional read-to-read relation.
LemmaC.2. An execution is consistent in the implementation
model i￿ the following hold.
( hbe  ![ poT-   ! [ po-T   ![ poRW    ![ wre  ![ xrwe   !) is acyclic.
( init  ! [ hbe  ![ poCon     !); lww is irre￿exive.
( init  ![ hbe  ![ poCon     !); lrw ) is irre￿exive.
Proof. For causality, we need that ( hb  ![ lwr [ xrw ) is acyclic.
We deduce:
hb  ![ lwr [ xrw is acyclic.
, init  ![ hbe  ![ po  ![ lwr [ xrw is acyclic.
, hbe  ![ po  ![ lwr [ xrw is acyclic.
, hbe  ![ po  ![ wre  ![ xrwe   ! is acyclic.
The ￿rst step follows from Lemma C.1; the second since init  !
is acyclic, and the last from de￿nitions of wre  !, xrwe   !.
Consider a cycle in the last relation above. Without loss
of generality, assume that every two adjacent elements of
the cycle are in di￿erent threads. All the relations other than
wre  ! use transactional events. So, if we have two adjacent
events a po  ! b, neither of which is transactional, the cycle
contains c1 wre  ! a po  ! b wre  ! c2. Thus, we deduce that
a poRW    ! b.
In the last three items, we use Lemma C.1 for the alterna-
tive characterization of hb  !. We can replace po  ! by poCon     !
by the following reasoning. If a ( lww [ lrw ) b, then a,b
access the same location and at least one is a write. ⇤
The following lemma addresses the infrastructure needed
for reordering transformations.
LemmaC.3. Let ,   be well-formed executions with the same
events that agree on the init  !, ww , wr , and tx⇠ relations and
satisfy:
( po-T   ! , poT-   ! , poTT   ! , poRW    ! , poCon     ! , swe  !  )
= ( po-T   !  , poT-   !  , poTT   !  , poRW    !  , poCon     !  , swe  !  )
Then,   is consistent i￿   is consistent.
Proof. We ￿rst show that the happens-before relations of
 ,   coincide. Since swe  ! coincides for  ,  , hbe  ! coincides
for  ,  . Result is immediate using lemma C.1.
Since  ,   also agree on all the base relations init  !, ww ,
wr , and tx⇠, they also agree on all the derived lifted relations.
Result follows. ⇤
The following lemma addresses the infrastructure needed
for roach-motel transformations.
LemmaC.4. Let ,   be well-formed executions with the same
events that agree on the init  !, ww , wr and po  !. Let the tx⇠
relation of   be a superset of the tx⇠ relation of  .
Then, if   is consistent, so is  .
Proof. Since the tx⇠ relation of   is a subset of the tx⇠ relation
of  , and  ,   agree on all the base relations init  !, ww , wr ,
and po  !, we deduce that all lifted relations of   are a subset
of the lifted relations of   and hb  ! ✓ hb  !  .
Consistency of   follows from the consistency of  . ⇤
The following lemma addresses the infrastructure needed
for fusion transformations.
Lemma C.5. Let   be a consistent, well-formed execution
with transaction a in s . Let b be a new name. Let   be derived
from   by:
• introducing ha:sCihb:sBi between the begin and end of
transaction a
• replacing the end (commit/abort) of a, if any, by an end
(commit/abort) of b
Then,   is well-formed and consistent.
Proof. Well-formedness of  follows from thewell-formedness
of  . WF1–WF8 are una￿ected by the changes. Any violation
of WF9–WF11 in   induces a violation of the same in  .
All orders in   restricted to the actions from   are con-
tained in the corresponding orders on  . Any simple cycle
in any of the consistency criterion on   induces a simple
cycle in   with the new actions replaced by ha:sBi. Thus,
consistency of   follows from consistency of  . ⇤
The following lemma addresses the infrastructure needed
for removing empty transactions.
Lemma C.6. Let   =   0    00 be a consistent, well-formed
execution, where   is an action of s that is not part of any
transaction.
Let b be a new name. Let   =   0  hb:sBihb:sC i  00.
Then,   is well-formed and consistent.
Proof. Well-formedness of   follows immediately from the
well-formedness of  . WF1–WF8 are una￿ected by the changes.
Any violation of WF9–WF11 in   induces a violation of the
same in  .
The new actions in   only participate in the po  ! order,
where they have a unique predecessor and successor. All
orders in   restricted to the actions from   are contained in
the corresponding orders on  . Any simple cycle in any of
the consistency criterion on   induces a simple cycle in  
with the new actions replaced by    .
⇤
D Additional Examples
The next two examples discuss aborted transactions.
Example D.1 (Opaque writes). Final outcome r = 1 is not
permitted in the program below.
atomica { x:=1; abort } || atomicb { r:=x }
This is trivial to justify by well-formedness (condition 7)
since wr cannot originate from an aborted transaction.
5
Conference’17, July 2017, Washington, DC, USA
Example D.2 (Race-free speculation). The only permitted
￿nal outcome is r = 2.
atomica { x++; ++ }
|| atomicb {if x ,   then {z:=1; abort}} || z:=2; r:=z
Since the guard of transaction b will never hold the program
is race free, and hence it will never execute z:=1. This means
that there is no danger that the abort will undo the nontrans-
actional write to z. In particular, for every execution hWz2i
obscures the read of z in the third thread.
Example D.3 (Dirty reads). Final result x = 0 and   = 1 is
forbidden.
atomica {if !  then x:=1; abort}; atomicb {if !  then x:=1}
||if x = 1 then  :=1
The result would be possible if the second thread observes the
write of x in transaction a, then updates  . Since a rolls back,
it will restorex ’s value back to 0, causing transactionb to skip
over the update to x on re-execution. However, in our model
such an execution is not possible since non-transactional
events cannot read from live or aborted transactions.
Example D.4 (No overlapped writes). Final result r = 0 is
forbidden in the program below, where z is an array. The
result would be possible if transaction a initializes z[ ] and
then publishes it by writing it to shared volatile variable x .
Since lazy version copies cached values in any order, the
second thread may see the update to x before it sees the
update to z[ ]. In our model, this results in the execution
below.
atomica { :=4; z[ ]:=1;x:=4}
||r:=1; atomic {q:=x };
if q , 0 then r:=z[q]
a:R 4 Wz[4]1 Wx 4
Rx 4 Rz[4]0
cwr lrw
Since we model volatile accesses as a singleton committed
transaction, we obtain an edge cwr to the read of x in the
second thread, which violates axiom (O￿￿￿￿￿￿￿￿￿￿).
References
[1] S. Dolan, K. C. Sivaramakrishnan, and A. Madhavapeddy. 2018. Bound-
ing data races in space and time. In PLDI, Je￿rey S. Foster and Dan
Grossman (Eds.). ACM, 242–255.
6
