




The Dissertation Committee for Ian Anthony Wehrman
certifies that this is the approved version of the following dissertation:
Weak-Memory Local Reasoning
Committee:
Warren A. Hunt, Jr., Supervisor




C. A. R. Hoare
Weak-Memory Local Reasoning
by
Ian Anthony Wehrman, B.S.; M.S.C.S.
Dissertation
Presented to the Faculty of the Graduate School of
The University of Texas at Austin
in Partial Fulfillment
of the Requirements
for the Degree of
Doctor of Philosophy
The University of Texas at Austin
December 2012
For my mother, Sally Marie Lorino, from whom I learned to learn.
Acknowledgments
This dissertation would not have been possible without the help of many people. I
thank Warren Hunt, who has been a wellspring of support in favor of my doctoral
degree. I am certain that I would not have not completed this project without his
encouragement and generous technical, moral and financial support. I also thank
Josh Berdine, who has been a true intellectual mentor to me. If I have contributed
anything to our field of interest then for that I have him to thank. I am privileged
to have had the opportunity to work with and learn from Tony Hoare, whose in-
sight into the science of programming and the broader scientific process has been
invaluable. I am in debt to J Moore for accepting me as an advisee at a difficult
moment and for guiding me back to a successful path. I thank also Allen Emerson
and Don Fussell for their participation on my dissertation committee. Finally, I
warmly acknowledge the support of my family and friends. My mother has always
been my unflagging champion; I do not know how I could have made it without her
love through good and bad times. While in Austin, my wonderful friends—in par-
ticular Joel Brandt, Richard Chang, Benjamin Delaware, Anne Proctor and Chelsea
Weathers—have broadened my horizons and brightened my life. Their compassion,
camaraderie and loyalty means more to me than they likely know.
v
Ian Anthony Wehrman





Ian Anthony Wehrman, Ph.D.
The University of Texas at Austin, 2012
Supervisors: Warren A. Hunt, Jr. and J Strother Moore
Program logics are formal logics designed to facilitate specification and correctness
reasoning for software programs. Separation logic, a recent program logic for C-like
programs, has found great success in automated verification due in large part to its
embodiment of the principle of local reasoning, in which specifications and proofs
are restricted to just those resources–variables, shared memory addresses, locks,
etc.–used by the program during execution.
Existing program logics make the strong assumption that all threads agree
on the values of shared memory at all times. But, on modern computer architec-
tures, this assumption is unsound for certain shared-memory concurrent programs:
namely, those with races. Typically races are considered to be errors, but some pro-
grams, like lock-free concurrent data structures, are necessarily racy. Verification
vii
of these difficult programs must take into account the weaker models of memory
provided by the architectures on which they execute.
This dissertation project seeks to explicate a local reasoning principle for
x86-like architectures. The principle is demonstrated with a new program logic for
concurrent C-like programs that incorporates ideas from separation logic. The goal
of the logic is to allow verification of racy programs like concurrent data structures






List of Figures xiii
Chapter 1 Introduction 1
1.1 An Illustrative Example . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Project Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Components and Dependencies . . . . . . . . . . . . . . . . . 6
1.2.2 Contributions and Status . . . . . . . . . . . . . . . . . . . . 8
Chapter 2 Background 11
2.1 Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.1 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.3 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.4 Universes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Program Logics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Hoare Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
ix
2.2.2 Separation Logic . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.3 Concurrent Program Logics . . . . . . . . . . . . . . . . . . . 22
2.3 Memory Consistency Models . . . . . . . . . . . . . . . . . . . . . . 25
2.3.1 Sequential Consistency . . . . . . . . . . . . . . . . . . . . . . 26
2.3.2 The x86-TSO Memory Model . . . . . . . . . . . . . . . . . . 27
2.3.3 Data-Race-Freedom Guarantees . . . . . . . . . . . . . . . . . 31
Chapter 3 A Sequential Program Logic 32
3.1 An Example Sequential Proof . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Expressions and Stacks . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Sequential Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Locality and Separation . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4.1 Spatial Separation . . . . . . . . . . . . . . . . . . . . . . . . 50
3.4.2 Temporal Separation . . . . . . . . . . . . . . . . . . . . . . . 53
3.4.3 Spatiotemporal Separation . . . . . . . . . . . . . . . . . . . 57
3.4.4 Flushing Closure . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5 Sequential Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.5.1 Sequential Satisfaction . . . . . . . . . . . . . . . . . . . . . . 61
3.5.2 Sequential Assertion Abbreviations . . . . . . . . . . . . . . . 67
3.5.3 Separating Implications . . . . . . . . . . . . . . . . . . . . . 68
3.5.4 Sequential Algebra . . . . . . . . . . . . . . . . . . . . . . . . 70
3.6 Sequential Specifications . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.6.1 Sequential Proof Theory . . . . . . . . . . . . . . . . . . . . . 72
3.6.2 Semantics of Sequential Specifications . . . . . . . . . . . . . 78
Chapter 4 A Concurrent Program Logic 81
4.1 An Example Concurrent Proof . . . . . . . . . . . . . . . . . . . . . 81
4.2 Concurrent Programs . . . . . . . . . . . . . . . . . . . . . . . . . . 88
x
4.3 Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.3.1 Spatial Separation . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.2 Temporal Separation . . . . . . . . . . . . . . . . . . . . . . . 103
4.3.3 Spatiotemporal Separation . . . . . . . . . . . . . . . . . . . 108
4.4 Concurrent Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.4.1 Concurrent Satisfaction . . . . . . . . . . . . . . . . . . . . . 114
4.4.2 Additional Concurrent Assertions . . . . . . . . . . . . . . . . 118
4.4.3 Concurrent Algebra . . . . . . . . . . . . . . . . . . . . . . . 119
4.4.4 Flushing Closure . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.5 Concurrent Specifications . . . . . . . . . . . . . . . . . . . . . . . . 123
4.5.1 Concurrent Proof Theory . . . . . . . . . . . . . . . . . . . . 124
4.5.2 Semantics of Concurrent Specifications . . . . . . . . . . . . . 131
4.5.3 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Chapter 5 Loose Ends 135
5.1 Top Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.2 Additive Barrier Assertions . . . . . . . . . . . . . . . . . . . . . . . 137
5.3 Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.3.1 A Use for Splitting Permissions . . . . . . . . . . . . . . . . . 142
5.3.2 A Use for Counting Permissions . . . . . . . . . . . . . . . . 144
5.4 Shared Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.5 Invariant Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Chapter 6 Related Work 152
6.1 Weak-Memory Program Transformations . . . . . . . . . . . . . . . . 152
6.2 Program Logics for Weak-Memory Reasoning . . . . . . . . . . . . . 154
6.3 Algebraic Models of Concurrency . . . . . . . . . . . . . . . . . . . . 155
Chapter 7 Conclusion 157
xi
Appendix A Additional Lemmas, Proofs and Conjectures 161
A.1 Flushing Closure Proofs . . . . . . . . . . . . . . . . . . . . . . . . . 161
A.2 Soundness Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
A.2.1 Soundness of the Axioms . . . . . . . . . . . . . . . . . . . . 165
A.2.2 Soundness of the Inference Rules . . . . . . . . . . . . . . . . 171





1.1 Dekker’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Dependencies among the components of the project . . . . . . . . . . 7
2.1 Universal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 Semantics of sequential primitive commands . . . . . . . . . . . . . . 45
3.2 Semantics of sequential commands . . . . . . . . . . . . . . . . . . . 47
3.3 Sequential command abbreviations . . . . . . . . . . . . . . . . . . . 48
3.4 Sequential satisfaction relation . . . . . . . . . . . . . . . . . . . . . 62
3.5 Sequential assertion semantics example . . . . . . . . . . . . . . . . . 66
3.6 Sequential semantic equivalences . . . . . . . . . . . . . . . . . . . . 70
3.7 Sequential semantic entailments . . . . . . . . . . . . . . . . . . . . . 71
3.8 Sequential axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.9 Sequential inference rules . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1 Semantics of concurrent primitive commands . . . . . . . . . . . . . 93
4.2 Semantics of concurrent commands . . . . . . . . . . . . . . . . . . . 96
4.3 Concurrent command abbreviations . . . . . . . . . . . . . . . . . . 97
4.4 Concurrent satisfaction relation . . . . . . . . . . . . . . . . . . . . . 115
4.5 Concurrent semantic equivalences . . . . . . . . . . . . . . . . . . . . 120
4.6 Concurrent semantic entailments . . . . . . . . . . . . . . . . . . . . 120
xiii
4.7 Concurrent axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.8 Concurrent inference rules . . . . . . . . . . . . . . . . . . . . . . . . 126




Most concurrent software verification techniques rely on a surprisingly strong as-
sumption: namely, that all processes agree on the value of shared memory at all
times. This is, of course, not generally true, but it is often a safe assumption be-
cause of implicit guarantees provided by the memory models of modern computer
architectures, which guarantee that programs without races will not observe such
inconsistencies. The soundness of most concurrent software verification techniques
therefore relies on race-freedom of the program under study. This is not considered
a major shortcoming though because races usually indicate a program error.
There are, however, useful and interesting programs for which races do not
indicate an error. For example, concurrent data structures, which optimize for speed
and throughput by using locks and memory fence instructions sparingly, are often
racy by design. Their correctness is demonstrated by relating the executions of the
comparatively daring implementation to those of its simpler, abstract counterpart.
Constructing such a relation therefore requires a technique that is tolerant of races.
But that requirement comes with a serious consequence: any technique that tolerates
races soundly must also admit that processes may observe the inconsistencies in
the value of shared memory that result from the peculiarities of the architecture’s
1
memory model.
The verification literature offers little insight into the problem of verifying
concurrent data structures and other inherently racy programs. This is because a
model of a contemporary memory adds serious complication to an already difficult
problem, but also because until recently formal specifications of common architec-
tures’ memory models did not exist publicly. (Or, perhaps, privately.) Fortunately,
the latter problem has been alleviated with recent safety specifications for the x86,
Power and ARM memory models [38, 41]. So, for these architectures, the path to-
ward a solution to the correctness problem of concurrent data structures and other
important programs now lies essentially unimpeded.
1.1 An Illustrative Example
// initially: f0 7→ 0 ∗ f1 7→ 0
[f0] := 1;
if ([f1] == 0) then
// critical section
[f1] := 1;
if ([f0] == 0) then
// critical section
Process P0 Process P1
Figure 1.1: Dekker’s algorithm
To motivate study of a local reasoning principle for weak memory models,
we consider the problem of reasoning about programs executing on such models.
Some of the issues involved can be illustrated by considering the small pseudocode
program in Figure 1.1.1
The initial condition states that the two pointer variables, f0 and f1 have
distinct values, each of which are addresses into shared memory at which the value
is 0. The program is a concurrent composition: process Pi, for i ∈ {0, 1}, sets its
flag by storing the value 1 at address fi, then optionally enters its critical section if
1The result of dereferencing a pointer variable x is indicated by [x].
2
the result of loading the address at other flag f1−i is 0.2
This is a simplification of Dekker’s algorithm for mutual exclusion; it should
not be possible for both processes to enter their critical sections simultaneously. An
informal correctness argument can be made that relies on a widely assumed property
of the underlying memory model, sequential consistency, defined by Lamport [30]
to mean that,
the result of any execution is the same as if the operations of all the
processors were executed in some sequential order, and the operations of
each individual processor appear in this sequence in the order specified
by its program.
The informal correctness argument for the program in Figure 1.1 is as follows.
In any execution, either process P0 sets flag f0 before process P1 sets flag f1, or
conversely, because we may assume all events are totally (sequentially) ordered. In
the first case, f0 is set before f1, which happens before P1 loads f0 because the total
order of events respects the program orders. So, if P0 sets its flag first, the load of
f0 returns 1 and P1 will not enter its critical section. A symmetric argument shows
that if P1 sets f1 before P0 sets f0, then P0 will not enter its critical section. In
both cases, at most one of the two processes may enter its critical section.3
Sequential consistency is crucial to this argument. If the events are not
totally ordered, the case split is not exhaustive. If the program orders are not
included in the total order, we may not conclude that the first store precedes the
other process’ load, despite the fact that the first store precedes the second store (in
the total order) and the second store precedes the load (in the program order). In
either case, mutual exclusion may fail.
Unfortunately, common multiprocessor architectures do not generally guar-
antee sequential consistency, and so neither the informal argument above nor more
2Load and store are assumed to be atomic operations when operating on integer-valued data.
3This program does not, of course, preclude deadlock.
3
rigorous arguments based on formalizations of sequential consistency are valid. And
although the memory models of various architectures all seem to be strictly weaker
than sequentially consistent models, they are not individually comparable. Roughly,
the Power and ARM architectures guarantee the first part of sequential consistency
(a total order on memory events), but not the second (that this order includes the
program orders) [41, 13]. Such memory models are called weakly consistent. Con-
versely, the x86 architecture does not guarantee a total order on memory events,
but does guarantee that the observed partial order is consistent with the program
orders [38]. These memory models are said to have the total-store ordering (TSO)
property.
There are, however, conditions under which these architectures guarantee
a program’s executions to be sequentially consistent—namely, in the absence of
data races. These so-called “data-race free” (DRF) guarantees provide sufficient
conditions under which sequential consistency can be recovered and used for a cor-
rectness argument. By guarding the memory-accessing commands in the program in
Figure 1.1 with synchronization primitives like locks to eliminate races, the previous
correctness argument again becomes valid. Such a program transformation might
make sense for a conservative programmer concerned with correctness, but it does
not constitute a helpful verification strategy because the transformation does not
preserve the original program’s semantics.
A less drastic semantics-altering transformation is to add fence instructions
directly after the processes’ store operations; this too results in an implementation of
Dekker’s algorithm that preserves mutual exclusion. The fences ensure that the pro-
cesses do not attempt their loads until after their respective stores have completed.
But proving that this modified program is correct requires an argument quite dif-
ferent from the one above: the loads in this program race with their opposite stores,
and so DRF guarantees cannot be applied to recover sequential consistency. Hence,
4
any correctness argument for this modified program must be cognizant of the pe-
culiarities of the underlying memory model. Indeed, correctness arguments for an
x86-like memory model are completely different for a correctness argument for an
ARM-like memory model.
1.2 Project Description
This dissertation project does not, by far, solve the verification problem for racy con-
current programs executing on a weak memory models generally. Its more modest
goal is instead to step toward this by developing a program logic for the verification,
by proof, of partial correctness properties of certain programs that interact with a
particular weak model of memory. Implicit in this is the goal to explicate a local
reasoning principle for a weak memory model. Traditional correctness reasoning
and specification is global: the entire system must be accounted for, which makes
scaling to large programs difficult. Local reasoning dictates instead that reasoning
and specification be restricted to just those resources—program variables, shared-
memory addresses, locks, etc.—that are accessed or modified by the program during
execution.
The particular programs studied in this dissertation are structured, C-like
programs with pointers and pointer arithmetic, and concurrency constructs, includ-
ing memory fences. The language was chosen based on the level of detail with
which racy concurrent programs are typically described in the literature: i.e., with
structured imperative control-flow constructs like loops and if-statements, but with
memory fences and locking explicitly specified.
The particular weak memory model explored in this dissertation is x86-like;
in particular, based on the x86-TSO memory model as defined by Owens, Sarkar
and Sewell [38]. This model was chosen for a variety of reasons. First, x86 multipro-
cessors are now in extremely widespread use, commonly found in servers, desktops,
5
laptops, tablets, smartphones and many other small computing appliances. Sec-
ond, unlike most other modern multiprocessor architectures (e.g., ARM and Power
[2, 50]), the memory model is both well understood and has, thanks to Owens et al.,
a clear, simple and formal specification.
In this project, local reasoning is explored in the context of an x86-like mem-
ory model in particular by developing two program logics that embodies such a
principle: a logic for purely sequential programs that execute on a single processor;
and a more general logic for concurrent programs that execute on multiple proces-
sors in parallel. The sequential fragment of this logic is loosely based on separation
logic, a recent Hoare-style logic which has spurred a revolution in high-level pro-
gram reasoning due to the simplicity with which it handles pointers using a local
reasoning principle. The concurrent extension of the logic is similarly based on a
concurrent extension to separation logic.
1.2.1 Components and Dependencies
The various components of the program logics described in this dissertation and their
explicit dependencies are pictured in the (transitively reduced) graph of Figure 1.2.
The components are represented by shapes that indicate their approximate type:
semantic objects by octagons; formal languages by squares; semantic relationships
by ovals; deduction systems by hexagons; and key properties by trapezoids.
The memory model is shown in a double-lined octagon, which reflects the
assumption in this project that it is complete and correct, and is not further modified
from its operational definition in [38], which is summarized in Section 2.3.
The machine model—in particular, the notion of machine state—depends on,
but is distinct from, the memory model. For example, the memory model dictates
that each processor has a private set of named registers, whereas in the machine






Figure 1.2: Dependencies among the components of the project
also take the liberty in the machine model to relax other restrictions on the notion
of state from the memory model, such as the requirement that writes buffered by a
single processor are totally ordered.
The programming language is a structured C-like language with concurrent
composition. It does not explicitly depend on any other components of the project.
The programming language semantics relates the programming language to the ma-
chine model, and hence depends on them both. The uniprocessor machine model and
sequential programming language are described in Section 3.3; the multiprocessor
machine model and concurrent programming language are described in Section 4.2.
The assertion language also does not explicitly depend on any other com-
ponents of the project.4 The assertion language semantics associates the assertion
language to sets of machine states (as defined by the machine model) with a par-
ticular structure. Assertions about uniprocessor machine states are described in
Section 3.5; assertions about multiprocessor machine states are described in Sec-
tion 4.4.
Ideally there would also be a proof theory of assertions and a corresponding
soundness theorem. We have chosen not to focus on a proof theory of assertions
in this project, but will indicate some semantic implications and equivalences that
4Implicitly, of course, it depends significantly on the memory and machine models.
7
would be relevant to that end in Section 3.5.4 for the uniprocessor case, and in
Section 4.4.3 in the multiprocessor case.
The specification language encompasses the programming and assertions lan-
guages, and its semantics is given in terms of the semantics of programs and asser-
tions. The proof theory of specifications relies on the existence of a suitable proof
theory of assertions for determining entailments. The soundness of the specification
logic relies on the soundness of the proof theory of assertions as well as the semantics
of programs and assertions. Specification of sequential programs are described in
Section 3.6; specifications of concurrent programs are described in Section 4.5.
1.2.2 Contributions and Status
The significant contributions of this dissertation project are as follows:
1. An x86-like operational semantics based on state transitions for a C-like pro-
gramming language with pointers and pointer arithmetic, memory fences and
concurrency constructs. The model is novel, and is expressive enough to de-
scribe both processor-parallel and interleaved thread executions.
2. An assertion language for describing, naturally and concisely, x86-like system
configurations, and a formal semantics in terms of the aforementioned states.
Both the language and the model are novel.
3. A program specification logic—i.e., a formal language of specifications and a
proof system—for describing and deducing partial correctness of the afore-
mentioned C-like programs in terms of the aforementioned assertions, as well
as a formal semantics of specifications that relates the x86-like execution of
C-like programs among states described by assertions. The specification logic
additionally embodies an x86-specific principal of local reasoning, which allows
proofs to be constructed by describing the interaction between the program
8
and the small portion of system resources relevant to its behavior; and subse-
quently generalizing the proof to describe the interaction of the program with
more complete system descriptions. This is the first known local reasoning
principle for a weak memory model.
The languages, models and deduction system described above are completely
and formally defined, having undergone hundreds of revisions. There are however
two major components left unfinished:
1. There is no proof system for the assertion language. Omitting this compo-
nent of the project was an early, intentional decision for two reasons. First, a
proof system is necessary for practical application of the specification logic—
and in particular automation of the specification logic—but is not critical for
the study of the specification language. In practice, syntactic entailment is a
practically important approximation of semantic entailment, but in principle
simply having a well defined notion of semantic entailment is sufficient. Sec-
ond, it is hypothesized that the traditional inference rules of first-order logic
are sound for the assertion language defined here—as is the case for related
theories of similar logics—and also that there is no complete set of rules—as
this language encompasses, e.g., arithmetic. The task of finding a suitable
proof system for the assertion can thus be seen as a purely practical issue.
2. The soundness proof for the specification logic is incomplete. Although the
logic is hypothesized to be sound w.r.t. the model described later in the doc-
ument, this is of course a serious drawback. Most of the relevant lemmas
have been proved previously with various, preliminary versions of the model.
There are no known problems with the model that are blocking a soundness
proof beyond sheer size of the proof as derived from the complexity of the
model. Indeed, the proofs are mostly straightforward inductions. Throughout
9
the document, key intermediate properties (marked as propositions) that are
expected to hold are noted along with, in some cases, proof sketches.
Finally, beyond these technical omissions, it must be admitted that this
dissertation gives little indication of how proofs ought to proceed in the logic; only
a handful of small examples will be given later on. This is because the logic has
evolved with soundness to the x86 memory memory model foremost in mind, instead
of as a logical system of interest independent of its potential models. Although the
examples presented do indicate that the logic is capable of highly non-trivial program
reasoning, the extent of its capability is not yet well understood, and remains as




This chapter describes background material necessary to understand the techni-
cal content of this dissertation, beginning with mathematical preliminaries and fre-
quently used notation in Section 2.1, followed by an overviews of research on program
logics in Section 2.2 and memory consistency models in Section 2.3.
2.1 Mathematical Preliminaries
Terms are defined throughout this document with the following meta-notation:
object =df definition.
We additionally use the following meta-notation for defining predicates:
predicate ≡df definition.
For example, for a set S and object o /∈ S we define
So =df S ] {o}
11
as the extension of set S by object o. Using this notation, a domain S can be lifted
to its optional domain by writing S⊥ as shorthand for S ] {⊥}, assuming ⊥ /∈ S.
2.1.1 Relations
For any set A, we write IdA for the identity relation on A. For a binary relation R
on A and n ∈ N, we write
n



























For a (possibly partial) function f : A⇀B and a ∈ A and b ∈ B, we write f [a   b]
for the updated function:
f [a   b] =df λx .

b if x = a
f(x) otherwise.
We write f(a) = ⊥ if the partial function f is not defined at point a, i.e.
if a /∈ dom(f), and def(f(a)) otherwise. The everywhere-undefined function is
indicated by ∅. The partial sum f ] g of partial functions f and g is defined, when
12
dom(f) ∩ dom(g) = ∅, as follows:
f ] g =df λx .

f(x) if x ∈ dom(f)
g(x) else if x ∈ dom(g)
⊥ otherwise.
When convenient, we also write fa as shorthand for f(a). a 7→ b is shorthand
for the unique partial function f such that f(a) = b and is undefined otherwise. For
A′ ⊆ A, f |A′ is the restriction of f to domain A′.
For (possibly partial) functions f, g : A⇀B, we we write f\\g for the result
of overriding f with g:
f\\g =df λx .

g(x) if x ∈ dom(g)
f(x) otherwise.
An obvious property is that, if g(a) = b, for some b ∈ B, then also (f\\g)(a) = b.
Some additional properties follow in Proposition 1.
Proposition 1. Let f, g, h : A⇀B.
1. f\\ ∅ = ∅ \ f = f
2. f\\(g\\h) = (f\\g)\\h.
3. dom(f\\g) = dom(f) ∪ dom(g).
4. If dom(f) ∩ dom(g) = ∅ then f\\g = f ] g, and hence f\\g = g\\f .
2.1.3 Lists
The empty list is denoted by ε, the literal list by [o, . . . , o′], list construction by o :: l,
and list concatenation by l++ l′, for objects o and lists l. We write T list to indicate
13
lists of elements drawn from the set T , and E for the function (λx . ε).
For a list l : (A × B) list, we write l for the corresponding partial lookup
function:
l =df λx .

b if l = l′++ [(x, b)]
l′(x) if l = l′++ [(y, b)] with x 6= y
⊥ otherwise.
For A′ ⊆ A, l|A′ is the sublist restriction of l to domain A′.
For convenience, we lift these function definitions pointwise to sets of lists.
For example, for a set L of lists, a ::L =df {a :: l | l ∈ L}.
The set of interleavings of lists m,n, written m ] n, is defined by recursion
on the structure of m and n:
m ] ε =df {m}
ε ] n =df {n}
a ::m′ ] b ::n′ =df a ::(m′ ] (b ::n′)) ∪ b ::((a ::m′) ] n′).
We now define a subset of the interleavings of lists of pairs from A×B that
play an analogous role to function overriding. The result of overriding a list m with
another n, written m\\n, is defined as follows:
l ∈ m\\n ≡df l = m\\n.
As with function overriding, list overriding has the property that, if n(a) = b, for
some b ∈ B, and l ∈ m\\n, then l(a) = b. Because all elements of m\\n have the
same lookup function, we may safely extend the list lookup notation as follows:
m\\n =df λx . l(x),
14
for arbitrary l ∈ m\\n.
As for function overriding, list overriding has the basic property that, if a ∈
dom(n), then n(a) = n\\m(a). It also has the other following analogous properties,
as noted by Proposition 2.
Proposition 2. Let l,m, n : A×B list.
1. l\\ ∅ = ∅ \ l = l
2. l\\(m\\n) = (l\\m)\\n.
3. dom(m\\n) = dom(m) ∪ dom(n).
4. If dom(m) ∩ dom(n) = ∅ then m\\n = m ] n, and hence m\\n = n\\m.
5. (m++n) ∈ (m\\n) ⊆ (m ] n).
2.1.4 Universes
The various universal sets are declared and in some cases defined in Figure 2.1. Note
that, in the case of memory locations (i.e., addresses into memory) and processor
identifiers, 0 is excluded. Also note that we shall later use the single set of identifiers
I to represent both program variables and logical variables.
Set Description
I Identifiers
V = Z Values
L ⊆ N+ Memory locations
P ⊆ N+ Processor identifiers
Figure 2.1: Universal Sets
2.2 Program Logics
Given that the motivating problem is to reason about concurrent programs exe-
cuting on a particular memory model, how might one approach the correctness of
15
the program in Figure 1.1 and others like it such as concurrent data structures?
One solution—perhaps, for now, the best—is to reason directly about the program
semantics in the following way:
1. formalize the semantics of the programming language using a general purpose
logic (e.g., first-order logic, higher-order logic, type theory);
2. prove that the semantics agrees with the memory model;
3. characterize the intended program property using the general logic;
4. prove using the general logic that the semantic object which represents the
program at hand possesses this property.
This is a perfectly reasonable strategy and, by using a proof assistant for a selected
general purpose logic (e.g., ACL2 [29], Isabelle/HOL [33], or Coq [5]), is within the
realm of feasibility for many programs and some experts.1 But, due to the generality
of the logic and complexity of the semantics of programs under study, one expects
such formalizations and proofs to be exceptionally complex. And although experts
are certainly able to develop methodologies and abstractions to tame this complexity,
the desire to reason at a higher and more intuitive level is manifest.
This is just the purpose of a program logic, which allows high-level formal
reasoning that codifies the programmer’s intuition about the behavior and correct-
ness of the program under study. Ideally, the program logic incorporates those
methodologies and abstractions that have been most useful to expert users reason-
ing directly about semantic objects in more general logics.
The situation is analogous to the use of temporal logics for studying reactive
systems. Though it is technically possible to reason about about such systems using
1As an alternative to defining a high-level programming language semantics that comports with
a high-level specification of a memory model, it is also possible to formally define the machine
that implements the memory directly, and give the semantics of programs in terms of possible
executions of this machine. This tack was taken, e.g., in work on reasoning about the behavior of
multi-threaded Java programs w.r.t. a detailed model of the Java Virtual Machine [32].
16
a general-purpose logic, both human reasoning (e.g., Unity [12]) and automation
(e.g., model checking [14]) were facilitated by specialized logics.
2.2.1 Hoare Logic
Hoare introduced the first program logic for an Algol-like language in his seminal
1969 paper [22]. A program c is specified with a pair of assertions P andQ, written in
first-order logic, that describe pre- and post-execution system states, respectively.
Hoare described two related logics, which differ in the style of specification: in
the total correctness logic specifications are written 〈P 〉 c 〈Q〉 and require program
termination as a necessary condition for satisfaction; in the partial correctness logic
specifications are written {P} c {Q} and allow divergent executions of the program
to satisfy any specification. In the sequel, we focus on logics of partial correctness.
The axioms and inference rules are either structural, directed by the program
syntax, or logical, directed by the logical operations of the assertion language. The
choice rule, for reasoning about nondeterministic choice command c+ c′, is an
example of a structural rule:
{P} c {Q} and {P} c′ {Q}
{P} c+ c′ {Q}
(choice)
According to this rule, in order to prove a specification {P} c+ c′ {Q}, it suffices
to prove the same specification of each of the constituent commands: {P} c {Q}
and {P} c′ {Q}. This is because the nondeterministic choice command may exe-
cute either c or c′, but not both; and so if each constituent command satisfies the
specification, then so must the composed command.
The disj rule, for reasoning about specifications in which the primary con-
nective of the pre-condition is a logical disjunction, is an example of a logical rule:
17
{P} c {Q} and {P ′} c {Q}
{P ∨ P ′} c {Q}
(disj)
According to this rule, to prove a specification of an arbitrary comment in which
the pre-condition is a disjunction, it suffice to prove a specifications in which the
pre-conditions consist of each of the disjuncts. Intuitively, regardless of whether the
command is executed in a state that satisfies P or P ′ it shall terminate in a state
that satisfies Q, or diverge.
The structural and logical rules of Hoare logic make proof construction par-
tially mechanical. But determining whether a program meets its specifications re-
mains generally undecidable due to the expressiveness of the assertion language. For
example, the rule of consequence allows for the relaxation of specifications by the
arbitrary strengthening of pre-conditions and weakening of post-conditions:
P ′ ⇒ P {P} c {Q} Q⇒ Q′
{P ′} c {Q′}
(cons)
For correct application of the rule, validity of the first-order logic implications must
be proved. But first-order validity is, of course, undecidable in general, so while
Hoare logic does ease some of the pain of proof construction, it is not a panacea.
2.2.2 Separation Logic
As successful as Hoare Logic has been, a significant drawback is its inability to
soundly cope with pointer variables and dynamically allocated memory, thus severely
complicating its application to low-level systems programs. To illustrate the prob-
lem, consider a simple program that updates the value at an address x by writing
to a dereferenced pointer: [x] := 2. To be clear, this program does not update the
value x, but instead the value in memory at the address whose value is x. If we
write heap(x) = 2 to mean that the value in memory address x is equal to 2, then
18
the following specification is clearly true:
{heap(x) = 1} [x] := 2 {heap(x) = 2} .
We might then wish to prove a stronger specification, which describes the behavior
of this program in a larger memory, with two allocated addresses x and y:
{heap(x) = 1 ∧ heap(y) = 1} [x] := 2 {heap(x) = 2 ∧ heap(y) = 1} .
This is easily derivable in Hoare logic using the rule of constancy, which allows an
arbitrary2 assertion F—called a frame—to be conjoined uniformly onto the pre- and
post-conditions of a derived specification:
{P} c {Q}
{P ∧ F} c {Q ∧ F} .
(const)
The problem, of course, is that this strengthened specification is not true. In case
the pointer variables x and y are aliased—i.e., if x = y—after the pointer update it
shall certainly not be the case that heap(y) = 2, for the update to the memory at x
also implicitly updated the memory at y.
In the presence of pointer variables, the rule of constancy—which is crucial
to the scalability of Hoare logic—is only sound in case the memory locations referred
to by the frame are disjoint or separate from those in the footprint of the command;
i.e., the part of the system state referenced by the program during its execution. To
maintain soundness, therefore, proofs in Hoare logic about such programs require
a variety of ad hoc extensions and onerous side conditions. (See, e.g., Richard
Bornat’s 2000 paper [6], which incorporates many sophisticated ideas into Hoare
logic.) But after more than forty years of research, a major breakthrough finally
2Actually, there is a simple syntactic restriction on F : namely, that fv(F ) ∩ mod(c) = ∅; i.e.,
that the free variables of the frame are not modified by the command. This holds in the example
because no variables are modified by dereferencing assignment.
19
came with the invention of separation logic [47], generally credited to John Reynolds
and Peter O’Hearn. Separation Logic is a Hoare-style program logic insofar as
the axioms and inference rules are similar; the crucial difference between it and
Hoare logic is the choice of assertion language. Instead of the classical first-order
logic assertions used by Hoare logic specifications, separation logic makes use of a
different logic—a theory of the logic of bunched implications pioneered by O’Hearn,
Pym and others [35, 45]—for describing system states with a notion of a heap—a
finite partial function, into which pointers point, that represents part of memory—
and disjointness of said states. The salient formulas that capture these notions are
the points-to assertion, ` 7→ v, and the separating conjunction, P ∗ Q. Models of
points-to assertion ` 7→ v are heaps with exactly one address allocated, given by `,
and with value v stored at that address: i.e., h |= ` 7→v iff h = {(`, v)}. Models of
the separating conjunction P ∗ Q are heaps that can be partitioned by address into
two subheaps, one of which models the formula P and the other Q: i.e., h |= P ∗ Q
iff h = hP ] hQ such that hP |= P and hQ |= Q.
Besides soundness w.r.t. a sequential C-like language with pointers and dy-
namic memory management, separation logic is important because it embodies the
principle of local reasoning. Unlike with Hoare logic, reasoning may be restricted
to a program component’s footprint, from which one may generalize to complete
system states. O’Hearn, Reynolds and Yang informally describe local reasoning in
the context of sequential pointer programs as follows [36]:
To understand how a program works, it should be possible for reason-
ing and specification to be confined to the [memory addresses] that the
program actually accesses. The value of any other [addresses] will auto-
matically remain unchanged.
Separation logic embodies the principle of local reasoning with its small ax-
ioms and its frame rule. The small axioms describe the programming language’s
20
primitive commands, specifying only their respective footprints. For example, the
small axiom3 for the pointer assignment command (i.e., store command) requires
with its pre-condition only that the relevant location be allocated with some value,4
and the resulting post-condition describes only the result of updating this location:
{e 7→ −} [e] := e′ {e 7→ e′} (store)
The local specification can then be generalized to a global specification using the
frame rule:5
{P} c {Q}
{P ∗ F} c {Q ∗ F}
(frame)
The frame rule of separation logic replaces the rule of constancy of Hoare logic.
It can be used to soundly derive the desired specification for the aforementioned
program as follows:
{x 7→−} [x] := 2 {x 7→2} store
{x 7→1} [x] := 2 {x 7→2} cons
{x 7→1 ∗ y 7→1} [x] := 2 {x 7→2 ∗ y 7→1} frame
Besides having been used to give human-readable proofs to a variety of al-
gorithms that manipulate complex pointer-based data structures (e.g., the Schorr-
Waite graph-marking algorithm [60]), useful fragments of separation logic have been
automated as part of program verifiers and static analyses, which have been success-
fully applied to programs with tens of thousands of lines of source code [3, 61, 4].
3Actually, an axiom schema parametrized by the expressions e and e′.
4e 7→ − is shorthand for ∃v . e 7→ v.
5As in the rule of constancy, the frame rule also requires that fv(F ) ∩mod(c) = ∅.
21
2.2.3 Concurrent Program Logics
Research into logics for concurrent programs has progressed independently from
research into logics for increasingly expressive sequential programs. Major efforts
are summarized below.
The Owicki-Gries Logic Early attempts at handling concurrency within a pro-
gram logic culminated in an extension of Hoare logic by Owicki and Gries [39],
which adds a rule for parallel composition of two program components with the
cumbersome side condition that every assertion in one component’s specification
be invariant under the operation of each atomic command executed by the other
component. While elegant in its simplicity, the side condition restricts the logic’s
usefulness. First, every application of the parallel composition rule requires a num-
ber of invariant proofs quadratic in the size of the components, making scalability
difficult. Second, the side condition creates dependencies on the proofs of the com-
ponent specifications, not just on the specifications themselves. This effectively
rules out independent proof construction for the individual components and yields
a highly non-compositional logic.
Rely/Guarantee Some relief from these problems was provided by Jones in his
rely/guarantee logic [28], in which Hoare-style specifications are augmented with
two additional assertions: the rely condition, which bounds the interference from
the environment that a component can tolerate while still meeting its pre- and
post-specifications; and the guarantee condition, which bounds the interference the
program itself may inflict upon the environment. Application of the rely/guarantee
parallel composition rule requires proofs that each process’ rely condition subsumes
the others’ guarantee conditions.
While a considerable improvement over the Owicki-Gries logic—subsumption
proofs linear in the number of components versus quadratic in their size, and depen-
22
dence only among components’ specifications instead of their proofs—rely/guarantee
still has shortcomings. The logic cannot be considered truly compositional because
each component may be specified with a variety of interference conditions, and it is
not clear which are appropriate until attempting the parallel composition. Further-
more, it can be laborious to specify sufficiently strong guarantee conditions. For
example, the guarantee condition for a component with three variables (x, y, z) that
updates only one (x := x + 1) must describe not just the relevant variable change
(x′ = x + 1), but also that all others remain the same (. . . ∧ y′ = y ∧ z′ = z)—the
latter condition being difficult because it suggests a sort of quantification over vari-
able names not possible in first-order logic, instead requiring an explicit numeration
of state variables.
RG-Sep A partial solution to this problem of specification has recently appeared
via separation logic. A new concurrent program logic from Vafeiadis and Parkinson,
dubbed RG-Sep [55] (“a marriage of rely/guarantee and separation logic”), mates
a generalization of separation logic’s assertion language and frame rules with the
rely/guarantee logic. Besides inheriting the local reasoning features of separation
logic, RG-Sep eases the pain of specifying environmental interference by semanti-
cally partitioning the system state into private and shared parts. A new class of
boxed assertions P is used to describe shared state, and the logical operations are
adjusted so that, e.g., a separated conjunction of boxed assertions P ∗ Q allows
their footprints to overlap.6 The proof rules are then modified so that, e.g., par-
allel composition requires only that each component be tolerant of environmental
interference to shared state, not private state. This could be used in the previous
example to obviate the explicit enumeration needed to describe invariance of the
irrelevant state variables.
Embodied within RG-Sep (along with its successors [17]) are some of the most
6An interesting consequence is that P ∗ Q is logically equivalent to P ∧Q .
23
advanced ideas about high-level reasoning techniques for fine-grained concurrent
shared-memory programs, including local reasoning. Vafeiadis’ 2008 dissertation [53]
includes correctness proofs for a variety of complex, racy concurrent data structures
using the logic. RG-Sep is by no means simple, but does yield relatively concise,
readable proofs about difficult algorithms. Indeed, the only significant criticism
leveled here is that, as with all the other concurrent logics discussed, RG-Sep is not
sound w.r.t. weak memory models for racy programs.
Concurrent Separation Logic A simplified variant of the original Owicki-Gries
logic does away with the complicated interference side condition, instead restricting
its scope to well locked7 and race-free programs. For this smaller (but still large
and useful) class of programs, it is possible to use the Owicki-Gries logic to perform
invariant-based reasoning, in which threads’ interaction with shared state may rely
only on a specified invariant, and must always preserve that same invariant.
As with Jones’ original rely/guarantee logic, this simplified Owicki-Gries
logic suffers from the difficulty of constantly having to specify, at each step of the
proof, not just what has changed in the program state, but also what has not
changed. Once again, the solution is to incorporate the concurrent reasoning of
the simplified Owicki-Gries logic in a separation logic-style logic. This logic, called
Concurrent Separation Logic (CSL), originally developed by Peter O’Hearn and
Stephen Brookes [34, 9], does just that. Hoare-style specifications are elaborated
with an invariant I:
I ` {P} c {Q} ,
which indicates informally that, from a state which can be partitioned into a shared
part, which satisfies I, and a private part, which satisfies P , that 1) the program c
7The exact definition of “well locked” is technical and rather complicated, but purely syntactic,
and hence preferable to the logical interference side conditions imposed by the full Owicki-Gries
logic.
24
does not ever encounter a memory error; 2) that throughout execution some portion
of the state always satisfies I; and 3) if the program terminates, then it does in a
state which can be partitioned finally into a shared part that satisfies I and a private
part that satisfies Q. The par rule for reasoning about the parallel composition of
program components c and c′ is, by comparison with the other approaches, extremely
simple:
I ` {P} c {Q} and I ` {P ′} c′ {Q′}
I ` {P ∗ P ′} c || c′ {Q ∗ Q′}
(par)
This rule asserts that if c and c′ can be proved to maintain the same invariant along
with their own private specifications, then the parallel combination c || c′ must also
maintain that invariant and, assuming the private portions of state accessed by the
two components are disjoint, maintains their private specifications as well.
2.3 Memory Consistency Models
A memory consistency model (or just memory model) is a specification of a concrete
implementation of a shared memory system. Memory consistency models bound the
possible results of reading an address in shared memory according to the history of
writes and reads that have already taken place or are in progress. For the sake of
reasoning about the behavior of concurrent software programs interacting with a
shared memory, memory consistency models are a crucial abstraction: not only are
concrete implementations of shared memory in modern computer systems incredibly
complex, but their details may additionally be considered proprietary and secret.8
Many memory consistency models have been defined and studied [52, 21, 1].
Below, we discuss (informally) two of particular interest: sequential consistency,
8As a consequence of this complexity and secrecy, it can be challenge to determine whether
or not a memory implementation is soundly described by a particular memory consistency model.
This interesting problem is beyond the scope of this project.
25
the most commonly used abstraction of memory for well-behaved programs; and
the x86-TSO memory model, which describes the interaction between memory and
arbitrary software programs on x86-like multiprocessor computers.
2.3.1 Sequential Consistency
Sequential consistency was originally defined by Leslie Lamport in 1979 [30]. For a
sequentially consistent program, he wrote:
the result of any execution is the same as if the operations of all the
processors were executed in some sequential order, and the operations of
each individual processor appear in this sequence in the order specified
by its program.
As an example, consider again the program in Figure 1.1, in which thread 0
writes to f0 and then reads from f1; and thread 1 writes to f1 and reads from f0. If
we write (p, o, `, v) to indicate that thread p performs operation o (where o ∈ {r, w}
is either a read or a write) on location ` with value v (indicate the value written or
value read), then:
(0, w, f0, 1), (1, w, f1, 1), (1, r, f0, 1), (0, r, f1, 1)
is a sequentially consistent execution, because the operations of each thread occur,
within the execution, in the same order as the operations appear in their respective
programs (with the writes preceding the reads), and the results of the loads clearly
are consistent with the total order of operations given.
26
On the other hand, the following two executions are not sequentially consis-
tent:
(0, r, f1, 0), (1, w, f1, 1), (1, r, f0, 0), (0, w, f0, 1)
(0, w, f0, 1), (0, r, f1, 0), (1, w, f1, 1), (1, r, f0, 0),
In the first execution, it is not the case that the operations of thread 0 take place in
the order specified by the program: the read precedes the write. In the second exe-
cution, the order of operations is consistent with the order specified by the program,
but the result is not the same as if the operations were executed in that order: the
final read of f0 by thread 1 should not result in value 0, because it directly succeeds
the write to f1.
Operationally, this constraint on the possible execution traces of a program
can be realized by modeling memory as a simple location-value map, in which reads
and writes to memory happen atomically. That is, memory is represented by a
(possibly partial) function L⇀V; and the semantics of the program is defined such
that no more than one thread may read or update the memory at once. This, of
course, is the most widespread model of memory used in the semantics of imperative,
shared-memory programs, and consequently the vast majority of program reasoning
and verification techniques tacitly assume that programs are sequentially consistent;
i.e., that all its executions can be described by its interactions with this simple model
of memory.
2.3.2 The x86-TSO Memory Model
After a great deal of investigation, experimentation and discussion with manufac-
turers, Owens, Sarkar and Sewell published a specification [38] of the x86 memory
27
model. This model has become widely accepted as accurate 9 and forms the basis
for a great deal of related research—e.g., the Comp-Cert project-in-progress [31],
which attempts to construct a realistic (i.e., which performs realistic optimizations)
and formally verified compiler for C and C++.
The model identified by Owens et al. is essentially a total-store order (TSO)
model which, as the name implies, indicates that there is a single total order, agreed
upon by all processors, that organizes store events. Their x86-TSO model is de-
scribed as a collection of legal traces of memory events, defined both axiomatically
and operationally. The latter, informally, is described in terms of write buffers:
per-processor FIFO queues of “writes” (i.e., location-value pairs). The informal se-
mantics of store events entails buffering a new write in the processor’s write buffer;
the semantics of load events entails returning the value of the most recent buffered
write to the intended location in the processor’s write buffer or, if no such buffered
write exists, of the shared memory.
Formally, the operational model is given with a labeled transition relation
between machine states. These states are four-tuples (R,m,B, l) in which:
• R : P→ I⇀V represents a register file for each processor;
• m : L⇀V represents a shared memory;
• B : P→(L× V) list represents a write buffer for each processor;
• l : P⊥ represents a global lock.
Transitions (labeled by memory events) between states indicate the possi-
bility and effect of those memory events. For example, in any state (R,m,B, l),
processor p may write a value v into its register i; i.e., it may update the state such
9An earlier paper [51], suggested that the x86 memory model more closely resembled causal
consistency—a model in which there is a single agreed-upon order not just for the store events, but
for all causally related memory events—but this was later contradicted by counterexamples that
lead to the current proposed TSO model.
28
that Rp(i) = v. Similarly, if Rp(i) = v then p may read value v from its register i.
A summary of the other events processor p may perform is as follows:
• it may load from its write buffer the most recent value of a location—or, if
a write to that location is not found in its write buffer, from memory—if the
lock is either available (i.e., the lock value is ⊥) or is held by p, but not if
some other processor q 6= p holds the lock;
• it may store a value to a memory location by adding a new write to the head
of its write buffer regardless of the status of the lock;
• it may flush (or, synonymously in this document, commit) the least recent
write in its buffer to memory if it holds the lock or the lock is available;
• it may fence, flushing all writes buffered on p to memory, resulting in an empty
write buffer;
• it may acquire the lock (i.e., change the lock value in the current state to p)
if the lock is available;
• it may release the lock (i.e., change the lock value in the current state to ⊥)
if it holds the lock.
The x86-TSO memory model includes all the sequentially consistent execu-
tions, as well as others. For example, the sequentially consistent execution from the
previous section:
(0, w, f0, 1), (1, w, f1, 1), (1, r, f0, 1), (0, r, f1, 1),
is shown valid (informally) under the x86-TSO model as follows. Assume in the
sequel that no processor holds the lock. The first thread may perform its store
operation by enqueuing a write (f0, 1) in its buffer, and then immediately flushing
29
that write to memory; and then similarly for the second thread, which enqueues a
write (f1, 1) and then immediately flushes it to memory. If the second thread then
reads location f0 it finds value 1 (because it has no other writes to f0 in its buffer);
and similarly if the first thread then reads location f1 it finds value 1 for the same
reason.
Next, consider again the two non-sequentially consistent executions of Fig-
ure 1.1 from the previous section:
(0, r, f1, 0), (1, w, f1, 1), (1, r, f0, 0), (0, w, f0, 1)
(0, w, f0, 1), (0, r, f1, 0), (1, w, f1, 1), (1, r, f0, 0),
The first non-sequentially consistent execution is invalid w.r.t. the x86-TSO model
as well, because the operations of a single thread again do not take place in the
order specified by the program. The second non-sequentially consistent execution,
however, is valid under the x86-TSO model: the first process enqueues a write (f0, 1)
in its buffer, and then reads 0 from location f1 in memory (because its buffer contains
no writes to f0); then the second process enqueues a write (f1, 1) in its buffer, and
then reads 0 from location f0 in memory (because its buffer has no writes to f1,
only the other buffer).
When combined with the earlier claim that the x86-TSO memory model
also includes all sequentially consistent executions, this example shows that x86-
TSO memory model is strictly weaker than sequential consistency. Consequently,
x86-TSO is considered to be a weak memory model.
Note that, despite the completely informal examples above, this specification
of the x86-TSO memory model only bounds the sort of memory events that can
occur in program executions; it does not give meaning to the programs of any
particular language, like x86 assembly or C programs. It can, however, be used to
give semantics to programs in those languages. Owens et al. use their model as a
30
guide in assigning semantics to a significant subset of the x86 assembly language,
for example. The semantics of a simple C-like programming language used in this
dissertation project, described in Sections 3.3 and 4.2, is also guided by the x86-
TSO memory model. And although it should be possible to prove that this semantics
respects the bounds of the model, that is not the focus of the project. But even
without such a correspondence proof, it should be clear that the semantics of the
programming language to be described later is manifestly weak.
2.3.3 Data-Race-Freedom Guarantees
Although the memory models of most modern computer architectures—including,
as we have seen, x86-like computer architectures—are relatively weak, allowing ad-
ditional executions compared to a sequentially consistent memory model, the tacit
assumption that programs are sequentially consistent made by most verification
techniques is actually, in many cases, perfectly reasonable. This is because modern
computer architectures, including x86, guarantee sequentially consistent execution
for a large class of programs: namely, the data-race free programs.10 Because of
these so-called data-race-freedom (DRF) guarantees, verification techniques which
assume sequential consistency are perfectly sound in case the program under con-
sideration is race-free. And because races are generally considered to be errors, a
reasonable verification workflow is to first check that a program is race-free before
proceeding with the verification of more elaborate properties; or, alternatively, to
leverage techniques which succeed only for race-free programs. But not all cor-
rect programs are race-free; for these programs, verification techniques may need to
account for the full complexities of the underlying memory model.
10An x86-specific notion of data-race, as well as theorems about data-race-freedom guarantees
for the x86, are described in [37].
31
Chapter 3
A Sequential Program Logic
In this chapter we describe a program logic for a sequential, single-processor, weak-
memory programming model. The semantics of programs and assertions will be
given in terms of system states with a single write buffer, and the logic is tailored
to reason about the behavior of sequential programs w.r.t. these states. The logic
developed in this chapter is not especially useful; the behavior of sequential program
execution on this model is essentially the same as on the typical, strong-memory
model (in which memory is modeled as a single array of addresses), and existing
logics for reasoning about this behavior are certainly simpler than the one developed
here. But the sequential program logic is a pedagogical stepping stone toward
the concurrent program logic, for which the behavior of programs may well be
significantly different from that of a strong-memory model. The development of the
sequential logic is vastly simpler than the weak-memory, multi-processor logic, and
many of the more difficult issues in that task can be introduced and explained more
easily in a single-processor setting.
A detailed description and soundness proof of an earlier single-processor,
weak-memory program logic was given previously [56].
32
3.1 An Example Sequential Proof
Consider the following simple sequential program c, which loads the value at an
address x into variable t, writes the value t+ 1 into address y, and then flushes that
write back to memory with a fence instruction:
c =df t := [x] ; [y] := t+ 1 ; fence .
This sequential program will fail with a memory error unless the value of x, from
which the program loads, is a properly allocated memory address, as well as the
value y, to which the program writes. Any provable specification of this program
must require this of x and y initially. If the value of x and y are indeed allocated
addresses, and the value in memory at address x is initially some value z, then upon
termination the write buffer will be empty and the value in memory at location y
will be z + 1.
In the program logic we develop throughout this chapter, we express this
informal specification of c with the following formal specification:
` {x 7→z ∗ y 7→−} c {x 7→z ∗ y 7→z + 1} . (3.1)
We write e 7→ f to mean that the value in memory at the address given by the
integer-valued expression e is equal to the value given by expression f , or e 7→ −
if the value in memory at e is irrelevant. We also write the separating conjunction
e 7→f ∗ e′ 7→f ′ to mean that e and e′ are distinct, allocated memory addresses, with
values in memory given by f and f ′ respectively. Hence, the pre-condition on the
left asserts that x and y are allocated memory locations, the former with value z,
and the post-condition on the right asserts that, upon executing the command c,
the value at x remains z and the value at y has been set to z + 1.
Note that the specification in Equation 3.1 above would not be true if the
33
traditional additive conjunction ∧ were used instead of the separating conjunction.
That is, the following specification:
` {x 7→z ∧ y 7→−} c {x 7→z ∧ y 7→z + 1} .
is not true because it allows for x and y to be aliased—i.e., to denote the same
memory address. For in that case the store to y would change the value in memory
at y as well as at x, leaving x 7→z+1 instead of x 7→z as specified. Also note that the
specification in Equation 3.1 would not be true without the trailing fence command
because the buffered write will not necessarily have committed to memory at the
moment the commands have completed their execution.
A proof sketch of the program specification in Equation 3.1 is as follows, the
details of which will be explained shortly:
{x 7→z ∗ y 7→−}
t := [x]
{x 7→z ∗ (y 7→− ∧ t = z)}
[y] := t+ 1
{x 7→z ∗ (y 7→− C y z + 1)}
fence
{x 7→z ∗ y 7→z + 1}
As the name indicates, a proof sketch provides a skeleton of a complete proof of a







which indicates that a complete proof will include proofs of the sub-specifications
` {P} c1 {R} and ` {R} c2 {Q} that will be combined, in a final step, using a rule
of inference (seq) for composing specifications of sequentially combined commands:
...
` {P} c1 {R}
...
` {R} c2 {Q}
` {P} c1 ; c2 {Q}
seq
where the vertical dots represent the remaining proofs to be supplied.
Returning to the proof sketch of the program specification in Equation 3.1,
we see that to complete the proof we must derive three sub-specifications, one for
each primitive command:
1. ` {x 7→z ∗ y 7→−} t := [x] {x 7→z ∗ (y 7→− ∧ t = z)}
2. ` {x 7→z ∗ (y 7→− ∧ t = z)} [y] := t+ 1 {x 7→z ∗ (y 7→− C y z + 1)}
3. ` {x 7→z ∗ (y 7→− C y z + 1)} fence {x 7→z ∗ y 7→z + 1}
The first specification asserts that the result of loading the value z in memory at
address x into variable t results in a state that is otherwise unchanged except that
t = z. The second specification asserts that the result of storing the value t + 1
to the address y results in the addition of a new buffered write to the state, which
may later be committed to memory, overwriting the current, unspecified value. The
third specification asserts that the result of explicitly flushing this write to memory
overwrites the existing value in memory at y with the value z + 1.
We shall now derive these three specifications. For the first specification,
we start by using an instance of the load axiom scheme to show that the result
of evaluating the load command t := [x] from pre-condition x 7→ z yields the post-
condition x 7→z ∧ t = z.
` {x 7→z} t := [x] {x 7→z ∧ t = z} load
35
We then use the spatial frame rule frame-sp to extend the pre- and post-condition
with a description of the value of an additional, distinct memory address y:
` {x 7→z} t := [x] {x 7→z ∧ t = z} load
` {y 7→− ∗ x 7→z} t := [x] {y 7→− ∗ (x 7→z ∧ t = z)} frame-sp
Finally, we observe that the derived pre- and post-conditions are not syntactically
equal to those of the desired specification, but are logically equivalent:
y 7→− ∗ x 7→z ≡x 7→z ∗ y 7→−
y 7→− ∗ (x 7→z ∧ t = z) ≡x 7→z ∗ (y 7→− ∧ t = z)
More generally, the derived pre-condition is logically implied by the desired pre-
condition, and the derived post-condition logically implies the desired post-condition.
Hence, we may strengthen the pre-condition and weaken the post-condition accord-
ingly with the rule of consequence cons:
` {x 7→z} t := [x] {x 7→z ∧ t = z} load
` {y 7→− ∗ x 7→z} t := [x] {y 7→− ∗ (x 7→z ∧ t = z)} frame-sp
` {x 7→z ∗ y 7→−} t := [x] {x 7→z ∗ (y 7→− ∧ t = z)} cons
This completes the derivation of the first specification.
For the second specification, we begin with an instance of the axiom scheme
for the store command in which the pre-condition describes just the value of y in
memory:
` {y 7→−} [y] := t+ 1 {y 7→− C y t+ 1} store
The post-condition describes, with the leads-to assertion y  t + 1, the addition
of a new, buffered write to address y with value t + 1. The temporal separating
conjunction y 7→− C y t + 1 indicates that the writes described on the left side
36
precede those on the right side.
Next, we again apply the spatial frame rule frame-sp to extend the speci-
fication with a description of the value of memory at x and the value of variable t,
namely (x 7→z ∧ t = z), which we abbreviate for space reasons as F2 below:
` {y 7→−} [y] := t+ 1 {y 7→− C y t+ 1} store
` {F2 ∗ y 7→−} [y] := t+ 1 {F2 ∗ (y 7→− C y t+ 1)}
frame-sp
The derived pre-condition is logically equivalent to—and, hence, is implied
by—the desired pre-condition, and the derived post-condition logically implies the
desired post-condition
(x 7→z ∧ t = z) ∗ y 7→− ≡x 7→z ∗ (y 7→− ∧ t = z)
(x 7→z ∧ t = z) ∗ (y 7→− C y t+ 1) |=x 7→z ∗ (y 7→− C y z + 1)
Hence, we finish the derivation of the second specification with an application of the
rule of consequence cons:
` {y 7→−} [y] := t+ 1 {y 7→− C y t+ 1} store
` {F2 ∗ y 7→−} [y] := t+ 1 {F2 ∗ (y 7→− C y t+ 1)}
frame-sp
` {x 7→z ∗ (y 7→− ∧ t = z)} [y] := t+ 1 {x 7→z ∗ (y 7→− C y z + 1)} cons
For the third and final specification, we begin with the axiom for the fence
command:
` {emp} fence {bar} fence
This small axiom allows for the introduction, from an empty system description,
of the bar assertion. Intuitively, this assertion describes the effect of a barrier on
the system state, in the sense that any writes that precede bar must necessarily be
37
committed to memory. For example, we have the following logical equivalence:
y 7→ t+ 1 ≡ y t+ 1 C bar
This means that the description of y having value t+ 1 in memory is equivalent to
the description of a buffered write to address y with value t+1 followed by a flush of
that write to memory.1 We wish to describe the effect of bar on the post-condition
from the previous specification, so we extend the specification with the temporal
frame rule frame-tm and the frame assertion x 7→ z ∗ (y 7→− C y z + 1), which
we abbreviate for space below as F2:
` {emp} fence {bar} fence
` {F2 C emp} fence {F2 C bar}
frame-tm
The empty assertion is a unit w.r.t. both the spatial and temporal separating con-
junctions, so the pre-condition assertion is equivalent to one without the final tempo-
ral conjunct emp. And the post-condition is equivalent to one in which the buffered
write has been flushed to memory, replacing the previous value in memory:
(x 7→z ∗ (y 7→− C y z + 1)) C bar ≡ x 7→z ∗ y 7→z + 1
So, once again, the derivation can be completed by an application of the rule of
consequence:
` {emp} fence {bar} fence
` {F2 C emp} fence {F2 C bar}
frame-tm
` {x 7→z ∗ (y 7→− C y z + 1)} fence {x 7→z ∗ y 7→z + 1} cons
This completes the proof of the program specification of Equation 3.1. In
1In fact, the points-to assertion shall later on in the description of the sequential logic be made
definitionally equal to the temporal conjunction of the analogous leads-to assertion and bar.
38
the following sections, the notions of program, system state, assertion, model, spec-
ification and proof are formally defined and discussed.
3.2 Expressions and Stacks
Expressions are terms that denote values, which in this development are just inte-
gers. Hence, they are also used later on to denote memory locations and processor
identifiers.
The language of expressions, written Expr, is given by the following gram-
mar:
Expr e ::= v | x | (e+ e′) | (e− e′) | . . . ,
where v ∈ V and x ∈ I. The possibility of additional operations is left open, as
knowledge of the complete set of expressions is not particularly important for the
purpose of describing the program logic.
The semantics of expressions is given w.r.t. stacks, which are total functions
from I to V. The collection of functions I→V is abbreviated Stack. The interpre-
tation of an expression w.r.t. a stack s is given by the extension of a stack, written
ŝ, which is a total function from Expr to V defined as follows:
ŝ(v) =df v
ŝ(x) =df s(x)
ŝ(e+ e′) =df ŝ(e) + ŝ(e′)
ŝ(e− e′) =df ŝ(e)− ŝ(e′)
Boolean expressions are terms that denote truth values. Their language is
39
given by the following grammar:
BExpr b ::= false | true | (!b) | (e = e′) | . . . ,
where e, e′ ∈ Expr. For convenience, we represent truth values by the set {0, 1} so
the extension of a stack can also be used to interpret boolean expressions:
ŝ(false) =df 0
ŝ(true) =df 1
ŝ(!b) =df 1− ŝ(b)
ŝ(e = e′) =df

1 if ŝ(e) = ŝ(e′)
0 otherwise.




fv(e+ e′) =df fv(e) ∪ fv(e′)




fv(e = e′) =df fv(e) ∪ fv(e′)
For stacks s, s′ and X ⊆ I, we write s ∼X s′ if, for all x ∈ X, s(x) = s′(x).
The following basic lemma uses this relation to connect the static and dynamic
40
semantics of expressions.
Lemma 1. If s ∼fv(e) s′ then ŝ(e) = ŝ′(e).
Proof. By induction on the structure of the expression e.
3.3 Sequential Programs
In this chapter, programs are identified with sequential commands, which consist of
compositions of primitive commands for testing, accessing and modifying state.
The syntax of primitive commands is given by the following grammar:
PComm p ::= skip | assume(b) | assert(b) | x := e | x := [e] |
[e] := e′ | fence
The informal meaning of the primitive commands is as follows.
• skip takes no evaluations steps;
• assume(b) evaluates to skip if b holds and becomes stuck otherwise;
• assert(b) evaluates to skip if b holds and aborts otherwise;
• x := e assigns e to identifier x;
• x := [e] assigns the value at memory address e to identifier x;
• [e] := e′ stores the value e′ to memory address e; and
• fence commits any buffered writes to memory
The formal semantics of the successful execution of a primitive command p
is given as a transition relation between machine states σ, σ′
p : σ → σ′.
41
Informally, a triple p, σ, σ′ belongs to this relation if the successful execution of p in
state σ may yield state σ′. (A formal interpretation will be given later in the context
of a formal semantics of full commands.) A primitive command may alternatively
execute unsuccessfully, indicated as follows
p : σ →  .
Executions may be unsuccessful as a result of failed assertions—e.g., assert(false)
is unsuccessful from any state—or attempts to access or modify a memory location
outside a command’s address space. We say that a primitive command aborts in
such unsuccessful executions.
To define these relations formally, we must first define the notion of a unipro-
cessor memory system and a machine state.
Definition 1. A uniprocessor memory system is a pair (h, b), where:
• h : L⇀V is a heap, i.e., a partial function that represents the allocated loca-
tions of shared memory and their values; and
• b : (L× V) list is a write buffer;
The set of uniprocessor memory systems is abbreviated as Mem. The pair
that consists of a stack s, as defined in Section 3.2, which assigns values to identifiers,
and a memory system µ is called a state, typically abbreviated by σ. The collection of
states is written State. We often abuse notation by interchanging memory systems
and states in definitions for which the stack is irrelevant.
Note that the notion of machine state given here differs from that used to
define the memory model. First and most obviously, there is only a single write
buffer, because in this section we are discussing only the execution of sequential
programs on uniprocessor machines. Second, the global lock value from the memory
model is omitted because there is only one write buffer, and hence commands have
42
no need to claim sole ownership of the memory system. Third, the set of names
(i.e., “registers,” “variables,” “identifiers,” etc.) are global instead of local to each
processor. This is for convenience only, and is not a technical restriction. The
specification logic will be restricted to programs for which the names are partitioned
among processes, except for those that are never modified. Another reasonable
choice would have been to use local names only, and to share read-only values
among processes in the shared memory. This has the advantage of codifying the
above healthiness condition on programs directly into the model of the language
and logic; it has the disadvantage of perhaps making the description of access to
shared values slightly more awkward.
The definition of the semantic relation for primitive commands is given in
Figure 3.1. By p-assume, the primitive assume(b) takes an evaluation step only
if the boolean expression b evaluates to 1 in the current state. Otherwise, it is
effectively stuck. Later, when we describe the specification logic, we will see that
such stuck executions are irrelevant to the truth of specifications. In fact, stuck
executions are equivalent to diverging executions w.r.t. the specification logic. By p-
assert and p-assert-a, assert(b) either evaluates normally if the boolean expression
b evaluates to 1 in the current state, and aborts otherwise. An aborting command
will later be shown to not satisfy any specification. By p-assign, the assignment
primitive x := e always evaluates successfully by assigning to x the value of the
expression e in the current state. By p-load, the load primitive x := [e] assigns
to x the value of the most recent buffered write to the address that is the value
of expression e, if such a write exists, and otherwise gives the value in the current
heap at that address. If the address that is the value of expression e has neither
any buffered writes nor is defined in the heap (i.e., if ŝ(e) /∈ (h\\b)) then by p-load-
a the load primitive aborts. By p-store, the store primitive [e] := e′ appends to
the write buffer a new write with address the value of expression e and value the
43
value of e′, but only if the value of e is an address that is already allocated (i.e.,
if ŝ(e) ∈ dom(h\\b)). Otherwise, by p-store-a the store primitive aborts. Finally,
by p-fence, the fence primitives evaluates without changing the state if the write
buffer is empty. Later, when we describe the semantics of full commands, we will
explain how the fence primitive can be thought to flush a non-empty buffer.
Structured commands consist of either a primitive command; a sequential
composition of commands; a nondeterministic (internal) choice between commands;
or an iteration of a command. The language of commands is defined by the following
grammar:
Comm c ::= pe | (c ; c′) | (c+ c′) | c∗,
where p is a primitive command.
The formal semantics of the successful execution of a command is given as a
binary transition relation between command-state pairs:
c, σ → c′, σ′.
But a command’s execution may abort unsuccessfully as well, as with primitive
commands. Unsuccessful executions are modeled as a transition relation between
command-state pairs and an erroneous pseudo-state, as for primitive commands:
c, σ →  .
We refer collectively to command-state pairs and the erroneous state  as configu-
rations, and use C to indicate a configuration. A configuration C is considered safe
if it does not abort: C 9  .
The semantics of commands also encompasses “silent” transitions, which
represent the flushing of buffered writes to the shared memory as allowed by the
44
if ŝ(b) = 1
assume(b) : (s, h, b)→ (s, h, b)
(p-assume)
if ŝ(b) = 1
assert(b) : (s, h, b)→ (s, h, b)
(p-assert)
if ŝ(b) = 0
assert(b) : (s, h, b)→  
(p-assert-a)
x := e : (s, h, b)→ (s[x   ŝ(e)] , h, b)
(p-assign)
if (h\\b)(ŝ(e)) = v
x := [e] : (s, h, b)→ (s[x   v] , h, b)
(p-load)
if (h\\b)(ŝ(e)) = ⊥
x := [e] : (s, h, b)→  
(p-load-a)
if ŝ(e) ∈ dom(h\\b)
[e] := e′ : (s, h, b)→ (s, h, b++ [ŝ(e), ŝ(e′)])
(p-store)
if ŝ(e) /∈ dom(h\\b)
[e] := e′ : (s, h, b)→  
(p-store-a)
if b = ε
fence : (s, h, b)→ (s, h, b)
(p-fence)
Figure 3.1: Semantics of sequential primitive commands
45




(h, b++ [(`, v)])→
τ
(h[`   v] , b)
We write  as shorthand for the converse of the reflexive-transitive closure →
τ
:




The complete relation that defines the semantics of commands is given in
Figure 3.2 below. The semantics of primitive commands is lifted directly to the
level of commands by c-prim and c-prim-a. The silent transitions as defined by
the →
τ
relation are also lifted directly to the level of commands by c-tau. Se-
quential compositions c1 ; c2 evaluate from left-to-right: by c-seq and c-seq-a if
the left command c1 evaluates or aborts, then so does the sequential composition;
and by c-seq-s, once the left command c1 has evaluated fully to skip it is dropped
so that evaluation of the right side c2 may proceed. By c-ch-1 and c-ch-2, the
nondeterministic choice command c1 + c2 may evaluate to either c1 or c2, after-
ward continuing evaluation of the chosen command. Finally, by c-loop, a looping
command c∗ may always evaluate without modifying the state by expanding to a
nondeterministic choice between doing nothing (i.e., exiting the loop) and sequential
composition that consists of evaluating the loop body c once and then continuing
with the loop.
The reflexive-transitive closure of the command evaluation relation, written
c, σ
∗→ C, is defined as usual. The range of a configuration C, written range(C), is
2As noted earlier, we abuse notation by interchanging the concept of state and memory system in
definitions for which the stack is irrelevant. Hence, the definition of the relation between memory







defined as the set of states σ for which C ∗→ skip, σ.
if p : σ → σ′ and σ′ ∈ State
p, σ → skip, σ′
(c-prim)
if p : σ →  





c, σ → c, σ′
(c-tau)
c, σ → c0, σ′
(c ; c′), σ → (c0 ; c′), σ′
(c-seq)
c, σ →  
(c ; c′), σ →  
(c-seq-a)
(skip ; c′), σ → c′, σ
(c-seq-s)
(c+ c′), σ → c, σ
(c-ch-1)
(c+ c′), σ → c′, σ
(c-ch-2)
c∗, σ → (skip +(c ; c∗)), σ
(c-loop)
Figure 3.2: Semantics of sequential commands
Sequential Command Abbreviations A few standard command abbreviations
are shown in Figure 3.3. Some would benefit greatly from local variable declarations,
which have not yet been added to the language.
Static Semantics The static semantics of expressions, primitive commands and
commands, embodied here by functions fv(−) and mod(−) associating these ob-
jects to their sets of free and modified variables, respectively, are completely stan-
47
if b then c else c′ =df (assume(b) ; c) +(assume(!b) ; c′)
if b then c =df (assume(b) ; c) +(assume(!b) ; skip)
while b do c =df (assume(b) ; c)∗ ; assume(!b)
Figure 3.3: Sequential command abbreviations
dard. (Especially so because there are no name-hiding operations in the language,
like the aforementioned missing local variable declaration command.) For example,
fv(x := [y + 1]) = {x, y} and mod(x := [y + 1]) = {x}.
The following lemma notes some basic facts about the relationship between
the step relation → and the static semantics of commands.
Lemma 2. If c, s, µ→ c′, s′, µ′ then:
• fv(c′) ⊆ fv(c)
• mod(c′) ⊆ mod(c)
• s ∼I\mod(c) s′.
Proof. By a straightforward induction on the derivation of c, s, µ→ c′, s µ′.
3.4 Locality and Separation
This section gives an informal introduction to the notion of local reasoning, including
a rough sketch of the idea of a local command. The latter will be codified formally
later in the semantics of program specifications in Section 3.6.
The idea of local reasoning is as follows. Perhaps we wish to show that the
result of evaluating a command c in a particular state σ is always included in a set of
states S—i.e., that the configuration c, σ does not abort, and that if c, σ ∗→ skip, σ′
then σ′ ∈ S. Suppose the elements of the set S are all composed of elements
constructed from some other sets S0 and S1—i.e., σ ∈ S iff, for some σ0 ∈ S0
48
and σ1 ∈ S1, σ = σ0 • σ1 (where • indicates some unspecified state-constructing
function)—and the initial state σ = σ0 • σ1 with σ0 ∈ S0. In that case, we may
wish to reduce the original problem to the potentially simpler task of showing that
the result of evaluating c in state σ1 is always included in S1—i.e., showing that the
configuration c, σ1 does not abort and that if c, σ1
∗→ skip, σ′1 then σ′1 ∈ S1.
Under what circumstances is this reduction sound? First, it should be the
case that if the command does not abort in the local state σ1 then neither does it in
the global state σ = σ0 • σ1. Or, contrapositively, if c, (σ0 • σ1)→  then c, σ1 →  .
This is called the safety monotonicity property. Second, it should be the case that if
c, (σ0•σ1)
∗→ skip, σ′, then there exists σ′1 such that σ′ = σ0•σ′1 and c, σ1
∗→ skip, σ′1.
For then, by assumption σ0 ∈ S0, by the reduction σ′1 ∈ S1, and by the structure
of S σ0 • σ′1 = σ′ ∈ S. This is called the frame property. A command that satisfies
both the safety monotonicity and frame properties is called a local command [62].
For example, c is perhaps a command that reads and writes a particular
set of memory addresses, and the starting state σ perhaps describes the value of
some addresses that are superfluous to the execution of c. We may decompose
σ by partitioning the memory addresses it describes into states σ0 and σ1 such
that (σ0 • σ1), with σ1 describing just the memory locations accessed by c and the
remainder by σ0. Because (σ0 • σ1) contains all the memory addresses of σ1 and
more, then if c can execute successfully (i.e., without aborting) from state σ1 then it
can also execute successfully from σ. And because the addresses of σ0 are irrelevant
to the evaluation of c, σ, those addresses will remain unchanged in the resultant
state σ′, and hence c, σ1 also evaluates to σ′1 with σ
′ = σ0 • σ′1.
This is the essence of local reasoning: leveraging some local property of a
command in a local state to a related global property of the command in a global
state. Of course, local reasoning is not possible for all properties—it relies crucially
on the notion of decomposition •—but when it is possible, it offers an especially
49
direct path to showing the desired property. If the commands of a programming
language are local, then the principle of local reasoning can be codified into a pro-
gram logic by way of a frame rule, which allows inference from local to global
program specifications. This will be considered in more detail in Section 4.5.
In the example above, the state was separated by memory address. But in
the programming model described in Section 3.3, memory systems consist of both
committed and buffered writes. How should we decompose (or separate) a memory
system with buffered writes? Or, alternatively, how and when can we compose two
partial memory systems? The goal of the next few sections is to carefully define a
handful of different, useful notions of separation for which the sequential commands
of the programming language are local. Each notions of separation will eventually
yield a different frame rule, and hence a different principle of local reasoning about
program specifications.
3.4.1 Spatial Separation
In this section we carefully define a notion of separation for uniprocessor states
analogous to the one described in the previous section, in which states are decom-
posed according to memory address. This is accomplished by defining a notion of
separation for memory systems, and then lifting that function to states that have
identical stacks: i.e., given a definition of µ •µ′, the lifted partial function on states
(s, µ) • (s′, µ′) is defined as follows
(s, µ) • (s′, µ′) =df (s, µ • µ′) if s = s′ and µ • µ′ is defined.
In the sequel, we shall ignore the distinction between the function on memory sys-
tems and the lifted function on states.
States are typically separated by the resources they describe. In the tradi-
tional, strong-memory heap model of separation logic, the resource is a flat shared
50
memory and the heaps are separated according to address:
h0 ∗̃ h1 =df

h0 ] h1 if dom(h0) ∩ dom(h1) = ∅
⊥ otherwise.
Note that the partial function is defined only when the heaps have disjoint domains.
It is also, very obviously, commutative.
We wish to define an analogous notion of separation for memory systems,
which consist of a heap-write buffer pairs. As in the case of separation logic, we
add heaps with disjoint domains. But how should we combine the write buffers?
To ensure commutativity of the operation, a natural choice is to interleave writes
of the buffers:
(h0, b0) ∗̃ (h1, b1) =df
⋃
b∈b0]b1
{(h0 ] h1, b) | dom(h0, b0) ∩ dom(h1, b1) = ∅}
Above, we write dom(h, b) as shorthand for dom(h) ∪ dom(b). The set µ0 ∗̃ µ1 is
called the spatial separation of µ0 and µ1 because it requires disjointness of the
constituent domains, and does not constrain the order of of the buffered writes. (In
a later section, we will weaken the disjointness requirement to yield a weaker notion
of separation.)
Interleaving the write buffers results in a notion of separation in which the
relative ordering between the writes in the constituent buffers, which necessarily
have distinct memory locations, is irrelevant. For example, for µ0 = (∅, [(`, v)]) and
µ1 = (∅, [(m,u)]), with ` 6= m, we have both (∅, [(`, v), (m,u)]) ∈ (µ0 ∗̃ µ1) and
(∅, [(m,u), (`, v)]) ∈ (µ0 ∗̃ µ1).
Unlike for the heap separation function, the heap-buffer separation function
maps into the power-domain3 of memory systems: for compatible pairs of memory
3Such algebras are called non-deterministic monoids [19].
51
systems, the separation yields a memory system for each possible interleaving of the
individual write buffers. The resulting set is non-empty if and only if the domains
of the constituent memory systems are disjoint.4 This necessitates a slight change
to the definition of a local command, as explained in Section 3.4. Given a notion of
separation that yields a set of memory systems, the safety monotonicity and frame
properties must hold for each possible result: i.e., for every µ ∈ µ0 ∗̃ µ1, if c, µ is safe
then so is c, µ1; and if c, µ
∗→ skip, µ′ then there exists µ′1 such that c, µ1
∗→ skip, µ′1
with µ′ ∈ µ0 ∗̃ µ′1.
We can informally observe locality as follows. For safety monotonicity, note
that the domain of the memory systems (µ0 ∗̃ µ1) are supersets of the domain of
µ1, and so if, e.g., a load command does not abort with a memory error in µ1 then
neither will it in (µ0 ∗̃ µ1). For the frame property, a load in µ0 ∗̃ µ1 does not
change the memory system, and since we have assumed that it does not abort in
µ1 alone, then it will have the same result in µ1 as in (µ0 • µ1). For example, with
µ0 = (∅, [(`, v)]), µ1 = (∅, [(m,u)]) and the load command c = x := [`], it is clear that,
for any µ ∈ µ0 ∗̃ µ1, if c, (s, µ)
∗→ skip, (s′, µ′) then also c, (s, µ1)
∗→ skip, (s′, µ1).
It is useful to to lift the definition of spatial separation from a partial function
on states up to a total function on sets of states as follows:





{σ0 ∗̃ σ1} .
By overloading notation in this way, this allows us to write, e.g., σ0 ∗̃ (σ1 ∗̃ σ2) as
shorthand for
⋃
{σ0 ∗̃ σ12 | σ12 ∈ σ1 ∗̃ σ2}. It also allows us to assert associativity
of the lifted function:
σ0 ∗̃ (σ1 ∗̃ σ2) = (σ0 ∗̃ σ1) ∗̃ σ2
Spatial separation as well as its lifting are also commutative, they have as units
4This is more convenient than defining a partial function into the power-domain. In that case,
both ⊥ and ∅ would apparently both indicate incompatibility.
52
(∅, ε) and {(∅, ε)}, respectively.
3.4.2 Temporal Separation
Spatial separation results in a set of states that encompasses all possible interleavings
of the composed write buffers. Consequently, it is unsuitable for composing states
with a particular interleaving in mind. Temporal separation does just this. Given
memory systems µ0 and µ1, the strong temporal separation, µ0 J̃ µ1 is the element
of the set µ0 ∗̃ µ1 in which the writes of µ0 all precede the writes of µ1. For example,
for µ0 = (∅, [(`, v)]) and µ1 = (∅, [(m,u)]), the strong temporal conjunction µ0 J̃ µ1
is given by:
µ0 J̃ µ1 = (∅, [(`, v), (m,u)]).
Instead of interleaving the constituent write buffers as in spatial separation, they
are concatenated by temporal separation. For another example, let µ′0 = (` 7→ v, ε).
Then µ′0 J̃ µ1 = (` 7→ v, [(m,u)]). Again the writes of µ′0 (which are committed)
precede the writes to µ1 in the composed state µ′0 J̃ µ1. As a final example, consider
µ′1 = (m 7→ u, ε) and the temporal separation µ0 J̃ µ′1. The presumed result of
this composition is (m 7→ u, [(`, v)]). But this violates the property of having the
writes of the left-hand side precede the writes of the right-hand side because, in
the composition, the committed write m 7→ u implicitly precedes the buffered write
(`, v). Consequently, in the definition of J̃ we explicitly rule out this case by
requiring either that the left-side buffer or right-side heap be empty.
The complete definition of µ0 J̃ µ1 is as follows:
(h, b) J̃ (h′, b′) =df

(h ] h′, b++ b′) if dom(h, b) ∩ dom(h′, b′) = ∅
and h′ = ∅∨b = ε
⊥ otherwise.
53
Clearly µ0 J̃ µ1 is defined when µ0 ∗̃ µ1 is defined and, because b++ b′ ∈ (b ] b′),
µ0 J̃ µ1 ∈ (µ0 ∗̃ µ1). The argument for locality w.r.t. temporal separation is conse-
quently similar to that for spatial separation.
The requirement that the constituent domains of the temporal separation
µ0 J̃ µ1 be disjoint is rather strong, however, and it is possible to eliminate this
condition entirely. We refer, in the sequel, to µ0 J̃ µ1 as the strong temporal sep-
aration of µ0 and µ1, and now define a more relaxed operation µ0 C̃ µ1, which we
call a weak temporal separation. As before, we illustrate this separation with a few
examples before showing the complete definition.
First, consider µ0 = (∅, [(`, v)]) and µ1 = (∅, [(`, u)]). Their weak tempo-
ral separation µ0 C̃ µ1 simply concatenates the constituent write buffers, giving
(∅, [(`, v), (`, u)]). Note that the semantics of load ensures that the result of loading
` in the context of µ1 is the same as for the context of µ0 C̃ µ1 because only the
value of the most recent write to a particular location is returned.
Next consider also µ′0 = (` 7→ v, ε) and µ′1 = (` 7→ u, ∅). The weak temporal
separation µ′0 C̃ µ1 is defined as for the strong variant: (` 7→ v, [(`, u)]). And the
weak temporal separation µ0 C̃ µ′1 is undefined as for the strong variant because of
the potential for the ostensibly more recent committed write from µ′1 preceding the
buffered write of µ0.
Finally consider the weak temporal separation µ′0 C̃ µ
′
1. For the strong vari-
ant as well as for spatial separation this is, of course, undefined because the con-
stituent domains are not disjoint. This is necessary because the result of adding the
maps that represent heaps is undefined in this case; for what would be the result of
applying the hypothetical map (` 7→ v) ] (` 7→ u) to `? Neither u nor v seem like
suitable answers in general, but in the case of temporal separation, we can answer
confidently: the result should be u, because the committed u write is more recent
than the committed v write. Consequently we use the map overriding operation to
54
combine heaps in the definition of weak temporal separation.
(h, b) C̃ (h′, b′) =df

(h\\h′, b++ b′) if h′ = ∅ ∨ b = ε
⊥ otherwise.
One way to understand the choice of the overriding operation on heaps is
w.r.t. an alternative, more concrete state model in which the heap is represented
by a list of writes l instead of a partial function. The list is intended to capture
the complete history of committed writes in the same way that the buffer captures
the history of uncommitted writes. The model of state given in Section 3.3 uses a
partial function h instead of a list of committed writes because only the most recent
committed write to a particular location is relevant to the operational semantics.
We can think of this model of state as an abstraction of the more concrete model in
which committed writes are represented by lists. The abstraction function α that
maps a concrete memory system (l, b) into an abstract memory system α(l, b) is
defined as follows:
α(l, b) =df (l, b),
where l is the lookup function for list l, as defined in Section 2.1.3.
Let us again consider the definition of weak temporal separation. In the
context of concrete states, the definition is completely natural:
(l, b) C̃ γ(l′, b′) =df

(l++ l′, b++ b′) if l′ = ε ∨ b = ε
⊥ otherwise.
This definition and the abstraction function given above provide a correctness cri-
terion for a candidate definition of weak temporal separation on abstract states,
namely that:
α((l, b) C̃ γ(l′, b′)) = α(l, b) C̃ α(l′, b′).
55
It is easy to see that the definition given above for weak temporal separation for
abstract states satisfies this criterion:
α((l, b) C̃ γ(l′, b′))
= {definition of C̃ γ}
α(l++ l′, b++ b′)
= {definition of α(−)}
(l++ l′, b++ b′)
= {l++ l′ ∈ l\\l′ and m ∈ l\\l′ ≡df m = l\\l′}
(l\\l′, b++ b′)
= {definition of C̃ }
(l, b) C̃ (l′, b′)
= {definition of α(−)}
α(l, b) C̃ α(l′, b′).
It is easy to see that both temporal separators are associative and, as for
spatial separation, have (∅, ε) as a unit. Furthermore, the weak variant is defined
whenever the strong variant is defined; and when both are defined, they are equal.
We may also lift these functions up to the power domain, as we did with spatial
separation:






σ0 J̃ σ1 | def(σ1 J̃ σ2)
}






σ0 C̃ σ1 | def(σ1 C̃ σ2)
}
Finally, we note the fact that the strong temporal separation can be defined
in terms of spatial separation and weak temporal separation:
µ0 J̃ µ1 = µ ⇔ µ0 C̃ µ1 = µ ∧ µ ∈ µ0 ∗̃ µ1.
56
We will leverage this fact in the sequel to simplify the rest of the development.
3.4.3 Spatiotemporal Separation
Both spatial and temporal are restrictions of a more general, unifying notion of
separation. We write µ0 #̃ µ1 for the spatiotemporal separation of memory systems
µ1 and µ2, defined as follows:
(h0, b0) #̃ (h1, b1) =df
⋃
b∈b0\\b1
{(h0\\h1, b) | dom(b0) ∩ dom(h1) = ∅} .
For example, consider µ0 = (∅, [(`, 1), (m, 2)]) and µ1 = (∅, [(`, 3), (n, 4)]).
Note that µ0 and µ1 are not strongly disjoint, and hence µ0 ∗̃ µ1 and µ0 J̃ µ1 are
undefined. The weak temporal separation µ0 C̃ µ1 is, on the other hand, defined
and equal to (ε, [(`, 1), (m, 2), (`, 3), (n, 4)]). The spatiotemporal separation µ0 #̃ µ1
includes additionally the following states:
• (ε, [(`, 1), (`, 3), (m, 2), (n, 4)])
• (ε, [(`, 1), (`, 3), (n, 4), (m, 2)])
• (ε, [(`, 1), (`, 3), (m, 2), (n, 4)])
Note in particular that the latter write to `, with value 3, does not precede the
earlier write to ` with value 1, but otherwise all other interleavings are included.
This is crucial to a locality argument because it ensures that a load in state µ1, if
safe, will have the same result as a load in the separated state (µ0 #̃ µ1).
It is easy to see that spatiotemporal separation generalizes both spatial and
temporal separation. In the spatial case, the strongly disjoint definedness condition
obviously implies the weakly disjoint definedness condition for spatiotemporal sep-
aration. And when the memory systems are strongly disjoint h0\\h1 = h0 ] h1 by
57
Lemma 1 and b0\\b1 = b0 ] b1 by Lemma 2. For the strong temporal case, the de-
finedness conditions are again obviously stronger, and b0 ++ b1 ∈ b0\\b1. In the weak
case, b0 = ε ∨ h1 = ∅ implies dom(b0) ∩ dom(h1) = ∅, and again b0 ++ b1 ∈ b0\\b1.
As with spatial separation, we lift spatiotemporal separation up to a function
on the power domain of memory systems (and states), and abuse notation to refer
to whichever function is appropriate in context:









Spatiotemporal separation is associative and has (∅, ε) as a unit as for the other
separators, but it is not commutative.
3.4.4 Flushing Closure
In Section 3.4 we considered the task of showing, for some configuration C and set of
states S, that C is safe and, if it evaluates to a configuration skip, σ then σ ∈ S. The
second part of this task is equivalently restated as requiring that range(C) ⊆ S. The
set range(C) has a special structure worth noting: namely, it is down-closed w.r.t.
the flushing order. That is, if σ ∈ range(C) and σ →
τ
σ′ then σ′ ∈ range(C) as well.
This is a consequence of the fact that the nondeterministic flushing of buffered writes
is incorporated in the evaluation semantics of programs; from c-tau, if σ →
τ
σ′ then





∗→ skip, σ → skip, σ′.
By transitivity, C ∗→ skip, σ′ and so by definition of the range of a configuration,
σ′ ∈ range(C).
For example, the set S0 =df {(s, ∅, [(`, v)]) | s ∈ Stack}, which consists of
states that have a single buffered write, is not closed because that write, according
58
to the definition of the flushing relation →
τ
in Section 3.3, may nondeterministically
commit to memory as follows:
(∅, [(`, v)])→
τ
(` 7→ v, ε),
but there is no stack s such that (s, ` 7→ v, ε) ∈ S0. On the other hand, the set
S1 =df {(s, ` 7→ v, ε) | s ∈ Stack} is closed because each include state is completely
flushed, and so none of the included states may take additional flushing steps beyond
the bounds of S1. The set S0∪S1 is also closed because the elements of S0 may step
to elements of S1. Furthermore, it is easy to see that both the empty set and the
set of all states are closed, and that closure is preserved by union and intersection.
Because the sets range(C) are closed, we may focus our attention on showing
the correctness of C w.r.t. sets S that are also closed. For if S does not have this
special structure, then either it will not be the case that range(C) ⊆ S (if, e.g.,
C
∗→ skip, σ → skip, σ′ with σ ∈ S but σ′ /∈ S), or it will be possible to demonstrate
a stronger property of the configuration; namely that range(C) ⊂ S′, for some closed
set of states S′ ⊆ S.
In Section 3.4 we also described a particular strategy for showing range(C) ⊆
S that we referred to as local reasoning. That strategy relied on the ability to
consider the set S as a composition of sets S0 and S1; i.e., for some notion of
separation •, it must be the case that S = S0 • S1. Because we choose to restrict
our attention to closed sets, it should be the case that the notion of separation
preserves the property of being closed: if S0 and S1 are down-closed w.r.t. the
flushing order, then S ought to be as well. And, in fact, for the three notions of
separation introduced in Sections 3.4.1, 3.4.2 and 3.4.3, this turns out to be the case.
Proposition 3. If S0 and S1 are closed w.r.t. the flushing order, then so are:
1. S0 ∗̃ S1;
59
2. S0 C̃ S1; and
3. S0 #̃ S1.
In each case, the proof follows from the fact that if µ ∈ µ0 • µ1 and µ′  µ
then there exists µ′0 and µ
′
1 such that µ
′
0  µ0, µ′1  µ1 and µ′ ∈ µ′0 • µ′1.
3.5 Sequential Assertions
Sequential assertions denote sets of uniprocessor machine states, and will later be
used to express the pre- and post-conditions of commands in the specification logic.
The language of sequential assertions is given by the following grammar:
Asrt P ::= b | (P ∨ P ′) | (P ∧ P ′) | (∃x : P ) | (∀x : P ) |
emp | e e′ | bar | (P ∗ P ′) | (P C P ′)
The informal meaning of the assertions above are as follows. The lifting of a boolean
expression (true, false, x = y, etc.) to an atomic formula, disjunction, conjunction
and quantifiers have the same basic meaning as in first-order logic: models for
which the boolean expression b evaluates to 1; the models that satisfy either P or
P ′, etc. The assertion emp describes states with an empty memory system (i.e.,
both an empty heap and an empty write buffer). e e′ describes a single write
to location e with value e′, either buffered or flushed to memory. bar describes
empty states in which preceding writes must have been committed to memory.
The assertion (P ∗ P ′) describes spatial separation of the states of P and P ′, as
described in Section 3.4.1. The assertion (P C P ′) describes temporal separation of
the states of P and P ′, as described in Section 3.4.2. Note that the assertion language




The meaning of sequential assertions is given by a satisfaction relation M |= P ,
relating uniprocessor models M to sequential assertions P . A model is a triple
(s, µ, γ), in which (s, µ) is a uniprocessor state, as defined in Section 3.3, and γ is a
boolean value. Informally, γ can be interpreted as indicating whether or not writes
earlier than those explicitly described by the memory system µ must necessarily
have been committed to memory. Note that the boolean γ is used in models of
assertions but not in the definition of states because it is irrelevant to the execution
of programs, and is only used to give meaning to assertions. We furthermore require
that if the heap component of the memory system is non-empty—i.e., describes some
writes that have flushed to memory—then γ = t. This is because any writes that
precede those explicitly described by the state must be flushed to memory, because
they precede writes that have been flushed to memory. Because it deals with the
flushing status of writes that precede those of a given state, the boolean γ, which
we refer to as the buffer-completeness flag, is particularly important for modeling
the bar assertion and the temporal separating conjunction. We call the pairs (µ, γ)
that are part of a uniprocessor modelM generalized uniprocessor memory systems,
and refer to them by the symbol ν.
Definition 2. A generalized uniprocessor memory system (typically indicated by
the symbol ν) is a triple, (h, b, γ), where
• (h, b) is a uniprocessor memory system; and
• γ is a boolean buffer-completeness flag;
such that if γ = f then also h = ∅.
The set of all models is abbreviated Model, and the satisfaction relation
between models and assertions P is defined by recursion on the structure of P , as
shown in Figure 3.4.
61
s, ν |= b ≡df ŝ(b) = 1
s, ν |= P ∨ Q ≡df s, ν |= P ∨ s, ν |= Q
s, ν |= P ∧ Q ≡df s, ν |= P ∧ s, ν |= Q
s, ν |= ∃x : P ≡df ∃v ∈ V : (s[x   v] , ν) |= P
s, ν |= ∀x : P ≡df ∀v ∈ V : (s[x   v] , ν) |= P
s, ν |= emp ≡df ν = (∅, ε, f) ∨ ν = (∅, ε, t)
s, ν |= bar ≡df ν = (∅, ε, t)
s, ν |= e e′ ≡df ν = (∅, [(ŝ(e), ŝ(e′))] , f) ∨
ν = (∅, [(ŝ(e), ŝ(e′))] , t) ∨
ν = (ŝ(e) 7→ ŝ(e′), ε, t)
s, ν |= P1 ∗ P2 ≡df ∃ν1, ν2 : ν ∈ (ν1 ∗̂ ν2) ∧
s, ν1 |= P1 ∧ s, ν2 |= P1
s, ν |= P1 C P2 ≡df ∃ν1, ν2 : ν = (ν1 Ĉ ν2) ∧
s, ν1 |= P1 ∧ s, ν2 |= P1
Figure 3.4: Sequential satisfaction relation
The meaning of the standard first-order logic formulas is as usual (i.e., clas-
sical). A modelM satisfies emp if its memory system µ is empty. A model satisfies
bar if its memory system is empty and its buffer-completeness flag γ is set. A
model satisfies the leads-to assertion e e′ either if its memory system consists of
an empty heap and a write buffer with a single write (ŝ(e), ŝ(e′)); or if its buffer-
completeness flag is set and its memory system consists of the single-point heap
ŝ(e) 7→ ŝ(e′) and an empty buffer. The two classes of model represent a write that
is either buffered or committed. For the case in which the write has flushed, any
previous writes must also have flushed, and so the buffer-completeness flag is set.
A model satisfies the temporal (resp. spatial) separating conjunction if it admits a
model-theoretic temporal (resp. spatial) separation, defined below, into models that
62
satisfy the constituent conjunctions.
(h1, b1, γ1) ∗̂ (h2, b2, γ2) =df {(µ, γ1 ∨ γ2) | µ ∈ (h1, b1) ∗̃ (h2, b2)}
(h1, b1, γ1) Ĉ (h2, b2, γ2) =df

(µ, γ1 ∨ γ2) if µ = ((h1, b1) C̃ (h2, b2)) ∧
(b1 = ε ∨ γ2 = f)
⊥ otherwise
Note that the above model-theoretic separation functions lift the memory-system-
theoretic separation functions defined in Section 3.4. In both cases, the buffer-
completeness flag is set in the combined model iff it is set in either of the constituent
models. Furthermore, the temporal function requires that the left-side buffer be
empty whenever the right-side buffer-completeness flag is set, which corresponds
to the informal requirement discussed above that writes which precede a buffer-
complete state must be flushed.
The set of free variables of an assertion P , written fv(P ), is defined as usual:
fv(P ∨ P ′) =df fv(P ) ∪ fv(P ′) fv(P ∧ P ′) =df fv(P ) ∪ fv(P ′)
fv(∃x : P ) =df fv(P ) \ {x} fv(∀x : P ) =df fv(P ) \ {x}
fv(emp) =df ∅ fv(e e′) =df fv(e) ∪ fv(e′)
fv(P ∗ P ′) =df fv(P ) ∪ fv(P ′) fv(P C P ′) =df fv(P ) ∪ fv(P ′)
The following lemma, analogous to Lemma 1, relates the set of free variables
of an assertion and the stack-component of the states that satisfy it:
Lemma 3. If s ∼fv(P ) s′ then s, ν |= P iff s′, ν |= P .
Proof. By induction on the structure of P , using Lemma 1 in the base cases.
63
We write [[P ]] for the set of all models that satisfy an assertion P ,
[[P ]] =df {M | M |= P} ,
and also P |= P ′ and P ≡ P ′ for semantic entailment and equivalence, respectively:
P |= P ′ ≡df [[P ]] ⊆ [[P ′]]
P ≡ P ′ ≡df [[P ]] = [[P ′]].
Assertions can thus be thought of as syntactic constructs that denote sets of
uniprocessor models, and hence sets of uniprocessor machine states. We now extend
the flushing order on memory systems described in Section 3.4.4 to generalized
memory systems:
Definition 3. For generalized memory systems ν1 = (µ1, γ1) and ν2 = (µ2, γ2),
ν2 →bτ ν1 ≡df γ1 = t ∧ µ2 →τ µ1
ν1 ≤ ν2 ≡df ν2
∗→bτ ν1
It is easy to see that this defines a partial order on generalized memory
systems. Furthermore, the set of models denoted by assertions is closed with respect
to this order.
Proposition 4. If s, ν |= P and ν ′ ≤ ν then s, ν ′ |= P .
A corollary of this lemma is that the set of states denoted by an assertions
is closed w.r.t. the flushing order.
Corollary 1. If s, µ, γ |= P and µ′  µ then s, µ′, t |= P .
Proof. (µ′, t) ≤ (µ, γ) because µ′  µ ∧ t = t; then Lemma 4.
64
An effect of this is that assertions are oblivious to the nondeterministic flush-
ing of buffered writes to memory because they denote all possible memory systems
that may result from such flushing. Intuitively, assertions may be thought to describe
only the “initial” states, in which no nondeterministic flushing of writes has taken
place, though the semantics encompasses all states reachable as a result these steps.
We consider this property to be an important feature of the assertion language—and,
hence, of the specification language.
Consider, as a significant example, the set of states that satisfy the atomic
formula e e′. These states may be classified as follows:
1. states that describe a single buffered write, and
2. states in which that write has been committed to memory.
Consider, as a less trivial example, the states that satisfy the compound assertion
1  3 C 1  4. The intuitive “initial” state is one with two successive buffered
writes to location 1. The states of the earlier left-side write assertion include ones
in which the buffered write has and has not committed, and similarly for the later
right-side write assertion. When these two classes of states are combined with the
sequential separation function, three classes of states result: those in which neither
write has flushed, those in which only the earlier write has flushed, and those in
which both have flushed. Crucially, the definition of weak sequential separation
rules out the case in which the later write has flushed but not the earlier write. This
is summarized in Figure 3.5. This results in a set of states that is closed w.r.t. the
flushing relation.
The requirement that assertions denote sets of states that are closed w.r.t.
flushing also explains why we have chosen atomic formulas that describe an empty
buffer, bar, as well as an empty heap and empty buffer, emp. A conceivable al-
ternative might be to use one atomic formula to describe empty heaps, say with
65
1 3 1 4 1 3 C 1 4
(∅, [(1, 3)] , f) (∅, [(1, 4)] , f) (∅, [(1, 3), (1, 4)] , f)
(∅, [(1, 3)] , t) (∅, [(1, 4)] , f) (∅, [(1, 3), (1, 4)] , t)
(1 7→ 3, ε, t) (∅, [(1, 4)] , f) (1 7→ 3, [(1, 4)] , t)
(∅, [(1, 3)] , f) (∅, [(1, 4)] , t) ⊥
(∅, [(1, 3)] , t) (∅, [(1, 4)] , t) ⊥
(1 7→ 3, ε, t) (∅, [(1, 4)] , t) (1 7→ 3, [(1, 4)] , t)
(∅, [(1, 3)] , f) (1 7→ 4, ε, t) ⊥
(∅, [(1, 3)] , t) (1 7→ 4, ε, t) ⊥
(1 7→ 3, ε, t) (1 7→ 4, ε, t) (1 7→ 4, ε, t)
Figure 3.5: Sequential assertion semantics example
arbitrary buffers, say heapemp, and another to describe empty buffers with arbi-
trary heaps. (Then the assertion emp could be defined as a simple conjunction.)
But heapemp is unsuitable because it does not describe a set of states that is closed
under flushing. For if any write is flushed from a buffer in a state with an empty
heap, the resulting state would have a heap that is nonempty.
The closure requirement also explains why we have omitted negation (and
implication) from the assertion language. In a naive semantics, a state might be
said to satisfy ¬P if and only if it does not satisfy P ; i.e., as a complement:
[[¬P ]] =df Model \ [[P ]].
But the complement of a down-closed set is not generally closed itself, which would
violate the predicate requirement.
Let us call the complement of a closed set an open set.5 Open sets are rather
curious; for example, an open set of memory systems that contains (1 7→ 2, ∅, t) also
contains every possible memory system that, when flushed, results in this memory
system, e.g., (∅, [(1, 2)]), (∅, [(1, 3), (1, 2)]), (∅, [(1, 4), (1, 3), (1, 2)]), etc. Open sets
5In fact, the open and closed sets described here appear to form a particular kind of topological
space, called an Alexandrov space, which is characterized by the property that the property of being
open or closed is preserved by arbitrary intersection, and not just finite intersections.
66
are thus not appropriate for describing individual writes. Sets that are both closed
and open might well yield an elegant algebra, allowing full negation and implication,
but it is not clear how such an algebra could be used for general program reasoning.
Also note that it is important to define the satisfaction relation so that
closure of atomic formulas is immediate and closure is preserved by each connective,
as opposed to closing the entire relation at once. For example, if [[e e′]] were not
closed w.r.t. flushing, then
[[e e′ C bar]] = [[e e′]] Ĉ [[bar]] = ∅,
and hence so too would be its flushing closure.
3.5.2 Sequential Assertion Abbreviations
In this section we extend the language of assertions with additional constructs whose
meaning can be defined in terms of the assertions already described.
The following abbreviation, analogous to the points-to formula of separation
logic, describes the result of flushing a single write to memory:
e 7→e′ =df e e′ C bar
That is, we may describe the value of a location in memory by describing a buffered
write to that location followed by a barrier assertion.
As indicated at the end of Section 3.4.2, we also introduce a strong temporal
separating conjunction as the (additive) conjunction of spatial and weak-temporal
conjunctions:
P J P ′ =df (P C P ′) ∧ (P ∗ P ′).
Because the spatial separating conjunction P ∗ Q is commutative, there is
no need to define its converse operation. The temporal separating conjunction,
67
however, is not. Hence, we define P B Q as shorthand for Q C P , and similarly for
P I Q:
P B Q =df Q C P
P I Q =df Q J P
3.5.3 Separating Implications
Although not required for the sequential logic, the concept of a separating implication
will be useful later. Thus, we shall introduce the concept first in the comparatively
simple sequential setting to ease their discussion later.
Separating implications are related to separating conjunctions in the same
way that the additive implication (P ⇒ Q) is related to the additive conjunction
(P ∧ Q), namely:
(P ⇒ Q) ∧ P |= Q.
For example, the separating implication for spatial separation, written P −∗ Q,
describes states that, when spatially conjoined with another state that satisfies P ,
satisfy Q. Or, less formally, P −∗ Q describes Q states with P -shaped holes. For
example, if Q describes a particular kind of record and P a field in that record, then
P −∗ Q describes records that lack the P field. Consequently, adding such a field
using the spatial separating conjunction, (P −∗ Q) ∗ P , yields a Q record:
(P −∗ Q) ∗ P |= Q.
The meaning of the spatial implication P −∗ Q is given as follows:
(s, µ) |= P −∗ Q ≡df ∀µ0, µ1 : s, µ0 |= P ∧ µ1 = µ0 ∗ µ ⇒ s, µ1 |= Q
68
Note that, if defined directly, the converse relation P ∗− Q would be equiva-
lent to Q −∗ P because spatial separation is commutative. Consequently, we simply
define P ∗− Q to be shorthand for Q −∗ P . On the other hand, temporal separa-
tion is not commutative, and so we may define two notions of temporal implication:
P −C Q and P C− Q. The former describes states that, when temporally conjoined
on the left side (i.e., the past) with a state that satisfies P , satisfies Q. The latter
describes states that, when temporally conjoined on the right side (i.e., the future)
with a state that satisfies P , satisfies Q. Formally:
(s, µ) |= P −C Q ≡df ∀µ0, µ1 : s, µ0 |= P ∧ µ1 = µ0 C µ ⇒ s, µ1 |= Q
(s, µ) |= Q C− P ≡df ∀µ0, µ1 : s, µ0 |= P ∧ µ1 = µ C µ0 ⇒ s, µ1 |= Q
The entailments that characterize the relationship between these implications and
temporal conjunction are as follows:
P C (P −C Q) |=Q
(Q C− P ) C P |=Q
As with spatial implication, we simply define the converse relations as shorthand:
P B− Q =df Q −C P
P −B Q =df Q C− P
The right-side temporal implication, and its characterizing entailment, is
especially useful when used with the bar assertion. In particular, the formula
Q C− bar describes states which, if flushed, would satisfy Q. For example, x 7→
69
1 C− bar describes states with any number of buffered writes to address x, with
the final one having value 1. One consequence is that:
(x 2) C (x 1) |= (x 7→1) C− bar.
The ability to express such sets of states will be of crucial importance to the program
logic for concurrent programs described in the following chapter.
3.5.4 Sequential Algebra
A few additional semantic equivalences and entailments are shown in Figures 3.6
and 3.7, respectively. If a formula contains instances of •, then that is short-hand
for the same formula in which the • has been consistently replaced by any of the
four separating conjunctions.
P • emp ≡P
emp • P ≡P
(P • P ′) • P ′′ ≡P • (P ′ • P ′′)
P ∗ P ′ ≡P ′ ∗ P
bar • bar ≡bar
(P • P ′) C bar ≡ (P C bar) • (P ′ C bar)
P J bar ≡P C bar
Figure 3.6: Sequential semantic equivalences
Each of the separating conjunctions is associative and has emp as a unit.
Furthermore, the spatial separating conjunction is also commutative. The sepa-
rating conjunctions are additive w.r.t. the barrier assertion bar, and bar also dis-
tributes fully through the separating conjunctions. Finally we note that the strong
and weak temporal separating conjunction of a barrier assertion are equivalent be-




P • P ′ |=P ′′ • P ′ if P |= P ′′
P • P ′ |=P • P ′′ if P ′ |= P ′′
e 7→e′ |= e e′
P J P ′ |=P ∗ P ′
P J P ′ |=P C P ′
P • (P ′ ◦ P ′′) |= (P • P ′) ◦ P ′′ for P • P ′ |= P ◦ P ′
(P ◦ P ′) • P ′′ |=P ◦ (P ′ • P ′′) for P • P ′ |= P ◦ P ′
(P ∗ P ′) J (P ′′ ∗ P ′′′) |= (P J P ′′) ∗ (P ′ J P ′′′)
Figure 3.7: Sequential semantic entailments
The first three entailments—that bar strengthens emp and monotonicity of
the separating conjunctions—follow directly from the definition of the satisfaction
relation. The next three entailments follow from their respective abbreviation ex-
pansions and by monotonicity. The three separating conjunctions naturally form
a sort of lattice, and they satisfy (the second- and third-to-last entailments) what
are known as small exchange laws [23]; the full exchange law (the final entailment)
only holds for the spatial and strong temporal conjunctions. The full exchange law
does not hold for the other combinations of separating conjunctions because, e.g.,
it implies commutativity of the main connective in the consequent, and the other
conjunctions are not commutative.
3.6 Sequential Specifications
A sequential program specification is a three-tuple, written as follows:
` {P} c {Q} ,
71
where c is a command and P,Q are assertions, referred to as the pre-condition
and post-condition, respectively. The specifications considered here assert partial
correctness of a command, which means that any specification of a nonterminating
command is true. Their informal meaning is roughly analogous to that of separation
logic: if c is evaluated in a state that satisfies P then: 1) it does not abort, and 2)
if it evaluates fully, it terminates in a state that satisfies Q.
3.6.1 Sequential Proof Theory
We now present a proof theory for deriving true sequential program specifications.
The axioms and inference rules are strongly inspired by Separation Logic; the axioms
and inference rules of Hoare Logic are included verbatim.
` {P} skip {P} (skip)
` {!b ∨ P} assume(b) {P} (assume)
` {b ∧ P} assert(b) {P} (assert)
` {P [e/x]} x := e {P} (assign)
` {e e′ J P} x := [e] {(e e′ J P ) ∧ x = e′} (load)
` {e e′′ J P} [e] := e′ {(e e′′ J P ) C e e′} (store)
` {emp} fence {bar} (fence)
Figure 3.8: Sequential axioms
The axioms of the logic are given in Figure 3.8.6 The axiom for skip indi-
cates its evaluation preserves any pre-condition into a post-condition. The fact that
assertions denote down-closed sets of states is crucial for the soundness of this rule,
because a configuration skip, σ may well have non-trivial transitions. That is, it may
6The axioms and inference rules of the proof theory are of course correctly referred to as axiom
schemas and rule schemas, because they must be instantiated metalogically with particular formulas,
commands and expressions.
72
be the case that, for some σ |= P , that skip, σ → skip, σ′ with σ′ 6= σ. In this case
it is clear from the operational semantics of commands that σ′  σ, and so σ′ |= P
as well because assertions denote sets that are closed w.r.t. the flushing relation.
The axioms for both the assume and assert commands ensure that the prop-
erty P holds if the commands either evaluate successfully or diverge. In the case
of assume(b), the command will execute successfully if the boolean expression b
evaluates to 1 in the current state and P holds to begin with, and will diverge if b
evaluates to 0. Hence, the pre-condition of assume(b) requires (!b ∨ P ), which means
that either the command will diverge or the property P holds to begin with. In the
case of assert(b), if b evaluates to false then the command will abort, which would
invalidate the axiom. Hence, the pre-condition requires both that b evaluate to true
and that P holds to begin with; i.e., that (b ∧ P ) is true of the starting state.
The assignment axiom, as in Hoare logic, effectively transforms a formula
with occurrences of the expression to be assigned into one with that expression
replaced by the variable to which the expression was assigned. For example, `
{y ≤ x+ 1} x :=x + 1 {y ≤ x} can be derived from the assignment axiom by in-
stantiating P by y ≤ x, given that (y ≤ x) [x+ 1/x] = y ≤ x+ 1.
The axiom for load requires, in the pre-condition, that the value of the most
recent write e e′ be specified, along with any other writes P that temporally suc-
ceed it. These additional writes must be to locations other than e, for otherwise the
specified write e e′ would not be the most recent write. Consequently, we use the
strong temporal separating conjunction to partition these additional writes in both
space and time: e e′ J P . Note that if P does assert the existence of succeed-
ing writes to location e, then the pre-condition is inconsistent and the specification
becomes vacuously true. Evaluating the load command does not explicitly change
the heap or buffer of the current state, just the value of the variable into which the
value is loaded. Consequently, the post-condition is just the additive conjunction of
73
the pre-condition and an equality x = e′ relating the loaded variable and the value
of the most recent write.
The write described in the pre- and post-conditions of the load axiom is
specified with the leads-to assertion. This assertion is used to describe writes that
may or may not be buffered. If the write is buffered, it could flush to memory before
or after the load has completed. But, again, because assertions denote sets of states
that are closed w.r.t. flushing, the soundness of the rule is preserved.
Like the load command, the store command similarly requires that the ad-
dress to be updated is already allocated. In this model, this simply means that there
is an existing write to that address, either buffered or flushed. The pre-condition
for the store axiom requires, as a witness to the allocation status of the address to
be updated, specification of the most recent write e e′. Of course, there may be
other, more recent, buffered writes, which do not affect the behavior of the store
command. These are described by the assertion P , and are combined using the
strong temporal separating conjunction with the witness:7 e e′ J P . Because e is
asserted by the pre-condition to be an allocated address, the store command is safe,
and as a result simply appends a new write after all others: (e e′ J P ) C e e′′.
Finally, the axiom for fence specifies an empty pre-condition, because the
fence command may execute without any particular assumptions about the state.
The result of the fence command, as specified by the post-condition, is simply to
introduce the bar assertion. This specification is particularly useful when used
in conjunction with a frame rule for a temporal separating conjunction, described
below.
The inference rules of the sequential logic are given in Figure 3.9. The logical
rules disj and ex allow the pre-condition to be weakened. Dually, the logical rules
7Use of the weak temporal separating conjunction here instead of the strong variant would also
be sound, as the leads-to assertion in the pre-condition serves only as a witness to the allocation
status of the location to which the command will store. The given axiom asks for the most recent
write to that location, but in fact any previous write would be suitable as a witness.
74
` {P} c {Q} ` {P ′} c {Q}
` {P ∨ P ′} c {Q}
(disj)
` {P} c {Q} x /∈ fv(c,Q)
` {∃x : P} c {Q}
(ex)
` {P} c {Q} ` {P} c {Q′}
` {P} c {Q ∧ Q′}
(conj)
` {P} c {Q} x /∈ fv(c, P )
` {P} c {∀x : Q}
(all)
` {P} c {Q} mod(c) ∩ fv(R) = ∅
` {R ∗ P} c {R ∗ Q}
(frame-sp)
` {P} c {Q} mod(c) ∩ fv(R) = ∅
` {R C P} c {R C Q}
(frame-tm)
` {P} c {Q} mod(c) ∩ fv(R) = ∅
` {R J P} c {R J Q}
(frame-stm)
P |= P ′ ` {P ′} c {Q′} Q′ |= Q
` {P} c {Q}
(cons)
` {P} c {R} ` {R} c′ {Q}
` {P} c ; c′ {Q}
(seq)
` {P} c {Q} ` {P} c′ {Q}
` {P} c+ c′ {Q}
(choice)
` {P} c {P}
` {P} c∗ {P}
(loop)
Figure 3.9: Sequential inference rules
75
conj and all allow the post-condition to be strengthened.
The next three inference rules are variants of the frame rule of separation
logic; one for each of the three separating conjunctions: spatial, temporal and strong
temporal. Note that, although the strong temporal conjunction is defined as the
additive conjunction of spatial and temporal conjunctions, its frame rule is not deriv-
able from the other rules, which justifies the rule’s inclusion. The spatial separating
conjunction is commutative, but the others—which have a temporal aspect—are
not. We have left-side frame rules only for these conjunctions.
The rule of consequence, as in Hoare Logic, allows arbitrary strengthening
of the pre-condition and weakening of the post-condition. Typically the side condi-
tions, which ensure that the pre-condition is strengthened and the post-condition is
weakened, are described using syntactic entailment among assertions. But because
in this project we have not developed a proof system for syntactic entailment—
though some rules can be gleaned from Section 3.5.4—we describe these conditions
using semantic entailment instead.
The remaining structural rules are exactly as in Hoare Logic. The rule for the
sequential composition c ; c′ requires identifying an intermediate assertion used as a
post-condition for the former command and a pre-condition for the latter. A speci-
fication can be proved of the nondeterministic choice c+ c′ if the same specification
can be proved of both commands individually. Finally, an invariant specification
holds for a loop if the body the loop maintains the invariant.
Derived and Alternative Axioms and Inference Rules
Some other useful rules may be derived from the set given in Figure 3.9. For example,
it is possible to derive modified load and store axioms that describe flushed writes,
with the points-to assertion, in the pre- and post-condition. Because the points-
to assertion e 7→ e′ is defined as e e′ C bar, we may instantiate the placeholder
76
formula P in either axiom schemas with a leading barrier assertion. Here is a derived
load-mem axiom schema
` {e e′ J (bar J P )} x := [e] {(e e′ J (bar J P )) ∧ x = e′}
load
` {e 7→e′ J P} x := [e] {(e 7→e′ J P ) ∧ x = e′}
cons
And a derived store-mem axiom schema:
` {e e′ J (bar J P )} [e] := f {(e e′ J (bar J P )) C e f}
store
` {e 7→e′ J P} [e] := f {(e 7→e′ J P ) C e f}
cons
The rule of consequence is applied properly above because the strong temporal
conjunction is associative, and because
e e′ J bar ≡ e e′ C bar =df e 7→e′,
according to the equivalence from Figure 3.6 and the definition of the points-to
assertions.
Another example of a useful derived rule is a “global” fence axiom, which
describes the result of a fence operation on a potentially complete system state
description:
` {emp} fence {bar} fence
` {P C emp} fence {P C bar} tm-frame
` {P} fence {P C bar} cons
We can also use the sequential separating implication to give an alternative
axiom for the fence command for backwards reasoning:
` {P C− bar} fence {P} backwards-fence
77
This axiom describes a sufficiently weak8 pre-condition for the operation of the fence
command in terms of an arbitrary post-condition P .
3.6.2 Semantics of Sequential Specifications
Following Vafeiadis [54], the formal semantics of specifications is given by a family of
predicates, safen(c, s, µ,Q), parametrized by n ∈ N, that relate a command c, state
(s, µ) and post-condition Q according to the informal explanation above. Once these
predicates are defined, we define truth of specifications as follows:
|= {P} c {Q} ≡df ∀(s, µ) ∈ State, n ∈ N : s, µ, t |= P ⇒ safen(c, s, µ,Q).
Note that we only consider models that satisfy the pre-condition with complete write
buffers—i.e., with γ = t.
The formal definition of safen(c, s, µ,Q) is given by natural number induction
on n. safe0(c, s, µ,Q) holds always. And for n ∈ N, safen+1(c, s, µ,Q) holds iff the
following conditions are true:
1. If c = skip then (s, µ, t) |= Q.
2. For all µ0, µF such that µ0 ∈ (µF #̃ µ) it is the case that c, (s, µ0) 9  .
3. For all µ0, µF , c′, s′, µ1 such that
(i) µ0 ∈ (µF #̃ µ),
(ii) c, (s, µ0)→ c′, (s′, µ1),
there exists µ′F , µ
′ such that
(a) µ1 ∈ (µ′F #̃ µ′),
(b) µ′F  µF , and
8It seems likely that P C− bar is in fact the weakest pre-condition, but this has not been proved.
78
(c) safen(c′, s′, µ′, Q).
The first part ensures that if a command is fully evaluated (i.e., is skip) after
n + 1 steps of evaluation, then the state satisfies the post-condition. The second
part ensures that the command, when evaluated in any more completely described
state—as defined by the spatiotemporal notion of separation—does not abort after
n + 1 steps of evaluation. The third part ensures that the command, if safely
evaluated for n steps, preserves safety for one additional step. The definition of
the predicate above differs significantly from Vafeiadis’ in that the frame state µF
is allowed to change from one step of evaluation to the next, but only by making
silent transitions. This represents the fact that the flushing of buffered writes is
nondeterministic and not controlled by the command. If the frame memory system
µF contains buffered writes, they may well flush to memory during evaluation; and
if those writes succeed buffered writes described by the local memory system µ, then
those writes too must have been flushed.
Note that the two conditions for locality—the safety monotonicity and frame
properties, as described in Section 3.4—are implicit in this definition. Safety mono-
tonicity requires that if the command does not abort in a memory system µ then
it also does not abort in a more completely defined memory system µF #̃ µ. The
second part requires that the command not abort in µF #̃ µ for any memory sys-
tem µF . If the command did abort in memory system µ, the specification would
simply not be true. Otherwise, the second part ensures that it also does not abort in
any more completely defined memory system. The third part is actually somewhat
weaker than the frame property, requiring only that if the command can take a step
in a more complete state then the command remains safe for the local state, instead
of the stronger requirement that the command may take an analogous step from the
local state. But this condition is sufficient to show the soundness of the spatiotem-
poral frame rule. It is also important to note that we use only the spatiotemporal
79
notion of separation in the definition of the safety predicate. The soundness of the
frame rules for the stronger (spatial and temporal) conjunctions will later be shown
to follow from this more general definition.
80
Chapter 4
A Concurrent Program Logic
This chapter describes a program logic for a parallel programming language modeled
w.r.t. a weak-memory multiprocessor system model. This is a significant general-
ization of the program logic for a sequential language described in Chapter 3.
4.1 An Example Concurrent Proof
Consider the following simple program, cs, which updates a single memory location
while holding a global memory lock:
cs =df lock0 ; [d] := 10 ; unlock0 .
Each primitive command in this program is annotated with the processor identifier
0, which indicates that, on a multiprocessor machine, it will be executed on the 0th
processor. The program first acquires a global lock with the lock primitive, then
stores the value 1 to the location given by d, and then finally releases the lock with
the unlock primitive.
Next consider the program cr, which reads the same memory location while
81
holding the global memory lock:
cr =df lock1 ;x := [d]1 ; unlock1 ; if x = 1 then ([d] := 21 ; fence1).
Here, the value in memory at address d is loaded into x after acquiring the lock and
before releasing it; if the result of the load was 1 (i.e., if x = 1) the value 2 is written
to address d and flushed back to memory. The primitive commands are annotated
with the processor identifier 1, which indicates that the command will be executed
on the 1st processor on a multiprocessor machine.
The parallel composition of these commands, cs || cr, is a very simple message-
passing program. The sending thread cs communicates to the receiving thread by
setting a flag at address d. If the receiving thread loads the address d and observes
that the flag has been set, then it knows something about the progress of the sending
thread: namely, that it completed execution up to and including the location of the
flag setting in the program order. In this particular program, if the receiver observes
that the flag has been set then the receiver may claim sole ownership of the address
d and may read and write to it without concern of future interference, and hence
without first having acquired the global memory lock.
We might like to prove such a specification: i.e., assuming d 7→ 0 and x = 0
holds initially, that the composition is safe and that d 7→x+1 holds upon termination:
` {x = 0 ∧ d 7→0} cs || cr {d 7→x+ 1} (4.1)
In the concurrent program logic, however, specifications distinguish between private
state, which is only accessible to the specified command, and shared state, which
is accessible to the environment at large. Private state is still described with pre-
and post-conditions, but the shared state is described using a single assertion, which
must be maintained as an invariant throughout the specified command’s execution.
82
We write J ` {P} c {Q} for the specification of a (possibly) concurrent command
c with shared state described by invariant assertion J , local pre-condition P and
local post-condition Q. The specification in Equation 4.1 is completely private to
the parallel command cs || cr, and so the shared invariant is simply emp, an assertion
that describes no memory addresses. Thus, the revised specification is:
emp ` {x = 0 ∧ d 7→0} cs || cr {d 7→x+ 1} (4.2)
From the perspective of the constituent commands cs and cr, however, the
memory at address d must be considered shared because neither command makes
sole use of the memory at that address. Hence, to prove specifications of the con-
stituent commands, we must describe the evolution of the memory at address d using
an invariant assertion and then show that both commands maintain that invariant.
To aid this description, it can help to augment the commands with auxiliary
program variables that track the progress of their evaluation. In this case, it helps
to augment the sender thread cs with a variable assignment that indicates that the
memory update has taken place. The in the augmented thread c′s below, s is an
auxiliary variable, the assignment to s is considered to be an auxiliary assignment,
and that assignment does not appear in the original thread cs:
c′s =df lock0 ; [d] := 10 ; s := 10 ; unlock0
We can now provide an invariant that is maintained by both the augmented
sender c′s and the given receiver cr. Informally, the evolution of the value and
ownership of the memory at address d can be described as follows: the memory at
address d is shared between the threads and has value s until the value of x becomes
1, when the receiver takes sole ownership of d. And, crucially, it is impossible for
x = 1 and s = 0 simultaneously because the receiver will only take ownership if
83
the sender has set the flag. So, more formally, either the receiver has not taken
ownership and d is shared with value s—i.e., x = 0 ∧ d 7→ s—or the receiver takes
sole ownership with x = 1 and s = 1 and the address is no longer shared—i.e.,
x = 1 ∧ emp. Hence, we define the invariant I as follows:
I =df (x = 0 ∧ (s = 0 ∨ s = 1) ∧ d 7→s) ∨ (x = 1 ∧ s = 1 ∧ emp).
We may then try to prove the following constituent specifications, which
maintain the invariant I:
I ` {s = 0 ∧ emp} c′s {s = 1 ∧ emp}
I ` {x = 0 ∧ emp} cr {(x = 0 ∧ emp) ∨ (x = 1 ∧ d 7→2)}
The first specification asserts that the invariant I is maintained by c′s and that,
if evaluated in a state with s = 0 with no thread-private memory, the command
also finishes evaluation with no private memory addresses and s = 1. The second
specification asserts that the invariant I is maintained by cr and that, if evaluated in
a state with r = 0 and with no thread-private memory, the command either finishes
with no thread-private memory (when x = 0) or with private ownership of d (when
x = 1).
If we are successful in proving these specifications, then we may use an infer-
ence rule (par) to derive a specification for the parallel composition of commands
whose individual specifications agree to maintain the same invariant, as do the above
specifications of c′s and cr:
...
I ` {Ps} c′s {Qs}
...
I ` {Pr} cr {Qr}
I ` {Ps ∗ Pr} c′s || cr {Qs ∗ Qr}
par
84
For space reasons, we abbreviate the pre-conditions of c′s and cr as Ps and Pr
respectively, and the post-conditions as Qs and Qr respectively, repeated below:
Ps =df s = 0 ∧ emp
Pr =df x = 0 ∧ emp
Qs =df s = 1 ∧ emp
Qr =df (x = 0 ∧ emp) ∨ (x = 1 ∧ d 7→2).
Next, an inference rule (share) may be applied that allows shared state to be
considered as local to the specified commands:
...
I ` {Ps} c′s {Qs}
...
I ` {Pr} c′r {Qr}
I ` {Ps ∗ Pr} c′s || c′r {Qs ∗ Qr}
par
emp ` {I ∗ Ps ∗ Pr} c′s || c′r {I ∗ Qs ∗ Qr}
share
The pre-condition can then be strengthened and the post-condition weakened
using the rule of consequence (cons) with the following semantic entailments:
s = 0 ∧ x = 0 ∧ d 7→0 |= I ∗ Ps ∗ Pr
I ∗ Qs ∗ Qr |= d 7→(x+ 1)
With a final inference rule (aux)1 we remove any mention of auxiliary variables
from the commands and assertions, giving the following derivation of the desired
1No attempt is made here to formalize the notion of auxiliary variable or the auxiliary variable
elimination rule.
85
specification from Equation 4.2:
...
I ` {Ps} c′s {Qs}
...
I ` {Pr} cr {Qr}
I ` {Ps ∗ Pr} c′s || cr {Qs ∗ Qr}
par
emp ` {I ∗ Ps ∗ Pr} c′s || cr {I ∗ Qs ∗ Qr}
share
emp ` {s = 0 ∧ x = 0 ∧ d 7→0} c′s || cr {d 7→x+ 1}
cons
emp ` {x = 0 ∧ d 7→0} cs || cr {d 7→x+ 1}
aux
It remains to show that the sequential commands c′s and cr satisfy their indi-
vidual specifications, each maintaining the invariant I along the way. The specifica-
tion of c′s follows from an application of the inference rule (atomic) for well-locked
commands:
emp ` {(lock0 ∗ I ∗ Ps) C bar0} c {(lock0 ∗ I ∗ Qs) C− bar0}
I ` {Ps} lock0 ; c ; unlock0 {Qs}
atomic
where we abbreviate [d] := 10 ; s := 10 as c above.
This rule, in essence, asserts that if a command c can be shown to main-
tain an invariant as part of its local specification, then the corresponding locked
command must also maintain the invariant as part of the shared state. In particu-
lar, because the introductory lock0 command acquires the global lock and the final
unlock0 command releases it, the pre- and post-conditions both assert ownership of
the lock with the lock0 assertion. While the lock is held the command may tem-
porarily violate the invariant, so long as it is repaired by the time the command has
finished executing and before releasing the lock with the unlock primitive.
Similarly, the pre- and post-condition both assert private ownership of the in-
variant I, along with but distinct from the local pre-condition Ps and post-condition
Qs. The lock0 command implicitly fences, flushing any pending writes described by
the invariant or local state, and so the entire pre-condition in the antecedent is suc-
86
ceeded by bar0. The unlock0 command also implicitly fences, so the post-condition
of the antecedent is generalized to include states that would satisfy the invariant and
local post-condition if they were succeeded by bar0. This generalization is accom-
plished with the temporal separating implication, P C− Q, which describes states
that satisfy P if they are temporally combined on the right with states that satisfy
Q. For example, assertion d 7→ 0 C− bar0 describes states which, when processor
0 is flushed, satisfy d 7→ 0; that is, states which include any number of writes to
d, followed by a write to d with value 1. A proof sketch of the antecedent follows
below:
{(lock0 ∗ I ∗ Ps) C bar0} ∴
{(lock0 ∗ (x = 0 ∧ d 7→s) ∗ (s = 0 ∧ emp)) C bar0} ∴
{(lock0 ∗ (x = 0 ∧ emp) ∗ d 7→−) C bar0} ∴
{lock0 ∗ (x = 0 ∧ emp) ∗ d 7→−}
[d] := 10
{lock0 ∗ (x = 0 ∧ emp) ∗ d 7→− C d 0 1}
s := 1
{lock0 ∗ (x = 0 ∧ s = 1 ∧ emp) ∗ d 7→− C d 0 1} ∴
{(lock0 ∗ x = 0 ∧ s = 1 ∧ d 7→1) C− bar0} ∴
{(lock0 ∗ (x = 0 ∧ ∧ d 7→s) ∗ (s = 1 ∧ emp)) C− bar0} ∴
{(lock0 ∗ I ∗ Qs) C− bar0}
In the proof sketch above, we separate successive assertions with ∴ to indicate
implication—weakening the pre-conditions and strengthening the post-conditions
found in the triples—allowed by the rule of consequence. In the first triple, the
pre-condition describes the memory address d is allocated but has an unknown
value (d 7→−), while the post-condition of the store command [d] := 10 describes the
addition of a buffered write on processor 0 (d 7→− C d 0 1). The post-condition of
the assignment command s = 1 simply asserts the equality. The final sequence of
87
implications indicates that, once a barrier is performed, the buffered write will be
flushed, yielding the correct value in memory.
The specification for cr can be proved similarly to that of c′s. The locked
portion of the command is proved using the atomic rule; and the rest by basic
sequential reasoning. A proof sketch of the specification of cr follows below:
{x = 0 ∧ emp}
lock1 ;x := [d]1 ; unlock1
{(x = 0 ∧ emp) ∨ (x = 1 ∧ d 7→1)}
if(x)
{x = 1 ∧ d 7→1}
[d] := 21
{(x = 1 ∧ d 7→1) C d 1 2}
fence1
{x = 1 ∧ d 7→2}
{(x = 0 ∧ emp) ∨ (x = 1 ∧ d 7→2)}
This completes the proof of the concurrent program specification of Equation 4.2.
4.2 Concurrent Programs
In this section we describe a simplified C-like structured programming language
with concurrency. The primitive commands closely resemble the basic memory
events described by the memory model, while the composite commands are typical
for high-level languages. In particular, note that this is not an assembly language.
This particular language of commands was chosen to be simple to reason about, but
also at a suitable level of detail for describing concurrent data structures. Such algo-
rithms are typically expressed using high level constructs like loops and if-then-else
statements, along with basic atomic constructs like compare-and-swap, indication
88
of where fencing is required, etc.
The concurrent language differs from the sequential language described in
Section 3.3 in two ways. First, commands may be composed in “parallel”, which
indicates that there are no program-order dependencies between the primitives of
the constituent commands. Second, each primitive must be annotated with an
expression that indicates the processor on which the primitive will be executed. As
is discussed below, this will allow us to express both concurrent and interleaving
parallelism uniformly.
In this setting, programs are identified with commands, which consist of
various compositions of primitive commands for accessing and modifying state. In
order to restrict the scope of the project, dynamic memory management commands
(e.g., memory allocation and disposal) have been omitted.2
The primitive commands for the concurrent language are, roughly, a superset
of those for the sequential language, as described in Section 3.3:
PComm p ::= skip | assume(b) | assert(b) | x := e | x := [e] |
[e] := e′ | fence | lock | unlock
Compared to the primitives in the sequential language, the lock and unlock com-
mands are new to the concurrent language, and serve to manipulate a single, global
lock, which can either be free (available) or busy (held by a particular processor).
The formal semantics of a successful execution step by a primitive command
p is given as a transition relation, parametrized by a processor identifier i, between
some machine states σ and σ′:
p : σ →
i
σ′.
Such a quadruple may be informally interpreted to mean that when primitive p is
2This commands were considered in earlier iterations of this project [56, 57], and are discussed
in Section 5.3
89
executed on processor i in machine state σ, it may evaluate in a single, atomic step
to yield state σ′. (A formal interpretation will be given later in the context of a
formal semantics of full commands.) This differs from the relation used to describe
the semantics of the primitives in the sequential language, described in Section 3.3,
by requiring that the processor on which the primitive is to execute be specified
explicitly.
A primitive command, in state σ on processor i, may alternatively abort
upon execution. This is indicated as follows
p : σ →
i
 .
For example, an assertion may fail or a process may attempt to access an unallocated
(from the process’s perspective) memory address.
To define the combined primitive evaluation relation formally, we must first
define the notions of memory system and machine state.
Definition 4. A multiprocessor memory system is a triple (h,B,K), where:
• h : L⇀V is a heap, i.e., a partial function that represents the allocated loca-
tions of shared memory and their values;
• B : P→(L× V) list is an array of write buffers; and
• K : P(P) is a set of blocked processors.
The set of multiprocessor memory systems is abbreviated as Mem.
A pair that consists of a stack s, as defined in Section 3.2, which assigns
values to identifiers, and a memory system µ is called a multiprocessor machine
state, typically abbreviated by σ. The collection of states is written State. We
often abuse notation by interchanging memory systems and states in definitions for
which the stack is irrelevant.
90
The notion of machine state given here differs from the structure used to
define the memory model, as described in Section 2.3.2, in the following ways. First,
names (i.e., “registers,” “variables,” “identifiers,” etc.) are global instead of local to
a particular processor. This is for convenience only, and is not an important technical
restriction. The specification logic we describe later will be restricted to those
programs for which the names are partitioned among processes, except for those
that are never modified. Another reasonable choice would have been to use local
names only, and to share read-only values among processes in the shared memory.
This has the advantage of codifying the above healthiness condition on programs
directly into the model of the language and logic, but it has the disadvantage of
perhaps making the description of access to shared values (stored, e.g., in the heap)
and local names (which must be parametrized by a processor name) slightly more
awkward.
Second, and more significant, the global lock from the memory model descrip-
tion is replaced in the machine model by a “blocked” set K of processor identifiers.
This set describes the processors that are not allowed, in the given state, to access
main memory. In the memory model, an available lock corresponds to an empty
blocked set K = ∅ in which no processors are blocked; a lock that is held by proces-
sor i corresponds to a blocked set K = P \ {i} that includes all processor identifiers
except for i, because all processors other than i are blocked.
The machine model described here is thus more general than the model used
to describe the memory model in that it allows, via the blocked set K, an arbitrary
subset of the available processors to be blocked from accessing memory. This is as
opposed to the use of a traditional lock object, which can only indicate that either
no processors are blocked (when the lock is available) or all but a single processor
i are blocked (when the lock is held by i). The blocked sets thus represent partial
information about the status of the true machine state. In particular, if the blocked
91
set is non-empty then, according to the memory model, the global lock must be
held by some processor, although it may not be clear which processor specifically.
We call a memory system lock-complete when its set of blocked processors is a valid
representation of a global lock—insofar as no processors are blocked when the lock
is free, and all processors but i are blocked when i holds the lock:
lock-complete(h,B, k) ≡df k = ∅ ∨ ∃i ∈ P : k = P \ {i} .
If a memory system is not necessarily complete, then it is called partial.
The definition of the semantic relation for primitive commands, which is
given w.r.t. machine states, is given in Figure 4.1 below. The memory systems are in
general partial, though some primitives require completeness. The semantics of the
primitives that are shared with the sequential language is similar to the description
given in Section 3.3. The assume, assert and assignment primitives are exactly the
same; the load, store and fence primitives operate w.r.t. the specified buffer within
the buffer array B, as opposed to the single buffer b present in the uniprocessor state
used to describe the sequential semantics. The lock command, w.r.t. processor i,
changes an empty blocked set to one in which every processor except for i is blocked
(i.e., P \ {i}), which indicates that i alone holds the global lock. Furthermore, the
lock command is only enabled when the buffer of the processor on which it executes
is empty, which is analogous to an implicit fence instruction. Conversely, the unlock
primitive changes a state in which every processor but i is blocked to one in which
no processor is blocked (i.e., ∅). And, like the lock command, the unlock primitive
includes an implicit fence.
Structured commands consist of either a primitive command; a sequential
or concurrent composition of commands; a nondeterministic choice between com-
mands; or an iteration of a command. We assume that, as commands, primitive
commands are annotated with the name of the processor on which they are to be ex-
92
if ŝ(b) = 1




if ŝ(b) = 1




if ŝ(b) = 0




x := e : (s, h,B,K)→
i
(s[x   ŝ(e)] , h,B,K)
(p-assign)
if (h\\B(i))(ŝ(e)) = v and i /∈ K
x := [e] : (s, h,B,K)→
i
(s[x   v] , h,B,K)
(p-load)
if (h\\B(i))(ŝ(e)) = ⊥ and i /∈ K




if ŝ(e) ∈ dom(h\\B(i))
[e] := e′ : (s, h,B,K)→
i
(s, h,B[i   B(i) ++ [ŝ(e), ŝ(e′)]] ,K)
(p-store)
if ŝ(e) /∈ dom(h\\B(i))




if B(i) = ε




if B(i) = ε
lock : (s, h,B, ∅)→
i
(s, h,B,P \ {i})
(p-lock)
if B(i) = ε




Figure 4.1: Semantics of concurrent primitive commands
93
ecuted. Presumably, the components of a sequential command will all be scheduled
to execute on the same processor, but this is not required and the semantics handles
all cases uniformly. This generality is not likely to be practically useful for sequential
composition, but is important to the semantics of concurrent composition, as will
be discussed shortly.
The formal language of commands is defined by the following grammar:
Comm c ::= pe | (c ; c′) | (c+ c′) | c∗ | (c || c′),
where p is a primitive command and e is an expression that indicates a processor
identifier.
The formal semantics of a single, atomic, successful step of execution by a
command is given as a transition relation between command-state pairs:
c, σ → c′, σ′.
Such an entry may interpreted to mean that a command c in state σ may take
a step of evaluation, transforming to command c′ in an updated state σ′. But
the evaluation of a command may abort unsuccessfully as well, as for primitive
commands. Unsuccessful executions are modeled by transitions from command-
state pairs to the erroneous pseudo-state  , again as for primitive commands:
c, σ →  .
We refer collectively to command-state pairs and the erroneous state  as configu-
rations, and use C to indicate a configuration.
The semantics of commands also encompasses “silent” transitions, which
represent the flushing of buffered writes to the shared memory as allowed by the
94






(h[`   v] , B[i   b] ,K) ≡df B(i) = [(`, v)] ++ b ∧ i /∈ K
µ→
τ
µ′ ≡df ∃i ∈ P : µ→
τ,i
µ′
We write  for the reflexive-transitive closure of the converse of →
τ
:




The complete relation that defines the semantics of commands is given in
Figure 4.2 below.
The reflexive-transitive closure of command evaluation semantics, written
c, σ
∗→ C, is defined as usual.
Command Abbreviations A few standard command abbreviations are shown
in Figure 4.3. Some would benefit greatly from local variable declarations, which
have not yet been added to the language.
It is interesting to note that the execution of the “locked” command 〈c〉e is
not atomic. Threads on other processors may concurrently read and write variables
(i.e., registers) during the execution of 〈c〉e, and also perform store operations, which
add new writes to the write buffer. What threads on other processors may not do
is load from memory—whether from their respective buffers or from the shared
memory—or update main memory by flushing their buffers. It is conceivable that
a theorem could be proved that shows the operational semantics presented in this
3As noted earlier, we abuse notation by interchanging the concept of state and memory system in
definitions for which the stack is irrelevant. Hence, the definition of the relation between memory







if p : σ →
ŝ(e)
σ′ and σ′ ∈ State
pe, σ → skip, σ′
(c-prim)
if p : σ →
ŝ(e)
 





c, σ → c, σ′
(c-tau)
c, σ → c0, σ′
(c ; c′), σ → (c0 ; c′), σ′
(c-seq)
c, σ →  
(c ; c′), σ →  
(c-seq-a)
(skip ; c′), σ → c′, σ
(c-seq-s)
(c+ c′), σ → c, σ
(c-ch-1)
(c+ c′), σ → c′, σ
(c-ch-2)
c, σ → c0, σ′
(c || c′), σ → (c0 || c′), σ′
(c-par-1)
c, σ →  
(c || c′), σ →  
(c-par-1a)
(skip || c′), σ → c′, σ
(c-par-1s)
c′, σ → c0, σ′
(c || c′), σ → (c || c0), σ′
(c-par-2)
c′, σ →  
(c || c′), σ →  
(c-par-2a)
(c || skip), σ → c, σ
(c-par-2s)
c∗, σ → (skip +(c ; c∗)), σ
(c-loop)
Figure 4.2: Semantics of concurrent commands
section is equivalent to one in which locked commands have a truly atomic semantics,
but this has not been attempted.
Stability Consider a state σ0 = (s, h,B, ∅) in which h = ` 7→ 0, B(j) = [(`, 1)]
and B(x) = ε for all x 6= j. From this state, a load on processor i may evaluate as
follows:
y := [`]i , (s, h,B, ∅)→ skip, (s[y   0] , h,B, ∅).
Because j is not blocked, it is also possible for a flushing operation to take place:
y := [`]i , (s, h,B, ∅)→ y := [`]i , (s, h[`   1] , B[j   ε] , ∅),
96
if b then c else c′ =df (assume(b) ; c) +(assume(!b) ; c′)
if b then c =df (assume(b) ; c) +(assume(!b) ; skip)
while b do c =df (assume(b) ; c)∗ ; assume(!b)














case(f, g, g′) =df
〈
(x := [f ] ; if x = g then [f ] := g′ else r := 0)
〉
e
Figure 4.3: Concurrent command abbreviations
and afterward for the load to evaluate as follows:
y := [`]i , (s, h[`   1] , B[j   ε] , ∅)→ skip(s[y   1] , h[`   1] , B[j   ε] , ∅).
Note that in the first evaluation the load resolves ` to 0, and in the second evalua-
tion it resolves ` to 1, with the distinguishing characteristic of the latter being the
preceding nondeterministic flushing operation.
By contrast, from the state σ1 = (s, h,B, {j}), where h and B are defined as
in σ0, the only reduction of y := [`]i is:
y := [`]i , (s, h,B, ∅)→ skip, (s[y   0] , h,B, ∅).
This is because processor j is blocked, and so its buffered write may not commit to
memory. As a consequence, it is not possible for the load on processor i to observe
the write buffered on processor j.
We say that location ` is unstable in state σ0 for processor i because the
result of loading ` is determined by the relative ordering of the flushing operations.
On the other hand, ` is stable in σ1 for i because the load of ` is oblivious to the
flushing operations.
A state is called coherent if each location has writes buffered by at most one
97
processor:
∀i, j ∈ P \K : i 6= j ⇒ dom(B(i)) ∩ dom(B(j)) = ∅ .
The memory locations in a coherent state may be partitioned among the processors,
such that the locations of a partition are all stable for their respective processor.
Interleaving versus Parallelism A pleasant property of this semantics is the
uniform description of both interleaving and parallel concurrency. Let ci be a se-
quential command c in which each primitive has processor annotation i. Then, e.g.,
(c1 || c′1) describes the interleaving concurrent execution of commands c and c′ on
processor 1, while (c1 || c′2) describes the parallel concurrent execution of c and c′ on
processors 1 and 2, respectively. But one does not typically have control over the
particular processor on which a command executes (e.g., c1 instead of c2). Thus,
(cx || c′x) describes the interleaving concurrent execution of commands c and c′ on
some individual processor, denoted by the free variable x. For x 6= y, (cx || c′y) de-
scribes the parallel concurrent execution of c and c′ on distinct processors given by
x and y respectively. Furthermore, without any assumptions about the relationship
between x and y, (cx || c′y) describes both interleaving and parallel executions of c
and c′. This presumably is the most common situation with concurrent composition:
it is up to the operating system to assign processors to threads, and correctness of
a program ought to encompass any such assignment.
Static Semantics The static semantics of expressions, primitive commands and
commands, embodied here by functions fv(−) and mod(−) associating these ob-
jects to their sets of free and modified variables, respectively, are completely stan-
dard. (Especially so because there is are no name-hiding operations in the language,
like the aforementioned missing local variable declaration command.) For example,
fv(x := [y + 1]z) = {x, y, z} and mod(x := [y + 1]z) = {x}.
98
4.3 Separation
In this section we define a series of notions of separation, analogous to those defined
in Chapter 3, but for multiprocessor states. As in the sequential, uniprocessor case,
the goal for these notions of separation is to allow for local reasoning, by modeling
separating conjunctions that have sound frame rules.
Unlike in the previous chapter, however, in which we defined separation
in terms of memory systems and then later lifted those definitions to generalized
memory systems, here we wish to define the notions of separation directly in terms
of generalized multiprocessor memory systems, which are defined as follows.
Definition 5. A generalized multiprocessor memory system (typically indicated by
the symbol ν) is a four-tuple, (h,B,K,Γ), where:
• (h,B,K) is a multiprocessor memory system; and
• Γ : P(P) is a buffer-completeness set.
We furthermore require that if Γ = ∅ then so is h = ∅.
Generalized multiprocessor memory systems are related to multiprocessor
memory systems, which model parallel programs, in the same way that general-
ized sequential memory systems are related to uniprocessor memory systems, which
model purely sequential programs. In particular, the buffer-completeness set Γ in
the generalized multiprocessor memory system plays a role analogous to the buffer-
completeness flag γ in the generalized uniprocessor memory system: inclusion of a
processor identifier i in the set Γ means that the ith buffer is complete, and there-
fore there may exist no previous buffered writes on the ith write buffer; they must
instead have been flushed to memory.
In the uniprocessor case, we required that if γ = f then h = ∅, because
flushing can only occur when the buffer is complete. In the multiprocessor case, we
99
require that h = ∅ only when none of the buffers are complete. Alternately, the
heap may be non-empty as long as at least some of the buffers are complete.
We call a generalized multiprocessor memory system (h,B,K,Γ) buffer-
complete when Γ = P; i.e., when each buffer is complete. The semantics of assertions
(given later in Section 4.4) shall be described in terms of generalized multiprocessor
memory systems which are not necessarily buffer-complete, but the specifications
(given later in Section 4.5) shall be described in terms of only buffer-complete gen-
eralized multiprocessor memory systems.
We now proceed with the definition of notions of separation in terms of
generalized multiprocessor memory systems.
4.3.1 Spatial Separation
As in the uniprocessor case, described in Section 3.4.1, we begin with the simplest
notion of separation, spatial separation, which describes decompositions of memory
systems with disjoint sets of allocated locations. We write µ ∈ µ0 ∗̃ µ1 when mem-
ory system µ may be spatially separated in terms of memory systems µ0 and µ1.
Similarly, we write ν ∈ ν0 ∗̂ ν1 when generalized memory system ν may be spatially
separated in terms of generalized memory systems ν0 and ν1. The latter operation
on generalized memory systems will be defined in terms of the former operation on
memory systems; hence we proceed to first define µ0 ∗̃ µ1.
Informally, the spatial separation µ0 ∗̃ µ1 denotes memory systems in which
the write buffers of µ0 and µ1 are pointwise-interleaved: i.e., the ith buffer is some
interleaving of the ith buffer of µ0 and the ith buffer of µ1. The formal definition
µ0 ∗̃ µ1 is as follows, where µ0 = (h0, B0,K0) and µ1 = (h1, B1,K1):
µ0 ∗̃ µ1 =df {(h0 ] h1, B,K0 ∪K1) | B ∈ B0 ]B1 ∧ µ0^∗ µ1} ,
where the spatial compatibility relation between multiprocessor memory systems
100
µ0^∗ µ1 is defined as follows:
µ0^∗ µ1 ≡df (dom(h0) ∪ dom(B0)) ∩ (dom(h1) ∪ dom(B1)) = K0 ∩K1 = ∅ .
In both definitions, we lift operations on lists to the analogous operations on func-
tions into lists. That is, the domain of a buffer array B is the union of the domains





and the interleavings of buffer arrays B0 and B1 are buffers arrays for which all
buffers are interleavings of their respective buffers in arrays B0 and B1:
B ∈ (B0 ]B1) ≡df ∀i ∈ P : B(i) ∈ B0(i) ]B1(i).
It is easy to check that µ0 ∗̃ µ1 yields a set of multiprocessor memory systems:
because the compatibility relation requires disjointness of the domains, h0]h1 is well
defined according to the definition of partial function summation in Section 2.1.2.
Next, we define the spatial separation of generalized memory system ν0 ∗̂ ν1
by lifting the spatial separation of the included memory systems and unioning the
buffer-completeness sets. That is, for ν0 = (µ0,Γ0) and ν1 = (µ1,Γ1), the set ν0 ∗̂ ν1
is defined as follows
ν0 ∗̂ ν1 =df {(µ,Γ0 ∪ Γ1) | µ ∈ µ0 ∗̃ µ1} .
Again, it is easy to check that ν0 ∗̂ ν1 yields a set of generalized multiprocessor
memory systems: h0 ∪ h1 = ∅ if Γ0 ∪ Γ1 = ∅, as required by the definition.
The blocking sets and buffer-completeness sets in the separation are defined
by union because, intuitively, the properties they describe can only accumulate with
101
increasingly complete descriptions of memory systems. If a processor is blocked
in some memory system, then it ought also to be blocked in a more complete de-
scription of a memory system—along with, perhaps, the requirement that some
additional processors be blocked. Similarly, if a processor is buffer-complete in some
partial description of a memory system, then it ought also to be buffer-complete in a
more thorough description of that memory system. The requirement that blocking
sets be disjoint is later used to give especially “small” axioms for the for the lock
manipulation primitives.
Let us consider a small example. In the sequel, let E =df λx . ε, the array
of empty write buffers. Let ν0 = (1 7→ 1, E [0   [(1, 2)]] , ∅, {0}) and ν1 = (2 7→
1, 1[(2, 2)] , ∅, {1}) be generalized multiprocessor memory systems. Here, ν0 consists
of a single-point heap describing the memory address 1 with value 1; a write buffer
array that is everywhere empty except for processor 0, which has a single write
to address 1 with value 2; no blocked processors; and with processor 0 declared
to be buffer-complete. Similarly, ν1 consists of a single-point heap describing the
memory address 2 with value 1; a write buffer array that is everywhere empty except
for processor 1, which has a single write to address 2 with value 2; no blocked
processors; and with processor 1 declared to be buffer-complete. Note that ν0^∗ ν1
holds because the domains of the memory systems are disjoint, as well as the blocked
sets. Consequently, the set ν0 ∗̂ ν1 is non-empty; indeed, it is a singleton:
ν0 ∗̂ ν1 = {1 7→ 1 ] 2 7→ 1, E [0   [(1, 2)] , 1   [(2, 2)]] , ∅, {0, 1}} ,
in which the only resulting memory system has a two-point heap with addresses 1
and 2 defined; a buffer array that is empty everywhere except for processors 0 and
1; an empty blocking set; and in which processors 0 and 1 are buffer-complete. The
set ν0 ∗̂ ν1 is a singleton because there is only a single possible interleaving of the
102
memory systems’ respective buffer arrays:
E [0   [(1, 2)]] ] E [1   [(2, 2)]] = {E [0   [(1, 2)] , 1   [(2, 2)]]} .
Note that the result of a load in memory system ν1 is the same as in any of the more
completely defined memory systems of ν0 ∗̂ ν1. Note also that neither ν0^∗ ν0 nor
ν1^∗ ν1 hold because in each case the domains overlap, and so ν0 ∗̂ ν0 = ν1 ∗̂ ν1 = ∅.
We overload for convenience the symbol ∗̂ to indicate the pointwise lifting
of this function to sets of memory systems:
S1 ∗̂ S2 =df ∪{ν1 ∗̂ ν2 | ν1 ∈ S1 ∧ ν2 ∈ S2 ∧ ν1^∗ ν2} .
We use these functions interchangeably when the intended meaning is clear from
context, e.g.:
ν1 ∗̂ (ν2 ∗̂ ν3) = ∪{ν1 ∗̂ ν23 | ν23 ∈ ν2 ∗̂ ν3} .
In the following, let νu =df (∅, E , ∅, ∅) be an empty generalized memory sys-
tem. The following lemma asserts some algebraic properties of spatial separation.
Proposition 5. For generalized multiprocessor memory systems ν0, ν1, ν2:
• νu ∗̂ ν0 = {ν0}
• ν0 ∗̂ ν1 = ν1 ∗̂ ν0
• ν0 ∗̂ (ν1 ∗̂ ν2) = (ν0 ∗̂ ν1) ∗̂ ν2
4.3.2 Temporal Separation
As in the sequential, uniprocessor case, we also define a temporal notion of memory
system separation, which we write µ0 C̃ µ1, and a temporal notion of generalized
memory system separation, which we write ν0 Ĉ ν1. Informally, a temporal sepa-
103
ration of memory systems partitions the writes of each write buffer. Unlike spa-
tial separation, temporal separation does not require disjointness of the domains of
memory addresses described by the states; so it can be used to describe sequences
of writes (on a single buffer) to a particular memory address. However, to ensure
locality and a sound (left-side) frame rule, the definedness condition for temporal
separation must rule out, in some cases, memory systems with pending writes to a
particular location on multiple write buffers.
Formally, the partial function µ0 C̃ µ1 is defined as follows, where µ0 =
(h0, B0,K0) and ν1 = (h1, B1,K1):
µ0 C̃ µ1 =df

(h0\\h1, B0 ++B1,K0 ∪K1) if µ0^C µ1
⊥ otherwise
where the temporal compatibility relation ^C is defined below. In µ0 Ĉ µ1, the
heap values defined in the later memory system override those in the earlier memory
system, and the writes in each buffer of the earlier memory system are prepended
to those in the later memory system. Above, we lift list concatenation to functions
into lists as follows:
B0 ++B1 =df λx .B0(x) ++B1(x).
The temporal compatibility relation ^C is defined as follows:
µ0^C µ1 ≡df K0 ∩K1 = ∅ ∧
∀i ∈ P \ (K0 ∪K1) : dom(B0(i)) ∩ dom(h1) = ∅ ∧
∀i, j ∈ P \ (K0 ∪K1) : i 6= j ⇒ dom(B0(i)) ∩ dom(B1(j)) = ∅ .
The first conjunct above requires, as in the case of spatial separation, disjointness
of the blocking sets. The second and third conjuncts describe the weak disjointness
104
requirements required for soundness of the temporal frame rule. Specifically, the
second requires that there be no preceding buffered writes to locations defined in the
given memory, unless the buffers on which those writes exist are blocked. Similarly,
the third conjunct requires that there be no preceding buffered writes to locations for
which there currently pending writes, unless either those earlier pending writes are
on the same buffer, or those earlier pending writes are on blocked buffers. Examples
presented shortly will illustrate the necessity of these definedness conditions.
The temporal separation of generalized memory systems, ν0 Ĉ ν1 is defined
as follows, where ν0 = (µ0,Γ0), µ0 = (h0, B0,K0) and ν1 = (µ1,Γ1):
ν0 Ĉ ν1 =df

(µ0 C̃ µ1,Γ0 ∪ Γ1) if µ0^C µ1 and ∀i ∈ Γ1 : B0(i) = ε
⊥ otherwise.
The additional definedness condition in the temporal separation of generalized mem-
ory systems requires that, for the buffer-complete processors i of ν1, any preceding
writes (in ν0) described on buffer i be flushed to memory.
The importance of the second and third conjuncts in the temporal compati-
bility relation above can be demonstrated with a few small examples. First, consider
ν1 = (1 7→ 2, E , ∅, {1}). A load by processor 1 of address 1 in this memory system
clearly results in the value 2. Next consider ν0 = (∅, E [0   [(1, 3)]] , ∅, {0}), which
includes a buffered write on processor 0 to the same memory address 1. The memory
system ν0 is not compatible with ν1 due to the second conjunct in the temporal com-
patibility relation because processor 0 is not blocked in ν0, but contains a buffered
write to a location defined in the heap of ν1. If this condition were omitted, then
the value of ν0 Ĉ ν1 would be as follows:
ν0 Ĉ ν1 = (1 7→ 2, E [0   [(1, 3)]] , ∅, {0, 1}).
105
Although in the memory system ν0 Ĉ ν1 defined above it remains the case that a
load by processor 1 of address 1 again results in the value 2, this is not a stable load:
the write buffered on processor 0 may flush to memory before the load completes,
resulting in the state (1 7→ 3, E , ∅, {0, 1}), from which a load by processor 1 certainly
would not result in the value 2. This is a situation we wish to avoid, and so we rule
out temporal conjunctions of memory systems like ν0.
On the other hand, the temporal conjunction ν ′0 Ĉ ν1, with memory system
ν ′0 = (∅, E [0   [(1, 3)]] , {0} , {0}), is perfectly safe because processor 2, on which the
offending write is buffered, is blocked, and so there is no concern that it will flush to
memory, becoming visible to other processors, and causing instability. Hence, the
second conjunct above only rules out writes to conflicting locations on non-blocked
write buffers.
The role of the third conjunct in the temporal compatibility relation is sim-
ilar: to rule out temporal conjunctions that lead to instability. Consider now
ν ′1 = (∅, E [1   [(1, 2)]] , ∅, {1}), which differs from ν1 above in that the write that was
flushed there is here still pending in the first buffer. Again, ν0 is incompatible, by
the third conjunct, with ν ′1 because the buffered write on processor 0 in ν0 conflicts
with the buffered write on processor 1 in ν ′1. If, however, these memory systems
were temporally compatible—i.e., if the third conjunct above were omitted in the
temporal compatibility relation—then the value of ν0 Ĉ ν ′1 would be as follows:
ν0 Ĉ ν
′
1 = (∅, E [0   [(1, 3)] , 1   [(1, 2)]] , ∅, {0, 1}).
The address 1 in this memory system is, again, unstable w.r.t. processor 1: if the
write on buffer 1 flushes to memory, followed by the write on buffer 0 flushing to
memory, a load on processor 1 might result in value 3 instead of 2. However, ν ′0 Ĉ ν
′
1
is well defined because the buffered write on processor 0 in memory system ν ′0 is
blocked, and so it is impossible for processor 1 to observe that write.
106
Note that writes committed to the memory are combined by the temporal
separator using heap overriding (first defined in Section 2.1.2). This reflects the
fact that previous committed writes are subsumed by more recent flushed writes.
For example, consider ν0 = (1 7→ 2, E , ∅, {0}) and ν1 = (1 7→ 3, E , ∅, {0}). In
the temporal separation ν0 Ĉ ν1, the later write overrides the earlier write because
(1 7→ 2)\\(1 7→ 3) = 1 7→ 3:
ν0 Ĉ ν1 = (1 7→ 3, E , ∅, {0}).
Also note that, because there can be no ordering among writes on distinct buffers,
none is created by temporal separation of such memory systems. For example,
because ν ′0 = (∅, E [0   [(1, 3)]] , ∅, ∅) and ν ′1 = (∅, E [1   [(2, 4)]] , ∅, ∅) have writes
only on distinct buffers, their temporal separation is symmetric:
ν ′0 Ĉ ν
′
1 = (∅, E [0   [(1, 3)] , 1   [(2, 4)]] , ∅, ∅) = ν ′1 Ĉ ν ′0.
In the following, let νu =df (∅, E , ∅, ∅) be an empty generalized memory sys-
tem as before. The following lemma asserts some algebraic properties of temporal
separation.
Proposition 6. For generalized multiprocessor memory systems ν0, ν1, ν2:
• νu Ĉ ν0 = ν0 Ĉ νu = ν0
• ν0 Ĉ (ν1 Ĉ ν2) = (ν0 Ĉ ν1) Ĉ ν2
Remark. Unlike traditional notions of separation (e.g., as outlined in work on ab-
stract separation logic [11]), note that temporal separation is not a cancellative
operation. That is, it is not the case that the following implication holds:
µ C̃ µ0 = µ C̃ µ1 ⇒ µ0 = µ1.
107
A simple counter-example is µ = µ0 = (1 7→ 2, E , ∅) and µ1 = (∅, E , ∅). Then
µ C̃ µ0 = µ = µ C̃ µ1, but obviously µ0 6= µ1.
Strong Temporal Separation
As in the sequential case, we may define a strong variant of temporal separation—
that requires both spatial and temporal separation—as the intersection of spatial
and temporal separators. For ν0 = (µ0,Γ0) and ν1 = (µ1,Γ1), the strong temporal
separation ν0 Ĵ ν1 is defined as follows:
ν0 Ĵ ν1 =df

ν0 Ĉ ν1 if µ0^C µ1 and µ0^∗ µ1
⊥ otherwise.
Observe that if ν0 Ĵ ν1 is defined then ν0 Ĉ ν1 is defined and is identical, and ν0 ∗̂ ν1
is defined with ν0 Ĵ ν1 ∈ ν0 ∗̂ ν1.
4.3.3 Spatiotemporal Separation
Finally, we again define a spatiotemporal separation function on memory systems
µ0 #̃ µ1, and a spatiotemporal separation function of generalized memory systems
ν0 #̂ ν1—so-called because it subsumes both spatial and temporal separation. Spa-
tiotemporal separation subsumes spatial and temporal separation in the sense that,
if ν0 Ĉ ν1 is defined, then ν0 Ĉ ν1 ∈ ν0 #̂ ν1 and, similarly, ν0 ∗̂ ν1 ⊆ ν0 #̂ ν1. This
separation function is intended to be as weak as possible, describing a wide variety of
memory systems while still maintaining locality w.r.t. the concurrent programming
language. This weakness reduces the expressive power of the function, which lim-
its its usefulness for describing particular sets of states like pre- and post-conditions
(i.e., for modeling assertions). The weakness of spatiotemporal separation shall later
(in Section 4.5.2) be extremely useful, though, for giving a semantics to specifica-
tions by outlining liberal conditions for sound frame rules. That is, specifications
108
will later be considered true when they soundly admit spatiotemporal frames; and
because spatiotemporal separation subsumes both spatial and temporal separation,
specifications will also be shown to soundly admit spatial frames and temporal
frames.
For memory systems µ0 = (h0, B0,K0) and µ1 = (h1, B1,K1), the spatiotem-
poral separation µ0 #̃ µ1 is defined as follows:
µ0 #̃ µ1 =df {(h0\\h1, B,K0 ∪K1) | B ∈ B0\\B1 ∧ µ0^C µ1} ,
in which the overriding operation on lists b0\\b1 is lifted pointwise to functions into
lists B0\\B1:
B ∈ B0\\B1 ≡df ∀i ∈ P : B(i) ∈ B0(i)\\B1(i).
Note that the temporal compatibility relation is reused in the definition above.
The spatiotemporal separation ν0 #̂ ν1 of generalized memory systems ν0 =
(µ0,Γ0) and ν1 = (µ1,Γ1) is defined by lifting the spatiotemporal separation of
memory systems and unioning the buffer-completeness sets as follows:
ν0 #̂ ν1 =df
{
(µ,Γ0 ∪ Γ1) | µ ∈ µ0 #̃ µ1
}
.
It is easy to see that, when µ0^C µ1 holds, µ0 C̃ µ1 ∈ µ0 #̃ µ1 because
B0 ++B1 is a member of the overriding B0\\B1 (by Lemma 2); and hence also
ν0 Ĉ ν1 ∈ ν0 #̂ ν1. Similarly, if µ0^∗ µ1 then also µ0^C µ1, because the strong
disjointness requirements of the former imply the definedness conditions of the lat-
ter. And generally ν0 ∗̂ ν1 ⊆ ν0 #̂ ν1 because, when µ0^∗ µ1, h0 ] h1 = h0\\h1 (by
Lemma 1) and B0\\B1 = B0 ] B1 (by Lemma 2). Consequently, all the equations
and inclusions in the examples of the previous two sections, for temporal and spatial
separation, hold for spatiotemporal separation as well.
As an example of the weakness of spatiotemporal separation, consider ν0 =
109
(∅, E [0   [(1, 2)]] , ∅, ∅) and ν1 = (3 7→ 4, E [0   [(1, 3)]] , ∅, {0}). Observe that ν0 Ĉ ν1
is undefined because the buffer 0 is complete in ν1 (i.e., 0 ∈ Γ1), but nonempty in
ν0. And ν0 ∗̂ ν1 is empty as well because of failed disjointness conditions (e.g.,
1 ∈ dom(B0(0)) ∩ dom(B1(0)). But, in contrast, ν0 #̂ ν1 is nonempty, with:
(3 7→ 4, E [0   [(1, 2), (1, 3)]] , ∅, {0}) ∈ ν0 #̂ ν1.
Note that the result of a load of either location 1 or 3 on processor 0 in memory
system ν1 above is preserved in the expanded memory systems of ν0 #̂ ν1. This is
important because, as mentioned above, we wish for commands to be local w.r.t.
spatiotemporal separation so that conjunctions derived from it may have sound
corresponding frame rules.
As further explanation of the definition of spatiotemporal separation, con-
sider a load of location ` on processor i in an arbitrary memory system ν1 for which `
is defined. Let us consider the manners in which ν1 can be extended while preserving
the resultant value of the load.
1. We may augment ν1 with additional buffered writes to locations distinct from
` regardless of their ordering with respect to writes to ` already present. The
resultant value of the load command is not affected by buffered writes to
locations not being loaded.
2. We may augment ν1 with additional buffered writes to address ` on buffer
i if those writes occur before the most recent writes to ` on i. The load
command only returns the most recent buffered write, so additional earlier
buffered writes will not affect the result. But we may not augment ν1 with
additional buffered writes to address ` on i that are more recent than those
already present, for these additional writes cat affect the outcome of the load.4
4Of course, if additional later buffered writes have the same value of as the most recent buffered
110
3. We may augment ν1 with additional committed writes to locations distinct
from ` regardless of their ordering with respect to writes to ` already present.
The resultant value of the load command is not affected by committed writes
to locations not being loaded.
4. We may augment ν1 with additional committed writes to address ` if those
writes again precede previously committed writes to ` in ν1. (Because com-
mitted writes implicitly precede all buffered writes, this is consistent with
the previous scenario in which the ith buffer is safely augmented with ear-
lier writes.) But we may not add additional committed writes that succeed
previously committed writes, for those could be observed by the load.5
5. We may augment ν1 with additional writes to locations distinct from ` on
other buffers j, with j 6= i, as well, regardless of their ordering with respect
to existing ` writes on buffer j. Although those writes may commit before or
after the `-writes being loaded by i, they do not affect the result.
6. Finally, consider writes to address ` on buffer j with j 6= i. In general, this
may adversely affect the load on i because we are necessarily unable to predict
the manner in which these writes buffered by j will commit to memory. For
example, it is possible that they will commit after buffered writes on i have
committed but before the load has completed, thus affecting the result of the
load. So it would seem that such writes must be disallowed.
But there is an important situation in which it is safe to augment ν1 in this
way: namely, when buffer j is blocked. Then, writes buffered by j will not be
committed to memory, and so there is no risk that they will be made visible
write to ` then they may not affect the value of the load, but neither is that obvious nor does it seem
worthwhile to consider seriously such a daring corner case in the definition of logical connective.
5In case there are buffered writes on i, it would also be safe to add committed writes to ` which
succeed existing committed writes. But factoring in this behavior would make a compositional,
associative definition of the logical connectives difficult.
111
to the load on i. Hence, we may augment other buffers with writes to ` when
j is blocked.
We can now check the proposed definition of ν0 #̂ ν1 for consistency with
the scenarios above:
1. Buffered writes to locations other than ` on buffer i are ordered arbitrarily
w.r.t. existing writes to ` by definition of buffer overriding, B0(i)\\B1(i), con-
sistent with the corresponding scenario above.
2. Buffered `-writes on buffer i necessarily precede any `-writes already present
on buffer i by definition of buffer overriding, B0(i)\\B1(i), consistent with the
corresponding scenario above.
3. Committed writes to locations other than ` are ordered arbitrarily w.r.t.
` writes already present in the heap by definition of heap overriding h0\\h1,
consistent with the corresponding scenario above.
4. Committed `-writes necessarily precede any `-writes already present in the
heap by definition of heap overriding, h0\\h1, consistent with the corresponding
scenario above.
5. Buffered writes to locations other than ` on another buffer j, with j 6= i, are
ordered arbitrarily w.r.t. existing writes to ` by definition of buffer overriding,
B0(j)\\B1(j), consistent with the corresponding scenario above. There are
no ordering constraints among buffered writes on different write buffers by
definition of overriding write buffer arrays, B0\\B1.
6. Finally, in the case of buffered ` writes on another buffer j, with j 6= i, the
definition is consistent with the corresponding scenario above by way of the
definedness conditions, which ensure that buffer j is blocked when ` writes are
already present in buffer i or in the heap.
112
As with spatial separation, we overload for convenience the symbol #̂ to
indicate the pointwise lifting of this function to sets of memory systems:
S1 #̂ S2 =df ∪
{
ν1 #̂ ν2 | ν1 ∈ S1 ∧ ν2 ∈ S2 ∧ ν1^# ν2
}
.
We use these functions interchangeably when the intended meaning is clear from
context, e.g.:
ν1 #̂ (ν2 #̂ ν3) = ∪
{
ν1 #̂ ν23 | ν23 ∈ ν2 #̂ ν3
}
.
In the following, let νu =df (∅, E , ∅, ∅) be an empty generalized memory sys-
tem as before. The following lemma asserts some algebraic properties of spatiotem-
poral separation.
Proposition 7. For generalized multiprocessor memory systems ν0, ν1, ν2:
• νu #̂ ν0 = ν0 #̂ νu = {ν0}
• ν0 #̂ (ν1 #̂ ν2) = (ν0 #̂ ν1) #̂ ν2
4.4 Concurrent Assertions
Assertions are used to describe certain sets of multiprocessor models—and hence
multiprocessor states and memory systems—and are used to write the pre- and
post-conditions of commands in the specification logic. The core language is defined
by the following grammar:
Assert P ::= b | (P ∨ P ′) | (P ∧ P ′) | (∃x : P ) | (∀x : P ) |
emp | bare | locke | e e′ e′′ | (P ∗ P ′) | (P −∗ P ′) |
(P C P ′) | (P −C P ′) | (P C− P ′)
The informal meaning of these assertions are as follows. The lifting of a
113
boolean expression to an atomic formula, disjunction, conjunction and quantification
have the usual meaning. emp describes states with an empty heap and all write
buffers empty. bare describes states in which just the eth buffer is empty. locke
asserts that processor e holds the lock. e e′ e′′ describes a single write to location
e with value e′′, either buffered on processor e′ or flushed to memory. The temporal
separating conjunction (P C P ′) describes per-write buffer concatenation of writes.
The spatial separating conjunction (P ∗ P ′) requires disjointness of the locations
described by P and P ′, and interleaves the described writes on each write buffer
instead of concatenating them. The spatial separating implication (P −∗ P ′) and
the left and right temporal separating implications (P −C P ′) and (P C− P ′) are
analogous to the separating implications defined earlier in Section 3.5.3.
The set of free variables of an assertion, written fv(P ), is defined as usual.
For example, fv(∃z : x z y C barz) = (fv(x z y) ∪ fv(barz)) \ {z} = ({x, y, z} ∪
{z}) \ {z} = {x, y}.
4.4.1 Concurrent Satisfaction
The meaning of assertions is given by a satisfaction relation M |= P , relating
models M to assertions P . A model is a pair (s, ν) consisting of a stack s and a
generalized multiprocessor memory system ν. The satisfaction relation is defined by
recursion on the structure of P below in Figure 4.4. For concision, the text of the
114
definition makes use of the following auxiliary set definitions:
emp =df {(h,B,K,Γ) | h = ∅ ∧ B = E ∧ K = ∅}
bar(i) =df {(h,B,K,Γ) ∈ emp | i /∈ P ∨ i ∈ Γ}
lock(i) =df {(h,B,K,Γ) | h = ∅ ∧ B = E ∧ i ∈ P ∧ K = P \ {i}}
pending(i, `, v) =df {(h,B,K,Γ) | h = ∅ ∧ B = E [i   [(`, v)]] ∧ K ⊆ {i}}
flushed(i, `, v) =df {(h,B,K,Γ) | h = ` 7→ v ∧ B = E ∧ K = ∅ ∧ i ∈ Γ}
leads-to(i, l, v) =df {(h,B,K,Γ) ∈ pending(i, l, v) ∪ flushed(i, l, v) | i ∈ P}
s, ν |= b ≡df ŝ(b) = 1
s, ν |= P ∨ Q ≡df s, ν |= P ∨ s, ν |= Q
s, ν |= P ∧ Q ≡df s, ν |= P ∧ s, ν |= Q
s, ν |= ∃x : P ≡df ∃v ∈ V : s[x   v] , ν |= P
s, ν |= ∀x : P ≡df ∀v ∈ V : s[x   v] , ν |= P
s, ν |= emp ≡df ν ∈ emp
s, ν |= bare ≡df ν ∈ bar(ŝ(e))
s, ν |= locke ≡df ν ∈ lock(ŝ(e))
s, ν |= e e′ e′′ ≡df ν ∈ leads-to(ŝ(e′), ŝ(e), ŝ(e′′))
s, ν |= P ∗ P ′ ≡df ∃ν0, ν1 : ν ∈ ν0 ∗̂ ν1 ∧
s, ν0 |= P ∧ s, ν1 |= P ′
s, ν |= P C P ′ ≡df ∃ν0, ν1 : ν = ν0 Ĉ ν1 ∧
s, ν0 |= P ∧ s, ν1 |= P ′
s, ν |= P −∗ P ′ ≡df ∀ν0, ν1 : s, ν0 |= P ∧ ν1 ∈ ν0 ∗̂ ν ⇒
s, ν1 |= P ′
s, ν |= P −C P ′ ≡df ∀ν0, ν1 : s, ν0 |= P ∧ ν1 = ν0 Ĉ ν ⇒
s, ν1 |= P ′
s, ν |= P C− P ′ ≡df ∀ν0, ν1 : s, ν1 |= P ′ ∧ ν0 = ν Ĉ ν1 ⇒
s, ν0 |= P
Figure 4.4: Concurrent satisfaction relation
The classical atomic formulas and logical operations are defined as usual. The
emp assertion describes generalized memory systems with empty heaps, empty write
buffers, and with no processors blocked. The bari assertion additionally requires
115
that i be buffer-complete. The locki assertion is like emp, but requires that all
processors except for i be blocked. The leads-to assertion e i f is modeled by
two classes of generalized memory systems: those that are pending, with an empty
heap, a single buffered write, and in which at most i is blocked; and those that
are flushed, with a single-point heap, all buffers empty, and in which no processors
are blocked. The separating conjunctions and implications are defined with the
semantic functions described in Section 4.3.
Remark. The choice of blocked sets in the leads-to equation deserves additional
explanation. In the case of models with a pending write on processor i, the blocked
set may or may not contain i—the pending write may not have flushed to memory
because the buffer on which it is enqueued is blocked; or, if the buffer is not blocked,
it may simply have not yet flushed. In the case of models in which the write has
flushed, i must not be blocked because, indeed, the write has flushed, which can
only happen if the buffer is unblocked.
It is also instructive to consider the models of the leads-to assertion in
separating-conjunction with the lock assertions: e i e′ ∗ lockj . The only blocked
set of the models of lockj is P \ {j}, which is maximal w.r.t. set inclusion. (Opera-
tionally, no lock state corresponds to every processor being blocked.) Consequently,
because the separating conjunction requires that the blocked sets of the constituent
models be disjoint, the only models of e i e′ that are compatible with a model of
lockj are those with an empty blocked set; i.e., those in which i is not blocked. An
informal interpretation of an empty blocked set is thus that of a model in which the
lock status is wholly unconstrained.
Incompatibility of models of the leads-to assertion in which i is blocked is
intuitive in case i = j: a blocked set that consists solely of i ought not to be
compatible with one that consists of all processors except i. But in case i 6= j, the
incompatibility is less obvious. Why must a model of a leads-to assertion which
116
asserts that i is blocked be incompatible with a model of a lock assertion which
asserts that many processors including i are blocked? The reason is that the lock
status may be dynamically changed by the lock-manipulation primitives, which if
compatibility were relaxed could lead to inconsistency. That is, even if a model
of e  i e′ in which i is necessarily blocked were allowed to be compatible with
models of locki, it surely ought not be compatible with models of lockj , with j 6= i;
but the lock-manipulations primitives can effect just this change. The separating
conjunctions are designed to isolate parts of the state from such changes, and such
a relaxed notion of compatibility would lead to unsoundness of the frame rules in
the context of lock manipulation primitives.
Similarly, even if the semantics were relaxed so that locki ∗ locki ≡ locki,
we must still ensure that locki ∗ lockj is inconsistent when i 6= j. But then we
could derive an invalid specification as follows:
...
J ` {lock0} unlock0 || lock1 {lock1}
...
J ` {lock0 ∗ lock0} unlock0 || lock1 {lock0 ∗ lock1}
frame-sp
J ` {lock0} unlock0 || lock1 {false}
cons
We write [[P ]] for the set of models that satisfy P ,
[[P ]] =df {M | M |= P} ,
and also P |= P ′ and P ≡ P ′ for semantic entailment and equivalence, respectively:
P |= P ′ ≡df [[P ]] ⊆ [[P ′]]
P ≡ P ′ ≡df [[P ]] = [[P ′]].
117
4.4.2 Additional Concurrent Assertions
As before, we shall define the strong temporal separating conjunction as the additive
conjunction of spatial and temporal separating conjunctions:
P J P ′ =df (P ∗ P ′) ∧ (P C P ′).
Again, the spatial separating conjunction P ∗ Q is commutative, so there is
no need to define its converse operation, but the temporal separating conjunction is
not. Hence, we define P B Q as shorthand for Q C P , and similarly for P I Q:
P B Q =df Q C P
P I Q =df Q J P.
In the sequential assertion logic we defined the points-to assertion e 7→ e′ as
an abbreviation of a leads-to assertion followed temporally by a barrier assertion.
In the concurrent assertion logic, however, the situation is not as simple. The leads-
to assertions are here annotated with expressions indicating the identifier of the
processor on which they are buffered, and similarly for barrier assertions. We might
try to define a points-to assertion as a leads-to assertion on a certain processor
temporally conjoined with a barrier assertion on the same processor:
e 7→e′ =df ∃x : e x e′ C barx
The intuition behind this abbreviation is that the result of flushing on one processor
should be indistinguishable from the result of flushing the same write on a differ-
ent processor, and hence the choice of a particular process from which the write
originated is unimportant. But, in the model described in this document, this is
only true of models for which all buffers are complete; and in general models of
118
assertions need not have all buffers complete. As a consequence of the abbreviation
above, among the models of 1 0 1 C 2 7→2 are included some in which the write to
location 1 is buffered on processor 0, whereas there should arguably be none.
The correct definition instead temporally conjoins a leads-to formula with
another that describes the flushing of all buffers:
e 7→e =df (∃x : e x e′) C (∀x : barx).
The left-hand conjunct describes a possibly buffered write on some processor, while
the right-hand conjunction describes the result of flushing every processor. Hence,
the models of e 7→e′ describe only a flushed write with all processors buffer-complete.
Note that if x does not denote a processor identifier, then e x e′ is not satisfied by
any model, whereas in the same circumstance barx is instead satisfied by the same
models as emp.
4.4.3 Concurrent Algebra
A few additional semantic equivalences and entailments are shown in Figures 4.5
and 4.6, respectively. If a formula contains instances of • or ◦, then that is short-
hand for the same formula in which the • has been consistently replaced by any of
the separating conjunctions.
As in the sequential assertion logic, each of the separating conjunctions is
associative and has emp as a unit. And the spatial separating conjunction is again
commutative. The separating conjunctions are additive w.r.t. the barrier assertion
bare, and bare also distributes fully through the separating conjunctions. Again,
the strong and weak temporal separating conjunctions of barrier assertions are equiv-
alent. The separating conjunctions are multiplicative w.r.t. the lock assertion locke
and also symmetric. bare also commutes with locke, even for the sequential con-
junctions. Note that the result of flushing successive writes to the same location is
119
P • emp ≡P
emp • P ≡P
(P • P ′) • P ′′ ≡P • (P ′ • P ′′)
P ∗ P ′ ≡P ′ ∗ P
P • locke ≡ locke • P
locke • lockf ≡ false
bare • bare ≡bare
(P • P ′) C bare ≡ (P C bare) • (P ′ C bare)
P J bare ≡P C bare
locke C bare ≡ locke ∗ bare
e e′ e
′′ ∗ e f ′ f ′′ ≡ false
e e′ e
′′ C e e′ f C bare′ ≡ e e′ f C bare′
Figure 4.5: Concurrent semantic equivalences
equivalent to flushing just the most recent write.
bare |= emp
P • P ′ |=P ′′ • P ′ if P |= P ′′
P • P ′ |=P • P ′′ if P ′ |= P ′′
e 7→f |= e e′ f
P J P ′ |=P ∗ P ′
P J P ′ |=P C P ′
P • (P ′ ◦ P ′′) |= (P • P ′) ◦ P ′′ for P • P ′ |= P ◦ P ′
(P ◦ P ′) • P ′′ |=P ◦ (P ′ • P ′′) for P • P ′ |= P ◦ P ′
(P ∗ P ′) J (P ′′ ∗ P ′′′) |= (P J P ′′) ∗ (P ′ J P ′′′)
Figure 4.6: Concurrent semantic entailments
The first three entailments in Figure 4.6—that bare strengthens emp and
monotonicity of the separating conjunctions—follow directly from the definition of
the satisfaction relation. Points-to strengthens leads-to by its abbreviation expan-
sion and monotonicity of the weak temporal separating conjunction, along with the
120
fact that (∀x : barx) |= emp. The next two entailments follow from their re-
spective abbreviation expansions and by monotonicity of ∧ . The three separating
conjunctions naturally form a sort of lattice, and they satisfy the small exchange
laws. The full exchange law only holds for the spatial and strong temporal conjunc-
tions. The law does not hold for the other separating conjunctions because, e.g., it
implies commutativity of the principal connective in the consequent, and the other
conjunctions are not commutative.
4.4.4 Flushing Closure
Assertions can thus be thought of as syntactic constructs that denote sets of mul-
tiprocessor models, and hence sets of generalized multiprocessor machine states.
These sets of models have a special structure: namely, they are down-closed w.r.t. a
partial order, which encompasses the flushing partial order on multiprocessor mem-
ory systems. This section formalizes this property.
The flushing order on multiprocesor memory systems, described in Sec-
tion 3.4.4, is extended to a generalized order on generalized multiprocessor memory
systems as follows:
Definition 6. For generalized multiprocessor memory systems ν1 = (µ1,Γ1) and
ν2 = (µ2,Γ2),
ν2 →bτ ,i ν1 ≡df Γ2 ⊆ Γ1 ∧ i ∈ Γ1 ∧ µ2 →τ,i µ1
ν2 →bτ ν1 ≡df ∃i ∈ P : ν2 →bτ ,i ν1
ν1 ≤ ν2 ≡df ν2
∗→bτ ν1
Informally, ν1 ≤ ν2 if ν2 can flush its buffer-complete processors to arrive at
ν1. The fact that this order on generalized multiprocessor memory systems encom-
passes the flushing order on multiprocessor memory systems is formalized by the
121
following lemma.
Proposition 8. If µ0  µ1 then for all Γ1 there exists Γ0 such that Γ0 ⊇ Γ1 and
(µ0,Γ0) ≤ (µ1,Γ1).
Proof. If µ0  µ1, then there exists some sequence of memory systems such that




0 = µ1 and µ
′
n = µ0. For each step in the above sequence,
there exists a processor identifier i such that µ′m →
τ,i
µ′m+1. Let Γ
′ be this set of these
identifiers, and Γ0 = Γ1 ∪ Γ′.
As in the uniprocessor semantics, a central claim is that the denotation [[P ]]
of each assertion P is closed w.r.t. the generalized flushing order.
Proposition 9 (Flushing Closure). If s, ν |= P and ν ′ ≤ ν then s, ν ′ |= P .
Sketch. By induction on the shape of P . For the separating conjunctions, see, e.g.,
Lemma 7 in Section A.1.
An effect of this is that assertions are oblivious to the nondeterministic flush-
ing of buffered writes to memory.
Corollary 2. If s, µ,Γ |= P and µ′  µ then there exists Γ′ such that Γ′ ⊇ Γ and
s, µ′,Γ′ |= P .
Sketch. By Propositions 9 and 8.
Intuitively, assertions may be thought to describe only the “initial” states
without concern for the nondeterministic flushing of writes that may have taken
place, though the semantics also encompasses all states reachable as a result these
steps. This is an important feature of the assertion language because flushing
buffered writes—which cannot be controlled—never changes the satisfiability of an
assertion. This is also extremely important for the specification logic, as is described
in the next section.
122
Remark. Earlier versions of the semantic functions that model the separating con-
junctions incorporated directly the notion of flushing closure. In order to ensure
that an assertion P •Q denoted a closed sets of states, the corresponding semantic
function •̃ was defined so that µ •̃µ′ was itself a closed set. For example, the spatial
conjunction of two states with buffered writes included not just the result of each
possible interleaving of the buffers, but also each possible interleaving of the buffers
after arbitrary flushing. Furthermore, the definition of the semantic function for
temporal separation was relaxed to be total instead of partial. In particular, the re-
sult of temporally conjoining an earlier, left-side state that contained buffered writes
to a later, right-side buffer-complete state (e.g., a model of a bar assertion)—a com-
position that would be undefined according to the current definition—was equivalent
to first completely flushing the left-side state and only then temporally combining
the states as as in the current definition. That is, the definition of the semantic
function included an explicit application of a function flush(h, b), which calculates
the result of flushing a buffer b into a heap h.
The relaxations described above made closure arguments straightforward,
but yielded function definitions that were otherwise difficult to reason about. The
semantic functions described in Section 4.3 are simpler and stronger than the earlier
definitions sketched here, but the limited scope and stronger definedness conditions
of the current definitions make them easier to reason about. And the lemmas and
propositions described earlier in this section assert that the simpler definitions are
indeed sufficient for flushing closure.
4.5 Concurrent Specifications
The language of specifications is given by the following schema:
J ` {P} c {Q} ,
123
where c is a command and J, P,Q are assertions, referred to as the invariant, pre-
condition and post-condition, respectively.
4.5.1 Concurrent Proof Theory
The axioms of the logic are given in Figure 4.7.
J ` {P} skipi {P} (skip)
J ` {!b ∨ P} assume(b)i {P} (assume)
J ` {b ∧ P} assert(b)i {P} (assert)
J ` {P [e/x]} x := ei {P} (assign)
J ` {e i e′ J P} x := [e]i {(e i e′ J P ) ∧ x = e′} (load)
J ` {e i e′′ J P} [e] := e′i {(e i e′′ J P ) C e i e′} (store)
J ` {emp} fencei {bari} (fence)
J ` {emp} locki {locki} (lock)
J ` {locki} unlocki {emp} (unlock)
Figure 4.7: Concurrent axioms
The axioms for skip, assume(b), assert(b) and x := e are as in Hoare logic,
identical to the corresponding axioms of the sequential program logic except for the
presence of the (arbitrary) invariant J . The load and store axioms are also similar to
those of the sequential program logic, but the writes described by the pre- and post-
condition are annotated with the processor identifier that annotates the command,
which indicates the processor on which the writes are buffered. And similarly for the
fence command, which introduces a barrier assertion annotated with the appropriate
processor identifier. Finally, the lock and unlock commands produce and consume,
respectively, a lock assertion annotated by the appropriate processor identifier. The
logical and structural inference rules of the logic are given in Figure 4.8. The logical
124
rules require little explanation: besides the addition of the invariant assertion, they
are unchanged from the previous chapter. Note again that, although the strong
temporal conjunction is defined as the additive conjunction of spatial and temporal
conjunctions, its frame rule is not derivable from the other rules, which justifies
the rule’s inclusion. The spatial separating conjunction is commutative, but the
others—which have a temporal aspect—are not. As in the sequential program logic,
we have left-side frame rules only for these conjunctions.
Unlike the sequential logic, there is no conjunction rule in the concurrent
logic. This is because there is a known connection [20] between the soundness of
the conjunction rule and a semantic property called precision, which we have not
yet identified in this model.
The structural rules for sequential composition, nondeterministic choice,
loops and concurrency are similar in spirit to those in Concurrent Separation Logic.
In particular, the concurrency rule requires syntactic agreement between the two
processes on the shared invariant, and the strong interleaving separating conjunc-
tion is used to partition the local states. Note that both this conjunction and the
concurrent composition command are commutative. The sharing rule allows us to
infer that, if a portion of the state is invariant w.r.t. the assertion J , then J may
also be assumed to hold initially and upon termination.
Only the invariant rule differs significantly from Concurrent Separation Logic.
Here, we require that the lock be held before accessing the shared state; and that,
under the assumption that the shared state satisfied the invariant initially, upon
completion of the command the shared state must again be shown to satisfy the in-
variant. In the subsection below, we discuss a number of variations on this important
rule.
125
J ` {P} c {Q} J ` {P ′} c {Q}
J ` {P ∨ P ′} c {Q}
(disj)
J ` {P} c {Q} x /∈ fv(c,Q)
J ` {∃x : P} c {Q}
(ex)
J ` {P} c {Q} mod(c) ∩ fv(R) = ∅
J ` {R ∗ P} c {R ∗ Q}
(frame-sp)
J ` {P} c {Q} mod(c) ∩ fv(R) = ∅
J ` {R C P} c {R C Q}
(frame-tm)
J ` {P} c {Q} mod(c) ∩ fv(R) = ∅
J ` {R J P} c {R J Q}
(frame-stm)
P |= P ′ J ` {P ′} c {Q′} Q′ |= Q
J ` {P} c {Q}
(cons)
J ` {P} c {R} J ` {R} c′ {Q}
J ` {P} c ; c′ {Q}
(seq)
J ` {P} c {Q} J ` {P} c′ {Q}
J ` {P} c+ c′ {Q}
(choice)
J ` {P} c {P}
J ` {P} c∗ {P}
(loop)
J ` {Pi} ci {Qi} fv(Pi, ci, Qi) ∩mod(c1−i) = ∅ for i ∈ {0, 1}
J ` {P0 ∗ P1} c0 || c1 {Q0 ∗ Q1}
(conc)
J ` {P} c {Q}
emp ` {J ∗ P} c {J ∗ Q}
(share)
emp ` {J ∗ P ∗ locki} c {J ∗ Q ∗ locki}
J ` {P ∗ locki} c {Q ∗ locki}
(inv)
Figure 4.8: Concurrent inference rules
126
Derived and Alternative Axioms and Inference Rules
Loading flushed writes Observe that the write in the pre-condition of both the
load and store axioms need not be buffered: the placeholder assertion P can be









(e i e′ J (bari C P ′)) ∧ x = e′
}
It is interesting to note though that this only describes the result of loading a write in
memory that originated from the processor performing the load. We can, however,
derive variants of the load and store axioms which allow us to reason about writes
flushed by an arbitrary processor in case all buffers are complete—i.e., variants of
the load and store axioms that make use of points-to assertions instead of leads-to
assertions, as follows:
J ` {e 7→e′ J P} x := [e]i {(e 7→e′ J P ) ∧ x = e′} (load-p)
J ` {e 7→e′ J P} x := [e]i {(e 7→e′ J P ) J e i e′} (store-p)
These axioms can be derived by instantiating P by (∀x : barx) J P and with the
rule of consequence, using the following equivalence:
e i e
′ J ((∀x : barx) J P ) ≡ ((∃x : e x e′) C (∀x : barx)) J P
Stronger lock-manipulation axioms The given axioms for the lock and unlock
commands are sound but rather weak insofar as they do not reflect the implicit
fencing that is coincident with these commands. The axioms, however, can be
strengthened to take this into account as follows:
J ` {emp} locki {locki ∗ bari} (lock)
127
J ` {locki} unlocki {bari} (unlock)
Using the frame rules, we can also derive “global” axioms for lock and unlock, as
well as fence:
J ` {P} fencei {P C bari} (fence-g)
J ` {P} locki {locki ∗ P C bari} (lock-g)
J ` {P ∗ locki} unlocki {P C bari} (unlock-g)
We can also derive “backward” variations of the global rules, using the temporal
and spatial separating implications:
J ` {P C− bari} locki {P} (fence-b)
J ` {(P C− bari) ∗− locki} locki {P} (lock-b)
J ` {(P C− bari) ∗ locki} unlocki {P} (unlock-b)
Locked and atomic commands Using the invariant rule and the axioms for lock
and unlock, we can derive the following rule for reasoning about “locked” commands,
such as those implemented on x86 like atomic increment or compare-and-swap:
emp ` {(J ∗ P ∗ locki) C bari} c {J ∗ Q ∗ locki}
J ` {P} locki ; c ; unlocki {Q C bari}
(locked)
A shortcoming of this derived rule specifically, and the existing locking ax-
ioms and invariant inference rules generally, is the shared invariant must be general
enough to describe buffered writes that may never be observed because of the fenc-
ing implicit with the lock commands. For example, the following specification is
true but not provable with the current rules:
x 7→1 ` {emp} lock0 ; [x] := 10 ; unlock0 {emp}
128
To prove this with the derived locked rule above, we would have to prove the
following specification:
emp ` {x 7→1 ∗ lock0} [x] := 10 {x 7→1 ∗ lock0} ,
but this is false because the post-condition of the store command yields a buffered
write that has not necessarily flushed, yet the post-condition requires that the buffer
be empty.
We remedy this by providing an alternative, stronger invariant axiom that
uses the expansion of the invariant, and relies upon commands being well-locked:
emp ` {(J ∗ P ∗ locki) C bari} c {(J ∗ Q ∗ locki) C− bari}
J ` {P} locki ; c ; unlocki {Q}
(atomic)
(We call this the atomic rule to distinguish it from the locked derived rule, though
they apply to the same commands.) This inference rule allows for a stronger invari-
ant, such as in the previous example.
Daring rules for accessing shared state We may also consider additional “dar-
ing” inference rules, the soundness of which may well be quite difficult to demon-
strate.
emp ` {J ∗ P} x := [e]i {J ∗ Q}
J ` {P} x := [e]i {Q}
(daring-load)
emp ` {J ∗ P} [e] := e′i {J ∗ Q}
J ` {P} [e] := e′i {Q}
(daring-store)
These rules differ from the invariant rule because they allow reasoning about the
behavior of individual load and store instructions in which the value of the lock is
unspecified. Intuitively, the shared load rule might be shown to be true because a
129
load may only proceed on a live processor, and so will never access shared state while
it is being modified by another process, which holds the lock. On the other hand,
the shared store axiom might be shown to be true because although a store may
take place while another process is modifying the shared state—and hence while
the shared state does not satisfy the stated invariant J—the buffered write will not
commit until the other process has released the lock and repaired the shared state,
restoring the invariant.
A problem with the daring store rule is that it requires the invariant J to
encompass the newly added buffered write; but heap-only invariants are easier to
describe and more general. A possible relaxation of the daring store rule is thus as
follows:
emp ` {J ∗ P} [e] := e′i {(J C− bari) ∗ Q}
J ` {P} [e] := e′i {Q}
(daring-store-2)
This relaxation would allow a daring store if the new write when flushed would
maintain the invariant. Unfortunately, this rule is almost certainly not sound w.r.t.
the semantics of specification described in the next section, because it is simply not
the case that the state necessarily satisfies the invariant after the store; only that
it is somehow observationally equivalent, in that a load of the shared region shall
be compatible with the result of a load that exactly satisfies the stated invariant.
Explorations of the soundness of these rules is thus considered future work.
The daring rules may be needed to reason about, e.g., x86 spinlock implemen-
tations. The spinlock is typically acquired using a compare-and-swap instruction,
which in this language is simply a locked if-the-else command. The invariant rule
and lock axioms thus should be sufficient for demonstrating correctness of spinlock
acquisition. But the spinlock is released by writing to a shared memory address
without first acquiring the global lock or fencing. This obviates the invariant rule,
but not the shared write rule, and so there is yet hope.
130
4.5.2 Semantics of Concurrent Specifications
A specification asserts the partial correctness of a command. Its informal meaning
is roughly analogous to that of Concurrent Separation Logic: if c is evaluated in a
state that satisfies J ∗ P , then: 1) it does not abort, 2) it maintains the invariant
J during execution, and 3) if it evaluates fully, it terminates in a state that satisfies
J ∗ Q.
Following Vafeiadis [54], the formal semantics of specifications is given by
a family of predicates, safen(c, s, µ, J,Q), parametrized by n ∈ N, that relate a
command c, state (s, µ), invariant assertion J and post-condition Q according to
the informal explanation above. Once these predicates are defined, we define truth
of specifications as follows:
J |= {P} c {Q} ≡df ∀s : ∀µ : (s, µ,P) |= P ⇒ ∀n ∈ N : safen(c, s, µ, J,Q).
Observe that only states (s, µ) for which buffer-complete models (s, µ,P) of the pre-
condition are relevant to the meaning of specifications.
In the sequel, let locked(µ) indicate that some processor holds the lock in
memory system µ:
locked(h,B,K) ≡df ∃i ∈ P : k = P \ {i} .
We now give a formal definition of safen(c, s, µ, J,Q) by natural number
induction on n. safe0(c, s, µ, J,Q) holds always. And for n ∈ N, safen+1(c, s, µ, J,Q)
holds iff the following conditions are true:
1. If c = skip then (s, µ,P) |= Q.
2. For all µ0, µJ , µF such that
(i) µ0 ∈ (µJ ∗̃ (µF #̃ µ)),
131
(ii) lock-complete(µ0), and
(iii) either (s, µJ ,P) |= J or locked(µ0),
c, (s, µ0) 9  .
3. For all µ0, µ1, µJ , µF , c′, s′ such that
(i) µ0 ∈ (µJ ∗̃ (µF #̃ µ)),
(ii) lock-complete(µ0),
(iii) either (s, µJ ,P) |= J or locked(µ0), and
(iv) c, (s, µ0)→ c′, (s′, µ1),




(a) µ1 ∈ (µ′J ∗̃ (µ′F #̃ µ′)),
(b) µ′F  µF ,
(c) either (s′, µ′J ,P) |= J or locked(µ1), and
(d) safen(c′, s′, µ′, J,Q).
The definition of the predicate above differs from Vafeiadis’ in three ways.
First, the separating conjunctions are obviously different and, in particular, there
are two different separating conjunctions used: spatiotemporal separation for fram-
ing and spatial separation for partitioning the local from the shared state. The
spatiotemporal separator is used for framing because it subsumes spatial and tem-
poral frames, as well as any combination of spatial and temporal frames. It may be
possible to make use of #̃ uniformly in the definition, which would yield a stronger
notion of specification, but recent theoretical results [23], which describes the im-
portance of a commutative notion of separation for the concurrency rule, makes this
seem unlikely. (Recall that spatial separation is commutative, but not spatiotem-
poral separation is not.)
132
Second, the frame state is allowed to change from one step to another, but
only by making silent transitions. This reflects the fact that specifications are obliv-
ious to the nondeterministic flushing transitions required by the memory model and
described by the program semantics—the program logic allows for reasoning about
specifications at a level that is generally higher than the level of nondeterministic
buffer flushing.
Third, because there is no inherent notion of atomicity in this language and
memory model, we cannot require that the system state always be partitionable so
that one cell satisfies the invariant assertion. For even while one processor holds
the lock, others may well continue to execute by, e.g., storing to their write buffers,
assigning to identifiers, etc. Hence, the invariant condition from Vafeiadis’ safety
predicate is weakened to require only that the invariant holds while the lock is
available. This allows a process to temporarily violate the invariant after acquiring
the lock. In these states, other processes of course cannot rely on the shared state
satisfying the invariant, but this is generally not a concern because, while the other
processes can continue to execute while the invariant is violated, they cannot read
the shared state; instead, they are blocked until the lock is released.
4.5.3 Soundness
For the sake of a soundness theorem about the proof system described above, we
say that a proof is a tree of specifications, in which the leaves are instances of axiom
schemas, and the internal specification nodes are instances of the conclusion of some
inference rule, with the children of that node as instances of the hypotheses of the
inference rule. We write J ` {P} c {Q} to indicate that there exists some proof for
which the root of the tree is labeled with J ` {P} c {Q}. The soundness theorem
asserts that provable specifications are true:
Theorem 1 (Soundness). J ` {P} c {Q} only if J |= {P} c {Q}.
133
Proof. By induction on the structure of an arbitrary proof tree, using the soundness




This chapter describes open questions about the current program logic, and also
some notable features that were explored but not been adopted either because they
were considered narrowly unsuitable, or not sufficiently well understood, or simply
due to lack of time.
5.1 Top Assertions
Previous iterations of the logic included an additional right-side frame rule for the
strong temporal separating conjunction:
J ` {P} c {Q} fv(R) ∩mod(c′) = ∅
J ` {P J R} c {Q J R}
(frame-stm-r)
The intuition behind this rule is that additional, more recent writes to distinct
locations are irrelevant to the load command. In fact, this frame rule allows for
the following very small load axiom, which does not require parametrization by any
additional formulas:
J ` {e e′ e′′} x := [e]e′ {e e′ e′′ ∧ x = e′′} (load-sm)
135
The load axiom defined in Chapter 4 is then derivable from the smaller load axiom
and the right-side strong temporal frame rule:
J ` {e e′ e′′} x := [e]e′ {e e′ e′′ ∧ x = e′′}
load-sm
J ` {e e′ e′′ J P} x := [e]e′ {(e e′ e′′ ∧ x = e′′) J P}
frame-stm-r
J ` {e e′ e′′ J P} x := [e]e′ {(e e′ e′′ J P ) ∧ x = e′′}
cons
Unfortunately this frame rule is not sound for the store command, which
only adds new writes to the “top” of the write buffer and never in the middle. For
example, the following derived specification is false:
J ` {x 7→−} [x] := 10 {x 7→0 C x 0 1}
store
J ` {x 7→− J y 0 2} [x] := 10 {(x 7→0 C x 0 1) J y 0 2}
frame-stm-r
This specification is false because, from a state with a buffered write y 0 2, a store
command will always add a succeeding write, not a preceding write as above.
We can work around this problem with a new assertion, tope, which describes
an empty write buffer that can only be extended with preceding and not succeeding
writes. For example, x  0 1 describes a part of the write buffer 0, which may
be extended using either the left- or right-side separating conjunctions. But x 0
1 C top0 describes specifically the top part of write buffer 0, and may be extended
only with preceding writes. This is accomplished by augmenting yet again the
notion of generalized multiprocessor memory system with an additional set of “top-
completeness” flags, analogous to the buffer-completeness flags Γ, and redefining
the separating conjunctions so that, e.g., top0 C y 0 2 is inconsistent. We then
update the axiom for the store command to specifically require that the top of the
write buffer be described in the pre-condition, so that a new top may be specified
in the post-condition:
136
J ` {(e e′ e′′ J P ) C tope′} [e] := fe′ {(e e′ e′′ J P ) C e e′ f C tope′}
(store-top)
The fence and lock-manipulation axioms require similar updates.
Although this model simplifies the load axiom and strengthens the proof
theory, it also significantly complicates the assertion language—the addition of top-
completeness flags would complicate an already complex model of assertions. Fur-
thermore, the details of the redefinition of the separating conjunction are far from
clear. For example, what is the meaning, if any, of tope C bare? Consequently,
although the idea of symmetric temporal frame rules is interesting, it is not clear
that this would be an improvement overall compared to the assertions and proof
theory described in Chapter 4.
5.2 Additive Barrier Assertions
The current definition of the barrier assertions bari yields an expressive logic, but
at the expense of a somewhat more complicated model. In particular, the buffer-
completeness flags Γ are present in the semantics of assertions solely to model the
bari assertion. This is the best solution of those that have been explored, but a
simpler, less expressive alternative semantics for bari is possible. In particular,
instead of distinguishing between states and more elaborate models, we may give
a semantics to assertions directly in terms of states. In this simplified semantics,
the meaning of most assertions is unchanged (though, of course, without the buffer-
completeness flags), and the meaning of bari is the set of states in which buffer i is
fully flushed:
(s, h,B,K) |= bari ≡df B = E [ŝ(i)   ε] .
Note that, in this definition, the heap is arbitrarily defined, as well as all the write
buffers other than i. In this model, bari has more in common with the additive
137
unit true than the multiplicative unit emp. To describe a flushed write, instead
of using a multiplicative conjunction e i f C bari, we use an additive conjunction
e i f ∧ bari.
By unifying states and models, the semantics of assertions and specifications
is drastically simplified, but reduced expressiveness of the assertion language makes
the program logic more complicated and less effective. For one, it is not possible to
give a “small” axiom to the fence, lock and unlock commands. The given axiom for
fence:
J ` {emp} fencei {bari} (fence)
is sound but drastically weakened: the post-condition, after all, describes an arbi-
trary set of heap-allocated locations. To restore the original spirit of the axiom, we
must make it “large,” incorporating the entire (relevant) system state as follows:
J ` {P} fencei {P ∧ bari} (fence-lg)
Such large axioms, formed using the additive conjunction, are less useful in the
context of a separation-style logic because there is no additive frame rule.
In the case of the atomic inference rule, the situation is even worse:
emp ` {(J ∗ P ∗ locki) C bari} c {(J ∗ Q ∗ locki) C− bari}
J ` {P} locki ; c unlocki {Q}
(atomic)
Soundness of this rule relies crucially on the use of the temporal separating im-
plication in the post-condition of the antecedent, in which, in the current model,
P C− bari denotes the set of states which, when flushed on processor i, satisfy P .
But in the simplified model, the separating implication P C− bari does not have
the intended meaning.1 The problem lies in the incompatibility between the multi-
plicative separating conjunctions and implication, and the additive semantics of the
1Or, really, any coherent or intuitive meaning.
138
barrier assertion. We might also consider using an additive implication P ⇐ bari,
but neither is this intuitively correct (it describes either flushed states that satisfy
P states, or arbitrary non-flushed states), nor is additive implication (along with
negation) technically feasible for the reasons previously discussed in Section 3.5.1.
5.3 Permissions
In Concurrent Separation Logic, the concept of permissions (or shares) has been
fruitful. The idea originated from the ownership interpretation of separation logic
assertions, in which e 7→f is read as an assertion of complete ownership of the address
e, thus effectively granting the command which has this assertion as a pre-condition
full permission to access and modify the location e. There can be no concurrent
modification of the value at e because, in order to do so, a command would also
require e 7→− in its pre-condition, but the parallel composition rule requires the pre-
conditions of parallel commands to be disjoint. For example, consider the following
CSL command specifications:
J ` {x 7→−} [x] := 1 {x 7→1}
J ` {y 7→−} [y] := 2 {y 7→2}
Using the rule of of composition we can derive the following combined specification
for [x] := 1 || [y] := 2:
J ` {x 7→− ∗ y 7→−} [x] := 1 || [y] := 2 {x 7→1 ∗ y 7→2}
This rule is sound because the first command has sole ownership of x and the
second of y. If the commands were to share ownership of a single address—i.e., if
x = y—then there would be a data race on that address. This is ruled out by the
139
pre-condition which requires that x 6= y.
Consider, however, a pair of load commands that share an address:
J ` {x 7→1} t := [x] {x 7→1 ∧ t = 1}
J ` {x 7→1} u := [x] {x 7→1 ∧ u = 1}
We might like to prove the following combined specification of t := [x] ||u := [x]:
J ` {x 7→1} t := [x] ||u := [x] {x 7→1 ∧ t = 1 ∧ u = 1}
Unfortunately this is not provable. The parallel composition rule, in particular,
yields the following specification:
J ` {x 7→1 ∗ x 7→1} t := [x] ||u := [x] {(x 7→1 ∧ t = 1) ∗ (x 7→1 ∧ u = 1)}
This is vacuously true because the pre-condition is inconsistent, and certainly not
equivalent to the desired specification above.
The problem with Concurrent Separation Logic is that both load commands
must claim sole ownership of the address x. This requirement is designed to ensure
that only race-free commands have derived specifications, but clearly in this case
there are no data races. In the fractional permission model of Concurrent Separation
Logic—first introduced by Boyland and Bornat [8, 7] and later refined by Parkinson
and Dockins et al. [40, 16]—each memory address is associated with a real-numbered
permission r with 0 < r ≤ 1. The operational semantics is adjusted so that a
store command requires full permission to execute safely, while the load command
only requires non-zero permission. The separating conjunction simply combines
permissions by adding them, with the operation being undefined if the sum is greater
than 1. For example, x 0.57→ 1 + x 0.57→ 1 ≡ x 17→1 and x 0.57→ 1 + x 17→1 ≡ false.
140
Using the fractional permission model, we can prove the desired specification
of t := [x] ||u := [x]. Instead of assigning each command full permission to the address




















0.57→ 1 ∧ u = 1
}




0.57→ 1 ∗ x 0.57→ 1
}
t := [x] ||u := [x]
{
(x 0.57→ 1 ∧ t = 1) ∗ (x 0.57→ 1 ∧ u = 1)
}
And now the pre-condition of this specification is equivalent to x 17→1 and the post-
condition to x 17→1 ∧ t = 1 ∧ u = 1, as desired.
An alternative to the fractional model of permissions is the counting model.
In this model, permissions are integers, the constituent integers are added in the
model of a separating conjunction, and the full permission, which is required for
writing, is modeled by 0. Conceptually, a points-to assertion with counting permis-
sion n ≥ 0 is thought of as a “source” from which n read-only permissions have been
derived. A characteristic equivalence of the counting permission model is as follows:
e
n7→e′ ∧ n ≥ 0⇔ e n+17→ e′ ∗ e −17→ e′
Some (strong-memory) concurrent programs are more amenable to verifica-
tion using counting permissions versus fractional splitting permissions depending
on the interaction among threads. Part of the goal of this dissertation project is to
explore what aspects of the memory model—which implicitly and concurrently inter-
acts with all programs—can be hidden or incorporated directly into a weak-memory
141
program logic. From this standpoint, both splitting and counting permissions have
been a fruitful source of ideas, some of which are described informally in the follow-
ing sections.2
5.3.1 A Use for Splitting Permissions
A variation on splitting permission could significantly extend the programs about
which the concurrent weak-memory logic is capable of reasoning: namely, to more
racy programs. Consider a program that performs no locking whatsoever, in which
at most one thread at a time may write to a shared region of memory, while the
other threads may only read. If the shared invariant relates more than one memory
location in a non-trivial way, then the “daring” rules from Section 4.5.1, which allow
individual reads and writes to shared memory without requiring that the global lock
be acquired first, may be insufficient because it may not be possible to show that
individual writes necessarily preserve the invariant.
For example, consider a message-passing program with two shared memory
locations: address d, which contains a data value; and address r which contains a
“ready” flag. The invariant, informally, is that whenever the ready flag is set the
value of the data flag is also set. Formally, let J be the following invariant:
J =df ∃r′ : ∃d′ : r 7→r′ ∗ d 7→d′ ∧ (r′ 6= 1 ∨ d′ = 1).
There may be multiple copies of a reading (or receiving) thread, which may load
r and d at any time without first acquiring a lock. The “daring” load axiom is
sufficient to describe the results of these loads. There is also a single writing (or
sending) thread cs, which updates these locations in accordance to the invariant,
defined as follows:
cs =df [d] := 1 ; [r] := 1.
2And also in a longer paper [57].
142
Intuitively, it is clear that this thread maintains the invariant, because it only sets
the ready flag after first setting the data flag; and no other threads may interfere,
resetting the data flag between these two writes.
The “daring” store axiom, however, is not sufficient to show that this thread
maintains the invariant J . The first store is not problematic: regardless of whether
r′ 6= 1 or d′ = 1 to begin with, after setting d, it will be the case that d′ = 1, which
satisfies the invariant. But an attempt to apply the daring store axiom to the second
write will fail, because it is only safe to set the ready flag when it is known that
d = 1. But this knowledge was lost after the first application of the daring store
axiom; again, we only know that one of r′ 6= 1 and d′ = 1 holds.
Suppose, however, we had a modified notion of splitting permissions such
that, as in Bornat’s model, only locations with permission 1 are writable, but also
in which only locations with permission not less than 12 are allowed to have buffered
writes. Let us parametrize the invariant J by some permission p as follows:
Jp =df ∃r′ : ∃d′ : r
p7→r′ ∗ d p7→d′ ∧ (r′ 6= 1 ∨ d′ = 1).
This allows us to separate the invariant into a read-only part J 1
4
that can be shared
among the reading threads, and a “bufferable” part J 3
4
that can be incorporated
into the private state of the writing thread, yielding the following equivalence:




The daring store axiom can then be used to acquire temporary, lock-free access to
the read-only part of the invariant and thus, when combined with the bufferable
part, giving the writing thread sufficient permission to perform the write. A proof












C d 0 1 C r 0 1}
The reason that this proof sketch might be expanded into a full proof—assuming,
hypothetically, the existence of a suitable notion of permission—is that the second
write can be shown to satisfy the invariant because it takes place in the context of
the first write, d 0 1.
The existence of a suitable permission model has not yet been demonstrated,
however. A significant problem that has arisen with candidate models is the re-
quirement that assertions denote sets of states that are closed w.r.t. the flushing
order. With the naive model, this property is easily violated. For example, the






4 0 2) are not closed, because the models
of the left-hand assertion are compatible with only the pending-write models of the
right-hand assertion (because the heap values match), but not the flushed models
(because the heap values, 1 and 2 respectively, do not match). It is not yet clear
how to overcome this problem.
5.3.2 A Use for Counting Permissions
Counting permissions have a different potential use in the weak-logic: namely, for
axiomatization of memory management commands, and unification of the strong and
weak temporal conjunctions.3 With the assertions given thus far, it is possible to
describe successive writes to the same location with the weak temporal conjunction,
but not of course with the strong temporal conjunction; to do so would violate the
3This section is adapted from an earlier paper [57].
144
disjointness requirements of the separating conjunctions. For example,
e 7→ e′ J e 0 e′′ ≡ false
because no models of the first conjunct are compatible with any models of the
second—the allocated location e cannot be separated across the sequential separat-
ing conjunction.
To unify the strong and weak temporal conjunctions with permissions, let us
associate with each allocated location an element of an asymmetric counting per-
mission model, in which permissions are integers and permissions are combined by a
function ⊕. The model is inspired by the previously described counting permissions
of Bornat et al. [7], but here, instead of addition, the operation ⊕ is partial and
asymmetric, defined as follows:
a⊕ b =

a+ b if a < 0 ∧ (b < 0 ∨ −b ≤ a)
⊥ otherwise.
The strong temporal conjunction is redefined with a relaxed notion of compatibility:
locations allocated in both states must have compatible (i.e., not ⊥) permissions.
(The spatial separating conjunction remains unchanged, requiring disjointness of
allocated locations.) Syntactically, points-to and leads-to assertions are annotated
with integers (e.g., e n e′) that denote their location’s permission value.
To understand the intuition behind asymmetric counting permissions, first
recall Bornat’s original counting permission model. There, negative integer anno-
tations denote read-only permission for the location, and nonnegative integer an-
notations (“source” annotations) indicate the number of read-only permissions that




−17→ 1 ∗ x −17→ 1 ∗ x 27→1,
is a consistent formula in Bornat’s logic with full permission (−1 +−1 + 2 = 0), in
which two read-only assertions have been split off of the original assertion.
The asymmetric model can be derived from Bornat’s counting model by re-
placing the separating conjunction with the sequential conjunction, and requiring
that permissions are combined with J in the order shown in the example above,
from negative to positive. Then we can interpret a nonnegative annotation as denot-
ing the most-recently written (top-most) value, where the particular value indicates
the number of prior values, if any. Such prior values are indicated by negative anno-
tations, and the full permission by zero. For example, the following is a consistent
formula that describes two successive writes on buffer i to location x:
x
−1
 i 1 J x
1
 i 2,
because −1⊕ 1 = 0. The following, on the other hand, are inconsistent:
x
−1
 i 1 J x
−1
 i 2 J x
1
 i 3 and x
0
 i 1 J x
−1
 i 2
because the top-most write, annotated with n ≥ 0, must succeed no more than n
earlier writes ((−1⊕−1)⊕1 = −2⊕1 = ⊥), and no writes may follow the top write,
annotated with a non-negative integer (0⊕−1 = ⊥).
Using the asymmetric counting permission described above, we may consider
axiomatization of primitives for dynamic memory management; namely, dynamic
allocation via the command x := new(e), and dynamic disposal via the command
free(e). The semantics of these commands are not defined at the level of the memory
model, so there is some choice about what operational meaning to give them. In
practice, these commands are typically implemented using fence commands to ensure
146
system-wide consistency. In contrast, we have chosen to extricate barriers from
their meaning, primarily to explore the circumstances in which barriers are actually
needed. (The typical semantics of allocation and disposal can be recovered by
explicitly adding fence instructions before and after these commands.)
Absent a succeeding barrier, allocation is perfectly natural; it simply adds a
new write to an unallocated location to the top of the write buffer:







Note that the write in the post-condition has full permission, which ensures that
earlier writes, framed from the left, have distinct locations. Otherwise, this could
result in a duplicate allocation.
The meaning of the free command absent a preceding barrier is less clear.
We axiomatize a conservative semantics: if a location has at most one value in the
system, it may be deallocated. In particular, we make no attempt to describe the
outcome of deallocation without a barrier when there are multiple pending writes.
Perhaps in this case the command should fault, or perhaps all pending writes should
be removed from the system, leaving writes to other locations unaffected. In any
case, the following axiom describes the conservative semantics and does not allow








Symmetric to the case for allocation, the write in the pre-condition has full permis-
sion, which in this case prevents earlier writes to the same location from being framed
on from the left, yielding a double disposal. By using the rule of consequence, the
pre-condition may be strengthened from a leads-to assertion to a points-to assertion,
thus axiomatizing disposal of shared memory.
147
5.4 Shared Variables
In Concurrent Separation Logic, threads may communicate not just through the
heap, but through shared variables as well. Elaborate rules govern these interac-
tions, which must be well synchronized, to ensure soundness of the logics. Even
for traditional, strong-memory logics, this is notoriously difficult; indeed, a counter-
example to the soundness of CSL was recently (and quite unintentionally) discovered
by the author—and described in detail by Brookes along with a minor revision to
the logic to avoid the problem [10], and also by Reddy accompanying a more radical
revision to the logic [46]—which exploited a subtle problem with the inference rules
that govern variable communication.
In the weak-memory logic described in this dissertation, communication via
variables is completely prohibited—the rule for concurrent composition explicitly
requires that the modified variables of each thread are distinct from the free vari-
ables of the others. This is a conservative condition that rules out many interesting
programs, with the goal of simplifying the program logic and focusing the project
on the study of communication via the weak memory system. Furthermore, there
is reason to believe that allowing variable communication in the weak-memory logic
would be even more subtle and problematic than in traditional logics. Of course,
communication among threads executed on different processors can only happen
by loading and storing through the memory system, and there is little reason to
think that variable assignment is, in general, a suitable abstraction of this process.
It seems likely though that under some conditions—namely, when shared-variable
assignments are consistently protected by fence or lock instructions—it is possi-
ble to reason soundly about programs with such interaction, but it remains to be
determined exactly what those conditions are.
148
5.5 Invariant Expansion
In the semantics of specifications given in Section 4.5.2, the shared invariant com-
ponent J of a specification J ` {P} c {Q} is interpreted rather literally: if the
lock is not held, then there must be a portion of the complete state which satisfies J
exactly. To reason about the specification of commands that accesses shared state,
it is therefore necessary to prove always that, once the shared access is finished,
there is some portion of the state that satisfies the invariant. For many programs,
however, this is difficult for the following reason. It is often most convenient to
specify the shared invariant by describing the legal values in the heap—i.e., with
spatial conjunctions of points-to formulas. But points-to formulas do not simply
constrain values of the heap; they also specify emptiness of write buffers. Conse-
quently, a program which buffers writes to the shared state without first acquiring
the lock cannot possibly satisfy a heap-only shared invariant. For example, even a
very simple invariant like x 7→1 cannot be satisfied by the program [x] := 1i because
it is not the case that:
x 7→1 C x i 1 |= x 7→1.
It seems clear, however, that in at least some circumstances the above command
ought to satisfy its simple specification because, observationally, a process cannot
distinguish a state that satisfies x 7→ 1 C x i 1 from one that satisfies the speci-
fication x 7→ 1. In both cases the address x is allocated, and the result of loading
x on any processor is 1. Also note that the command does not satisfy the more
liberal specification x 7→1 C x i 1 for the same reason, which is that the following
entailment is not true:
x 7→1 C x i 1 C x i 1 |= x 7→1 C x i 1.
An alternative semantics of specifications might relax the interpretation of
149
the shared invariant so that, if the lock is not held, there is always a portion of the
state such that, as its buffered writes flush to memory, the heap part satisfies the
invariant J . More formally, let us write trim for the function that replaces the write
buffers in a memory system with empty buffers:
trim(h,B,K) =df (h, E ,K).
We might then redefine the semantics of specifications such that there must always
be a portion µJ of the state such that trim(µJ) satisfies the invariant J . Note that,
because of the implicit flushing steps incorporated into the semantics of commands,
this implies that regardless of which flushing steps take place, that trimming the
substate µJ satisfies J . Hence, the heap part of the shared substate always satisfies,
in this relaxed way, the invariant J as it commits.
Under this semantics of specifications, a more liberal rule for invariant rea-
soning could be admitted, in which the command must ensure that a portion of
state always satisfies the expansion of the invariant J , written exp(J). Informally,
a state satisfies the expansion of J if, as in the relaxed semantics, as it flushes its
heap part always satisfies J . Formally:
s, µ,Γ |= exp(J) ≡df ∀µ′,Γ′ : (µ′,Γ′) ≤ (µ,Γ) ⇒ s, trim(µ′),Γ′ |= J.
The relaxed invariant rule is as follows:
emp ` {exp(J) ∗ P} c {exp(J) ∗ Q}
J ` {P} c {Q}
(inv-exp)
Note that this invariant rule does not require that lock acquisition precede shared
state manipulation.
While this semantics and invariant rule seems, in principle, superior to those
presented in Chapter 4, it has disadvantages as well. Most significantly, it is not
150
clear how to reason about the exp(J) formula. It seems as though it should be
possible to load and store in the context of exp(J) just as in the context of J ,
but it is not clear how to connect this intuition with the assertion logic. Consider
again the invariant x 7→ 1. If the rule inv-exp is used to access this shared state
in the guise of exp(x 7→1) then to usefully apply the load axiom as presented in
Section 4.5.1 we must exhibit an assertion P such that exp(J) |= x 7→ 1 J P and
(x 7→ 1 J P ) C x i 1. Considering the relatively loose definition of exp(J), it is
not clear what such an assertion would be.
Furthermore, in those situations for which access to shared state is mediated
by the global lock, such a relaxed semantics is undesirable. When the threads
of a program take care to serialize their access to shared resources, flushing their
buffers before and after such accesses, it is an entirely unnecessary complication
to assume otherwise. Ideally a semantics and proof theory would be found which




Technical inspiration for this project comes primarily from work on separation logic
[47, 36, 35] and abstract separation logic [11], as well as concurrent separation
logic [34, 9], which this program logic resembles insofar as it strives to enable local
(instead of global) reasoning about shared-state invariants (instead of two-place state
relations). Earlier iterations of the machine and programming language models was
influenced by work on graphical models [58, 25, 26] and the pomset model of true
concurrency from Pratt [43, 44]. The style of semantics of specifications, and the
associated soundness proof, is taken almost directly from Vafeiadis’ recent soundness
proof of concurrent separation logic [54]. Vafeiadis’ excellent dissertation has also
been an invaluable guide [53].
The following sections discuss other work related to the topic of weak-memory
program reasoning and verification.
6.1 Weak-Memory Program Transformations
The approach to weak-memory program reasoning that is perhaps technically sim-
plest is based on program transformations. The basic idea is to use existing tech-
152
niques to reason about programs that have been modified to directly incorporate
the behavior of the memory model on which they are to be executed. For example,
in the context of x86, programs would be modified to include as additional state a
series of in-memory FIFO queues which represent write buffers, each load and store
operation is replaced by calls to subroutines which respectively search the process’s
queue for an appropriate write and append a new write to the process’s queue, and
finally an additional thread is composed in parallel which commits writes, with ar-
bitrary timing, from the encoded buffers to memory. This is essentially the tack
taken by relatively early work in this area by Ridge [48], in which he uses the Is-
abell/HOL framework and logic to reason directly about operational semantics of
modified Caml programs. In particular, a mechanically verified proof in this style is
presented for Peterson’s mutual exclusion algorithm [42]. The potential drawback to
the program transformation approach is the difficulty of applying a local reasoning
principle and, thus, a useful frame rule. Because Ridge uses higher-order logic to
reason about the semantics directly, those proofs are undertaken without a frame
rule.
It is also conceptually possible to use a separation logic with a strong-memory
locality principle and associated frame rule to reason about transformed programs.
In the author’s early experiments, this approach seemed cumbersome at best. With
write buffers encoded as program state, the locations of the writes are encoded
as data. This encoding obviates the notion of “separation” that makes separation
logic so useful, since it is always possible to frame onto a specification writes which,
when flushed, violate the intended specification. And because the nondeterministic
flushing of writes is not embedded into the logic, it must be encoded as a disjunction
of possible buffered states and reasoned about on a case-by-case basis.
153
6.2 Program Logics for Weak-Memory Reasoning
Also of note are two works that present solutions to the same weak-memory rea-
soning problem, both developed much more fully than the logic described in this
dissertation. First is Ridge’s rely-guarantee program logic for the x86-TSO mem-
ory model [49]. Ridge’s logic is formalized in HOL and has been demonstrated
with proofs of a number of interesting algorithms, including Simpsons’s 4-slot non-
blocking buffer. In contrast to this dissertation project, Ridge’s is a logic for the
x86 assembly language, whereas the logic described in Chapter 4 targets a higher
level, structured language. Additionally, Ridge’s logic is not inherently local, and
offers nothing like the frame rule of separation logic.
A second work of note is Cohen and Schirmer’s [15] reduction from x86-
TSO to sequential consistency for certain programs. This is notable because the
class of programs they consider is larger than just the well locked programs. They
show that many concurrent programming paradigms, although racy, in fact remain
sequentially consistent. They furthermore provide a method of syntactic restriction
for an Owicki-Gries-style program logic that allows sound reasoning about such
programs. Although they describe some useful programs that fall outside of this
boundary, this seems to be a work of great practical importance. Although their
logic also offers no frame rule, Cohen has suggested in private communication that
a similar restriction may be applied concurrent separation logic for sound local
reasoning.
Related but less relevant to the current problem, compared to the previ-
ous two papers, is work by Ferreira, Feng and Shao which gives soundness proofs
for concurrent separation logic in a variety relaxed-memory settings [18]. As with
the original soundness proof by Brookes [9], their theorem applies to well locked
programs only, which are necessarily sequentially consistent.
154
6.3 Algebraic Models of Concurrency
A final thread of related work by Hoare et al. considers algebraic models of concur-
rent programs and logics for reasoning about such programs [58]. The thread begins
with the introduction of graphical models, a true-concurrency model, like Pratt’s
pomset model [43, 44] or Winskel’s event structures [59]. Graphical models are a
simple but amazingly expressive formalism; recent work has even shown how they
can be used to give weak-memory semantics of various kinds to concurrent programs
[27].
A graphical model is a finite partial order over some set of events. The
labels on events are chosen based on the features of the programming language
one wishes to model; examples include variable assignments; memory loads, stores
and fences; dynamic memory allocation and disposal; lock acquisitions and releases;
reads and writes to communication channels; and thread forks and joins. Directed
edges between events represent control flow and data flow. Each graph represents a
single execution of a program; a program is represented by a set of graphs indicating
its possible executions; and program features are described in terms of constraints
on the graphs.
For example, a variable assignment x := e might be described as a single event
labeled by x := v, where v is the valuation of e, with incoming edges labeled with the
names, and respective values, of the variables found in the expression e, and outgoing
edges labeled by x and the value v. In this case, the incoming edges represented data
flows needed to evaluate the expression e, and the outgoing edges represent data
flows of the new value of x that can be used to evaluate later expressions. Another
example is the sequential composition of commands c0 ; c1, which are modeled by
graphs that can be partitioned such that there are no “backward” edges from an
event of c1 to an event of c2.
Graphical models have also been used to show the soundness of concurrent
155
program logics [58]. When modeling a program logic, assertions simply denote sets
graphical models just the same as programs—a simplification that yields sound-
ness much more easily than with traditional models. Later, models of program
logics were generalized to an abstract algebraic characterization of graphical mod-
els. These algebras are known as Concurrent Kleene Algebras [25, 24], so-called
because they extend traditional Kleene Algebras with operators for both sequential
and concurrent composition.
The laws of Concurrent Kleene Algebras are further relaxed in a broader
study of algebras for concurrent reasoning [23], which resulted in the discovery of
locality bimonoids, found to be sufficient for modeling logics that embody a local
reasoning principle. Some results from work on locality bimonoids influenced this
work. In particular, those results showed that, while frame rules do not require
commutative logical operations, concurrency rules do—this indicated that the se-
mantics of specifications, which describes the separation of the shared state from the
private state using commutative spatial separation can likely not be strengthened by





This dissertation project has presented a new program logic for reasoning about the
behavior of structured, C-like concurrent programs w.r.t. a weak, x86-like model of
memory. The programming language has a novel semantics that unifies the descrip-
tion of parallel and interleaved execution; the novel assertion language provides a
natural and concise language for describing x86-like system states, and is oblivious
to the non-deterministic flushing of buffered writes; and the program logic—similar
in shape and meaning to Concurrent Separation Logic—provides small axioms for
the programming language’s primitives, including memory fences, and embodies in
its frame rules a novel x86-specific principle of local reasoning.
Motivations Reconsidered Why bother building a program logic? The original
motivation was as follows. Although program logics are reasonable systems in which
to construct hand proofs of arbitrary program properties, they have more recently
been shown to be amenable to automation of relatively shallow properties, like
memory safety or shape properties. But existing logics cannot be soundly applied
to certain fine-grained concurrent programs like concurrent data structures. This
is because these programs are typically not well locked and contain races, and so
157
cannot rely on the underlying computer architecture to ensure that their interaction
with memory is sequentially consistent. Sequential consistency however is a deep
assumption in most existing program logics, hence their inapplicability.
As further motivation, concurrent data structures are inarguably important
to computer science given the decline of single-threaded processor performance im-
provements and the concomitant proliferation of hardware parallelism. At the same
time, correctness arguments for concurrent data structures are subtle enough to
make informal reasoning extremely difficult. Additionally, these programs are of
only modest size, which perhaps gives cause for optimism about their amenability
to automated or semi-automated verification. Altogether, this would appear to be
an excellent opportunity for the application of an automated formal method.
Now, however, I am less confident in the practical value of this project than
I was at the outset, having identified a number of errors of judgment in the original
motivation. First, it was wrong to consider the small size of these programs as in-
creasing the viability of their fully automatic reasoning. This is exactly backwards:
because these subtle and important programs are so small, it may be entirely prac-
tical to consider expert-constructed formal proofs of their correctness using proof
assistants like ACL2, Coq or Isabelle. And although constructing these proofs is
difficult, surely it is less so than developing a general technique for doing so.
Second, although such programs are clearly racy, it is not clear that their
interactions with memory fall outside the bounds of sequentially consistency. And
for sequentially consistent programs, it seems unlikely that an approach of such high
fidelity w.r.t. the memory model (e.g., with explicit write buffers) will turn out to
be the most effective.
Nonetheless, the project still has real scientific merit. First, it is not clear
that it is possible to reason about small concurrent data structures in isolation from
the clients that access them. If, in fact, this is not possible—if such programs cannot
158
soundly be analyzed by their components—then it could be the case that a weak
memory program logic that smoothly generalizes simpler traditional logics could
be applied to whole programs, working like a traditional program logic for the less
complex aspects of the program, but general enough to handle the racy interactions
made possible by the concurrent data structures. The logic described here is not
yet up to that task, but may constitute a step in that direction.
Furthermore, this work faces the problem of local reasoning about the behav-
ior of programs executing on a more complicated machine quite directly and gives
some indication of how this can be done without relying on simplifying assumptions
about memory. Local reasoning techniques can, of course, be quite useful even for
hand-constructed formal proofs. In the best case, this work could provide a foun-
dation for practically useful reasoning about a class of difficult programs. In the
worst case, it sheds some light on the problem of local program reasoning in gen-
eral by providing an additional—fairly extreme—data point in the space of program
logics, illustrating the difficulty and complexity of reasoning about the behavior of
programs w.r.t. a widespread and weak memory model.
Local Reasoning and Sequential Consistency A final note has to do with the
relationship between local reasoning and sequential consistency. The logic described
in this dissertation was designed up from the memory model; that is, soundness
w.r.t. the x86 memory model was a primary requirement, and strongly directed the
shape of the assertion and specification languages and proof theory. The second
primary requirement was a logic that embodied a local reasoning principle. Because
the original goal was to be able to reason about racy concurrent data structures,
at every step care was taken to be as liberal as possible while maintaining both
goals. For example, the definition of the spatiotemporal conjunction was relaxed
over and over again, incorporating more and more possible state configurations,
until it seemed as though any additional relaxation would yield an unsound frame
159
rule, thus violating the goal of having a logic suitable for local reasoning. And
yet, in the end, it seems—though has not been proven—that the logic that resulted
from this design process is incapable of expressing proofs about non-sequentially
consistent programs. This was certainly was not intended; indeed, it came about
despite direct intention.
Although one could imagine changing or improving many aspects of this
logic, it is far from clear how one might proceed to relax its definition further
still to incorporate non-sequentially consistent programs given the local reasoning
requirement. One is led to wonder whether these two requirements of the logic—
both that it directly describes weak-memory systems, and that it admits a useful
principle of local reasoning—are not just at odds, but even mutually inconsistent.
The apparent failure of this logic to accomplish both goals simultaneously is not, of
course, a cogent argument for the impossibility of this task, but it does hint toward
the difficulty of doing so.
160
Appendix A
Additional Lemmas, Proofs and
Conjectures
This chapter contains proofs of various results about the assertion and program
logics. Section A.1 describes results about closure properties of the models of the
separating conjunctions, and Section A.2 describes results about the soundness of
the program logic w.r.t. the model given in Chapter 4.
A.1 Flushing Closure Proofs
The following lemma describes a crucial relationship between spatiotemporal sepa-
ration and the flushing relation.
Lemma 4. If µ ∈ µ1 #̃ µ2 and µ →
τ
µ′, then either there exists µ′1 such that
µ′ ∈ µ′1 #̃ µ2, or there exists µ′2 such that µ′ ∈ µ1 #̃ µ′2.
Proof. Without loss of generality we may assume that µ = (h,B[i   (`, v) :: b] ,K)
for some i /∈ K, and thus µ′ = (h[`   v] , B[i   b] ,K). Furthermore, we have
h = h1\\h2, B(i) = (`, v) :: b ∈ B1(i)\\B2(i) and K = K1 ] K2, assuming µ1 =
161
(h1, B1,K1) and µ2 = (h2, B2,K2). The least-recent write of B(i), (`, v), is either
the least-recent write of B1(i) or B2(i).
In the first case, B1(i) = (`, v) :: b′1, and b ∈ b′1\\B2(i). Let





Because i /∈ K = K1 ] K2, it is also the case that i /∈ K1, and so µ1 →
τ
µ′1. By
definedness of µ1 #̃ µ2, we know that if ` ∈ dom(h2) ∩ dom(B1(i)) then i ∈ K.
Hence ` /∈ dom(h2), which means that (h1\\h2)[`   v] = h1[`   v] \\h2. It follows
that
µ′ = (h[`   v] , B[i   b] ,K)
= (h1[`   v] \\h2, B[i   b] ,K1 ]K2)




,K1) #̃ (h2, B2,K2)
= µ′1 #̃ µ2.
In the second case, B2(i) = (`, v) :: b′2, and b ∈ B1(i)\\b′2. Let





Again, i /∈ K2, and so µ2 →
τ
µ′2. Because (h1\\h2)[`   v] = h1\\(h2[`   v]), it follows
that
µ′ = (h[`   v] , B[i   b] ,K)
= (h1\\(h2[`   v]), B[i   b] ,K1 ]K2)





= µ1 #̃ µ′2.
162
Lemma 5. If µ ∈ µ1 ∗̃ µ2 and µ →
τ
µ′, then either there exists µ′1 such that µ
′ ∈
µ′1 ∗̃ µ2, or there exists µ′2 such that µ′ ∈ µ1 ∗̃ µ′2.
Proof. Without loss of generality we may assume that µ = (h,B[i   (`, v) :: b] ,K)
for some i /∈ K, and thus µ′ = (h[`   v] , B[i   b] ,K). Furthermore, we have
h = h1 ] h2, B(i) = (`, v) :: b ∈ B1(i) ] B2(i) and K = K1 ] K2, assuming µ1 =
(h1, B1,K1) and µ2 = (h2, B2,K2). The least-recent write of B(i), (`, v), is either
the least-recent write of B1(i) or B2(i).
In the first case, B1(i) = (`, v) :: b′1, and b ∈ b′1 ]B2(i). Let





Because i /∈ K = K1 ] K2, it is also the case that i /∈ K1, and so µ1 →
τ
µ′1. By
µ1^∗ µ2 and ` ∈ dom(µ1) we have ` /∈ dom(h2), which means that (h1]h2)[`   v] =
h1[`   v] ] h2. It follows that
µ′ = (h[`   v] , B[i   b] ,K)
= (h1[`   v] ] h2, B[i   b] ,K1 ]K2)




,K1) ∗̃ (h2, B2,K2)
= µ′1 ∗̃ µ2.
The second case is symmetric.
Lemma 6. If µ = µ1 C̃ µ2 and µ →
τ
µ′, then either there exists µ′1 such that
µ′ = µ′1 C̃ µ2, or there exists µ
′
2 such that µ
′ = µ1 C̃ µ′2.
Proof. Without loss of generality we may assume that µ = (h,B[i   (`, v) :: b] ,K)
for some i /∈ K, and thus µ′ = (h[`   v] , B[i   b] ,K). Furthermore, we have
163
h = h1\\h2, B(i) = (`, v) :: b = B1(i) ++B2(i) and K = K1 ] K2, assuming µ1 =
(h1, B1,K1) and µ2 = (h2, B2,K2). The least-recent write of B(i), (`, v), is either
the least-recent write of B1(i) or B2(i).
In the first case, B1(i) = (`, v) :: b′1, and b = b
′
1 ++B2(i). Let





Because i /∈ K = K1 ] K2, it is also the case that i /∈ K1, and so µ1 →
τ
µ′1. By
µ1^C µ2, we know that if ` ∈ dom(h2)∩dom(B1(i)) then i ∈ K. Hence ` /∈ dom(h2),
which means that (h1\\h2)[`   v] = h1[`   v] \\h2. It follows that





= ((h1\\h2)[`   v] , B[i   b] ,K)
= µ′.
In the second case, B2(i) = (`, v) :: b′2, B1(i) = ε and b = b
′
2. Let





Because i /∈ K = K1 ] K2, it is also the case that i /∈ K2, and so µ2 →
τ
µ′2. By
definition of function overriding, (h1\\h2)[`   v] = h1\\h2[`   v]. It follows that
µ1 C̃ µ
′





= ((h1\\h2)[`   v] , B[i   b] ,K)
= µ′.




2 such that µ
′
1  µ1, µ′2  µ2 and µ′ ∈ µ′1 •̃µ′2.
Proof. By induction on the number of τ steps from µ to µ′. The base case is trivial.
Otherwise, assume that µ′′  µ and µ′′ →
τ
µ′, and by the induction hypothesis that
there exists µ′′1, µ
′′
2 such that µ
′′
1  µ1, µ′′2  µ2 and µ′′ ∈ µ′′1 •̃µ′′2. In which case the
result follows from one of Lemmas 4, 5 or 6, and by transitivity of .
A.2 Soundness Proofs
The following lemma describes the relationship between the “safety” lemmas and
“soundness” lemmas in subsequent sections.
Lemma 8. Let c be a command and J, P, and Q be assertions. Suppose, for all
n ∈ N, s ∈ Stack and µ ∈Mem, that whenever s, µ,P |= P it is also the case that
safen(c, s, µ, J,Q) holds. Then J |= {P} c {Q} holds as well.
Proof. Immediate from the definition of J |= {P} c {Q}.
A.2.1 Soundness of the Axioms
This section presents soundness lemmas for each of the axioms described in Sec-
tion 4.5.1. That is, for each axiom J ` {P} c {Q}, there is a lemma below which
asserts J |= {P} c {Q}.
Lemma 9. For all n ∈ N, if (s, µ,P) |= P then safen(skip, s, µ, J, P ).
Proof. By induction on n. The base case is trivial. For the induction step, we show
safen+1(skip, s, µ, J, P ) under the assumption that safen(skip, s, µ, J, P ).
1. Because c = skip we must show that s, µ,P |= P . This is true by hypothesis.
2. Let µ0, µJ , µF such that µ0 ∈ µJ ∗̃ (µF #̃ µ), lock-complete(σ0) and either
s, µJ ,P |= J or locked(µ0). We must show that skip, s, µ0 9  . But the only
165
evaluation step possible from configuration skip, s, µ0 is by c-tau, which never
aborts.
3. Let µ0, µJ , µF , c1, µ1 such that µ0 ∈ µJ ∗̃ (µF #̃ µ), lock-complete(µ0), either
s, µJ ,P |= J or locked(µ0), and skip, s, µ0 → c1, s, µ1. We must show µ′J , µ′F , µ′
such that µ1 ∈ µ′J ∗̃ (µ′F #̃ µ′), µ′F  µF , either s, µ′J ,P |= J or locked(µ1),
and safen(c1, s, µ′, J, P ).
The only evaluation step possible from skip, s, µ0 is by c-tau, hence µ1  µ0.
By Lemma 7, there exists µ′J , µ
′
F , µ
′ such that µ1 ∈ µ′J ∗̃ (µ′F #̃ µ′), µ′J  µJ ,
µ′F  µF , µ′  µ. By Proposition 2 and µ′J  µJ , if s, µJ ,P |= J then also
s, µ′J ,P |= J . Similarly, s, µ′,P |= P because s, µ,P |= P and µ′  µ. Hence,
by the inductive hypothesis we have that safen(skip, s, µ′, J, P ).
Lemma 10. J |= {P} skip {P}.
Proof. Immediate from Lemmas 8 and 9.
In the sequel, we write dom(µ), for µ = (h,B, k), as shorthand for dom(h)∪
∪i∈P dom(B(i)), and dom(µ|i) as shorthand for dom(h) ∪ dom(B(i)).
Lemma 11. For all n ∈ N and s, µ such that s, µ,P |= e  e′ e′′ J P , if x /∈
fv(e, e′, e′′, P ) then safen(x := [e]e′ , s, µ, J, ((e e′ e
′′ J P ) ∧ x = e′′)).
Proof. By induction on n. The base case is trivial. For the induction step, we
assume the lemma holds for n and show that it holds for n+ 1.
1. The command is not equal to skip, so this part holds vacuously.
2. Let µ0, µJ , µF such that µ0 ∈ µJ ∗̃ (µF #̃ µ), lock-complete(µ0), and either
locked(µ0) or s, µJ ,P |= J . We must show that c, s, µ0 9  . The only
aborting step possible is via c-prim-a by way of p-load-a. This requires
166
that ŝ(e) /∈ dom(h\\B(ŝ(e′))). But s, µ,P |= (e e′ e′′) J P by hypothesis,
which means there exists µw such that s, µw,P |= e e′ e′′ and dom(µw|ŝ(e′)) ⊆
dom(µ|ŝ(e′)) ⊆ dom(µ0|ŝ(e′)). But ŝ(e) ∈ dom(µw|ŝ(e′)), and so
ŝ(e) ∈ dom(h\\B(ŝ(e′))).
Hence, the command cannot abort.
3. Let µ0, µJ , µF , c′, s′, µ1 such that µ0 ∈ µJ ∗̃ (µF #̃ µ), lock-complete(µ0), ei-




′ such that µ1 ∈ µ′J ∗̃ (µ′F #̃ µ′), µ′F  µF , either locked(µ1) or
s′, µ′J ,P |= J , and safen(c′, s′, µ′, J, (e e′ e′′ J P ) ∧ x = e′′). The evalua-
tion step is either by c-tau or c-prim by way of p-load.
In the case of c-tau, c′ = (x := [e]e′), s
′ = s and µ1  µ0. By Lemma 7 there
exists µ′J , µ
′
F , µ
′ such that µ′J  µJ , µ′F  µF and µ′  µ. By Lemma 2,
s, µ′,P |= (e e′ e′′) J P and s, µ′J ,P |= J if not locked(µ1). It follows from
the inductive hypothesis that safen(c′, s′, µ′, J, (e e′ e′′ J P ) ∧ x = e′′).
In the case of c-prim and p-load, c′ = skip, s′ = s[x   v] and µ1 = µ0,
assuming µ0 = (h0, B0, k0) and (h0\\B0(ŝ(e′)))(ŝ(e)) = v. Let µ′J = µJ ,
µ′F = µF and µ
′ = µ. Then µ1 ∈ µ′J ∗̃ (µ′F #̃ µ′) by assumption, µ′F  µF by
reflexivity, s′, µ′J ,P |= J if not locked(µ1) because x /∈ fv(J) and by Lemmas 2
and 3. Finally, safen(skip, s′, µ′, J, ((e  e′ e′′) J P ∧ x = e′′)) follows from
Lemma 9 because s′, µ′,P = s[x   v] , µ,P |= (e e′ e′′ J P ) ∧ x = e′′, which
itself is because s, µ,P |= e e′ e′′ J P with x /∈ fv(e, e′, e′′, P ).
Lemma 12. J |= {x e′ e′′ J P} x := [e]e′ {(x e′ e′′ J P ) ∧ x = e′′} if x /∈
fv(e, e′, e′′, P ).
167
Proof. Immediate from Lemmas 8 and 11.
Lemma 13. For all n ∈ N and s, µ with s, µ,P |= e e′ e′′ J P , it is the case that
safen([e] := fe′ , s, µ, J, (e e′ e′′ J P ) C e e′ f).
Proof. By induction on n. The base case is trivial. For the induction step, we
assume the lemma holds for n and show that it holds for n+ 1.
1. The command is not equal to skip, so this part holds vacuously.
2. Let µ0, µJ , µF such that µ0 ∈ µJ ∗̃ (µF #̃ µ), lock-complete(µ0), and either
locked(µ0) or s, µJ ,P |= J . We must show that c, s, µ0 9  . The only abort-
ing step possible is via c-prim-a by way of p-store-a. This requires that
ŝ(e) /∈ dom(h\\B(ŝ(e′))). But s, µ,P |= (e e′ e′′) J P , which means there
exists µw such that s, µw,P |= e e′ e′′ and dom(µw|ŝ(e′)) ⊆ dom(µ|ŝ(e′)) ⊆
dom(µ0|ŝ(e′)). But ŝ(e) ∈ dom(µw|ŝ(e′)), and so ŝ(e) ∈ dom(h\\B(ŝ(e′))).
Hence, the command cannot abort.
3. Let µ0, µJ , µF , c′, s′, µ1 such that µ0 ∈ µJ ∗̃ (µF #̃ µ), lock-complete(µ0), ei-




′ such that µ1 ∈ µ′J ∗̃ (µ′F #̃ µ′), µ′F  µF , either locked(µ1) or
s′, µ′J ,P |= J , and safen(c′, s′, µ′, J, (e e′ e′′ C P ) C e e′ f). The evalua-
tion step is either by c-tau or c-prim by way of p-store.
In the case of c-tau, c′ = c, s′ = s and µ1  µ0. By Lemma 7 there
exists µ′J , µ
′
F , µ
′ such that µ′J  µJ , µ′F  µF and µ′  µ. By Lemma 2,
s, µ′,P |= e e′ e′′ J P and s, µ′J ,P |= J if not locked(µ1). It follows from
the inductive hypothesis that safen(c, s, µ′, J, (e e′ e′′ J P ) C e e′ f).
In the case of c-prim and p-store, c′ = skip, s′ = s and let µ1 = µ0 C̃ µw,
where µw models the new write, defined as follows
µw =df (∅, E
[








∈ {assumption µ0 ∈ µJ ∗̃ (µF #̃ µ)}
(µJ ∗̃ (µF #̃ µ)) C̃ µw
⊆ {algebra of separation functions}
µJ ∗̃ (µF #̃ (µ C̃ µw))
Finally, we have safen(skip, s, µ C̃ µw, J, (e e′ e′′ J P ) C e e′ f) by way of
Lemma 9 because s, µ C̃ µw,P |= (e e′ e′′ J P ) C e e′ f , which itself is
because s, µ,P |= e e′ e′′ J P and s, µw,P |= e e′ f
Lemma 14. J |= {e e′ e′′ J P} [e] := fe′ {(e e′ e′′ J P ) C e e′ f}.
Proof. Immediate from Lemmas 8 and 13.
Lemma 15. For all n ∈ N and s, µ such that s, µ,P |= emp it is the case that
safen(locke, s, µ, J, locke).
Proof. By induction on n. The base case is trivial. For the induction step, we
assume the lemma holds for n and show that it holds for n+ 1.
1. The command is not equal to skip, so this part holds vacuously.
2. There are no aborting steps possible from command locke.
3. Let µ0, µJ , µF , c′, s′, µ1 such that µ0 ∈ µJ ∗̃ (µF #̃ µ), lock-complete(µ0), ei-




′ such that µ1 ∈ µ′J ∗̃ (µ′F #̃ µ′), µ′F  µF , either locked(µ1) or
169
s′, µ′J ,P |= J , and safen(c′, s′, µ′, J, locke). The evaluation step is either by
c-tau or c-prim by way of p-lock.
In the case of c-tau, c′ = c, s′ = s and µ1  µ0. By Lemma 7 there
exists µ′J , µ
′
F , µ
′ such that µ′J  µJ , µ′F  µF and µ′  µ. By Lemma 2,
s, µ′,P |= emp and s, µ′J ,P |= J if not locked(µ1). It follows from the
inductive hypothesis that safen(c′, s′, µ′, J, locke).
In the case of c-prim and p-lock, c′ = skip, s′ = s and µ1 = (h0, B0,P \
{ŝ(e)}), assuming µ0 = (h0, B0,K0). By assumption, K0 = ∅, and so from
µ0 ∈ µJ ∗̃ (µF #̃ µ), we have that µ = (h,B, ∅) for some h,B. Let µ′ =
(h,B,P \ {ŝ(e)}). Then from µ0 ∈ µJ ∗̃ (µF #̃ µ) and K0 = ∅, we also have
µ1 ∈ µJ ∗̃ (µF #̃ µ′). Finally, safen(skip, s, µ′, J, locke) follows from Lemma 9
because s, µ′,P |= locke, which itself is because s, µ,P |= emp.
Lemma 16. J |= {emp} locke {locke}.
Proof. Immediate from Lemmas 8 and 15.
Lemma 17. For all n ∈ N and s, µ such that s, µ,P |= locke it is the case that
safen(unlocke, s, µ, J, emp).
Proof. By induction on n. The base case is trivial. For the induction step, we
assume the lemma holds for n and show that it holds for n+ 1.
1. The command is not equal to skip, so this part holds vacuously.
2. There are no aborting steps possible from command unlocke.
3. Let µ0, µJ , µF , c′, s′, µ1 such that µ0 ∈ µJ ∗̃ (µF #̃ µ), lock-complete(µ0), ei-




′ such that µ1 ∈ µ′J ∗̃ (µ′F #̃ µ′), µ′F  µF , either locked(µ1) or
170
s′, µ′J ,P |= J , and safen(c′, s′, µ′, J, emp). The evaluation step is either by
c-tau or c-prim by way of p-unlock.
In the case of c-tau, c′ = c, s′ = s and µ1  µ0. By Lemma 7 there
exists µ′J , µ
′
F , µ
′ such that µ′J  µJ , µ′F  µF and µ′  µ. By Lemma 2,
s, µ′,P |= locke and s, µ′J ,P |= J if not locked(µ1). It follows from the
inductive hypothesis that safen(c′, s′, µ′, J, emp).
In the case of c-prim and p-unlock, c′ = skip, s′ = s and µ1 = (h0, B0, ∅),
assuming µ0 = (h0, B0,K0). By assumption, K0 = P \ {ŝ(e)}, and so from
µ0 ∈ µJ ∗̃ (µF #̃ µ), we have that µ = (h,B,P \ {ŝ(e)}) for some h,B. Let
µ′ = (h,B, ∅). Then from µ0 ∈ µJ ∗̃ (µF #̃ µ) and K0 = P \ {ŝ(e)}, we
also have µ1 ∈ µJ ∗̃ (µF #̃ µ′). Finally, safen(skip, s, µ′, J, emp) follows from
Lemma 9 because s, µ′,P |= emp, which itself is because s, µ,P |= locke.
Lemma 18. J |= {locke} unlocke {emp}.
Proof. Immediate from Lemmas 8 and 17.
A.2.2 Soundness of the Inference Rules
This section presents soundness lemmas for some, but not all, of the inference rules
described in Section 4.5.1. That is, for some inference rules
J ′ ` {P ′} c′ {Q′}
J ` {P} c {Q}









only if J |= {P} c {Q} .
171
For a set of memory systems S, we write safen(c, s, S, J,Q) and s, S,Γ |= Q
as shorthand for the universal quantifications
∀µ ∈ S : safen(c, s, µ, J,Q) and ∀µ ∈ S : s, µ,Γ |= Q.
The following two lemmas sketch soundness proofs for a spatiotemporal frame
rule. Note, however, that the spatiotemporal conjunction is not actually defined in
the assertion language described in this dissertation, nor is there a spatiotemporal
frame rule. But the separating conjunctions that are described are refinements
of this hypothetical spatiotemporal conjunction, and so it may be interesting to
consider its frame rule.
Lemma 19. If safen(c, s, µ, J,Q), fv(R)∩mod(c) = ∅, def(µR #̃ µ), and s, µR,P |=
R, then safen(c, s, µR #̃ µ, J,R # Q).
Sketch. By induction on n. The base case is trivial. For the induction step, we
assume the lemma holds for n and show that it holds for n+ 1. That is, we assume,
for any c, s, µ, µR, J,Q,R, that whenever safen(c, s, µ, J,Q), fv(R) ∩ mod(c) = ∅,
def(µR #̃ µ), and s, µR,P |= R, it is also the case that safen(c, s, µR #̃ µ, J,R # Q)
holds. We also assume that safen+1(c, s, µ, J,Q), fv(R)∩mod(c) = ∅, def(µR #̃ µ),
and s, µR,P |= R, and show that safen+1(c, s, µR #̃ µ, J,R # Q). To show this, we
must establish three conditions.
1. Suppose c = skip. By assumption, s, µR,P |= R, def(µR #̃ µ), and if c = skip
then s, µ,P |= Q, and so s, µ,P |= Q. Hence s, (µR #̃ µ),P |= R # Q.
2. Let µ0, µJ , µF such that µ0 ∈ µJ ∗̃ (µF #̃ (µR #̃ µ)), and either locked(µ0) or
s, µJ ,P |= J . We must show that c, s, µ0 9  . But
µJ ∗̃ (µF #̃ (µR #̃ µ)) ⊆ µJ ∗̃ ((µF #̃ µR) #̃ µ),
172
and so by part 2 of assumption safen+1(c, s, µ, J,Q), instantiating with (an
arbitrary element of) µF #̃ µR, the result holds.
3. Let µ0, µJ , µF , µ1, c′, s′ such that µ0 ∈ µJ ∗̃ (µF #̃ (µR #̃ µ)), either locked(µ0)
or s, µJ ,P |= J , and c, s, µ0 → c′, s′, µ1. We must exhibit µ′J , µ′F , µ′ such that:
(a) µ1 ∈ µ′J ∗̃ (µ′F #̃ µ′),
(b) µ′F  µF ,
(c) either locked(µ1) or s′, µ′J ,P |= J , and
(d) safen(c′, s′, µ′, J, R # Q).
We instantiate part 3 of assumption safen+1(c, s, µ, J,Q) with µJ , (an arbitrary




(a) µ1 ∈ µ′J ∗̃ (µ′FR #̃ µ′)
(b) µ′FR  µF #̃ µR
(c) either locked(µ1) or s′, µ′J ,P |= J , and
(d) safen(c′, s, µ′, J,Q).
By Lemma 7 and µ′FR  µF #̃ µR, there exists µ′F , µ′R such that µ′F  µF ,
µ′R  µR and µ′FR ∈ µ′F #̃ µ′R.
From µ′R  µR and s, µR,P |= R—as well as the assumption fv(R)∩mod(c) =
∅ and Lemmas 2 and 3—it follows that s′, µ′R,P |= R.
Next, we wish to apply the inductive hypothesis to safen(c′, s′, µ′, J,Q) to
show that safen(c′, s′, µ′R # µ
′, J, R # Q). This follows from the observations










tivity and monotonicity it follows that
µ1 ∈ µ′J ∗̃ (µ′F #̃ (µ′R #̃ µ′)).
We have already showed that µ′F  µF . Either locked(µ1) or s′, σ′J ,P |=
J by assumption. Finally safen(c′, s′, µ′R #̃ µ
′, J, R # Q) by the inductive
hypothesis above.
Lemma 20. If J |= {P} c {Q} and it is the case that fv(R) ∩ mod(c) = ∅ then
J |= {R # P} c {R # Q}.
Sketch. Let s, µ,P |= R # P and n ∈ N. We must show safen(c, s, µ, J,R # Q).
From s, µ,P |= R # P , there exists µR, µP with µ ∈ µR #̃ µP such that s, µR,P |=
R and s, µP ,P |= P . By assumption, safen(c, s, µP , J,Q). The result follows from
Lemma 19.
A proof sketch of the safety of the spatial frame rule relies on the following
unproven conjecture.
Conjecture 1. Let µa ∈ µb ∗̃ µc, and suppose c, s, µa → c′, s′, µ′a. If µ′a ∈ µ′b #̃ µ′c
with µ′b  µb, then µ′b #̃ µ′c = µ′b ∗̃ µ′c.
Lemma 21. If safen(c, s, µ, J,Q), fv(R)∩mod(c) = ∅, def(µR ∗̃ µ), and s, µR,P |=
R, then safen(c, s, µR ∗̃ µ, J,R ∗ Q).
Sketch. By induction on n. The base case is trivial. For the induction step, we
assume the lemma holds for n and show that it holds for n+ 1. That is, we assume,
for any c, s, µ, µR, J,Q,R, that whenever safen(c, s, µ, J,Q), fv(R) ∩ mod(c) = ∅,
def(µR ∗̃ µ), and s, µR,P |= R, it is also the case that safen(c, s, µR ∗̃ µ, J,R ∗ Q)
174
holds as well. We additionally assume the antecedent, which is that safen+1(c, s, µ, J,Q),
fv(R)∩mod(c) = ∅, def(µR ∗̃ µ), and s, µR,P |= R, and show the consequent, which
is that safen+1(c, s, µR ∗̃ µ, J,R ∗ Q). To show this, we must establish three condi-
tions.
1. Suppose c = skip. By assumption, s, µR,P |= R, def(µR ∗̃ µ), and if c = skip
then s, µ,P |= Q, and so indeed s, µ,P |= Q. Hence s, µR ∗̃ µ,P |= R ∗ Q.
2. Let µ0, µJ , µF such that µ0 ∈ µJ ∗̃ (µF #̃ (µR ∗̃ µ)), and either locked(µ0) or
s, µJ ,P |= J . We must show that c, s, µ0 9  . But
µJ ∗̃ (µF #̃ (µR ∗̃ µ)) ⊆ µJ ∗̃ (µF #̃ (µR #̃ µ))
= µJ ∗̃ ((µF #̃ µR) #̃ µ)
and so by part 2 of assumption safen+1(c, s, µ, J,Q), instantiating with (an
arbitrary element of) µF #̃ µR, the result holds.
3. Let µ0, µJ , µF , µ1, c′, s′ such that µ0 ∈ µJ ∗̃ (µF #̃ (µR ∗̃ µ)), either locked(µ0)
or s, µJ ,P |= J , and c, s, µ0 → c′, s′, µ1. We must exhibit µ′J , µ′F , µ′ such that:
(a) µ1 ∈ µ′J ∗̃ (µ′F #̃ µ′),
(b) µ′F  µF ,
(c) either locked(µ1) or s′, µ′J ,P |= J , and
(d) safen(c′, s′, µ′, J, R ∗ Q).
We first instantiate part 3 of assumption safen+1(c, s, µ, J,Q) with µJ , (an





(a) µ1 ∈ µ′J ∗̃ (µ′FR #̃ µ′)
(b) µ′FR  µF #̃ µR
175
(c) either locked(µ1) or s′, µ′J ,P |= J , and
(d) safen(c′, s′, µ′, J,Q).
By Lemma 7 and µ′FR  µF #̃ µR, there exists µ′F , µ′R such that µ′F  µF ,
µ′R  µR and µ′FR ∈ µ′F #̃ µ′R.
From µ′R  µR and s, µR,P |= R—as well as the assumption fv(R)∩mod(c) =
∅ and Lemmas 2 and 3—it follows that s′, µ′R,P |= R.
Next, we wish to apply the inductive hypothesis to safen(c′, s′, µ′, J,Q) to
show that safen(c′, s′, µ′R ∗̃ µ′, J, R ∗ Q). This follows from the observations
that fv(R) ∩ mod(c′) = ∅ (because mod(c′) ⊆ mod(c) from Lemma 2) and
def(µ′R ∗̃ µ′).




R ∗̃ µ′. By Conjecture 1
it follows that
µ1 ∈ µ′J ∗̃ (µ′F #̃ (µ′R ∗̃ µ′)).
We have already showed that µ′F  µF . Either locked(µ1) or s′, σ′J ,P |= J by
assumption. Finally safen(c′, s′, µ′R ∗̃ µ′, J, R ∗ Q) by the inductive hypothesis
above.
Proof of Theorem 1. By induction on the structure of an arbitrary derivation of
J ` {P} c {Q}. The base cases follow from Lemmas 10, 12, 14, 16, 18, etc. The
inductive cases follow from Lemma 20, etc.
176
A.3 Summary of Notation
Mathematical function and predicate symbols are written in an italic face. Pro-
graming language primitives are written in a sans-serif face. Atomic formulas are
written in a bold serif face.
The symbols that are typically used to denote particular objects are listed
in Figure A.1.
Object Symbols Components
Identifiers (aka variables) x, y, z, t, u, v
Processor identifiers i, j, k
Memory addresses (aka locations) `
Lists (generally) l,m, n






Memory systems µ (h, b) or (h,B)
Generalized memory systems ν (µ, γ) or (µ,Γ)
States σ (s, µ)





Assertions (as invariants) I, J
Figure A.1: Symbols and the objects they denote
177
Bibliography
[1] S. V. Adve and K. Gharachorloo. Shared memory consistency models: A tuto-
rial. IEEE Computer, 29(12):66–76, 1996.
[2] J. Alglave, A. C. J. Fox, S. Ishtiaq, M. O. Myreen, S. Sarkar, P. Sewell, and
F. Z. Nardelli. The semantics of Power and ARM multiprocessor machine code.
In Petersen and Chakravarty [41], pages 13–24.
[3] J. Berdine, C. Calcagno, and P. W. O’Hearn. Smallfoot: Modular automatic
assertion checking with separation logic. In F. S. de Boer, M. M. Bonsangue,
S. Graf, and W. P. de Roever, editors, FMCO, volume 4111 of Lecture Notes
in Computer Science, pages 115–137. Springer, 2005.
[4] J. Berdine, B. Cook, and S. Ishtiaq. SLAyer: Memory Safety for Systems-level
Code. In G. Gopalakrishnan and S. Qadeer, editors, CAV, volume 6806 of
Lecture Notes in Computer Science. Springer, 2011.
[5] Y. Bertot and P. Castéran. Interactive Theorem Proving and Program Develop-
ment. Coq’Art: The Calculus of Inductive Constructions. Texts in Theoretical
Computer Science. Springer Verlag, 2004.
[6] R. Bornat. Proving pointer programs in hoare logic. In R. C. Backhouse
and J. N. Oliveira, editors, MPC, volume 1837 of Lecture Notes in Computer
Science, pages 102–126. Springer, 2000.
178
[7] R. Bornat, C. Calcagno, P. W. O’Hearn, and M. J. Parkinson. Permission
accounting in separation logic. In J. Palsberg and M. Abadi, editors, POPL,
pages 259–270. ACM, 2005.
[8] J. Boyland. Checking interference with fractional permissions. In R. Cousot,
editor, SAS, volume 2694 of Lecture Notes in Computer Science, pages 55–72.
Springer, 2003.
[9] S. Brookes. A semantics for concurrent separation logic. Theor. Comput. Sci.,
375(1-3):227–270, 2007.
[10] S. Brookes. A revisionist history of concurrent separation logic. Electronic Notes
in Theoretical Computer Science, 276(0):5 – 28, 2011. Twenty-seventh Con-
ference on the Mathematical Foundations of Programming Semantics (MFPS
XXVII).
[11] C. Calcagno, P. W. O’Hearn, and H. Yang. Local action and abstract separation
logic. In LICS, pages 366–378. IEEE Computer Society, 2007.
[12] K. M. Chandy and J. Misra. Parallel program design: a foundation. Addison-
Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1988.
[13] N. Chong and S. Ishtiaq. Reasoning about the ARM weakly consistent memory
model. In E. D. Berger and B. Chen, editors, MSPC, pages 16–19. ACM, 2008.
[14] E. M. Clarke and E. A. Emerson. Design and synthesis of synchronization
skeletons using branching-time temporal logic. In Logic of Programs, Workshop,
pages 52–71, London, UK, 1982. Springer-Verlag.
[15] E. Cohen and B. Schirmer. From total store order to sequential consistency: A
practical reduction theorem. In M. Kaufmann and L. C. Paulson, editors, ITP,
volume 6172 of Lecture Notes in Computer Science, pages 403–418. Springer,
2010.
179
[16] R. Dockins, A. Hobor, and A. W. Appel. A fresh look at separation algebras
and share accounting. In Z. Hu, editor, APLAS, volume 5904 of Lecture Notes
in Computer Science, pages 161–177. Springer, 2009.
[17] M. Dodds, X. Feng, M. J. Parkinson, and V. Vafeiadis. Deny-guarantee reason-
ing. In G. Castagna, editor, ESOP, volume 5502 of Lecture Notes in Computer
Science, pages 363–377. Springer, 2009.
[18] R. Ferreira, X. Feng, and Z. Shao. Parameterized memory models and concur-
rent separation logic. In A. D. Gordon, editor, ESOP, volume 6012 of Lecture
Notes in Computer Science, pages 267–286. Springer, 2010.
[19] D. Galmiche and D. Larchey-Wendling. Expressivity properties of boolean BI
through relational models. In S. Arun-Kumar and N. Garg, editors, FSTTCS,
volume 4337 of Lecture Notes in Computer Science, pages 357–368. Springer,
2006.
[20] A. Gotsman, J. Berdine, and B. Cook. Precision and the conjunction rule in
concurrent separation logic. Electr. Notes Theor. Comput. Sci., 276:171–190,
2011.
[21] L. Higham, J. Kawash, and N. Verwaal. Weak memory consistency models
part I: Definitions and comparisons. Technical Report 98/612/03, Department
of Computer Science, The University of Calgary, 1998.
[22] C. A. R. Hoare. An axiomatic basis for computer programming. Commun.
ACM, 12(10):576–580, 1969.
[23] C. A. R. Hoare, A. Hussain, B. Möller, P. W. O’Hearn, R. L. Petersen, and
G. Struth. On locality and the exchange law for concurrent processes. In J.-
P. Katoen and B. König, editors, CONCUR, volume 6901 of Lecture Notes in
Computer Science, pages 250–264. Springer, 2011.
180
[24] C. A. R. Hoare, B. Möller, G. Struth, and I. Wehrman. Concurrent Kleene
algebra. In M. Bravetti and G. Zavattaro, editors, CONCUR, volume 5710 of
Lecture Notes in Computer Science, pages 399–414. Springer, 2009.
[25] C. A. R. Hoare, B. Möller, G. Struth, and I. Wehrman. Foundations of con-
current kleene algebra. In R. Berghammer, A. Jaoua, and B. Möller, editors,
RelMiCS, volume 5827 of Lecture Notes in Computer Science, pages 166–186.
Springer, 2009.
[26] C. A. R. Hoare, B. Möller, G. Struth, and I. Wehrman. Concurrent kleene
algebra and its foundations. J. Log. Algebr. Program., 80(6):266–296, 2011.
[27] T. Hoare and J. Wickerson. Unifying models of data flow. In M. Broy,
C. Leuxner, and T. Hoare, editors, Software and Systems Safety - Specification
and Verification, volume 30 of NATO Science for Peace and Security Series -
D: Information and Communication Security, pages 211–230. IOS Press, 2011.
[28] C. B. Jones. Development methods for computer programs including a notion
of interference. PhD thesis, Oxford University, 1981.
[29] M. Kaufmann and J. S. Moore. An industrial strength theorem prover for a
logic based on common lisp. IEEE Trans. Software Eng., 23(4):203–213, 1997.
[30] L. Lamport. How to make a multiprocessor computer that correctly executes
multiprocess programs. IEEE Trans. Computers, 28(9):690–691, 1979.
[31] X. Leroy. Formal verification of a realistic compiler. Communications of the
ACM, 52(7):107–115, 2009.
[32] J. S. Moore and G. Porter. An executable formal java virtual machine thread
model. In Java Virtual Machine Research and Technology Symposium, pages
91–104. USENIX, 2001.
181
[33] T. Nipkow, L. C. Paulson, and M. Wenzel. Isabelle/HOL - A Proof Assistant
for Higher-Order Logic, volume 2283 of Lecture Notes in Computer Science.
Springer, 2002.
[34] P. W. O’Hearn. Resources, concurrency, and local reasoning. Theor. Comput.
Sci., 375(1-3):271–307, 2007.
[35] P. W. O’Hearn and D. J. Pym. The logic of bunched implications. Bulletin of
Symbolic Logic, 5(2):215–244, 1999.
[36] P. W. O’Hearn, J. C. Reynolds, and H. Yang. Local reasoning about programs
that alter data structures. In L. Fribourg, editor, CSL, volume 2142 of Lecture
Notes in Computer Science, pages 1–19. Springer, 2001.
[37] S. Owens. Reasoning about the implementation of concurrency abstractions
on x86-tso. In T. D’Hondt, editor, ECOOP, volume 6183 of Lecture Notes in
Computer Science, pages 478–503. Springer, 2010.
[38] S. Owens, S. Sarkar, and P. Sewell. A better x86 memory model: x86-TSO. In
S. Berghofer, T. Nipkow, C. Urban, and M. Wenzel, editors, TPHOLs, volume
5674 of Lecture Notes in Computer Science, pages 391–407. Springer, 2009.
[39] S. S. Owicki and D. Gries. Verifying properties of parallel programs: An ax-
iomatic approach. Commun. ACM, 19(5):279–285, 1976.
[40] M. J. Parkinson. Local reasoning for Java. Technical report, University of
Cambridge, 2005. Technical Report UCAM-CL-TR-64.
[41] L. Petersen and M. M. T. Chakravarty, editors. Proceedings of the POPL
2009 Workshop on Declarative Aspects of Multicore Programming, DAMP 2009,
Savannah, GA, USA, January 20, 2009. ACM, 2009.
182
[42] G. L. Peterson. Myths about the mutual exclusion problem. Inf. Process. Lett.,
12(3):115–116, 1981.
[43] V. R. Pratt. On the composition of processes. In POPL, pages 213–223, 1982.
[44] V. R. Pratt. The pomset model of parallel processes: Unifying the temporal
and the spatial. In S. D. Brookes, A. W. Roscoe, and G. Winskel, editors,
Seminar on Concurrency, volume 197 of Lecture Notes in Computer Science,
pages 180–196. Springer, 1984.
[45] D. J. Pym, P. W. O’Hearn, and H. Yang. Possible worlds and resources: the
semantics of BI. Theor. Comput. Sci., 315(1):257–305, 2004.
[46] U. S. Reddy and J. C. Reynolds. Syntactic control of interference for separation
logic. In J. Field and M. Hicks, editors, POPL, pages 323–336. ACM, 2012.
[47] J. C. Reynolds. Separation logic: A logic for shared mutable data structures.
In LICS, pages 55–74. IEEE Computer Society, 2002.
[48] T. Ridge. Operational reasoning for concurrent caml programs and weak mem-
ory models. In K. Schneider and J. Brandt, editors, TPHOLs, volume 4732 of
Lecture Notes in Computer Science, pages 278–293. Springer, 2007.
[49] T. Ridge. A rely-guarantee proof system for x86-TSO. In G. T. Leavens, P. W.
O’Hearn, and S. K. Rajamani, editors, VSTTE, volume 6217 of Lecture Notes
in Computer Science, pages 55–70. Springer, 2010.
[50] S. Sarkar, P. Sewell, J. Alglave, L. Maranget, and D. Williams. Understanding
power multiprocessors. In M. W. Hall and D. A. Padua, editors, PLDI, pages
175–186. ACM, 2011.
183
[51] S. Sarkar, P. Sewell, F. Z. Nardelli, S. Owens, T. Ridge, T. Braibant, M. O.
Myreen, and J. Alglave. The semantics of x86-CC multiprocessor machine code.
In Z. Shao and B. C. Pierce, editors, POPL, pages 379–391. ACM, 2009.
[52] R. C. Steinke and G. J. Nutt. A unified theory of shared memory consistency.
J. ACM, 51(5):800–849, 2004.
[53] V. Vafeiadis. Modular fine-grained concurrency verification. Technical report,
University of Cambridge, 2008. Technical Report UCAM-CL-TR-726.
[54] V. Vafeiadis. Concurrent separation logic and operational semantics. In
J. Ouaknine, editor, 27th Conference on the Mathematical Foundations of Pro-
gramming Semantics, 2011.
[55] V. Vafeiadis and M. J. Parkinson. A marriage of rely/guarantee and separation
logic. In L. Caires and V. T. Vasconcelos, editors, CONCUR, volume 4703 of
Lecture Notes in Computer Science, pages 256–271. Springer, 2007.
[56] I. Wehrman. Semantics and syntax of a weak-memory concurrent separation
logic: Sequential fragment. Technical Report TR-12-08, University of Texas at
Austin, Department of Computer Science, 2010.
[57] I. Wehrman and J. Berdine. A proposal for weak-memory local reasoning.
In Syntax and Semantics of Low-Level Languages, 2011. Available at http:
//www.cs.utexas.edu/~iwehrman/pub/lola.pdf.
[58] I. Wehrman, C. A. R. Hoare, and P. W. O’Hearn. Graphical models of separa-
tion logic. Inf. Process. Lett., 109(17):1001–1004, 2009.
[59] G. Winskel. Event structures. In W. Brauer, W. Reisig, and G. Rozenberg,
editors, Advances in Petri Nets, volume 255 of Lecture Notes in Computer
Science, pages 325–392. Springer, 1986.
184
[60] H. Yang. Local Reasoning for Stateful Programs. PhD thesis, University of
Illinois at Urbana-Champaign, 2001. Technical Report UIUCDCS-R-2001-2227.
[61] H. Yang, O. Lee, J. Berdine, C. Calcagno, B. Cook, D. Distefano, and P. W.
O’Hearn. Scalable shape analysis for systems code. In A. Gupta and S. Malik,
editors, CAV, volume 5123 of Lecture Notes in Computer Science, pages 385–
398. Springer, 2008.
[62] H. Yang and P. W. O’Hearn. A semantic basis for local reasoning. In M. Nielsen
and U. Engberg, editors, FoSSaCS, volume 2303 of Lecture Notes in Computer
Science, pages 402–416. Springer, 2002.
185
Vita
Ian Wehrman was born in St. Louis, Missouri on June 7, 1980. He received his B.Sc.
in Computer Science from Webster University in 2004 and his M.Sc. in Computer
Science from Washington University in St. Louis in 2006. His master’s thesis de-
scribes a fully automated approach to solving the word problem from automated
deduction by using termination checkers of term-rewriting systems. Later in 2006
he entered the doctoral program in Computer Science at the University of Texas
at Austin where he studied formal methods, concurrency and the theory of pro-
gramming languages. He received a Ph.D in 2012 for a dissertation that describes
a program logic for reasoning locally about the behavior of programs w.r.t. a weak,
x86-like memory model.
Permanent Address: ian@wehrman.org
This dissertation was typeset with LATEX 2ε1 by the author.
1LATEX 2ε is an extension of LATEX. LATEX is a collection of macros for TEX. TEX is a trademark of
the American Mathematical Society. The macros used in formatting this dissertation were written
by Dinesh Das, Department of Computer Sciences, The University of Texas at Austin, and extended
by Bert Kay, James A. Bednar, and Ayman El-Khashab.
186
