Strong Logic for Weak Memory: Reasoning About Release-Acquire Consistency in Iris by Kaiser, Jan-Oliver et al.
Strong Logic for Weak Memory: Reasoning About
Release-Acquire Consistency in Iris∗†
Jan-Oliver Kaiser1, Hoang-Hai Dang2, Derek Dreyer3, Ori Lahav4,
and Viktor Vafeiadis5
1 MPI-SWS, Saarbrücken and Kaiserslautern, Germany‡
janno@mpi-sws.org
2 MPI-SWS, Saarbrücken and Kaiserslautern, Germany†
haidang@mpi-sws.org
3 MPI-SWS, Saarbrücken and Kaiserslautern, Germany†
dreyer@mpi-sws.org
4 MPI-SWS, Saarbrücken and Kaiserslautern, Germany†
orilahav@mpi-sws.org
5 MPI-SWS, Saarbrücken and Kaiserslautern, Germany†
viktor@mpi-sws.org
Abstract
The field of concurrent separation logics (CSLs) has recently undergone two exciting develop-
ments: (1) the Iris framework for encoding and unifying advanced higher-order CSLs and formal-
izing them in Coq, and (2) the adaptation of CSLs to account for weak memory models, notably
C11’s release-acquire (RA) consistency. Unfortunately, these developments are seemingly incom-
patible, since Iris only applies to languages with an operational interleaving semantics, while C11
is defined by a declarative (axiomatic) semantics. In this paper, we show that, on the contrary, it
is not only feasible but useful to marry these developments together. Our first step is to provide
a novel operational characterization of RA+NA, the fragment of C11 containing RA accesses
and “non-atomic” (normal data) accesses. Instantiating Iris with this semantics, we then derive
higher-order variants of two prominent RA+NA logics, GPS and RSL. Finally, we deploy these
derived logics in order to perform the first mechanical verifications (in Coq) of several interesting
case studies of RA+NA programming. In a nutshell, we provide the first foundationally verified
framework for proving programs correct under C11’s weak-memory semantics.
1998 ACM Subject Classification F.3.1 Specifying and Verifying and Reasoning about Pro-
grams; F.3.2 Semantics of Programming Languages
Keywords and phrases Weak memory models, release-acquire, concurrency, separation logic
Digital Object Identifier 10.4230/LIPIcs.ECOOP.2017.17
Supplementary Material ECOOP Artifact Evaluation approved artifact available at
http://dx.doi.org/10.4230/DARTS.3.2.15
∗ An extended version of this paper with a technical appendix can be found at [1].
† This research was supported in part by a European Research Council (ERC) Consolidator Grant for the
project “RustBelt”, funded under the European Union’s Horizon 2020 Framework Programme (grant
agreement no. 683289).
‡ Saarland Informatics Campus.
Co
ns
iste
nt *
Complete * W
ell Documented*Easyto
Re
us
e*
*Evaluated*
EC
OO
P *
Artifact * AEC
© Jan-Oliver Kaiser, Hoang-Hai Dang, Derek Dreyer, Ori Lahav, and Viktor Vafeiadis;
licensed under Creative Commons License CC-BY
31st European Conference on Object-Oriented Programming (ECOOP 2017).
Editor: Peter Müller; Article No. 17; pp. 17:1–17:29
Leibniz International Proceedings in Informatics
Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany
17:2 Strong Logic for Weak Memory
1 Introduction
Separation logic [25] is a refinement of Hoare logic with an intrinsic notion of ownership:
whereas an assertion in Hoare logic denotes a fact about the global machine state, an
assertion in separation logic denotes ownership of (and knowledge about) a piece of that
state, and the separating conjunction P ∗Q denotes that the assertions P and Q own disjoint
pieces of state. This ownership reading of assertions is useful for giving “local” (or “small-
footprint”) specifications for primitive commands, which are much easier to compose soundly
into specifications for larger programs. Moreover, as O’Hearn was the first to observe [24],
separation logic is also eminently suitable for concurrent programs. In particular, ownership
provides a direct and convenient way of explaining how synchronization mechanisms serve to
transfer control of shared state between threads. Although O’Hearn’s original concurrent
version of separation logic was geared toward reasoning about coarse-grained synchronization
via semaphores, the subsequent decade of research into concurrent separation logics (CSLs) has
shown that ownership and separation are just as useful for reasoning about more fine-grained
and low-level synchronization mechanisms, such as those employed in the implementations of
non-blocking data structures [35, 11, 7, 32, 30, 23, 6].
In this paper, we consider two of the most recent, boundary-pushing developments in
concurrent separation logics: (1) the Iris framework for encoding and unifying advanced
higher-order CSLs and formalizing them in Coq [14, 13, 16, 17], and (2) the adaptation of
CSLs to account for weak memory models, notably C11’s release-acquire (RA) consistency [34,
33, 8, 20]. Although these developments have thus far (for reasons explained below) appeared
to be incompatible, we show that in fact they are not! Quite the contrary: we demonstrate
that it is not only feasible but useful to marry them together, and in so doing, provide
the first foundationally verified framework for proving programs correct under C11’s weak
memory semantics.
1.1 Iris: A Unifying Framework for Concurrent Separation Logics
After O’Hearn’s original CSL, there came a steady stream of “new and improved” CSLs
appearing on at least a yearly basis. Unfortunately, as these new CSLs grew ever more
expressive, they also grew increasingly complex, baking in increasingly sophisticated proof
rules as primitive, with the relationships and compatibility between different proof rules (e.g.,
whether they could be soundly combined in one logic) remaining unclear.
The central source of complexity in most existing CSLs lies in their mechanisms for
controlling interference between threads accessing shared state, which have evolved from
Jones’s rely-guarantee [12] to the much more sophisticated and elaborate protocol mechanisms
appearing in logics like CaReSL [32], iCAP [30], and TaDA [6]. In an attempt to consolidate
the field, Jung et al. developed Iris [14, 13, 16], a logic with the express goal of showing that
even the fanciest of these interference-control mechanisms could be encoded via a combination
of two orthogonal “off-the-shelf” ingredients: (1) partial commutative monoids (PCMs) for
formalizing protocols on shared state, and (2) invariants for enforcing them. Invariants are an
old and ubiquitous concept in program verification, and PCMs have been used in a number of
prior logics to represent different kinds of ghost (or auxiliary) state i.e., logical state that is
manipulated as part of the proof of a program but is not manipulated directly by the program
itself. Jung et al.’s observation was that in fact these two simple mechanisms are all you need:
using just PCMs and invariants, one can derive a variety of powerful forms of protocol-based
reasoning from prior CSLs within Iris, and by virtue of working in a unified framework,
these derived mechanisms are automatically compatible (different mechanisms can be used
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:3
soundly to verify different modules in a program). Iris also goes beyond most prior CSLs by
supporting higher-order quantification and impredicative invariants—invariants that can talk
recursively about the existence of (other) invariants—which are crucial for reasoning about
languages with higher-order state (e.g., Rust).
In the past, the complexity of CSLs was further exacerbated by the fact that (until very
recently [27]) they only supported manual and error-prone “pencil-and-paper” proofs. The
initial version of Iris [14] was no exception: the soundness of the core logic was verified in
Coq, but the Coq development provided no support for using the logic (either to encode other
logics or to verify programs interactively). However, in the past year, Krebbers et al. [17] have
developed IPM, an interactive proof mode geared toward using Iris as a proof development
environment for verifying concurrent programs within Coq. With IPM, Iris has begun the
transition to a more practically useful proof tool, and is already being deployed effectively
for larger verification efforts, e.g., in the RustBelt project [10].
1.2 Separation Logics for Release-Acquire Consistency
Iris is a “generic” logical framework in that it is parameterized over the programming language
in question—it merely requires, like the vast majority of prior work on concurrent program
verification, that the language have an operational, interleaving semantics, typically known
as a sequentially consistent (SC) semantics [21]. Under SC, threads take turns accessing the
shared memory, and updates to memory are immediately visible to all other threads.
SC semantics has the benefit that it is easy to define and manipulate formally, but it is
also woefully unrealistic: no serious language guarantees a fully SC semantics, because of
the significant performance costs associated with maintaining the fiction of a single, globally
consistent view of memory on modern multi-core architectures. One of the reasons for this
discrepancy between the theory and the reality of concurrent programming is that, until
relatively recently, formal accounts of more realistic—so-called weak (or relaxed)—memory
models for concurrent programming languages were not available. However, in the past
decade, great progress has been made on formalizing weak memory models, with a notable
high point being the formalization of the C/C++11 memory model (hereafter, C11) [4].
In response to this development, a number of verification researchers have followed suit
by building new verification tools—program logics, model checkers, testing frameworks,
etc.—that account for these more realistic memory models. In particular, Vafeiadis and
collaborators have thus far developed several different separation logics for C11, including
RSL [34] and GPS [33]. The main focus of these logics is on RA+NA, an important fragment
of C11 consisting of release-acquire (RA) accesses and non-atomic (NA) accesses. RA accesses
are useful because they support a common idiom of message-passing synchronization at low
cost compared to SC. NA accesses are intended for “normal” data accesses and are even more
efficiently implementable than RA accesses, with the proviso that they are not permitted to
race (i.e., races on non-atomics cause the entire program to have undefined behavior).
A major challenge that Vafeiadis et al. had to overcome was the fact that C11 is defined
using a radically different semantics than SC. Specifically, it is defined by a declarative
(or axiomatic) semantics, in which the allowed behaviors of a program are defined by
enumerating candidate executions (represented as “event graphs”) and then restricting
attention to the executions that obey various coherence axioms. In building separation logics
for C11, Vafeiadis et al. were thus not able to use the standard model of Hoare-style program
specifications from prior separation logics because notions like “the machine states before and
after executing a command C” do not have a clear meaning in C11’s declarative semantics.
To account for this radically different type of semantics, they were instead forced to
essentially throw away the “separation-logic textbook” and come up with an entirely new,
ECOOP 2017
17:4 Strong Logic for Weak Memory
non-standard model of separation logic in terms of predicates on event graphs. While ground-
breaking, this approach has had several downsides. Firstly, certain essential mechanisms
of SC-based separation logic (such as ghost state), which are easy to justify in standard
models, became very difficult to justify in the new event-graph-based models of RA+NA
logics. Secondly, the complexity of these new models has made them challenging to adapt and
extend, and their non-standard nature has posed a major accessibility hurdle for researchers
accustomed to traditional models of separation logic. Last but not least, although the
soundness of these logics has been verified formally in Coq, there has thus far been no tool
support for using the logics to prove programs correct under RA+NA semantics.
1.3 Our Contributions
Given our above description, it may seem that the Iris framework’s reliance on interleaving
semantics renders it fundamentally inapplicable to reasoning about C11’s weak-memory
semantics. In this paper, we show that this is not the case at all—not only is it possible to
derive RA+NA logics like GPS and RSL within Iris, but there are several tangible benefits
to doing so. Deriving such logics within Iris:
Lets us take advantage of the rich features of the Iris host logic (e.g., separation, invariants)
when proving soundness of the derived logics, thereby significantly lifting the abstraction
level at which those soundness proofs are carried out (compared to prior work).
Allows us to support some very useful features in our derived logics by directly importing
them from Iris. Such features include PCM-based ghost state, higher-order impredicative
quantification, and Iris’s interactive proof mode in Coq. By virtue of being encoded in
Iris, our derived logics inherit these features for free.
Makes it easy to experiment with the derived logics and quickly develop new and useful
extensions (e.g., single-writer protocols, see below).
Makes it possible to soundly compose proofs from different derived logics, since they are
all carried out in the uniform framework of Iris.
Our first step (Section 2) is to avoid the essential complicating factor—C11’s declarative
account of RA+NA—and instead work with an operational account. Building closely on
Lahav et al.’s recently proposed “strong release-acquire” (SRA) semantics [19], we define a
novel, operational, interleaving semantics for RA+NA. Our operational account of the RA
fragment of the language is very similar to Lahav et al.’s operational account of SRA in
that it models writing and reading of memory via the sending and receiving of timestamped
messages; the main difference is that the RA rule for assigning timestamps is slightly more
liberal. Our account of NA is new, though; it uses timestamps to model races on non-atomics
as stuck (unsafe) machine states. We have proven that, under the reasonable restriction that
programs do not mix RMWs (atomic updates) and non-atomic reads at the same location,
our semantics is equivalent to the standard declarative semantics of the RA+NA fragment of
C11.
Next, since our new semantics for RA+NA is an interleaving semantics, we can instantiate
Iris with it. In Section 3, we review the basic reasoning mechanisms of Iris, and show how to
use them to derive small-footprint proof rules for reasoning about RA+NA programs. We
apply these rules to verify a simple message-passing example of RA+NA programming in
Iris. However, as will become clear, reasoning directly with the Iris primitive mechanisms is
rather too low-level and a more abstract logic is needed.
In Section 4, we present iGPS, a higher-order variant of Turon et al.’s GPS logic [33],
which supports much higher-level reasoning about RA+NA programs. Unlike the original
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:5
GPS, iGPS is derived within Iris on top of the small-footprint proof rules from Section 3. It
also extends GPS with single-writer protocols, an extremely useful feature that simplifies
proofs of RA+NA programs in the common case where there are no write-write races on
atomic accesses.
In Section 5, we briefly describe some other contributions, including iRSL, a higher-order
variant of RSL [34] derived within Iris, and several case studies that we have verified using
iGPS and iRSL in Coq. These examples showcase one of the major advantages of working
in the Iris framework: our ability to verify weak-memory programs, foundationally and
mechanically, with the same degree of ease that was previously only possible for SC programs.
Finally, in Section 6, we conclude with related work.
2 Release-Acquire and Non-Atomics
In this section, we introduce our operational semantics for RA+NA, which we then use as
the machine for our working language λRN. Subsequent sections will show how to build a
logic for λRN using Iris.
C11 provides several memory access modes, each ensuring a different degree of consistency.
In this paper we focus on RA+NA, the fragment of C11 consisting only of release-acquire
(RA) and non-atomic (NA) accesses. Non-atomic accesses (which we denote with “[na]”) are
the default type of memory accesses, intended to be used for normal data rather than for
synchronization. Thus, C11 forbids any data races on non-atomic accesses, and programs
that may have such races are considered buggy (they have undefined semantics). In contrast,
RA accesses (which we denote with “[at]” for atomic) are permitted to race, but provide just
enough consistency guarantees to enable the well-known message passing (MP) idiom:
x[na] := 0; y[na] := 0;
x[na] := 37;
y[at] := 1
repeat y[at];
x[na]
Initially, both variables x and y are set to 0. The first thread will initialize x to 37 (non-
atomically) and then set the variable y to 1 (via a release write) as a way of sending a message
to the second thread that x has been properly initialized and is ready for consumption.
The second thread will repeatedly read y (via an acquire read) until it observes y 6= 0, at
which point—thanks to release-acquire semantics—it will know that it can safely access
x. Summing up, the use of RA here ensures that the non-atomic write to x in the first
thread “happens before” the non-atomic read of x in the second thread—i.e., that they do
not race—and furthermore that the read of x will return 37.
The formal semantics of RA+NA is “declarative”, formulated as a set of constraints
on execution graphs. We will instead now present an alternative operational semantics of
RA+NA. Our operational semantics is not completely coherent with C11’s for programs that
mix atomic and non-atomic accesses to the same location (although the semantics of such
programs is already known to be problematic [3]—see Section 6 for further discussion of this
point). However, for the large class of programs that do not mix atomic updates (like CAS)
and non-atomic reads at the same location, our semantics is provably equivalent to C11’s
declarative semantics. This class of programs includes all C11 programs considered (and
verified) in this paper. (For formal details of the correspondence between our semantics and
C11’s, see our technical appendix [1].) We will first start with the pure RA fragment, and
then add a “race detector” for non-atomic accesses.
ECOOP 2017
17:6 Strong Logic for Weak Memory
2.1 Release-Acquire
Our operational semantics for RA starts from the observation that in RA—in contrast
to a standard heap language—different threads have a different view of what the state is.
Accordingly, we need to keep track of past write events as they might still be relevant for
some subset of threads. Moreover, we need to keep writes to the same location in a total
order enforced in C11 under the name modification order (mo for short). Finally, we also
need to keep track of each thread’s “progress” in terms of which writes are visible to it, as
this determines what a thread may read and where its writes may end up.
For the mo order, the RA machine manages for each location a totally ordered set of
timestamps t ∈ Time , N. Each write of some value v to a location ` gets assigned a
timestamp (that is unique for `), resulting in a write event ω ∈ Event , Loc×Val× Time,
where v ∈ Val , Z.1 Using timestamps, the thread’s “progress” is represented by a view,
V ∈ View , Loc fin⇀ Time, which records the timestamp of the most recent write event
observed by the thread for every location. To enable communication between threads, every
write event is augmented with the writing thread’s view, yielding a message m ∈ Msg ,
Event×View. The machine state σ comprises a message pool (called memory) and a view
for every thread.
I Definition 1 (Simplified Physical State). Let σ ∈ Σ ,
(
P(Msg)× (ThreadId fin⇀ View)
)
unionmulti
{⊥uninit} represent physical machine states, where ⊥uninit represents an error state. We write
M and T to denote the two components of a non-error state.
The λRN language’s reductions are factored into expression reductions, concerned with
the evaluation of the language’s expressions, and machine reductions, concerned with how the
execution of an expression affects the machine state. We will define the expression reductions
later when we formally define λRN. Here we focus on the machine reductions.
We define event labels ε ∈ E , {〈Read, `, v〉, 〈Write, `, v〉, 〈Update, `, vo, vn〉, 〈Fork, ρ〉},
representing reads, writes, atomic updates (RMW’s), and forks (with ρ being the newly
created thread id), respectively. The reductions are defined by a set of local, per-thread
reductions ε−→pi⊆ Σ× Σ given in Figure 1, where pi represents the current thread’s id.
A write (Thread-Write) picks an unused timestamp t for location ` that is greater than
the thread’s view of `, updates the thread’s view to the new view V ′ that includes t, and
adds a corresponding message to the memory. A read (Thread-Read) incorporates the view
V of the message that it reads into the thread’s own view.2 Note that the message being read
is required to have a timestamp that is not smaller than the thread’s view of the relevant
location. Updates (Thread-Update) combine reading and writing in one step. In addition,
updates must “pick” t+ 1 as a timestamp for the new message, where t is the timestamp of
the read message. This implies, in particular, that two different updates cannot read the
same message, and corresponds to C11’s atomicity condition, which requires every update
to read from its mo-immediate predecessor. Thread-Fork adds a new thread whose view
is copied from its parent. Finally, Thread-Uninitialized detects reads from uninitialized
locations, and moves to the error state ⊥uninit.
1 The full semantics also supports allocation, which induces an allocation value A. We do not mention it
here for the sake of simplicity.
2 Here, we use the join operator unionsq on views: (V1 unionsq V2)(`) = max{V1(`), V2(`)} if ` ∈ dom(V1) ∩ dom(V2);
(V1 unionsq V2)(`) = Vi(`) if ` ∈ dom(Vi) \ dom(Vj); and (V1 unionsq V2)(`) is undefined if ` 6∈ dom(V1) ∪ dom(V2).
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:7
Thread-Read
(`, v, t, V ) ∈ M T(pi)(`) ≤ t
(M,T) 〈Read,`,v〉−−−−−−→pi (M,T[pi 7→ T(pi) unionsq V ])
Thread-Write
¬∃v′, V. (`, v′, t, V ) ∈ M T(pi)(`) < t V ′ = T(pi)[` 7→ t]
(M,T) 〈Write,`,v〉−−−−−−−→pi (M ∪ {(`, v, t, V ′)},T
[
pi 7→ V ′
]
)
Thread-Update
(`, vo, t, V ) ∈ M T(pi)(`) ≤ t
¬∃v, V. (`, v, t+ 1, V ) ∈ M V ′ = T(pi)[` 7→ t+ 1] unionsq V
(M,T) 〈Update,`,vo,vn〉−−−−−−−−−−−→pi (M ∪ {(`, vn, t+ 1, V ′)},T
[
pi 7→ V ′
]
)
Thread-Fork
ρ /∈ dom(T)
(M,T) 〈Fork,ρ〉−−−−−→pi (M,T[ρ 7→ T(pi)])
Thread-Uninitialized
ε ∈ {〈Read, `, v〉, 〈Update, `, vo, vn〉}
T(pi)(`) = ⊥
(M,T) ε−→pi ⊥uninit
Figure 1 Per-thread reductions for RA without NA.
Functional correctness of MP
With the operational semantics of RA, we can now sketch why MP (assuming for now that
all its accesses are RA) is functionally correct, i.e., why the read of x by the second thread
will return 37 when the program terminates. The write of 37 to x is recorded at a view V37,
which is then included in the view V1 of the write of 1 to y by the first thread. When the
second thread reads 1 from y, its local view is updated to incorporate V1 (and thus also V37).
A read from x is now guaranteed to read from the message setting x to 37 or from a more
recent one, but no more recent one exists. Consequently, the return value will be 37.
2.2 Non-Atomics
Formally, C11 defines a data race as two memory accesses to the same location—of which at
least one is a write and at least one is non-atomic—that are not ordered by “happens-before.”
A program that exhibits data races in some of its execution graphs is called racy, and its
behavior is considered undefined. We now show how to account for non-atomics and data
races in the context of our operational semantics.
Let us first extend the set of physical states by another error state ⊥race, whose intent is
captured by the following correspondence: a program is racy if and only if at least one of its
machine executions can reach ⊥race (stated and proved formally in our appendix [1]).
To detect data races during the execution of a program, we add an additional component
to the physical state: the non-atomic view N, which tracks the timestamp of the most recent
non-atomic write to every location. Then, we place the following restrictions on all atomic
and non-atomic operations (if violated, the program will enter the ⊥race state):
To perform any access (atomic or non-atomic) to a location `, a thread pi must have
observed the most recent non-atomic write to `, i.e., N(`) ≤ T(pi)(`).
A thread pi can only perform a non-atomic read from a location ` if it has observed the
most recent (atomic or non-atomic) write to `, i.e., @t, (`, , t, ) ∈ M. T(pi)(`) < t.
ECOOP 2017
17:8 Strong Logic for Weak Memory
Read
(`, v, t, V ) ∈ M T(pi)(`) ≤ t
α = na⇒∀v′, t′, V ′.(`, v′, t′, V ′) ∈ M⇒ t′ ≤ T(pi)(`)
(M,T,N) 〈Readα,`,v〉−−−−−−−→pi (M,T[pi 7→ T(pi) unionsq V ] ,N)
Write-at
¬∃v′, V. (`, v′, t, V ) ∈ M N(`) ≤ T(pi)(`) < t V ′ = T(pi)[` 7→ t]
(M,T,N) 〈Writeat,`,v〉−−−−−−−−→pi (M ∪ {(`, v, t, V ′)},T
[
pi 7→ V ′
]
,N)
Write-na
¬∃v′, V. (`, v′, t, V ) ∈ M N(`) ≤ T(pi)(`) V ′ = T(pi)[` 7→ t]
∀v′, t′, V. (`, v′, t′, V ) ∈ M⇒ t′ < t
(M,T,N) 〈Writena,`,v〉−−−−−−−−→pi (M ∪ {(`, v, t, V ′)},T
[
pi 7→ V ′
]
,N[` 7→ t])
Update
(`, vo, t, V ) ∈ M N(`) ≤ T(pi)(`) ≤ t
¬∃v, V. (`, v, t+ 1, V ) ∈ M V ′ = T(pi)[` 7→ t+ 1] unionsq V
(M,T,N) 〈Update,`,vo,vn〉−−−−−−−−−−−→pi (M ∪ {(`, vn, t+ 1, V ′)},T
[
pi 7→ V ′
]
,N)
Fork
ρ /∈ dom(T)
(M,T,N) 〈Fork,ρ〉−−−−−→pi (M,T[ρ 7→ T(pi)] ,N)
Race-I
ε ∈ {〈Readα, `, v〉, 〈Writeα, `, v〉, 〈Update, `, vo, vn〉}
T(pi)(`) < N(`)
(M,T,N) ε−→pi ⊥race
Race-II
∃v′, t′, V ′.(`, v′, t′, V ′) ∈ M ∧ T(pi)(`) < t′
(M,T,N) 〈Readna,`,v〉−−−−−−−−→pi ⊥race
Uninitialized
ε ∈ {〈Readα, `, v〉, 〈Update, `, vo, vn〉} T(pi)(`) = ⊥
(M,T,N) ε−→pi ⊥uninit
Figure 2 Per-thread reductions for the RA+NA machine.
In addition to these restrictions, we require non-atomic writes to pick timestamps greater
than all existing timestamps of messages of the same location. Intuitively, these restrictions
enforce that each non-atomic write to ` starts a new “era” in `’s timestamps, after which any
attempt to access writes from a previous era (or to write with a timestamp from a previous
era) constitutes a race. Note that there is an asymmetry between non-atomic reads and
writes: non-atomic writes to ` are allowed even when the thread has not observed the most
recent write to `, as it is only required to observe the most recent non-atomic write to `.
One might fear that this fails to detect the case when a non-atomic write is racing with
a concurrent atomic write (and the atomic write happens first); but in this case the race
will be detected in a different execution where the non-atomic write happens first (and the
atomic write enters the ⊥race state), so the program will nevertheless be declared racy.
Revisiting MP, we note that it is safe to have non-atomic accesses to x: the write is
performed while the left thread is necessarily aware of the most recent non-atomic write to x
(the initialization); and the read is performed while the right thread is necessarily aware of
the most recent write to x, whose timestamp was incorporated into the right thread’s view
when it read y = 1.
Figure 2 presents the full operational semantics. It is based on the following definition of
a physical state.
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:9
v ∈ Val ::= () | z ∈ Z | ` ∈ Loc | fix (f, x). e
α ∈ Access ::= at | na
e ∈ Expr ::= v | e1 e2 | if z then e1 else e2 | fork e | `[α] | `[α] := v | cas(`, v, v) | ...
`[α]
〈Readα,`,v〉−−−−−−−→ v, nil
`[α] := v
〈Writeα,`,v〉−−−−−−−−→ (), nil
cas(`, vo, vn)
〈Update,`,vo,vn〉−−−−−−−−−−−→ 1, nil
cas(`, vo, vn)
〈Readat,`,v〉−−−−−−−−→ 0, nil if v 6= vo
fork e 〈Fork,ρ〉−−−−−→ (), [e]
(fix (f, x). e) v −→ e[(fix (f, x). e)/f ][v/x], nil
ifz then e1 else e2 −→ e1, nil if z 6= 0
ifz then e1 else e2 −→ e2, nil if z = 0
...
Figure 3 Main λRN expressions and expression reductions.
I Definition 2 (Physical State).
Let σ ∈ Σ ,
(
P(Msg)× (ThreadId fin⇀ View)×View
)
unionmulti {⊥race,⊥uninit} represent physical
machine states. We write M, T, and N to denote the components of a non-error state. The
initial physical state, denoted σinit, is given by (∅, [0 7→ ∅], ∅).
2.3 The λRN language
λRN is a standard lambda calculus with recursive functions, forks, and references with atomic
and non-atomic accesses. The repeat construct that we have used in MP can be defined
in terms of recursive functions. The interesting part of the language and its expression
reductions is given in Figure 3. The expression reduction relation (e ε−→ e′, ef ) has four
components: the original expression e, an (optional) machine memory event ε, the resulting
expression e′, and a list of newly created threads ef . Only the rule for fork e creates a new
thread (i.e., a singleton list [e]), while all other reductions produce an empty list (i.e., nil).
The per-thread language reductions (σ; e ε−→pi σ′; e′, ef ) are then the combination of the
expression reductions and the machine reductions, given by the Combined-* rules in Figure 4.
Non-stateful reductions (Combined-Pure) simply defer to the expression reductions, while
stateful reductions (Combined-Mem and Combined-Fork) use the event label ε and the thread
id pi to tie the expression and machine reductions together correctly. These per-thread
reductions then are lifted in a straightforward manner to the full (threadpool) reductions.
3 Iris
Iris is a generic framework for constructing concurrent separation logics. One can instantiate
the framework with any language that has an operational interleaving semantics, and then
easily derive time-tested reasoning principles for one’s target logic, including various “protocol”
mechanisms for controlling interference. Figure 5 provides an excerpt of Iris syntax.
Iris supports the common connectives (False,True,⇒,∧,∨, ∗, —∗ ,∃,∀, µ) and proof rules
standard in higher-order separation logics. Iris’s extended set of constructs includes physical
state ownership Phys(σ), ghost state ownership a γ , the later . and always  modalities,
ECOOP 2017
17:10 Strong Logic for Weak Memory
Combined-Pure
e −→ e′, nil
σ; e −→pi σ; e′, nil
Combined-Mem
ε 6= 〈Fork, 〉
e
ε−→ e′, nil σ ε−→pi σ′
σ; e ε−→pi σ′; e′, nil
Combined-Fork
e
〈Fork,ρ〉−−−−−→ e′, [ef ] σ 〈Fork,ρ〉−−−−−→pi σ′
σ; e 〈Fork,ρ〉−−−−−→pi σ′; e′, [ef ]
Threadpool-Red-Pure
σ; T S(pi) −→pi σ′; e′, nil
σ; T S −→pi σ′; T S
[
pi 7→ e′
] Threadpool-Red-Memσ; T S(pi) ε−→pi σ′; e′, nil
σ; T S ε−→pi σ′; T S
[
pi 7→ e′
]
Threadpool-Red-Fork
σ; T S(pi) 〈Fork,ρ〉−−−−−→pi σ′; e′, [ef ]
σ; T S 〈Fork,ρ〉−−−−−→pi σ′; T S
[
pi 7→ e′
]
unionmulti [ρ 7→ ef ]
Figure 4 Threadpool reductions.
P ::= False | True | P ⇒Q | P ∧Q | P ∨Q | P ∗Q | P —∗ Q | ∃x. P | ∀x. P | µx. P
| Phys(σ) | a γ | .P | P | P N | P N1VN2 Q | {P } e {x. Q}N | ...
Figure 5 An excerpt of Iris syntax.
invariants P N , view shifts P N1VN2 Q, and Hoare triples {P } e {x. Q}N . We will first
explain these constructs via a running example, in which we use Iris to verify the MP example
in a simple, sequentially consistent language called λSC. This will not only illustrate how
one can derive within Iris a target logic for a language defined by an operational semantics,
but will also serve as a warm-up for our subsequent explanation of how we can instantiate
Iris to reason about weak memory.
Road map
The process of instantiating Iris to derive new logics follows a simple pattern, which is worth
articulating up front:
1. When we first instantiate Iris, the only primitive assertion we get about the state of the
program is the physical state ownership assertion Phys(σ), which asserts that σ is the
current global state of the machine. Together with this assertion we also get for free a
bunch of large-footprint specifications for the primitive commands of the language, based
directly on their operational semantics. For example, the primitive specification we get
for updating a location in λSC will be {Phys(σ)} ` := v {Phys(σ[` 7→ v])}.
2. Of course, one of the main points of separation logic is to be able to reason modularly
using local assertions about the machine state, such as the points-to assertion, ` ↪→ v,
and correspondingly give small-footprint specifications of the primitive commands, such
as {` ↪→ w} ` := v {` ↪→ v}. In Iris, such local assertions are not baked into the logic, but
rather are encodable using ghost state ownership assertions, and the user of the logic has
a great deal of flexibility concerning how these assertions are defined. In the case of the
points-to assertion, ` ↪→ v, we will define this assertion so as to represent the knowledge
that ` currently points to v and the rights to read and write `.
3. On its own, a local, user-defined ghost state assertion like the points-to assertion is merely
a representation of knowledge and rights. In order to give meaning to such a ghost state
assertion—i.e., to make sure it is in sync with the primitive physical state assertion—we
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:11
(!`, σ) → (v, σ) if σ(`) = v
(` := v, σ) → ((), σ[` 7→ v]) if ` ∈ dom(σ)
...
Figure 6 Main heap-related reductions of the λSC language.
establish an invariant tying the assertions together. In the case of the points-to assertion,
this invariant will enforce that when a thread owns the ghost state assertion ` ↪→ v, its
“knowledge” that ` currently points to v in the physical machine state is actually correct.
In short, ghost state assertions represent local knowledge and rights concerning the
machine state, and invariants enforce that ghost state assertions mean what they say they
mean.
3.1 Iris by Example
Our example programming language λSC is a standard lambda calculus with references. It
is basically the same as λRN, except that all accesses are sequentially consistent, and races
are permitted (they do not induce stuckness). The language’s physical state is a heap σ,
which is a finite map from allocated locations to values. The main heap-related reductions
of λSC are given in Figure 6. When we instantiate Iris with λSC’s operational semantics,
(as explained in the above road map) what we get automatically from Iris are the following
large-footprint Hoare triples concerning the physical state ownership assertion Phys(σ):
Phys-Heap-Read
{Phys(σ) ∗ σ(`) = v} !` {z. z = v ∗ Phys(σ)}
Phys-Heap-Write
{Phys(σ)} ` := v {Phys(σ[` 7→ v])}
Note that z in the first triple binds the return value of the expression !`. In the second triple,
the expression returns the unit value, so we elide the binder.
3.1.1 Encoding Separation Logic for λSC
We would now like to encode these small-footprint Hoare triples for λSC:
Heap-Read
{` ↪→ v} !` {z. z = v ∗ ` ↪→ v}
Heap-Write
{` ↪→ w} ` := v {` ↪→ v}
The first step is to define the points-to assertion, ` ↪→ v, using Iris’s ghost state.
Ghost state and partial commutative monoids
Ghost state is non-physical state that is only used as part of a program verification but is not
itself part of the machine state. In Iris, ghost state is formalized using partial commutative
monoids (PCMs).3 The assertion a : M γ asserts the ownership of the ghost resource a for
an instance γ of the PCM M . Separating conjunction for ghost state assertions simply lifts
the PCM composition operation to the assertion level: a : M γ ∗ b : M γ ⇐⇒ a ·M b : M γ .
If two PCM fragments are not compatible (i.e. their composition is not defined), then it is
3 Actually, ghost state in Iris is based on the more general mechanism of “cameras” (aka step-indexed
resource algebras), which can support a more general form of higher-order ghost state [13].
ECOOP 2017
17:12 Strong Logic for Weak Memory
not possible to own both of them at the same time, i.e., if a · b = ⊥ then a γ ∗ b γ ⇒ False.4
In order to maintain consistency of the logic, therefore, changes to ghost state are restricted
to frame-preserving updates, in which a PCM fragment a can only be updated to b if b
preserves compatibility with any other fragments in the environment (the frame):
Ghost-Update
∀af . a · af 6= ⊥⇒ b · af 6= ⊥
a
γ V b γ
Ghost updates belong to the set of logical computations, or in Iris terminology, view shifts.
A view shift P V Q represents the capability of transforming a resource satisfying P into a
resource satisfying Q without changing the physical state.
A PCM for heaps
As a step towards defining ` ↪→ v, let us now construct a PCM called Heap that has the
same basic structure as the physical heap, but allows splitting and recomposition. (We will
ultimately need a slightly more sophisticated PCM to define ` ↪→ v, but Heap is an important
part of the construction.) Heap is a finite partial map from locations to values, with the
empty heap as its unit element, and the composition on heaps is defined as disjoint union (i.e.,
union if the heaps have disjoint domain, and undefined otherwise). The composition implies
that the singleton heap [` := v] does not combine with itself, so it can only be uniquely
owned, and it also represents the permission required to update `:
Ghost-Heap-Exclusive
[` := v] γ ∗ [` := w] γ ` False
Ghost-Heap-Update
[` := w] γ V [` := v] γ
The singleton heap [` := v] therefore has the desired properties for defining the local assertion
` ↪→ v, but unfortunately it is still not quite enough: we also need some way to tie this ghost
state assertion to the underlying physical state of the program. Toward this end, we employ
Iris’s invariants.
Invariants
Invariants in Iris can be thought of as assertions that hold of some shared resource at all
times, although the choice of which shared resource satisfies them is allowed to vary over time.
The Iris invariant assertion P N stipulates that P is an invariant. The resource that satisfies
it is shared with all threads, and thus any thread can access it freely in a single physical
step5: it can open the invariant and gain local ownership of the resource for the duration
of the operation, so long as it can close the invariant by relinquishing ownership of some
(potentially different) resource satisfying P at the end of the operation. For bookkeeping
purposes—specifically, to ensure that we do not unsoundly open the same invariant more
than once in a nested fashion—invariants in Iris are named, and the N in the above invariant
assertion is a namespace (set of names) from which the name of the invariant must come.
Invariants belong to the set of persistent assertions, denoted with the always modality
. The assertion P establishes the knowledge that P holds without any ownership, and
4 In the rest of the paper we also suppress the PCM M in a :M γ when it can be inferred in context.
5 In Iris terminology, a resource in an invariant can be accessed within an atomic operation, which is an
operation that takes only a single physical step of execution. We do not use the term here to avoid
confusion with C11 atomics.
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:13
therefore holds forever after. Putting resources into an invariant is thus a common way to
share or transfer ownership through the use of freely distributable knowledge.
Meanwhile, the actions of opening and closing invariants belong to the set of logical
computations, or view shifts. To account for invariants, view shifts are extended with
namespaces as well: P N1VN2 Q asserts that, assuming the invariants named in N1 hold
before the view shift, then the invariants named in N2 hold after the view shift. Opening
and closing of invariants are then formalized as follows:
Inv-Open
P
N ` True NV∅ P
Inv-Close
P
N ` P ∅VN True
Inv-Open allows a thread to open the invariant P N and gain ownership of P but prevents
it from doing so more than once. Only after applying Inv-Close and re-establishing the
invariant will the thread be able to open it again.
Note: The Inv-Open rule as stated here is only sound if P talks about ownership (of
physical or ghost state) and not about the existence of other invariants. In general, however,
Iris makes no such restriction; rather, it supports impredicative invariants, meaning that
P can be an arbitrary assertion. In order to avoid paradoxes caused by impredicative
circularities (like the one described in [16]), the fully general version of this rule in Iris
requires that P be guarded by the step-indexed later modality (.). Fortunately, in most cases
the .’s can be stripped away automatically (the Iris proof mode in Coq provides support for
doing this) and do not play an interesting role in proofs. To focus the presentation of this
paper, we will therefore suppress further discussion of the . modality.
Hoare triples in Iris are also annotated with invariant namespaces, since Hoare triples
combine both physical and logical computations. A Hoare triple {P } e {x. Q}N with a
namespace N implies that if the invariants in N hold before the expression’s execution, then
they will be preserved between every step and also after its execution. Consequently, when
verifying any single physical step of computation, we are free to open the invariants in N so
long as we immediately close them. This reasoning is encapsulated in the following “atomic
rule of consequence”:
AConsq
P NunionmultiN
′VN P ′ {P ′} e {v. Q′}N ∀v. Q′ NVNunionmultiN
′
Q e takes 1 physical step
{P } e {v. Q}NunionmultiN ′
Since bookkeeping of namespaces is largely a tedious detail (and one which Coq will force us
to get right), we will for the remainder of the paper suppress namespaces from definitions and
proofs. We will always use disjoint namespaces to ensure correctness in opening invariants.
Linking physical and ghost state using invariants and the “authoritative” PCM
Now, returning to our example, the key idea is to use an invariant to tie the physical state
assertion together with local ghost state assertions, thereby giving them meaning. To achieve
this, we will employ an extremely useful construction called the authoritative PCM [14].
Given a base PCM M, the authoritative PCM Auth(M) has two types of elements:
authoritative • a and non-authoritative ◦ a (for a ∈M). For any instance γ, • a γ is exclusive
(i.e., • a γ ∗ • a γ ` False), and is the main point of reference for all the non-authoritative
fragments, in the sense that any ownable fragment ◦ b γ must have b included in a, that is
∃c. a = b · c. The PCM’s update therefore requires more: if one wants to update b to b′, it
ECOOP 2017
17:14 Strong Logic for Weak Memory
has not only to ensure b′ is compatible with c, but also has to update a to a′ = b′ · c. These
properties are summarized in the following two rules:
Auth-Agree
• a γ ∗ ◦ b γ ` ∃c. a = b · c
Auth-Update
b′ · c 6= ⊥
• b · c γ ∗ ◦ b γ V • b′ · c γ ∗ ◦ b′ γ
With these two rules in hand, we can derive the following rules for operations on Auth(Heap):
AGhost-Heap-Exclusive
◦ [` := v] γ ∗ ◦ [` := w] γ ` False
AGhost-Heap-Agree
•σ γ ∗ ◦ [` := v] γ ` σ(`) = v
AGhost-Heap-Update
•σ γ ∗ ◦ [` := w] γ V •σ[` 7→ v] γ ∗ ◦ [` := v] γ
We are now ready to establish the invariant ∃σ. Phys(σ) ∗ •σ γ , which binds together
the physical state ownership and the authoritative ghost heap ownership. With the invariant
in place, AGhost-Heap-Agree implies that if a thread owns the singleton ghost heap ◦ [` := v]
locally, then, in combination with the invariant, it is guaranteed that ` currently has value v
in the physical heap. AGhost-Heap-Exclusive and AGhost-Heap-Update ensure that only the
one thread who owns ◦ [` := v] γ can make updates to the contents of `.
The points-to assertion is then defined as ` ↪→ v , ◦ [` := v] γ , and we can easily prove
the small-footprint triples from the beginning of this section by combining the above rules
for authoritative ghost heaps with those for opening and closing invariants.
3.1.2 Verifying MP in λSC
Using the small-footprint triples, we are ready to verify MP in λSC. We discuss the proof in
a bit of detail here, since later on we will show how to adapt this proof to verify MP under
weak-memory semantics.
The proof of MP is given in Figure 7. As a proof convention, we only mention persistent
assertions (like invariants) once and use them freely later, since they are always true after
being established. The proof works essentially as follows.
First of all, both threads want to operate on y simultaneously, so we need to put ownership
of y into an invariant Invy . This invariant says that y is in one of two states—0 or 1. We
can establish the invariant right after the initialization of y (the write of 0 to y), because y is
in state 0 at that moment. The first thread is responsible for setting y to state 1. When
the second thread observes that y is in state 1, it will expect to be able to gain ownership
of x ↪→ 37. To achieve this, in state 1, Invy asserts the existence of another invariant Invx
concerning x, and it is this latter invariant that we use to transfer ownership of location x
from the first thread to the second thread.
To understand Invx, it helps to have seen the film Raiders of the Lost Ark, or at least the
first few minutes of it, in which Indiana Jones (played by Harrison Ford) attempts to steal a
precious golden idol from an ancient Peruvian temple—without setting off booby traps—by
swapping it for a similarly weighted bag of sand. Unfortunately for him, the temple detects
his ruse and tries to kill him. But we can play a similar trick, and Iris will be perfectly happy!
In our case, the “golden idol” is x ↪→ 37, which is transferred into the invariant Invx when
it is established by the first thread. The “bag of sand” is a “token”  (a uniquely ownable
piece of ghost state) that is given to the second thread at the beginning of its execution.
Invx simply asserts that it owns either the golden idol or the bag of sand. Thus, when the
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:15
Invariants:
Invy , y ↪→ 0 ∨ y ↪→ 1 ∗ Invx
Invx ,  ∨ x ↪→ 37
{x ↪→ 0 ∗ Invy }
x := 37
{x ↪→ 37}
{ Invx }
op
en
In
v y {y ↪→ }
y := 1
{y ↪→ 1 ∗ Invx }
{True}
{ Invy ∗ }
repeat y;
{ Invx ∗ }
{x ↪→ 37}
!x
{z. z = 37 ∗ x ↪→ 37}
Figure 7 Verification of MP in λSC.
second thread learns of the existence of Invx, it can safely use the invariant opening and
closing rules to swap the bag of sand in its possession for the golden idol owned by Invx, and
thereafter claim local ownership of x ↪→ 37.
3.2 Instantiating Iris with λRN
We now consider an instantiation of Iris with our λRN language from Section 2.3. A key
difference between λRN and λSC is that the expression reductions of λSC do not depend on
which thread is executing the expression, since every thread has the same global view of the
memory, whereas the reductions of λRN depend on the current thread’s subjective view of the
memory. Thus, we need to be able to talk about thread ids in our logic as well. To this end,
we pair up expressions from λRN with thread ids, making them visible in our specifications.
Eventually, in Section 4, we will see how we can reason about λRN without talking explicitly
about thread ids.
3.2.1 Encoding Separation Logic for λRN
After instantiating Iris with λRN, as in the case of λSC, Iris provides us with large-footprint
specifications of the primitive commands for free, which concern the physical state assertion
and mirror the rules of λRN’s operational semantics. Recall that in λRN the physical state
is a tuple (M,T,N) of the message pool M, the current view map T, and the non-atomic
timestamp map N. As before, we aim to develop “local” assertions using ghost state, establish
an invariant that connects those local assertions to the physical state assertion, and then
derive small-footprint specifications of the primitive commands for use in modular verification.
But what kind of “local” assertions do we want?
For λSC, we had the points-to assertion ` ↪→ v, but in λRN we no longer have a simple
mapping from locations to values. Rather, associated with each location ` is a set of messages
corresponding to writes to `. We will represent that associated information as a history h,
consisting of a set of triples (v, t, V ), where v is a value written to `, t is the timestamp
at which that value was written, and V is the view of the writing thread at the time the
write occurred. Note that V incorporates the new timestamp, i.e., V (`) = t. To reflect
the per-location history, we define our first local assertion: The history ownership assertion
Hist(`, h) asserts ownership of ` and knowledge of its write history h.
Since the ability to read or write values in λRN depends on threads’ local views of memory,
we would also like to support an assertion of thread-view ownership, Seen(pi, V ), which asserts
ownership of the current view V of the thread pi and is required to update pi’s view. Since any
operation by a thread pi on a location ` relies on both pi’s current view and `’s history, the
ECOOP 2017
17:16 Strong Logic for Weak Memory
Hist(`, h) , ◦ [` := h] γ1
Seen(pi, V ) , ◦ [pi := V ] γ2
PSInv , ∃σ. ∃H ∈ Loc fin⇀ P(Val× Time×View). Phys(σ) ∗ •H γ1∗ •σ.T γ2∗ HInv(σ,H)
HInv(σ,H) , dom(H) = {m.` | m ∈ σ.M} ∧ (∀m ∈ σ.M. m.t = m.V (m.`))
∧ ∀` ∈ dom(H). H(`) = {(m.v,m.t,m.V ) | m ∈ σ.M ∧m.` = ` ∧m.t ≥ σ.N(`)}
Figure 8 Ghost state and invariant setup for λRN.
Base-NA-Read
PSInv ` {Seen(pi, V ) ∗Hist(`, h) ∗ init(h, V ) ∗ na(h, V )}
`[na], pi
{v. Seen(pi, V ) ∗Hist(`, h) ∗ na(h, V ) ∗max(h).v = v}
Base-NA-Write
PSInv ` {Seen(pi, V ) ∗Hist(`, h) ∗ alloc(h, V )}
`[na] := v, pi
{∃V ′ w V, t, h′ = {(v, t, V ′)}. Seen(pi, V ′) ∗Hist(`, h′) ∗ na(h′, V ′) ∗ init(h′, V ′)}
Base-AT-Read
PSInv ` {Seen(pi, V ) ∗Hist(`, h) ∗ init(h, V )}
`[at], pi
{v. ∃t1, V1, V ′ w V unionsq V1. Seen(pi, V ′) ∗Hist(`, h) ∗ (v, t1, V1) ∈ h ∗ t1 ≥ V (`)}
Base-AT-Write
PSInv ` {Seen(pi, V ) ∗Hist(`, h) ∗ alloc(h, V )}
`[at] := v, pi
{∃V ′ w V, t, h′ = h unionmulti {(v, t, V ′)}. Seen(pi, V ′) ∗Hist(`, h′) ∗ init(h′, V ′)}
Figure 9 Selected Hoare triples of λRN base logic.
pair of history and thread-view ownership assertions comprise the general ghost ownership
required for accessing the location.
To tie these local assertions to the global physical state, we use a construction very similar
to the one for λSC. The local assertions are defined as before by wrapping finite-map PCMs
with the authoritative PCM construction, and the invariant enforces that these ghost maps
are coherent with the physical state. The definitions of the invariant PSInv and the two local
assertions are given in Figure 8. The Hist(`, h) and Seen(pi, V ) assertions also have the exact
same rules for exclusiveness, agreement, and updating as the ` ↪→ v assertion of λSC.
It is important to note that the history ownership assertion Hist(`, h) does not record the
full history of `, but only write events of the current era (see Section 2.2), i.e., only events
as recent as or more recent than the last non-atomic write to `. This is reflected in the last
condition of HInv: m.t ≥ σ.N(`). Defining Hist(`, h) this way helps to simplify the job of the
user of the logic in establishing the absence of races: by construction, it is impossible to even
attempt to read (racily) from write events before the current era. In order to preserve this
property of Hist(`, h), a non-atomic write (vna, tna, Vna) must completely remove all write
events currently in h (which would be racy to access now), and replace it with a new history
h′ that contains only the newly created non-atomic write event: h′ = {(vna, tna, Vna)}.
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:17
With the physical state shared and its ghost counterpart splittable, we are ready to derive
the small-footprint Hoare triples, which constitute a base logic that is powerful enough to
verify programs in λRN. A selected set of these triples is given in Figure 9. The general
pattern of these rules is that a thread pi needs to own the history h of a location ` and its
own thread view V in order to operate on `. Additionally, pi needs to show certain relations
between h and V in order to guarantee the safety of its operations. These relations are
represented by the alloc, init, and na predicates. The alloc(h, V ) predicate (resp. init(h, V ))
asserts that V has observed an event in h which ensures that ` is allocated (resp. initialized).
The na(h, V ) predicate asserts that V has observed the mo-latest write event of h, which is
needed to do a non-atomic read. Note that all of these predicates require that V has seen
some event from h—i.e., an event from the current era of `—which, as discussed above, is a
prerequisite for non-raciness.
These base logic rules provide a very concise explanation of λRN’s operational semantics.
Base-AT-Read, for example, requires the current view’s knowledge of ` being initialized and
ensures that the new updated view V ′ is at least the join of the local view V and the view V1
of the write event that the thread reads from. This event must be from h and not mo-earlier
than the write event observed by the thread previously. Base-NA-Read is similar, except
that it requires that the current view must have observed the mo-latest write event to `, and
therefore reads from that write event, which we denote by max(h). Base-NA-Write preserves
the HInv invariant by proactively dropping from the history all the old write events, which
are mo-before this non-atomic write. Notice that, unlike Base-NA-Read, Base-NA-Write
does not require na(h, V ), as explained in Section 2.2.
3.2.2 MP in λRN
We show that the base logic is enough to verify MP in λRN. The invariants and the proof,
given in Figure 10, follow closely those used for MP in λSC. The singleton heap ownership
in λSC is replaced with the history ownership, and extra conditions on view extension are
added to reflect the view updates inherent in λRN. More specifically, in Invy, we enforce that
any write of a non-zero value6 to y be at a view V1 that extends V37, which is the view at
which the write of 37 to x is made by the first thread. Consequently, when the second thread
observes V1 (by reading y to be non-zero), it must have also observed V37, and thus can read
x = 37, using the Indiana Jones invariant Invx. The extra conditions on V0 ensure that y is
initialized with 0 at V0, so that the second thread (having observed V0) can safely read y.
We have shown that the base logic is powerful enough to verify MP in λRN, and in
principle it is powerful enough to verify many other realistic weak memory programs that
are expressible in λRN as well. However, it is also clear that the base logic is not abstract
enough: one has to burden oneself with keeping track of the low-level details of changes to
locations’ histories and threads’ views. What we really want is a way of abstracting away
from those low-level details and finding simple high-level reasoning principles for λRN, the
type of reasoning principles supported by RA+NA logics like GPS and RSL. We will now
see how such high-level principles can in fact be derived on top of our low-level base logic.
6 In the MP example this value is always 1.
ECOOP 2017
17:18 Strong Logic for Weak Memory
Invariants:
Invy(V0) , ∃h. Hist(y, h) ∗ (0, , V0) ∈ h ∗ ∀V1, v1 6= 0. (v1, , V1) ∈ h⇒∃V37 v V1. Invx(V37)
Invx(V37) ,  ∨Hist(x, [(37, , V37)])
Thread 1 proof outline:
{Seen(pi, V0) ∗Hist(x, [(0, , Vx)]) ∗ Vx v V0 ∗ Invy(V0) }
x[na] := 37
{∃V37 w V0. Seen(pi, V37) ∗Hist(x, [(37, , V37)])}
{Seen(pi, V37) ∗ Invx(V37) }
op
en
In
v y {Seen(pi, V37) ∗ ∃h. Hist(y, h) ∗ ...}
y[at] := 1
{∃V1 w V37. Seen(pi, V1) ∗Hist(y, h unionmulti [(1, , V1)]) ∗ Invx(V37) }
{Seen(pi, V1) ∗ Invy(V0) }
Thread 2 proof outline:
{Seen(pi, V0) ∗ Invy(V0) ∗ }
repeat y[at];
{∃V1, V37, V2. V2 w V1 w V37 ∗ Seen(pi, V2) ∗ Invx(V37) ∗ }
{Seen(pi, V2) ∗ V37 v V2 ∗Hist(x, [(37, , V37)])}
x[na]
{z. Seen(pi, V2) ∗ z = 37 ∗Hist(x, [(37, , V37)])}
Figure 10 Verification of MP in λRN base logic.
4 iGPS
Vafeiadis et al. have introduced several logics for C11, and two in particular that were focused
on RA+NA: GPS [33] and RSL [34]. The key difference between these logics and the RA+NA
base logic presented in Section 3.2.1 is that, in GPS and RSL, the user does not reason
explicitly about views—instead, the assertions of these logics are implicitly predicated over
the view of the thread asserting them. This helps to hide much tedious reasoning about
views, and leads naturally to a model of assertions as predicates over views (Section 4.3).
We have encoded iGPS and iRSL, variants of both GPS and RSL, in Iris, and we will
focus here on iGPS, since it is the more sophisticated of the two logics. (We briefly describe
iRSL in Section 5.) GPS introduced several useful abstractions that were not present in RSL:
PCM-based ghost state, single-location protocols, and escrows. In encoding GPS in Iris, as
noted in the introduction, we get PCM-based ghost state completely for free, just by working
in Iris. Below, we will describe the other features, and how we account for them in Iris.
Note that our goal with iGPS is not to slavishly imitate all details of GPS, but to provide
proof rules very similar to GPS’s that are both strong enough to support all the examples
from the GPS papers [33, 31] and significantly easier to prove sound within Iris. To this
end, we slightly restrict select rules, but only in a way that does not impact their utility on
known case studies. We discuss the differences from the original rules in detail in Section 6.
Furthermore, in Section 4.2, we show how iGPS supports an additional feature—single-writer
protocols—that was not supported by the original GPS and that significantly simplifies
proofs.
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:19
iGPS-NA-Read
{` ↪→ v} `[na] {w. w = v ∗ ` ↪→ v}
iGPS-NA-Write
{` ↪→ } `[na] := v {` ↪→ v}
iGPS-NA-Exclusive
` ↪→ v ∗ ` ↪→ w⇒⊥
iGPS-Read
∀s′ w s, v. P ∗ τread(s′, v)V Q
{ ` : s τ ∗ P } `[at] {v. ∃s′ w s. ` : s′ τ ∗Q}
iGPS-Write
(∀s′′. s′ w s′′) P V τfull(s′, v) ∗Q
{ ` : s τ ∗ P } `[at] := v { ` : s′ τ ∗Q}
iGPS-CAS
∀s′ w s. P ∗ τfull(s′, vo)V ∃s′′ w s′. τfull(s′′, vn) ∗Q ∀s′ w s. P ∗ τread(s′, vo)V R
{ ` : s τ ∗ P } cas(`, vo, vn) {v. ∃s′′ w s. ` : s′′ τ ∗ ((v = 1 ∧Q) ∨ (v = 0 ∧R))}
iGPS-Persistent
` : s τ ⇒ ` : s τ
iGPS-Agree
` : s1 τ ∗ ` : s2 τ V s1 v s2 ∨ s2 v s1
iGPS-Escrow-Intro
QV [P  Q]
iGPS-Escrow-Elim
P ∧ [P  Q]V Q
iGPS-Escrow-Persistent
[P  Q]⇒ [P  Q]
Figure 11 iGPS proof rules for non-atomics, protocols, and escrows.
4.1 Key Features of GPS
In this section, we explain the key features of GPS and how they are formalized in iGPS
(besides PCM-based ghost state, which is directly imported into iGPS from Iris). A selected
set of iGPS proof rules is given in Figure 11.
Non-atomics
Since non-atomic locations may not be raced on, GPS (and iGPS) reason about them much
in the same way that locations are reasoned about in standard separation logic: using the
points-to assertion, ` ↪→ v. Note that the proof rules for reading (iGPS-NA-Read) and writing
(iGPS-NA-Write) and the exclusivity property (iGPS-NA-Exclusive) are equivalent to those
from the logic for λSC (Section 3.1.1). Additionally, GPS (and iGPS) support fractional
ownership of non-atomics to allow such locations to be read (but not written) by multiple
threads at once. We omit the rules of fractional ownership for brevity.
Protocols
To reason about RA atomics, we need a mechanism for controlling interference on such accesses.
Toward this end, CSLs for SC have supported a variety of protocol mechanisms, which control
how shared state may evolve over time, and several of the more recent logics [32, 30, 23]
employ state transition systems (STSs) to formalize such protocols. Crucially, protocols
enforce irreversibility: the state of an STS protocol can only make forward progress over the
course of a proof. For example, in Section 3.1.2, a protocol could enforce that the variable
y could only progress from 0 to 1 but not back again. (We did not need to enforce that
property to verify the MP example, but it is useful to be able to in general.) In Iris, protocols
are encoded using a combination of invariants and ghost state.
Under weak memory, invariants and protocols are unsound in general because they require
a single coherent history of updates to all locations. GPS showed how to partially restore
protocol reasoning for weak memory with single-location protocols: protocols which restrict
the evolution of a single shared location. Intuitively, single-location protocols are sound due
ECOOP 2017
17:20 Strong Logic for Weak Memory
to the per-location coherence property of C11 (often called “SC per location”): the writes to
any single location are totally ordered (by mo). In particular, they maintain the invariant
that the order of protocol states associated with writes is consistent with their timestamp
(mo) order. If write event x to location ` with associated protocol state sx is mo-before write
event y (to the same location) with protocol state sy, then sx will be before sy in protocol
order. Thus, once a thread has observed that the protocol on ` has reached state sx, it can
from that point on only observe the protocol on ` to be in states that are accessible from sx.
This fulfills the expectation that protocol transitions are irreversible.
GPS protocols come equipped with an interpretation predicate which specifies the resources
held by the protocol depending on the protocol’s state and the location’s value. The primitive
operations on an atomic location serve to transfer resources in and out of its protocol:
Writes may transfer resources into the protocol, but may not transfer anything out.
Reads may not transfer any resources into the protocol, but they may transfer “knowledge”
(i.e., duplicable resources) out of it. They are restricted to transferring out duplicable
resources because there may be many reads of the same write event.
Updates (RMWs), by virtue of the physical synchronization they provide, may transfer
resources both in and out of the protocol.
In iGPS, we represent protocols in a slightly different way, using two interpretations:
a “full” interpretation, and a duplicable “read” interpretation that is implied by the full
interpretation. The intuition is that the read interpretation is used for reads (since they may
only transfer duplicable resources out of the protocol) and the full interpretation is used for
the other operations.
We will now present the formal definition of protocols and the corresponding proof rules.
I Definition 3 (Protocols). A protocol τ comprises a non-empty state set S, a reflexive, transi-
tive transition relation v ⊆ S×S, and two interpretation predicates τm(·, ·) ∈ S×Val→ Prop
with m ∈ {read, full} representing read and full interpretations, respectively. The interpreta-
tion predicate has to fulfill the following two laws:
τfull(s, v)V τfull(s, v) ∗ τread(s, v) τread(s, v)V τread(s, v) ∗ τread(s, v)
We write ` : s τ (as in GPS) to denote the persistent assertion that ` is governed by protocol
τ and that the protocol has been observed in state s.
The first rule, iGPS-Agree, represents the guarantee that every protocol is always in a
state consistent with all observations, i.e., that all observed states can be linearly ordered
w.r.t. to the transition relation v.
iGPS-Read enables reading from a location governed by protocol τ—allowing the user to
observe a future state s′ of whatever state s it has previously observed, and providing the
associated read interpretation τread.
Writes to the location are subject to iGPS-Write, which allows the user to move the
protocol to a “final state” s′—i.e., a state accessible from every state in the protocol—so long
as they can provide τfull for s′. This rule may seem very weak, since it forces the protocol into
a final state, but this weakness derives from the need to handle the general case where there
can be write-write races. Write-write races allow only very limited reasoning: both writes
have to prove that their protocol state is later in protocol order to the other one. The write
rule presented here solves this problem in a very simple way (see Section 6 for a comparison
with the original GPS rule). In Section 4.2, we present a much stronger write rule that is
optimized for the common case where there are no write-write races on the location.
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:21
Finally, iGPS-CAS governs updates. Its two premises represent the success and failure
case, respectively. If the operation succeeds, the value read, vo, is guaranteed to belong to a
future state s′. The user picks the new state s′′ depending on s′ and establishes τfull(s′′, vn),
making use of τfull(s′, vo). In case of a failure, the rule degenerates to iGPS-Read.
Escrows
A limitation of GPS protocols is that they offer no way to transfer ownership of (non-
duplicable) resources from one thread to another unless the receiving thread performs
physical synchronization via an update operation. For example, in our MP example, there
is no update operation, and yet we want to transfer ownership of the non-atomic location
x from the first thread to the second. For such an example, an additional mechanism for
ownership transfer is required.
This motivates escrows, a mechanism for logical synchronization which, unlike protocols,
is not tied to physical locations.
I Definition 4 (Escrows). An escrow [P  Q] consists of a guard resource P and a payload
resource Q to be transferred. The guard resource P must be exclusive, i.e. P ∗ P ⇒⊥. The
escrow assertion itself is persistent knowledge (freely duplicable).
The idea of escrows is really just a slight generalization of the “Indiana Jones invariant”
Invx that we used in the proof of the MP example from Section 3.1.2. Following the
explanation there, the payload resource Q is the “golden idol”, the guard resource P is the
“bag of sand”, and the escrow allows one to swap P for Q. The restriction on exclusivity of
P ensures that this swap can only be performed once.
The proof rules for escrows follow the above intuition. iGPS-Escrow-Intro places the
payload resource Q in escrow. Any thread that learns of the existence of this escrow can
then use iGPS-Escrow-Elim to trade ownership of the guard resource P for Q.7
Message passing in iGPS
The verification of MP using iGPS is given in Figure 12. Although the verification is sound
for λRN, it is much simpler than the proof we carried out in the base logic of Iris, and is in
fact very close in structure to the SC verification of MP in λSC. In particular, the Indiana
Jones invariant Invx has now become an escrow XE, and the invariant Invy has now become
a (2-state) iGPS protocol YP, but otherwise the steps are almost the same. The abstraction
of iGPS has relieved us from the burden of reasoning with history and view updates explicitly.
4.2 Single-Writer Protocols
As we observed above, the iGPS protocol write rule suffers from a restriction: the user has to
transition to a final state. This restriction is not present in CSLs for SC, which let the user
pick the future state depending on the current one, much as iGPS-CAS does. Fortunately, in
the common case when there are no write-write races to the location, this restriction can be
lifted by single-writer protocols, a novel invention of iGPS.
A single-writer protocol splits the protocol assertion into two parts: an exclusive writer
assertion ` : s τ
W
and a persistent reader assertion ` : s τ
R
. Owning the writer assertion
7 The rule given for elimination is only sound if Q is “timeless”, meaning that it only describes ownership
of (ghost) state, not knowledge about protocols or escrows, as is the case in our message passing example.
A more general rule, which returns Q under the later modality, is given in the appendix [1].
ECOOP 2017
17:22 Strong Logic for Weak Memory
XE(x) ,
[
  x ↪→ 37
]
YP(x)(0, v) , v = 0
YP(x)(1, v) , v = 1 ∗XE(x)
{x ↪→ 0 ∗ y : 0 YP(x) }
x[na] := 37
{x ↪→ 37}
{XE(x) ∗ y : 0 YP(x) }
y[at] := 1
{ y : 1 YP(x) }
{ y : 0 YP(x) ∗ }
repeat y[at];
{ y : 1 YP(x) ∗XE(x) ∗ }
{x ↪→ 37}
x[na]
{z. z = 37 ∗ x ↪→ 37}
Figure 12 Verification of MP with iGPS.
iGPS-SW-Exclusive-Writer
` : s1 τ
W
∗ ` : s2 τ
W
⇒⊥
iGPS-SW-Agree
` : s1 τ
R
∗ ` : s2 τ
R
V s1 v s2 ∨ s2 v s1
iGPS-SW-Max
` : s1 τ
W
∗ ` : s2 τ
R
V s1 w s2
iGPS-SW-Read-Exclusive{ ` : s τ
W
} `[at] {v. ` : s τ
W
∗ τread(s, v)}
iGPS-SW-Read
∀s′ w s, v. P ∗ τread(s′, v)V Q
{ ` : s τ
R
∗ P } `[at] {v. ∃s′ w s. ` : s′ τ
R
∗Q}
iGPS-SW-Write
P ∗ ` : s′′ τ
W
∗ τfull(s, )V τfull(s′′, v) ∗Q s′′ w s
{ ` : s τ
W
∗ P } `[at] := v { ` : s′′ τ
R
∗Q}
Figure 13 A selection of single-writer proof rules.
provides both the permission to change the state as well as the guarantee that no one else
can change it. Owning the reader assertion only allows reads. Thus, the reader assertion
represents a lower bound on the protocol state whereas the state contained in the writer
assertion is exactly the most recent state of the protocol. The full proof rules for single-writer
protocols are given in Figure 13, and of these, the write rule iGPS-SW-Write is the most
important. The writer knows exactly what the current state is, and is free to pick the next
state accordingly.
Applications of single-writer protocols
Single-writer protocols provide more explicitly intuitive and concise proofs over normal
protocols when there are no write-write races, i.e., only one thread is writing to the location.
This may mean that there is exactly one writer in the whole program, or (perhaps more
typically) that the programmer is using sufficient synchronization to ensure that there is
exactly one active writer at a time. Such is the case for several headlining examples verified
in GPS, including circular buffer, bounded ticket lock, and read-copy update [33, 31].
In the original GPS proofs for these examples, the lack of single-writer protocols meant
that the proofs had to employ a significant amount of tedious ghost state (mostly in the form
of so-called “protocol tokens”) to formalize the fact that the writing thread knew exactly which
state the protocol had to be in at any given time. Using single-writer protocols, this reasoning
is immediate from the iGPS-SW-Write rule. By removing the need for such boilerplate ghost
state, single-writer protocols simplify and clarify the proofs of these examples.
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:23
a
γ , λ . a γJP K , λV. JP K(V )JP ∗QK , λV. JP K(V ) ∗ JQK(V )JP ⇒QK , λV. ∀V ′ w V. JP K(V ′)⇒ JQK(V ′)J` ↪→ vK , λV. ∃Vna v V. Hist(`, {(v, , Vna)})J[P  Q]K , λV. ∃V0 v V. JP K(V0) ∨ JQK(V0)
J{P } e {v. Q}K , λV. ∀V ′ w V, pi.
{PSInv ∗ Seen(pi, V ′) ∗ JP K(V ′)}
(e, pi)
{v. ∃V ′′ w V ′. Seen(pi, V ′′) ∗ JQK(V ′′)}
Figure 14 Definition of iGPS assertions.
An intriguing feature of the iGPS-SW-Write rule is that it is possible for the writer to
relinquish ownership of the exclusive writer permission while doing the write itself. (This is
why the writer permission appears in the precondition of the premise.) This extra flexibility is
particularly useful when reasoning about RA implementations of locks (such as the bounded
ticket lock). When the lock holder releases the lock (typically with a release write), this
feature allows them to also give up their permission to do further release writes to the lock,
so that it can be transferred to the next thread that acquires the lock.
4.3 The Model of iGPS
We now briefly describe the model of iGPS assertions. Figure 14 contains an excerpt of
the encoding of the standard assertions and connectives of CSL as well as non-atomics and
escrows. The somewhat more involved model of protocols is detailed in the appendix [1].
We model iGPS assertions as monotone predicates over views. The view parameter
represents the current view of the thread making the assertion. The monotonicity requirement
is motivated by the observation that the view of a thread only grows over the execution of a
program. To ensure properties like the frame rule, it is therefore crucial that simply adding
information to a view does not invalidate previously held (e.g., framed) assertions. As a
consequence of this requirement, we explicitly monotonize the encoding when necessary.
We benefit greatly from the support offered by the surrounding logic. As a result, the
model is extremely simple, with the lion’s share of connectives being translated in a purely
structural way and the remaining ones making direct use of ambient Iris connectives. The
most interesting encodings are those of Hoare triples, non-atomics, escrows, and protocols.
We now discuss these in more detail.
The model of Hoare triples embodies the intuition behind our encoding of iGPS assertions
as view predicates: the view at which we operate is that of the local thread. In the encoding,
we achieve this by quantifying over a view V ′ and tying V ′ to the physical view of the thread
pi via the Seen(pi, V ′) assertion and to the original pre-condition P , which is required to hold
at V ′. As the thread’s physical view may evolve during the execution of the expression e,
the triple returns an extended view V ′′ w V ′ and the corresponding Seen(pi, V ′′) assertion,
together with the post-condition Q, which is guaranteed to be valid at V ′′.
The encoding of non-atomics is particularly simple due to the properties of the Hist
assertion. As all writes in the history have to be mo-after the most recent non-atomic write,
the history becomes a singleton when the location is used non-atomically. To tie the local
view V to the view of the non-atomic write Vna, we simply demand that V extend Vna.
We encode escrows with a single, simple invariant, which holds either the guard resource
P or the payload resource Q. The view V0 at which the invariant owns one of these resources
is the view used to initialize the escrow. Knowing an escrow at a local view V thus reduces
to knowing that V0 v V .
ECOOP 2017
17:24 Strong Logic for Weak Memory
Protocols
The model of iGPS protocols consists of two parts: a protocol invariant, and local protocol
assertions given out to clients. The protocol invariant owns the location’s history as well as
the logical history ∆, which tracks the transition history of protocol states and is always kept
in agreement with the location’s history. Additionally, the invariant owns τread for all writes in
the history and τfull for select ones, depending on the protocol’s type. For example, in normal
protocols, τfull is only stored for CAS-able write events, justifying that only an update can
access the full interpretation of the write event that it reads from (see iGPS-CAS). Meanwhile,
local protocol assertions hold knowledge about the logical history, which gives effectively
lower bounds on the current state of the protocol, and by the protocol invariant, indirectly
implies knowledge about the location’s history. In the case of single-writer protocols, the
writer assertion also owns the exclusive right to change the state. This construction also relies
on the authoritative PCM (see Section 3.1.1). More details are given in the appendix [1].
Soundness of iGPS
The soundness of iGPS is expressed by the following theorem.
I Theorem 1 (Adequacy). For any expression e, physical state σ′, and meta-level predicate
on values Φ, we have
(` {>} e {v. Φ(v)})⇒∀T S. σinit; [0 7→ e] −→∗ σ′; T S ⇒
(T S(0) ∈ Val⇒ Φ(T S(0)))
∧ ∀ρ ∈ dom(T S). T S(ρ) ∈ Val ∨ ∃σ′′, e′′, e′′f . T S(ρ) −→ρ σ′′, e′′, e′′f
The theorem connects iGPS program specifications (` {>} e {v. Φ(v)}) and the program’s
possible executions, and provides two guarantees:
1. If the original thread with id 0 terminates with a value v, we know that Φ(v) holds.
2. For any pair of a state σ′ and a threadpool T S reachable from the initial state σinit (see
Definition 2) and the initial threadpool [0 7→ e], we know that any thread ρ in T S either
has terminated with a value or can still be reduced in the state σ′.
The proof follows from the adequacy theorem of Iris.
5 Other Contributions
In our Coq development accompanying this paper, we make several additional contributions
that we briefly summarize here.
An RSL encoding
Using the same base logic from Section 3.2.1, we have mechanized iRSL, a higher-order
variant of RSL [34]. RSL focuses on the message-passing style transferring of resources
through release-write/acquire-read pairs. The two main assertions of RSL are Rel(`,Q) and
Acq(`,Q), representing the permission to write to and read from `, respectively. The resource
Q(v) is released by writing v to `, and then acquired by reading v from `. Consequently, to
support this MP-like mechanism, the encoding’s model shares a great deal with the model of
iGPS, namely the full vs. read interpretation construction for protocols.
One particular feature of RSL, however, demands special attention. Although simpler
than GPS, RSL has the extra ability to split the receiver predicate Q into smaller predicates,
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:25
so that different acquire reads of the same value v can acquire different parts of the transferred
resource: Acq(`, λv. Q1(v)∗Q2(v))V Acq(`,Q1)∗Acq(`,Q2). It is not obvious how to prove
this sound when the splitting is completely arbitrary. Fortunately, a similar pattern, called
the barrier pattern, has been addressed by Jung et al. [13], who propose the mechanism
of “higher-order ghost state” to support such splitting. Our iRSL model basically extends
Jung et al.’s barrier proof with a more complex (Iris) protocol to carefully manage resource
splitting.
The encoding in Iris also provides us with useful extensions to the logic. Without extra
work, iRSL naturally supports PCM-based ghost state and higher-order assertions, which
were not available in the original RSL. The encoding shows that our approach has the right,
reusable foundations to construct different logics for RA+NA.
In our RSL encoding, assertions are encoded as view predicates and proof rules are proven
sound with respect to the base logic—in the same way as our GPS encoding. This allows us
to soundly combine RSL and GPS reasoning principles in the same proof at no additional
cost. It is even possible to design iGPS protocols whose state interpretation mentions iRSL
assertions and vice versa. Of course, at a single point in time a location can only be governed
by either iGPS or iRSL, as they represent incompatible modes of ownership transfer.
Allocation and deallocation
We have also incorporated support for memory allocation and deallocation into our RA+NA
operational semantics. Since C11 is not clear about the semantics of allocation and dealloca-
tion, we take the liberty of defining them as reasonably as possible: in short, allocation and
deallocation behave as non-atomic writes with special values A and D, respectively.
Fractional protocols
So far, all protocols presented are permanent: once the protocols are established, they govern
their locations forever. This poses two interesting questions: 1) Can we change the protocol
which governs a location? and 2) How can we deallocate a location governed by a protocol?
To support these features, we derive, with little modification to the iGPS model, fractional
protocols, whose protocol assertions also assert the permission to even use the protocol.
Initially, a protocol τ for a location ` will be established with the full fraction, and then it
will be distributed to those who want to use τ . Later, when the full fraction is recollected,
one can disable the protocol (since no one else can use it), regain the raw ownership of `, and
then deallocate `—or more interestingly, establish a new protocol for `! These recollectable
protocols open up possibilities for verifying programs that do custom memory reclamation,
e.g., RCU (see below). In the current Coq development, we have created fractional versions
of both normal and single-writer protocols.
Mechanization and Case Studies
Our Coq mechanization employs a shallow embedding of iGPS (and iRSL), making critical
use of the Iris Proof Mode [17]. In its current form, the proof mode is specific to the algebra of
Iris and offers no additional support for embedded logics like our encoding of iGPS assertions.
There are two consequences to this: 1) Unlike in the paper presentation of iGPS, where
thread views are completely hidden in the (Iris) model of the logic, in our Coq proofs thread
views are visible. However, they are also unobtrusive: all assertions in a given proof context
always hold at the current thread’s local view, and the view only changes when the thread
takes a step. Thus, while the views are visible, they are always manipulated and kept in
ECOOP 2017
17:26 Strong Logic for Weak Memory
sync in a very straightforward way, which we mostly automate with a set of simple tactics.
2) iGPS assertions cannot always be manipulated directly by the proof mode. We sometimes
have to unfold our embedding of iGPS assertions—but not in the statements of our lemmas
and theorems—to make explicit the underlying Iris assertions so that the proof mode can
operate on them. As our embedding is very simple, this has little additional overhead. In
particular, all lemmas and theorems stated at the iGPS level remain easily applicable even
to the unfolded definitions at the Iris level.
We have verified all of the standard examples that have been proven in previous work.
The simplest of these is the spin-lock example, proven in iRSL. More interestingly, using
iGPS, we have also mechanized the message passing, circular buffer, bounded ticket lock, and
Michael-Scott lock-free queue examples, which were only verified by hand in the original
GPS paper. We have also verified a variant of the read-copy update (RCU) technique
employed in the Linux kernel, following the proof of Tassarotti et al. [31]. The RCU proof
is the most substantial example in iGPS, which simplifies the original proof in GPS by
using fractional single-writer protocols that allow garbage collection. To our knowledge, our
development provides the very first mechanized proofs of the circular buffer, bounded ticket
lock, Michael-Scott queue, and RCU examples in a weak-memory setting.
6 Related Work
This paper demonstrates one of the first major applications of the Iris framework. Other
recent applications include Krebbers et al. [17], who developed the interactive Coq proof
mode for Iris that we rely on heavily in this paper, and Krogh-Jespersen et al. [18], who use
Iris to encode a logical-relations model of a relational model of a type-and-effect system for
a higher-order, concurrent programming language. Neither of those papers considers weak
memory.
There are a number of program logics for weak memory models [28, 5, 2, 20], some of
which have mechanized soundness proofs [34, 8, 26], but none of which provide real support
for mechanized proofs of weak-memory programs in the way that we do.
FSL++ [9], an extension of FSL [8] (with ghost state) and RSL (with relaxed accesses),
was used to mechanize a proof of an implementation of atomic reference counters based on
the one in Rust’s Arc library. The proof is done by applying the basic laws of separation
logic, resulting in painful manual work. Our approach alleviates a great deal of such tedium
using the Iris proof mode. As of now, however, iGPS cannot reason about relaxed accesses.
More recently, RSL, FSL, and FSL++ have been encoded in Viper [22] to provide an
automated verification approach to weak memory programs [29]. The encodings, however,
axiomatize all proof rules without providing soundness guarantees, and are specific to the
style of RSL and its FSL descendants. Our base logic, in contrast, is not tied to any specific
surface logic and can be used to develop and prove sound different surface logics. It remains
to be seen if the more expressive GPS protocols can be encoded in Viper.
iGPS is based on the GPS logic [33] and supports all reasoning mechanisms of GPS.
However, the exact rules of iGPS differ in various small ways from the original ones in GPS.
These differences stem from pragmatic choices made to simplify the soundness proof of iGPS.
A particularly noteworthy difference from GPS appears in the premises of the iGPS-Read
and iGPS-Write rules: the split of the protocol interpretations τ into τread and τfull. We
show the original GPS rules below for comparison.
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:27
GPS-Read
∀s′ w s. P ∗ τ(s′, v)⇒Q
{ ` : s τ ∗ P } `[at] {v. ` : s′ τ ∗Q}
GPS-Write
∀s′ w s. P ∗ τ(s′, v)⇒ s′ v s′′ P V τ(s′′, w) ∗Q
{ ` : s τ ∗ P } `[at] := w { ` : s′′ τ ∗Q}
The interpretation τ(s, v) in these two GPS rules is equivalent to τfull(s, v) in iGPS. In
GPS, the user can gain access to the full interpretation of the “current” protocol state (s′),
but only to obtain some knowledge, not to consume (i.e., transfer out of the protocol) any
non-duplicable resources owned by that interpretation. This is enforced in GPS-Read by
guarding Q with an always modality , and in GPS-Write by requiring the user to establish
the interpretation of the new write using only the local resource P . In contrast, iGPS does
not provide the user with the full interpretation, but only the read interpretation τread. This
weakens, for example, the iGPS-Write rule in comparison with GPS-Write, because the user
cannot use τfull to show s′ v s′′.
The reason for this discrepancy between GPS and iGPS is that the soundness proof of
GPS reasons about an entire program execution graph at once. With its bird’s-eye view of
the entire execution, GPS can, for the duration of a step, assemble resources that have been
transferred elsewhere in the graph to re-construct τfull for the user. The soundness of iGPS,
on the other hand, is established in a simpler and more local manner, without involving
reasoning about the full execution of the program. We avoid the global soundness argument
of GPS and instead provide a pragmatic solution which—judging from our success in porting
GPS examples—is effectively as strong as GPS and, at the same time, makes for a very
simple soundness proof: we maintain the duplicable τread for all (past) events and can thus
easily provide it to the client of iGPS-Read.
Essentially, the reason our rules are as effective (if not more so) than those of GPS is that
GPS provides one-size-fits-all rules, which are applicable to both programs with write-write
races and those without, whereas we provide special support for the common case where
there are no write-write races. For programs without those races, the rules provided by GPS
are quite cumbersome to use and often require additional ghost state for bookkeeping. iGPS
instead supports an optimized write rule for the common case in which there are no such
races, via single-writer protocols. For the remaining cases, the rather simple-minded rule
iGPS-Write appears to suffice in all the examples we have considered thus far.
Our operational semantics for RA draws heavily on Lahav et al.’s semantics for SRA [19]—
a stronger variant of RA, which is equivalent to it in the absence of write-write races. SRA
was developed to provide an intuitive operational characterization which is as efficiently
implementable as RA. However, as we observe here, moving to an operational characterization
does not in fact require any strengthening of the RA semantics (even the slight strengthening
of SRA). The main difference between the operational semantics of SRA and the one we give
for RA is that writes in SRA always take a globally maximal timestamp, whereas in RA they
need not do so. The canonical example demonstrating this difference is the 2+2W example
(see Lahav et al. [19] for more details).
Going beyond Lahav et al., we offer the first operational account of the interaction of
RA and non-atomic accesses. Our semantics corresponds to C11’s for programs that do not
mix atomic RMW operations and non-atomic reads at the same location. We feel this is a
reasonable restriction, given that C11’s treatment of programs mixing atomic and non-atomic
accesses is already known to be problematic [3]. Our semantics does not correspond to C11’s
for arbitrary programs, as evidenced by the following example:
ECOOP 2017
17:28 Strong Logic for Weak Memory
cas(x, 0, 1)
x[at] := 2;
a := x[na]
C11 considers this program racy because, if the CAS succeeds, the first thread’s update of x
to 1 and the second thread’s non-atomic read of x are not in a happens-before relation. In
contrast, our semantics does not consider this a race because the non-atomic read is always
guaranteed to read from the previous write with value 2. We find the C11 semantics for this
program to be rather unintuitive, but leave a more thorough investigation of the issue to
future work.
Our RA semantics may also be considered a close derivative of Kang et al.’s “promising”
semantics [15], which is geared toward solving a broader problem with the full C11 model
(the so-called “out-of-thin-air” problem). We look forward to using Iris to construct program
logics for this promising semantics.
Acknowledgements. We would like to thank Mohit Vyas for spotting a mistake in our
original proof of correspondence to C11, and Mark Batty for helpful conversations.
References
1 Technical appendix and Coq development accompanying this paper, available at the follow-
ing URL: http://plv.mpi-sws.org/igps/.
2 Tatsuya Abe and Toshiyuki Maeda. Observation-based concurrent program logic for relaxed
memory consistency models. In APLAS, pages 63–84, 2016.
3 Mark Batty, Kayvan Memarian, Kyndylan Nienhuis, Jean Pichon-Pharabod, and Peter
Sewell. The problem of programming language concurrency semantics. In ESOP, 2015.
4 Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. Mathematizing
C++ concurrency. In POPL, pages 55–66, 2011.
5 Richard Bornat, Jade Alglave, and Matthew J. Parkinson. New lace and arsenic: adventures
in weak memory with a program logic. CoRR, abs/1512.01416, 2015.
6 Pedro da Rocha Pinto, Thomas Dinsdale-Young, and Philippa Gardner. TaDA: A logic for
time and data abstraction. In ECOOP, 2014.
7 T. Dinsdale-Young, M. Dodds, P. Gardner, M. Parkinson, and V. Vafeiadis. Concurrent
abstract predicates. In ECOOP 2010, volume 6183 of LNCS, pages 504–528. Springer,
2010.
8 Marko Doko and Viktor Vafeiadis. A program logic for C11 memory fences. In VMCAI,
volume 9583 of Lecture Notes in Computer Science, pages 413–430. Springer, 2016.
9 Marko Doko and Viktor Vafeiadis. Tackling real-life relaxed concurrency with FSL++. In
ESOP, 2017.
10 Derek Dreyer. The RustBelt project. http://plv.mpi-sws.org/rustbelt/.
11 Xinyu Feng. Local rely-guarantee reasoning. In POPL, pages 315–327, 2009.
12 C. B. Jones. Tentative steps toward a development method for interfering programs.
TOPLAS, 5(4):596–619, 1983.
13 Ralf Jung, Robbert Krebbers, Lars Birkedal, and Derek Dreyer. Higher-order ghost state.
In ICFP, pages 256–269, 2016.
14 Ralf Jung, David Swasey, Filip Sieczkowski, Kasper Svendsen, Aaron Turon, Lars Birkedal,
and Derek Dreyer. Iris: Monoids and invariants as an orthogonal basis for concurrent
reasoning. In POPL, pages 637–650, 2015.
15 Jeehoon Kang, Chung-Kil Hur, Ori Lahav, Viktor Vafeiadis, and Derek Dreyer. A promising
semantics for relaxed-memory concurrency. In POPL, pages 175–189, 2017.
J.-O. Kaiser, H.-H. Dang, D. Dreyer, O. Lahav, and V. Vafeiadis 17:29
16 Robbert Krebbers, Ralf Jung, Aleš Bizjak, Jacques-Henri Jourdan, Derek Dreyer, and Lars
Birkedal. The essence of higher-order concurrent separation logic. In ESOP, 2017.
17 Robbert Krebbers, Amin Timany, and Lars Birkedal. Interactive proofs in higher-order
concurrent separation logic. In POPL, pages 205–217, 2017.
18 Morten Krogh-Jespersen, Kasper Svendsen, and Lars Birkedal. A relational model of types-
and-effects in higher-order concurrent separation logic. In POPL, pages 218–231, 2017.
19 Ori Lahav, Nick Giannarakis, and Viktor Vafeiadis. Taming release-acquire consistency. In
POPL, POPL 2016, pages 649–662. ACM, 2016.
20 Ori Lahav and Viktor Vafeiadis. Owicki-Gries reasoning for weak memory models. In
Automata, Languages, and Programming, ICALP 2015, volume 9135 of LNCS, pages 311–
323. Springer, 2015.
21 Leslie Lamport. How to make a multiprocessor computer that correctly executes multipro-
cess programs. IEEE Trans. Computers, 28(9):690–691, 1979.
22 Peter Müller, Malte Schwerhoff, and Alexander J. Summers. Viper: A verification infras-
tructure for permission-based reasoning. In VMCAI, pages 41–62, 2016.
23 Aleksandar Nanevski, Ruy Ley-Wild, Ilya Sergey, and Germán Andrés Delbianco. Commu-
nicating state transition systems for fine-grained concurrent resources. In ESOP, 2014.
24 Peter W. O’Hearn. Resources, concurrency, and local reasoning. Theor. Comput. Sci.,
375(1-3):271–307, 2007.
25 John C. Reynolds. Separation logic: A logic for shared mutable data structures. In LICS,
2002.
26 Tom Ridge. A rely-guarantee proof system for x86-TSO. In VSTTE 2010, volume 6217 of
LNCS, pages 55–70. Springer, 2010.
27 Ilya Sergey, Aleksandar Nanevski, and Anindya Banerjee. Mechanized verification of fine-
grained concurrent programs. In PLDI, pages 77–87, 2015.
28 Filip Sieczkowski, Kasper Svendsen, Lars Birkedal, and Jean Pichon-Pharabod. A separa-
tion logic for fictional sequential consistency. In ESOP 2015, volume 9032 of LNCS, pages
736–761. Springer, 2015.
29 Alexander Summers. Personal communication, 2017.
30 Kasper Svendsen and Lars Birkedal. Impredicative concurrent abstract predicates. In
ESOP, 2014.
31 Joseph Tassarotti, Derek Dreyer, and Viktor Vafeiadis. Verifying read-copy-update in a
logic for weak memory. In 36th ACM SIGPLAN Conference on Programming Language
Design and Implementation, PLDI 2015, pages 110–120. ACM, 2015.
32 Aaron Turon, Derek Dreyer, and Lars Birkedal. Unifying refinement and Hoare-style rea-
soning in a logic for higher-order concurrency. In ICFP. ACM, 2013.
33 Aaron Turon, Viktor Vafeiadis, and Derek Dreyer. GPS: Navigating weak memory with
ghosts, protocols, and separation. In OOPSLA, OOPSLA 2014, pages 691–707. ACM, 2014.
34 Viktor Vafeiadis and Chinmay Narayan. Relaxed separation logic: A program logic for C11
concurrency. In OOPSLA 2013, pages 867–884. ACM, 2013.
35 Viktor Vafeiadis and Matthew Parkinson. A marriage of rely/guarantee and separation
logic. In CONCUR 2007, volume 4703 of LNCS, pages 256–271. Springer, 2007.
ECOOP 2017
