Weak memory models using event structures by Castellan, Simon
Weak memory models using event structures
Simon Castellan
To cite this version:
Simon Castellan. Weak memory models using event structures. Julien Signoles. Vingt-
septie`mes Journe´es Francophones des Langages Applicatifs (JFLA 2016), Jan 2016, Saint-Malo,
France. <hal-01333582>
HAL Id: hal-01333582
https://hal.inria.fr/hal-01333582
Submitted on 17 Jun 2016
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entific research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destine´e au de´poˆt et a` la diffusion de documents
scientifiques de niveau recherche, publie´s ou non,
e´manant des e´tablissements d’enseignement et de
recherche franc¸ais ou e´trangers, des laboratoires
publics ou prive´s.
Weak memory models using event structures
Simon Castellan †
†: ENS de Lyon, LIP, CNRS, Inria, UCBL, Université de Lyon
46 allée d'Italie, 69364 Lyon cedex 07
simon.castellan@ens-lyon.fr
Abstract
In this article, we investigate a denotational semantics based on event structures for a very
simple imperative and concurrent programming language. The model incorporates behaviours of
weak memory models such as reordering of instructions and non-locality. Our model can then be
used to deﬁne a function from programs to their possible outcomes that can be used to give a
formal semantics to a processor or a programming language.
Most of the semantic ideas come from game semantics and its recent development based on
event structures, but taking advantage of the ﬁrst-order setting, we present in this paper a self-
contained simpliﬁcation of these ideas.
1. Introduction
Compiler and hardware optimizations The ﬁrst guideline to model concurrency was Lamport's
sequential consistency (SC) [11] which states that the semantics of concurrent program is the set of
its possible sequentializations, called interleavings.
This model although very simple to reason with, is nowadays disconnected from reality. To improve
performance of sequential programs, compilers and processors aggressively reorder instructions
concerning distinct parts of the memory. As a result, it is now common to observe executions of
concurrent programs that break sequential consistency because of these optimizations. For instance,
the parallel execution of the following threads on a modern processor can lead to the end result
r1 = 0 ∧ r2 = 0 which no interleaving predicts:
a := 1 b := 1
r1 ← b r2 ← a
In this snippet, a and b stand for shared variables (initialized to 0) accessible to both threads while
r1 and r2 denote registers that are thread-local. Assignments of shared variables (:=) are called stores
whereas assignments from shared variables to registers (←) are called loads.
The processor (or the compiler) can decide to permute the memory operations within a thread of
this program as they hit distinct variables. The sequential behaviour of each thread is not altered by
this optimization but once the two threads are put in parallel this turns out to break SC.
Moreover, modern architectures have caches so that threads do not share the same view on the
memory. For instance in the following example appearing in the Intel documentation:
a := 1 b := 1
r1 ← a s1 ← b
r2 ← b s2 ← a
it can be observed that r1 = s1 = 1 but r2 = s2 = 0 even if the platform is not allowed to permute
the memory reads on a and b of each thread.
39
S. Castellan
Hardware vs software speciﬁcations These optimizations create the need for a clear speciﬁcation
of which behaviours are allowed on a given platform. Indeed, on the one hand, processor manufacturers
need to specify which behaviours assembly programs have, to provide the user with guarantees. Those
speciﬁcations deﬁne a particular execution model that the processor exhibits and as a consequence we
possess robust formalizations of them (see eg. [2]).
On the other hand, programming language designers need to explain the possible run-time
behaviour of valid programs so users can reason (informally or formally) about it. Because a
programming language can be implemented on several architectures, and compilers should be allowed
to optimize the program, those speciﬁcations are much more relaxed. As a result, their formalization is
often non-satisfactory: Java's speciﬁcation is unsound [15, 9] and C11's speciﬁcation allows undesired
out of thin air behaviours [3].
In this article we will focus on modelling behaviours appearing in hardware speciﬁcations.
Mathematical description of executions Such speciﬁcations should be as formal as possible to
avoid ambiguities. This requires a mathematical language in which to express the desired constraints.
Two such languages are popular: an axiomatic approach is followed by [12, 2] which gives axioms
satisﬁed by valid executions and an operational approach based on modeling the platform as an
idealized machine [14, 4].
In this paper we follow a diﬀerent approach: as [9] we would like to develop a denotational
semantics based on causality between memory events to observe concurrency as a ﬁrst-class notion
and reduce the size of the model. Unlike this paper which works with an abstract structure called
conﬁguration theories, we work with event structures [17] which are a concrete representation for
speciﬁc conﬁguration theories. Event structures comprise a set of events equipped with a partial
order representing causality and a binary relation denoting conﬂict between events that are mutually
exclusive. They are much more concrete and compact.
An advantage of this approach is that the resulting semantics is compositional, derived by induction
on the syntax and not extracted from an operational semantics. We believe that compositional
semantics and modular reasoning are key to be able to scale to large programs. Moreover, using
causality instead of traces (as in eg. [5]) allows us to escape the combinatorial explosion inherent to
trace-based models of concurrency.
Game semantics based on event structures A recent line of work [13, 8, 6] has developed a
game semantics for concurrent higher-order languages based on event structures. In this article, we
want to take advantage of the model and the constructions to give a semantics to a simple ﬁrst-order
toy language, complex enough to illustrate the phenomenon at stake.
This paper assumes no prior knowledge of these models: the construction of the model is self-
contained. Ideas and intuitions directly come from seminal papers of game semantics [1, 10] and have
been simpliﬁed in this ﬁrst-order setting. Using this technology, it is very easy to extend our model
to fully-ﬂedged programming languages with functions, control, recursion, and higher-order features.
Contribution of the paper This paper explores the usage of game semantics techniques to give
an event structures-based denotational account of weak memory model. In this paper, a particular
model is chosen although we believe this framework can be adapted to a great variety of models.
Where the axiomatic approach speciﬁes which executions are valid, our approach builds directly the
correct executions. Moreover, event structures allow to represent several incompatible executions in
a compact form by sharing the common parts.
A naive implementation of the model presented here as well as a few variants is available at
http://iso.mor.phis.me/memory.
40
Weak memory models using event structures
Outline of the paper In Section 2, we deﬁne event structures and the constructions on them that
we need in order to build our model. In Section 3, we introduce the language we study and outline the
construction of the model. In Section 4, we interpret each term as a labelled event structure where
the behaviour of variables is left abstract. These event structures represent the dependency between
the instructions enforced by the platform. In Section 5, we explain how to wire a chosen memory
model in the interpretation of Section 4 to recover a complete description of executions.
2. Event structures
As mentioned in the introduction, we use event structures to analyze both the concurrency and the
non-determinism (due to races) inside programs.
2.1. Deﬁnitions and notations
In this article, we use event structures with binary conﬂict:
Deﬁnition 1 (Event structures).  An event structure is a tuple (S,≤S ,#S) where S is a set
of events equipped with a partial order ≤S (called causality of S) and a binary symmetric irreﬂexive
relation #S (called conﬂict relation of S) satisfying the following two axioms:
• (Conﬂict inheritance) For all s, s′, s′′ ∈ S such that s #S s′ and s′ ≤ s′′, then s #S s′′ (in this
case, the conﬂict s #S s
′′ is said to be inherited from s #S s′)
• (Finite causes) For all events s ∈ S, the set [s] = {s′ ∈ S | s′ ≤ s} is ﬁnite.
A labelled event structure is a pair (S, lbl) where S is an event structure and lbl : S → L is a
function from the events of S to a set L of labels. We will often omit the labelling function when the
context is clear.
The causality of S has the following interpretation: if s ≤ s′ then for the event s′ to occur, the
occurrence of s is necessary. In particular, causality is conjunctive: if s1 ≤ s and s2 ≤ s then the
occurrences of s1 and of s2 are necessary to that of s. We write s_S s′ whenever s < s′ and there is
no events in between s and s′. This relation is called the immediate causality.
The relation #S represents the non-determinism of the system: two events in conﬂict cannot occur
together in the same execution. By conﬂict inheritance, this relation is generated by a relation of
minimal conﬂict: s S s
′ whenever s #S s′ and for all s0 ≤ s and s′0 ≤ s′ such that s0 #S s′0 then
s0 = s and s
′
0 = s
′. Two events s, s′ ∈ S are said to be concurrent when they are incomparable for
≤S and they are not conﬂicting.
To give a graphical representation to a labelled event structure (S, lbl), we represent the labels,_S and S rather than ≤S and #S as it is more compact.
Example 1.  The following event structure is the result of the interpretation in our model of the
term Jr ← a ‖ a := 2 ‖ b := 1K (variables are initialized to zero):
Re
(r=2)
a Wr
(2)
a
Wr
(2)
a
_LLR
Re
(r=0)
a
_LLR
Wr
(1)
b
There is a race to determine in which order the load and the store on a are scheduled while the store
on b is independent of the other two events.
The notion of conﬁguration (states of the system) is central in the theory.
41
S. Castellan
Deﬁnition 2 (Conﬁguration).  A conﬁguration of an event structure (S,≤S ,#S) is a ﬁnite subset
x of S satisfying the two conditions:
• (Down-closure) If s ∈ x then [s] ⊆ x (ie. x is down-closed for ≤)
• (Consistency) If s, s′ ∈ x then ¬(s #S s′).
The set of conﬁgurations of S is written C (S) and is naturally ordered by inclusion. A consequence
of the axioms of event structures is that if s ∈ S, [s] ⊆ S is a conﬁguration of S.
2.2. Simple parallel composition and sum of event structures
Deﬁnition 3 (Simple parallel composition).  Given two event structures S and T we form the
event structure S ‖ T called the simple parallel composition of S and T , deﬁned by:
• Events: {0} × S ∪ {1} × T (Tagged disjoint union)
• Causality: ≤S‖T= {
(
(0, s), (0, s′)
) | s ≤S s′} ∪ {((1, t), (1, t′)) | t ≤T t′}
• Conﬂict: #S‖T= {
(
(0, s), (0, s′)
) | s #S s′} ∪ {((1, t), (1, t′)) | t #F t′}
The parallel composition S ‖ T is the system obtained by letting the systems S and T evolve
concurrently without interferences (conﬂict or causality). As a consequence, conﬁgurations of S ‖ T
are in one-to-one correspondence with pairs of conﬁgurations of S and T .
Example 2.  Parallel composition can be used to describe the semantics of threads that do not
interfere. For instance, we have Ja := 1; a := 2 ‖ b := 2K = Ja := 1; a := 2K ‖ Jb := 2K depicted as
follows:
Wr
(2)
a
Wr
(1)
a
_LLR
Wr
(2)
b
Non-deterministic sums of event structures are deﬁned similarly as parallel composition except
that S and T are set in conﬂict: executions (or conﬁgurations) are either contained in S or in T .
Deﬁnition 4 (Sum of event structures).  Given two event structures S and T we form the event
structure S + T deﬁned by:
• Events: those of S ‖ T
• Causality: that of S ‖ T
• Conﬂict: that of S ‖ T plus every pair ((i, s), (j, t)) with i 6= j.
Example 3.  Sums of event structures will be used to represent abstract load operations. For
instance a load on a variable a whose context is not known can be represented as a sum of its possible
outcomes (in any execution a given load reads only value):
R
(0)
a R
(1)
a R
(2)
a . . .
These operations naturally extend to labelled event structures by letting lblS0‖S1(i, e) =
lblS0+S1(i, e) = lblSi(e).
42
Weak memory models using event structures
2.3. Preﬁxing and concatenation
In the following, we will often want to concatenate event structures to represent sequential
composition. To allow for the instructions to be reordered we need this composition not to be
completely sequential in order to break some causalities. To that end, we introduce a relaxed
concatenation operator.
Deﬁnition 5 (Concatenation).  Let (S, lbl : S → L) and (T, lbl : T → L) be labelled event
structures and R ⊆ L× L be a relation satisfying:
• for any t ∈ T , there is a ﬁnite number of s ∈ s such that (lbl(s), lbl(t)) ∈ R
• for any t ∈ T , there does not exist s, s′ ∈ S such that s #S s′ and both s ≤S⊗RT t and
s′ ≤S⊗RT t.
where ≤S⊗RT is deﬁned as transitive closure of ≤S‖T ∪{(0, s), (1, t) | (lbl(s), lbl(t)) ∈ R}. We form
the labelled event structure S ⊗R T as follows:
• Events and labels: Those of S ‖ T
• Causality: ≤s⊗RT
• Conﬂict: s # s′ iﬀ there exists s0 ≤S⊗RT s and s′0 ≤S⊗RT s′ such that s0 #S‖T s′0.
In S ⊗R T , S and T occur concurrently except for some causalities from S to T speciﬁed by R.
An instance of this is preﬁxing : let (S, lbl : S → L) be an event structure and l ∈ L a label. We
write l ·S for {l}⊗{l}×S S. The event structure l ·S starts by doing l ﬁrst and then proceeds to do S.
3. Setting up the stage
In this section, we introduce the syntax for a simple concurrent imperative programming language.
We then discuss the general structure of our model.
3.1. Syntax of our language
For the purpose of this article we introduce a very simple language which features enough interesting
behaviours to illustrate the model. We suppose given two disjoint sets of variable names a, b, c, . . . ∈ S
for shared variables between threads and r, s, . . . ∈ R for register (thread-local) variables. The syntax
is as follows:
e, e′ ::= { Arithmetic expressions }
k ∈ N | r ∈ R | e+ e′
t ::= { Threads }
| a := e; t (Store to a shared variable)
| r ← a; t (Load from a shared variable)
| () (Empty thread)
p ::= { Programs }
t1 ‖ . . . ‖ tn
We use two diﬀerent syntactic constructs for loads and stores to make the distinction more explicit.
Moreover, the construct r ← a; t binds r to a inside t. We use the shorthands r ← a and a := e for
43
S. Castellan
fv(e) ⊆ ∆
∆ ` e
a,Γ; r,∆ ` t
a,Γ; ∆ ` r ← a; t
∆ ` e a,Γ; ∆ ` t
a,Γ; ∆ ` a := e; t
Γ;` t1 . . . Γ;` tn
Γ ` t1 ‖ . . . ‖ tn
Figure 1: Typing rules for the language
r ← a; () and a := e; () respectively. Write fv(e) for the set of register variables appearing in e. To
ensure well-formedness we use a very simple type system whose judgments are of the form Γ; ∆ ` t
for threads and Γ ` p for programs. The context Γ is a set of global variables and ∆ is a set of
thread-local (or register) variables. Rules are given in Figure 1.
Variables are initialized to zero at the beginning of an execution.
3.2. Open and closed semantics
In the rest of the paper, we show an example of a possible model based on event structures allowing
for instructions reordering and non-locality.
The speciﬁcation of such a model can be split into two parts:
• The processor part which explains which instructions may or may not be permuted inside a
thread.
• The memory part which explains how memory operations in one thread are propagated to the
others.
We believe these two parts should be cleanly separated in the model and should be independent
from one another as it is implicitly the case in existing models (eg. in [2]).
The ﬁrst step is what we call open semantics: it is about computing the exact control ﬂow of the
program leaving the side eﬀects aside. It interprets a program Γ ` p as if it was a pure functional
program λΓ.p where the operations of read and write are kept abstract and not speciﬁed and ; is not
interpreted as sequential composition but as something that allows concurrency between instructions
that can be reordered. This construction is presented in Section 4.
The second step is what we call closed semantics: it applies the λΓ.p we get from the ﬁrst step to
implementations of memory cells (one for each variable in Γ). Giving the semantic behaviour of these
cells is exactly as giving the formal semantics of the memory: how reads and write are propagated
along threads. This construction is presented in Section 5.
For each step, diﬀerent choices can be made, corresponding to diﬀerent architectures. In Section
4 and 5, we pick a possible choice for each component.
3.3. An example of open semantics: a sequentially consistent semantics
In this subsection, we quickly illustrate what we mean by open semantics on a simple example by
deﬁning an open semantics that does not allow reordering any instructions within a thread. Coupled
with a sequential memory model, this would only exhibit the sequentially consistent executions.
Our semantics will map programs over a context Γ to event structures with labels in the set
LΓ = {Wr(k)a , Re(r=k)a | a ∈ Γ, k ∈ N, r ∈ R}. There is an obvious map v : LΓ → Γ deﬁned by
v(Re
(r=k)
a ) = a and v(Wr
(k)
a ) = a.
Environments To handle thread-local variables, we will use an environment to pass down their
values as we compute the interpretation. An environment will be a map ρ : ∆→ N assigning to each
44
Weak memory models using event structures
thread-local variable a value. Such an environment extends naturally to arithmetic expressions with
variables in ∆ by the following equations:
ρ(k) = k ρ(e+ e′) = ρ(e) + ρ(e′).
Interpreting threads Given a thread Γ; ∆ ` t and an environment ρ : ∆→ N we deﬁne a labeled
event structure JtKρ as follows:
• Ja := e; tKρ = Wr(ρ(e))a · JtKρ  we perform the write on a and then continue to t.
• Jr ← a; tKρ = ∑n∈N Re(r=n)a · JtKρ[r←n]  as the context is open, the model needs to account for
all possible values, formalized as a non-deterministic sum of event structures.
Interpreting programs Given Γ ` t1 ‖ . . . ‖ tn a program, we deﬁne its open interpretation JpK asJt1K∅ ‖ . . . ‖ JtnK∅ where ∅ is the empty environment on the empty context.
This interpretation is very simple: threads are interpreted by sequential event structures: two
events are either comparable or conﬂicting. There is no in-thread concurrency as no instructions can
be permuted.
Example 4.  On the program p = (r1 ← a; b := 1) ‖ (r2 ← b; a := r2), this gives the following event
structure (N is assumed to be {0, 1} to simplify the picture):
Wr
(1)
b Wr
(1)
b Wr
(0)
a Wr
(1)
a
Re
(r1=0)
a
_LLR
Re
(r1=1)
a
_LLR
Re
(r2=0)
b
_LLR
Re
(r2=1)
b
_LLR
4. Open semantics
In this section we develop an open interpretation that allows permuting operations that do not target
the same variable. This behaviour can be found in the Alpha architecture [2]. In particular, we forbid
permuting loads from the same variable even though this optimization is sequentially valid. This
particular choice is simple to present as it is uniform but the theory supports other choices.
In the semantics, we represent instructions that can be reordered by concurrent events. The goal
of this section is to build a semantics similar to the one of Section 3.3 which allows for concurrency
inside a thread.
4.1. Operation dependency
The challenge of this interpretation is to determine what memory operations are causally related.
Indeed, in the interpretation we want to replace the earlier deﬁnition:
Ja := e; tKρ = Wr(ρ(e))a · JtKρ
which is too sequential by something using the concatenation operator ⊗R:
Ja := e; tKρ = Wr(ρ(e))a ⊗_d JtKρ
for a well-chosen _d which indicates when a label is a necessary cause to another one. We should
have Wr
(k)
a _d s whenever v(s) = a. But what about Re(r=k)a ? For instance, we do not know which
stores depend on r: we can only observe the value written, not the arithmetic expression actually
45
S. Castellan
computed. To that end, we need to enrich our labels to contain information about registers. But the
problem is more subtle than that:
Example 5.  We want our semantics to compute the following event structure for the program
r1 ← a; b := r1 ‖ r2 ← b; a := 1:
Wr
(0)
b Wr
(1)
b
Re
(r1=0)
a
_LLR
Re
(r1=1)
a
_LLR
Re
(r2=0)
b Re
(r2=1)
b Wr
(1)
a
We observe that a := 1 only generates one event where b := r1 generates two. We have to track
the dependency in r1 of the store to b in the ﬁrst thread and duplicate accordingly the write event,
while the store of the second thread is independent of r2 and is represented by a single event.
To solve this, we need to determine how many reads an operation depends on. Moreover, we need
also to track the values read, not to put causal links between results of incompatible reads. This
means that our labels should now carry a partial map R ⇀ N indicating which labels are used and
what are their values. Deﬁning L′Γ = [R⇀ N]×LΓ, we can now deﬁne_d⊆ L′Γ×L′Γ by the following
equations:
(ρ, Wr(k)a ) _d (ρ′, s) ≡ ρ ⊆ ρ′ ∧ v(s) = a (ρ, Re(r=k)a ) _d (ρ′, s) ≡ ρ ⊆ ρ′ ∧ r ∈ dom(ρ′)
We do not need the environments to carry values of registers anymore: any operation that needs
the value of a register will give rise to one event per value possibly read. However to compute which
registers an operation depends on, we need to keep track of the register bindings. Indeed the threadJa := e; tK depends on any read on any register bound to a (ie. registers whose values are loaded
from a). Therefore to deﬁne the semantics by induction, we will maintain an environment σ : ∆→ Γ
containing exactly this information.
4.2. Semantics of threads
Assume we have a thread Γ; ∆ ` t.
Stores Suppose t = (a := e; t′), and an environment σ : ∆→ Γ. The partial environment ρ for this
thread is deﬁned on D = fv(e) ∪ σ−1(a) where σ−1(a) is the set of registers bound to a in σ. We let:
JtKσ =
 ∑
ρ:D→N
(
ρ, Wr(ρ(e))a
)⊗_d Jt′Kσ.
(Recall that ρ(e) stands for the extension of ρ to arithmetic expressions with registers in D.) This
picks a non-deterministic value for each read that has to occur before the write, performs the write
and proceeds to t′ letting ⊗_d insert the right causalities between the writes and t′.
Loads Suppose t = (r ← a; t′), and an environment σ : ∆→ Γ. This time we let D = σ−1(a):
JtKσ =
∑
n∈N
∑
ρ:D→N
(
ρ[r ← n], Re(r=n)a
)⊗_d Jt′Kσ[r←a].
We proceed similarly to writes but we also pick non-deterministially the value that is read.
46
Weak memory models using event structures
4.3. Semantics of programs
As before, given a program Γ ` t1 ‖ . . . ‖ tn we let Jt1 ‖ . . . ‖ tnK = Jt1K∅ ‖ . . . ‖ JtnK∅. This gives
the desired event structure with enough concurrency to express possible reorderings of instructions.
Having done the construction we project labels in L′Γ to LΓ by forgetting the register environment: it
is not needed anymore.
Example 6 (Read/Write reordering).  Assume we have the thread a, b ` r ← a; b := 1 in the empty
environment. Computing Jb := 1Kr 7→a is easy as there is no dependency for this instruction in this
environment: D will be empty. This yields the event structure with a single event labelled (∅, Wr(1)b ).
It follows that Jr ← a; b := 1K = Jr ← aK ‖ Jb := 1K as those events are not in _d:
({r 7→ 0}, Re(r=0)a ) ({r 7→ 1}, Re(r=1)a ) (∅, Wr(1)b )
For the thread a, b ` r ← b; a := r the situation is diﬀerent. Computing Ja := rKr 7→b needs
duplication as D will be {r}: the value written depends on r. This gives the following event structure∑
n∈N({r 7→ n}, Wr(n)a ). The whole thread gives then:
({r 7→ 0}, Wr(0)a ) ({r 7→ 1}, Wr(1)a )
({r 7→ 0}, Re(r=0)b )
_LLR
({r 7→ 1}, Re(r=1)b )
_LLR
5. Memory models
Given the open interpretation of a program p, we would like to close it by restricting the scope of
the shared variables to p  the environment cannot modify it anymore1. To do so, we need to deﬁne
formally the behaviour of the memory: which sequences of memory operations are valid according to
our model? Depending on the architecture diﬀerent behaviours can be accepted. Throughout this
section, for simplicity of presentation we make one assumption: the memory behaves as a server that
receives commands (reads and write) in a total order, this assumption is however non-necessary. (See
section 5.4).
Deﬁnition 6 (Linear memory model).  A memory model is a preﬁx-closed subset of (La)
∗, the
set of ﬁnite lists (or traces) of memory events concerning a single ﬁxed variable a.
Given a linear memory model T , we will write T (b) for a given variable b to denote the set of
memory events allowed on variable b simply obtained by substituting b for a in the traces of T . Note
that in order to prove theorems quantiﬁed over all memory models, one would need more axioms in
order to get only meaningful memory models, but the constructions presented here do not require
them.
The goal of this section is to build the closed semantics JΓ ` pKT of a program Γ ` p with respect
to a memory model T by eliminating behaviours that are not in T .
Example 7 (A sequential memory).  The ﬁrst memory model that comes to mind is that of a
sequential memory. Consider the grammar with non-terminal symbols (Tk)k∈N deﬁned as follows:
Tk ::= Re
(r=k)
a · Tk | Wr(n)a · Tn | .
Then the language generated by T0 ⊆ (La)∗ gives a memory model T representing a sequential memory
cell initialized to zero: a load must read the last value written to it, or zero if there has been no write
to it yet.
1This operation is similar to the restriction in pi-calculus.
47
S. Castellan
5.1. Closed semantics with respect to a memory model
In this section, we explain how to compute the closed semantics of a program Γ ` p with respect to a
speciﬁc memory model T ⊆ (La)∗. We start oﬀ by examining a simple example.
Example 8.  Consider the program p = r ← a ‖ a := 1. The open semantics of p is
Re
(r=0)
a Re
(r=1)
a Wr
(1)
a
Against the sequential memory deﬁned above, the desired result would be (the read and write must
be sequentialized in certain order):
Re
(r=1)
a Wr
(1)
a
Wr
(1)
a
_LLR
Re
(r=0)
a
_LLR
There are more events in this event structure than in the open semantics as the Wr
(1)
a event has
been duplicated. Consider now a, b ` a := 1 ‖ b := 2; r ← a; b := r. We have the following open
semantics:
Wr
(0)
b Wr
(1)
b
Wr
(1)
a Wr
(2)
b
5 66? ) 118
Re
(r=0)
a
_LLR
Re
(r=1)
a
_LLR
Computing the closed semantics involves scheduling the operations on the two variables a and b in a
consistent manner, which yields a more complicated picture:
Wr
(1)
b
Wr
(1)
a Wr
(0)
b Re
(r=1)
a
_LLR
Re
(r=0)
a
_LLR 5 66?
Wr
(2)
b
H>>H
_LLR
Wr
(1)
a
_LLR
As a consequence of the linearity assumption, two events concerning the same memory cell are either
causally comparable or in conﬂict.
The conﬁgurations (or states) of the desired resulting event structures are easier to understand
than the set of events: they always correspond to a conﬁguration of the open semantics JΓ ` pK
equipped with linearization information for each variable a ∈ Γ. Conﬁgurations of JΓ ` pK that have
no possible linearizations disappear while conﬁgurations that have several per-variable linearizations
yield several conﬁgurations of the result. We now make this intuition formal.
Projection It will be useful to project a conﬁguration of an event structure to a particular variable.
This is done through the notion of projection:
Deﬁnition 7 (Projection).  Given an event structure S and a set of events V ⊆ S, we form an
event structure S ↓ V as follows:
• Events: V
• Causality: ≤S ∩V 2
• Conﬂict: #S ∩V 2
48
Weak memory models using event structures
In S ↓ V only the events in V are visible while the others are thought to be occurring in background
or hidden. Note that any conﬁguration x of S induces a conﬁguration x ∩ V of S ↓ V .
Given a ∈ Γ and (S, lbl) an event structure labelled in LΓ, we write S ↓ a as a short-hand for
S ↓ {s ∈ S | v(lbl s) = a}  the projection of S to the memory events concerning a. Any conﬁguration
x ∈ C (S) induces a conﬁguration x ↓ a of C (S ↓ a).
Example 9.  The projection of the open semantics of Example 9 gives:
Wr
(0)
b Wr
(1)
b
Wr
(2)
b
]]g 8 77A
This is a partial-order that can be linearized in several ways, each one representing a trace:
Deﬁnition 8 (Trace of a conﬁguration).  Let (S, lbl) be a labelled event structure. A covering
chain of x ∈ C (S) is a sequence s0, . . . , sn ∈ S such that {s0, . . . , sn} = x and for all i ≤ n, {s0, . . . , si}
is a conﬁguration of S. A trace of x is a sequence of the form lbl(s0), . . . , lbl(sn) where s0, . . . , sn is
a covering chain of x.
We write tr(x) for the set of traces of x.
We can deﬁne the desired conﬁgurations of our event structure as follows. Let Γ ` p be a program
and write JΓ ` pK its open interpretation. We deﬁne
Sp = {(x, (ta)a∈Γ) | x ∈ C (JΓ ` tK) and ∀a ∈ Γ, ta ∈ tr(x ↓ a)}.
The set Sp is naturally ordered: (x, (ta)) v (y, (t′a)) if x ⊆ y and for all a ∈ Γ, ta is a preﬁx of t′a.
Such pairs are the conﬁgurations of our program. To get an event structure S such that (C (S),⊆) is
order-isomorphic to (Sp,v), we use a construction similar to synchronized product of event structures
[16]. First, for each (x, (ta)) ∈ Sp, we equip x with a partial order written ≤(x,(ta)):
s ≤(x,(ta)) s′ iﬀ ∀(x′, (t′a)) v (x, (ta)), s′ ∈ x′ ⇒ s ∈ x
This reads: s is below s′ within (x, (ta)) when for every sub-conﬁguration of (x, (ta)) where s′
occurs, then s must occur as well.
Proposition 1 (Prime construction).  For a given memory model T , the following form an event
structure written JΓ ` pKT :
• Events: (x, (ta)) ∈ Sp such that (x,≤(x,(ta))) has a greatest element (such a conﬁguration is
called prime).
• Causality: given by restricting v to the set of events.
• Conﬂict: (x, (ta)) # (x′, (t′a)) when
 either there exist s ∈ x and s′ ∈ x′ such that s # s′
 or there exists a ∈ Γ such that ta and t′a are not comparable for the preﬁx order.
Proof. Since conﬁgurations are ﬁnite sets, any prime conﬁguration (x, (ta)) has a ﬁnite number of
elements below it for v, so in particular a ﬁnite number of prime conﬁgurations.
Moreover, assume that (x, (ta)) # (x
′, (t′a)) and (x
′, (t′a)) v (x′′, (t′′a)). If we have s ∈ x and s′ ∈ x′
such that s # s′, the conﬂict is indeed inherited as x′ ⊆ x′′. Otherwise, write b for a variable such
that tb and t
′
b are incomparable. It is easy to see that tb and t
′′
b have also to be incomparable since
the preﬁx order is a tree.
We also have that any partial order of the form (x,≤(x,(ta))) is order-isomorphic to a unique
conﬁguration of JΓ ` pKT , hence we have that (C (JΓ ` pKT ,⊆)) is order-isomorphic to (Sp,v).
49
S. Castellan
Re
(r1=1)
a Re
(s1=1)
b
Re
(s1=1)
b Re
(s2=1)
a
_LLR
Re
(s2=1)
a Re
(r2=1)
b Re
(r2=1)
b
_LLR
Re
(r1=1)
a
Wr
(1)
b
_LLR 1 44=aaj
Re
(s2=1)
a Re
(s1=1)
b Re
(r1=1)
a Re
(r2=1)
1 Wr
(1)
a
_LLR 1 44=aaj
Re
(r2=0)
b
_LLR
Re
(r1=1)
a
ZZe
_LLR
Re
(r2=1)
b
_LLR
Re
(s2=1)
a
_LLR
Re
(s1=1)
b
_LLR
?::D
Re
(s2=0)
a
_LLR
Wr
(1)
a
( 007_LLR 1 44=aaj
Wr
(1)
b
ggn _LLR 1 44=aaj
Figure 2: Closed semantics for a strict memory model of the store buﬀering example
5.2. A relaxed memory model
In this section, we show an instance of a relaxed memory model that exhibits behaviours forbidden
by the sequential model of Section 5.
Example 10 (Store buﬀering).  Consider the following program:
a := 1 b := 1
r1 ← a s1 ← b
r2 ← b s2 ← a
Assume an open semantics where Store/Load reorderings are not allowed so that for this program it
would look like:
Re
(r1=0)
a Re
(r1=1)
a Re
(r2=0)
b Re
(r2=1)
b Re
(s1=0)
b Re
(s1=1)
b Re
(s2=0)
a Re
(s2=1)
a
Wr
(1)
a
ffm 	__h 5 66? ) 118
Wr
(1)
b
ffm 	__h
5 66? ) 118
(We keep events Re
(r1=0)
a and Re
(s1=0)
b because the environment can modify it.)
Closing it with the sequential memory model deﬁned above gives the large event structure of Figure
5.2. The key point is that there are unique events corresponding to reading r2 = 0 and s2 = 0 and
they are in (inherited from the writes) conﬂict so the outcome r2 = s2 = 0 is not possible.
However, in some situations, r2 = s2 = 0 can be observed. This is an example of store buﬀering
exhibited by Intel processors for instance: there might be delays between one a write is available
locally and when a write is available globally.
To deal with that we need to make our model thread-aware in order to distinguish events from
diﬀerent threads. We assume now our labels carry an extra integer called the thread-id indicating the
origin of this event: we now let LtΓ = N× L′Γ. It is straightforward to extend our open interpretation
of threads to depend on a integer representing the thread's id and then deﬁne
JΓ ` t1 ‖ . . . ‖ tnK = JΓ ` t1K(1) ‖ . . . ‖ JΓ ` tnK(n).
50
Weak memory models using event structures
With this, we can make a linear memory model that takes into account store buﬀering using
a grammar similarly as at the beginning of Section 5. Now our symbol is indexed over pairs
(ρ, k) ∈ (N → N) × N where k is the global value of the cell and ρ(ι) is the local value in thread
ι. Writing ρ[ι← x] to update environments, we deﬁne:
T ′ρ,k ::= |  | Re(r=k)ι,a · T ′ρ[ι←k],k (Read from the global memory)
| Re(r=ρ(ι))ι,a · T ′ρ,k (Read from the local cache)
| Wr(n)ι,a · T ′ρ[ι←n],n (Write)
Then T ′ ≡ T ′(n 7→0),0 gives a list of traces representing a memory model allowing for store buﬀering.
The ﬁrst axiom says that when a read fetches the value of the global memory, then this value is
available to all threads. This is similar to the Value axiom of Sparc [2].
Example 11.  Considering again the program of Example 10, we can now compute the closed
interpretation with the memory model T ′. Since now we have that Wr(1)1,a · Re(r=1)1,a · Re(r=0)2,a ∈ T ′(a),
more behaviours are allowed in particular the one where r2 = s2 = 0. The event structure is too big
to be drawn here (but can be visualized via the implementation).
5.3. Executions in our model
In our model, complete executions correspond to maximal conﬁgurations. In more detail, given a
program Γ ` p, a possible execution of p in a memory model T is given by a conﬁguration x ofJΓ ` pKT maximal with respect to inclusion: no more events can be added to it. From such data, we
discuss here how one could recover an execution witness in the sense of [2]. In the open semantics,
the program order of p is already lost because of the concurrency between events that correspond to
instructions that can be executed in any order. What the open semantics gives us is the preserved
program order : causalities from the program that the architecture cannot break.
We show here how to recover the read-from and write-serialization maps. We have seen in Section
5.1 that conﬁgurations of JΓ ` pK correspond to pairs (x, (ta)a∈Γ) in Sp, where x ∈ C (JΓ ` pK) is
an event structure in the sense of [2] (up to the fact that the partial order represents the preserved
program order not the actual program order, as pointed out above). Because operations on a variable
a are scheduled to be operated in a total order, it follows that any non-zero read has to be preceded
by a write in ta. To deﬁne the read-from map, we need that T satisﬁes the property: any read in a
trace is either with value zero or preceded by a write of this value. The T presented above do satisfy
it. The write-serialization map comes for free as traces are linearly ordered.
5.4. Non-linear memory models
The reader might have noticed that the event structures drawn in the last examples are rather large
and sequential. The linearity assumption on the memory model forces the behaviour on each variable
to be sequential: two events are either in conﬂict or comparable. In this section we investigate the
possibility of non-linear models to keep more concurrency in the generated event structure.
Deﬁnition 9 (Non-linear memory model).  A non-linear memory model is a collection of partial
orders labelled on La closed under rigid inclusion, deﬁned as follows: q ↪→ q′ when the support of q is
included in that of q′ and the identity map is order-preserving and preserves down-closed sets2.
Given a non-linear memory model T , computing the closed interpretation with respect to that
model is more technical. The deﬁnition of Section 5.1 relies heavily on the fact that the models are
linear. If they are not, one needs to use a synchronized product of event structures [16], that will pair
2Also called a rigid map of event structure.
51
S. Castellan
up a conﬁguration x of JtK and a list of partial-orders (xa ∈ T (a)) satisfying a condition saying that
the orders of x and the xa are compatible: their union does not have a cycle. For lack of space, we
do not detail this construction and content ourselves with an example.
Example 12 (A simple example of non-linear memory model).  Let A be the collection of partial-
orders labelled in La satisfying the following two axioms: (1) the projection to any thread-id gives
a linear order (operations in the same thread cannot be concurrent) and (2) every chain of the form
s1 _ . . . _ sn where s1 is minimal is in T : every read event reads the last value written or zero if
there is not any.
Looking back at the example 10, we see that apart from the two events Re
(r1=0)
a and Re
(s1=0)
b
that cannot appear in a partial order of A(a), all other conﬁgurations are actually partial-orders in
A(a) ‖ A(b): every conﬁguration of the open semantics has a unique minimal synchronization with
A(a) ‖ A(b). Hence the synchronized product does not change anything, yielding the result:
Re
(r1=1)
1,a Re
(r2=0)
1,b Re
(r2=1)
1,b Re
(s1=1)
2,b Re
(s2=0)
2,a Re
(s2=1)
2,a
Wr
(1)
1,a
	__h 5 66?
) 118
Wr
(1)
2,b
	__h 5 66?
) 118
6. Conclusion
In this article, we have built a translation from a simple imperative concurrent programming language
to event structures representing the possible executions. This model takes advantage of concurrency
to encode the possibility of instruction reordering in order to cut down the size of the representation.
We believe this representation is a short and faithful representation of all the possible behaviours of
a program that could be used to give a semantics to programming languages or processors.
This translation is heavily inspired by game semantics techniques that can be simpliﬁed in this
ﬁrst-order setting. This simpliﬁed approach can be easily extended to support conditionals and while
loops. The original game semantics model [7, 6] however already supports a higher-order concurrent
programming language at a price of increased mathematical complexity.
In the future, we would like to use this approach to handle all the existing architectures using
non-linear memory models to describe compactly behaviours of modern shared memories.
Acknowledgements I would like to thank Pierre Clairambault and Olivier Laurent for helpful
comments, Jade Alglave and Jean Pichon for introducing me to the world of memory models.
References
[1] S. Abramsky and G. McCusker. Full abstraction for Idealized Algol with passive expressions.
volume 227, pages 342. 1999.
[2] J. Alglave. A formal hierarchy of weak memory models. Formal Methods in System Design,
41(2):178210, 2012.
[3] M. Batty, K. Memarian, K. Nienhuis, J. Pichon-Pharabod, and P. Sewell. The problem of
programming language concurrency semantics. In J. Vitek, editor, Programming Languages and
Systems - 24th European Symposium on Programming, ESOP 2015, Held as Part of the European
Joint Conferences on Theory and Practice of Software, ETAPS 2015, London, UK, April 11-18,
2015. Proceedings, volume 9032 of Lecture Notes in Computer Science, pages 283307. Springer,
2015.
52
Weak memory models using event structures
[4] M. Batty, K. Memarian, S. Owens, S. Sarkar, and P. Sewell. Clarifying and compiling C/C++
concurrency: from C++11 to POWER. In J. Field and M. Hicks, editors, Proceedings of the 39th
ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2012,
Philadelphia, Pennsylvania, USA, January 22-28, 2012, pages 509520. ACM, 2012.
[5] S. Burckhardt, M. Musuvathi, and V. Singh. Verifying local transformations on relaxed memory
models. In R. Gupta, editor, Compiler Construction, 19th International Conference, CC 2010,
Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2010,
Paphos, Cyprus, March 20-28, 2010. Proceedings, volume 6011 of Lecture Notes in Computer
Science, pages 104123. Springer, 2010.
[6] S. Castellan. La stratégie de la fourchette. In D. Baelde and J. Alglave, editors, Vingt-sixièmes
Journées Francophones des Langages Applicatifs (JFLA 2015), Le Val d'Ajol, France, Jan. 2015.
[7] S. Castellan, P. Clairambault, and G. Winskel. Concurrent Hyland-Ong games, 2014.
[8] S. Castellan, P. Clairambault, and G. Winskel. The parallel intensionally fully abstract games
model of PCF. In LICS 2015. IEEE Computer Society, 2015.
[9] P. Cenciarelli, A. Knapp, and E. Sibilio. The java memory model: Operationally, denotationally,
axiomatically. In R. D. Nicola, editor, Programming Languages and Systems, 16th European
Symposium on Programming, ESOP 2007, Held as Part of the Joint European Conferences on
Theory and Practics of Software, ETAPS 2007, Braga, Portugal, March 24 - April 1, 2007,
Proceedings, volume 4421 of Lecture Notes in Computer Science, pages 331346. Springer, 2007.
[10] D. R. Ghica and A. S. Murawski. Angelic semantics of ﬁne-grained concurrency, 2007.
[11] L. Lamport. IEEE Transactions on Computers, 46, 1997.
[12] S. Mador-Haim, L. Maranget, S. Sarkar, K. Memarian, J. Alglave, S. Owens, R. Alur, M. M. K.
Martin, P. Sewell, and D. Williams. An axiomatic memory model for POWER multiprocessors.
In P. Madhusudan and S. A. Seshia, editors, Computer Aided Veriﬁcation - 24th International
Conference, CAV 2012, Berkeley, CA, USA, July 7-13, 2012 Proceedings, volume 7358 of Lecture
Notes in Computer Science, pages 495512. Springer, 2012.
[13] S. Rideau and G. Winskel. Concurrent strategies. In Logic in Computer Science (LICS), 2011
26th Annual IEEE Symposium on, pages 409418. IEEE, 2011.
[14] S. Sarkar, K. Memarian, S. Owens, M. Batty, P. Sewell, L. Maranget, J. Alglave, and D. Williams.
Synchronising C/C++ and POWER. In J. Vitek, H. Lin, and F. Tip, editors, ACM SIGPLAN
Conference on Programming Language Design and Implementation, PLDI '12, Beijing, China -
June 11 - 16, 2012, pages 311322. ACM, 2012.
[15] J. Sevcík and D. Aspinall. On validity of program transformations in the Java memory model.
In J. Vitek, editor, ECOOP 2008 - Object-Oriented Programming, 22nd European Conference,
Paphos, Cyprus, July 7-11, 2008, Proceedings, volume 5142 of Lecture Notes in Computer Science,
pages 2751. Springer, 2008.
[16] G. Winskel. Event structure semantics for CCS and related languages. In M. Nielsen and E. M.
Schmidt, editors, Automata, Languages and Programming, 9th Colloquium, Aarhus, Denmark,
July 12-16, 1982, Proceedings, volume 140 of Lecture Notes in Computer Science, pages 561576.
Springer, 1982.
[17] G. Winskel. Event structures. In Petri Nets: Central Models and Their Properties, Advances
in Petri Nets 1986, Part II, Proceedings of an Advanced Course, Bad Honnef, 8.-19. September
1986, pages 325392, 1986.
53
S. Castellan
54
