Fences in Weak Memory Models by Jade, Alglave & Maranget, Luc
HAL Id: inria-00408568
https://hal.inria.fr/inria-00408568
Submitted on 3 Aug 2009
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entific research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Fences in Weak Memory Models
Alglave Jade, Luc Maranget
To cite this version:
Alglave Jade, Luc Maranget. Fences in Weak Memory Models. [Research Report] RR-7010, INRIA.
2009. ￿inria-00408568￿
appor t  
de  r ech er ch e 
IS
S
N
02
49
-6
39
9
IS
R
N
IN
R
IA
/R
R
--
70
10
--
F
R
+
E
N
G
INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE
Fences in Weak Memory Models
Jade Alglave — Luc Maranget
N° 7010
Juillet 2009
Centre de recherche INRIA Paris – Rocquencourt
Domaine de Voluceau, Rocquencourt, BP 105, 78153 Le ChesnayCedex
Téléphone : +33 1 39 63 55 11 — Télécopie : +33 1 39 63 53 30
Fenes in Weak Memory ModelsJade Alglave , Lu MarangetThème :Équipe-Projet MosovaRapport de reherhe n° 7010  Juillet 2009  39 pages
Abstrat: We present here an axiomati framework, implemented in the Coqproof assistant, for dening weak memory models in terms of several parameters:loal reorderings of reads and writes, and visibility of inter and intra proessorommuniations through memory. In this ontext, we provide formal denitionof weak memory models indued by arhitetures, illustrated by denitions of
SC and Spar TSO. Moreover, we dene a omparison over arhitetures,an arhiteture A1 being weaker than another one A2 when A1 allows morebehaviours than A2. In addition, we provide a haraterisation of behavioursallowed by A1 whih are also valid on A2. By that means, we provide a simpleharaterisation of SC and TSO behaviours on any weaker arhiteture. Wealso provide an abstrat notion of what should be the ation and plaement offenes to restore a given model from a weaker one.Key-words: Weak Memory Models, Fenes
Une étude de l'ation des barrières au sein demodèles de mémoire relâhésRésumé : Nous proposons un environnement générique, implémenté au seinde l'assistant de preuve Coq, pour dénir des modèles de mémoire relâhés enfontion de plusieurs paramètres: réordonnanements de letures et éritures,et visibilités des ommuniations via la mémoire. Dans e ontexte, nous four-nissons une dénition formelle d'un modèle de mémoire induit par une arhite-ture, que nous illustrons par les dénitions de SC et Spar TSO. Par ailleurs,nous dénissons une notion de omparaison de deux arhitetures, une arhi-teture A1 étant onsidérée plus faible qu'une arhiteture A2 si A1 autoriseplus de omportements que A2. De plus, nous fournissons une aratérisationdes omportements autorisés par A1 qui sont également valides au sein de A2,e qui nous permet de donner une aratérisation simple de SC et TSO surdes arhitetures plus faibles. Nous fournissons également une formalisation dupouvoir et du plaement des barrières mémoires pour restaurer un modèle donnédepuis un modèle plus faible.Mots-lés : Modèles de mémoires relâhés, Barrières
Fenes in Weak Memory Models 3Contents1 Introdution 41.1 An axiomati generi model . . . . . . . . . . . . . . . . . . . . . 41.2 Study of barriers power . . . . . . . . . . . . . . . . . . . . . . . 51.3 Case study: a Power model . . . . . . . . . . . . . . . . . . . . . 52 Desription of the model 52.1 Axiomatisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1.1 Basi objets . . . . . . . . . . . . . . . . . . . . . . . . . 52.1.2 Exeution witnesses . . . . . . . . . . . . . . . . . . . . . 62.1.3 Arhitetures . . . . . . . . . . . . . . . . . . . . . . . . . 82.1.4 Validity of an exeution with respet to an arhiteture . 92.1.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2 Comparison of arhitetures . . . . . . . . . . . . . . . . . . . . . 122.2.1 Making validity monotonous . . . . . . . . . . . . . . . . 132.2.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.3 Equivalene with native models . . . . . . . . . . . . . . . . . . . 142.3.1 S is SC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.3.2 Tso is TSO . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4.1 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.4.2 Comparison of models . . . . . . . . . . . . . . . . . . . . 152.4.3 Charateristi tests . . . . . . . . . . . . . . . . . . . . . . 163 Semantis of barriers 193.1 Barriers guarantee . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Considering a weaker guarantee . . . . . . . . . . . . . . . . . . . 204 Case study: a Power model 214.1 Complete event strutures and exeution witnesses . . . . . . . . 224.2 Globality of rfmaps . . . . . . . . . . . . . . . . . . . . . . . . . . 244.3 Preserved program order ppo . . . . . . . . . . . . . . . . . . . . 244.4 Values do not ome out of thin air . . . . . . . . . . . . . . . . . 254.5 Cumulative memory barriers . . . . . . . . . . . . . . . . . . . . 275 Barrier experiments 285.1 Oial tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285.2 Classial tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 Towards a stronger model 346.1 Extension ppo-ext→ . . . . . . . . . . . . . . . . . . . . . . . . . . . 346.2 Semantis of lwsyn . . . . . . . . . . . . . . . . . . . . . . . . . 347 Conlusion 377.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377.2 Status of writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
RR n° 7010
4 Jade Alglave & Lu Maranget1 IntrodutionMemory models are what desribe and onstrain the behaviour of a programrunning on a multiproessor. That said, understanding what a program woulddo on suh a mahine requires a preise denition of the memory model induedby the mahine, that is, the underlying memory system and the behaviourof the proessors involved. Previous studies [14, 18℄ have disussed the needfor a rigorous denition of weak memory models, whih some of the publidoumentations [3, 4℄ lak. We provide here a generi and axiomati frameworkto preisely dene a memory model in terms of several parameters and test itagainst real hardware.Let us onsider a shared-memory multiproessor system, that onsists ofseveral proessors writing to or reading from a ommon shared memory. Wewill disuss here what representation of memory and proessor behaviour weonsider.Representation of memory One representation of a shared memory ouldbe a single memory on whih several proessors operate simultaneously, all theirwrites being ommited to memory as soon as they are issued. Thus, one anonsider the onnetion between proessors and memory as diret: as soonas one proessor writes to memory, the value written overwrites the previousvalue and is immediately available to all proessors. This property, alled storeatomiity, has been examined and advoated as valuable [16, 8, 11℄ as it providesthe guarantee that ations on suh a memory are serialisable, whih leads to arather understandable memory model. However, it is not guaranteed on severalreal arhitetures [1, 10, 3, 4℄. They indeed relax the store atomiity onstraint,whih means a write is not available to all proessors at one. For examplea write is at rst initiated by a given proessor, then ommitted to a ahe,and nally to memory. This last step is sometimes alled globally performed[15℄. Even without assuming writes to be ommitted immediately, we supposea total order on the globally performed writes to the same loation, a propertysometimes alled oherene [5℄ that is widely assumed by modern arhitetures[1, 10, 3, 4℄.Proessor behaviour One representation of a proessor behaviour ould sup-pose a sequential order, onsistent with the program order, of all the reads andwrites events issued by a given proessor, as a generalisation of the uniproessorase. However, modern arhitetures [1, 10, 3, 4℄ provide relaxed memory modelsthat do not onstrain the way reads and writes are ordered that muh. Theseonstraints, or their relaxation, are often gathered behind the term instrutionreordering [8, 11℄.1.1 An axiomati generi modelWe will preisely dene an arhiteture in terms of its ordering and store atomi-ity relaxations at setion 2.1.3. For example, Sequential Consisteny (heneforth
SC) [17℄, supposes writes to be ommitted to memory as soon as they are issued,and that the program order is maintained between all aesses, thus being thestrongest (in a sense that will be dened preisely at setion 2.2) memory model.We will illustrate how to instaniate our model to produe SC and Spar TSOINRIA
Fenes in Weak Memory Models 5
P0 P1
(a) x← 1 (c) y← 1
(b) r2← y (d) r4← xi3 r2 = 0 ∧ r4 = 0 ?Figure 1: i3 exhibits non-SC behaviour on modern arhitetures[1℄, and show equivalene with the native models, together with haraterisationof exeutions that would be valid on these models.1.2 Study of barriers power
SC provides indeed a rather omfortable programming model, whih explainswhy most arhitetures provide mehanisms suh as barriers and loks, to restoreit from a weaker model. However, it is not lear how muh power a barrierneeds to provide the illusion of SC, and where to plae these onstrutions inthe ode. We examine this question at setion 3.1 from a general point of view:we provide a suient ondition on barriers to restore a strongest model from aweaker one. Moreover, we rene this ondition in some partiular yet interestingases at setion 3.2, suh as TSO [1℄.1.3 Case study: a Power modelOur generi framework, implemented in the Coq proof assistant [12℄, has twoompanion tools: memevents, written in OCaml, whih is an exat implemen-tation of our axiomati model, and litmus, whih runs the same inputs thatmemevents takes on real hardware.We provide a serie of tests to instaniate properly our model with respetto a given mahine or arhiteture, whih allowed us to design a model for asigniant fragment of the Power arhiteture with barriers.2 Desription of the model2.1 AxiomatisationThe lassial test depited at g. 1, whih an be found in [6℄ with number 2.3aillustrates the fat that we annot use an interleaving semantis to reason onexeutions indued by weak memory models, as its exhibits a non-SC behaviouron some urrent arhitetures [6, 5℄. Instead, we reason on relations over readand write events raised by an instrution.2.1.1 Basi objetsAn event is an abstration of a memory aess performed during the exeution ofa multiproessor program. We note E the set of events generated by a partiularexeution. Events are of two kinds: reads and writes, whih sets will be depitedby R and W. Heneforth, we will note e for an event, r for a read, and w for awrite. An event e will hold its diretion (R or W ), its loation, given by loc e,RR n° 7010
6 Jade Alglave & Lu Maranget
a: W [x℄=1
b: R [y℄=0po:0
0:r1=0
: W [y℄=1
d: R [x℄=0po:1
1:r2=0
a: W [x℄=1
b: R [y℄=0po:0
0:r1=0
: W [y℄=1
d: R [x℄=1po:1
1:r2=1
a: W [x℄=1
b: R [y℄=1po:0
0:r1=1
: W [y℄=1
d: R [x℄=0po:1
1:r2=0
a: W [x℄=1
b: R [y℄=1po:0
0:r1=1
: W [y℄=1
d: R [x℄=1po:1
1:r2=1Figure 2: Event strutures for test i3.its value, given by val e and its proessor, given by proc e. We will note (a)x← v for a write to loation x with value v labelled (a), and (b) r1← y, for aread from y labelled (b).An exeution is also haraterised by the program order po→, a relation onevents that reets the sequential exeution of instrutions on a single proessor:given two instrutions i1 and i2 that generate events e1 and e2, having e1 po→ e2on events simply means that i1 preedes i2 in program order. po→ is a total orderamongst the events from the same proessor1 and never relates events fromdierent proessors.We ollet these informations into an event struture, depited by E:
E , (E,
po
→)Figure 2 illustrates the event strutures assoiated to the test i3 depited atg. 1.2.1.2 Exeution witnessesWe postulate two relations over events: rf→ and ws→.Rf A read-from map, links a read event with the write event that providesits value. We represent the notion by a relation from writes to reads, whih iswell-formed in the following sense:prf
→ , {(w, r) | ∃lv, w ∈Wl,v ∧ r ∈ Rl,v}
wfrf
rf
→ ,
rf
→⊆
prf
→ ∧∀r, ∃! w, w
rf
→ rWe gathered rst all pairs of writes and reads with same loation l and value v,whih sets are depited by Wl,v and Rl,v, and then enfored the uniqueness ofread soures.1When some instrutions may perform several memory aesses, po→ should inlude someof intra-instrution dependenies [18℄, thus beoming a partial order on events from a sameproessor. INRIA
Fenes in Weak Memory Models 7Ws The write serialisation is a total order of the writes to a same loation.Thus, we rst gather all pairs of writes to the same loation, and we requirethe relation to be a total order on writes to a same loation l, whih set will bedepited by Wl: pws
→ , {(w1, w2) | ∃l, w1 ∈Wl ∧ w2 ∈Wl}
wfws
ws
→ ,
ws
→⊆
pws
→ ∧ ∀ℓ, total order (
ws
→ ↾ Wℓ) WℓThe notation ws→ ↾ Wℓ stands for the restrition of the relation ws→ to the set Wℓ,i.e. ws→∩ (Wℓ ×Wℓ).Fr From these two relations, we dedue a third one, fr :
r
fr
→ w , ∃ w′ , w′
rf
→ r ∧ w′
ws
→ w
w
r
w0(rf)(fr) (ws)As we said, ws→ orders globally performed writes to the same loation; thus,if a write w′ is before another write w in ws→, we know that w′ is globally  thatis, for every proessor  before w. Furthermore, if a read r reads from w′ , weonsider r to be globally ordered the following write w: otherwise, there wouldbe no guarantee r atually read its value from w′ , thus ontraditing the rf→relation between them.Exeution witnesses We gather these relations  exept fr→ as it an bededued from the others  into an exeution witness, depited by X :
X , (
rf
→,
ws
→)Figure 3 adds rf→ and fr→ edges to the event strutures of gure 2. Thereare no ws→ edges among the (non-initialisation) writes shown. However, we seesome fr→ arrows whih follow from the serialization of init stores (whih omerst in ws→) and of stores generated by instrutions. For instane, in the leftmostpiture, we have d fr→ a. Indeed, the load d reads the initial value of loation x,whih loation is overwritten (later!) by the store a.We have the assoiated well formedness prediate wf, being the onjuntionof the prediates for rf→ and ws→.Initial and nal states The write serialization provides a natural way todene the initial and nal states of an exeution:
init X , {w | ¬(∃w′, w′
ws
→ w)}
final X , {w | ¬(∃w′, w
ws
→ w′)}RR n° 7010
8 Jade Alglave & Lu Maranget
a: W [x℄=1
b: R [y℄=0po:0rf
: W [y℄=1fr 0:r1=0
d: R [x℄=0po:1rf
fr
1:r2=0
rf
rf
a: W [x℄=1
b: R [y℄=0po:0
d: R [x℄=1
rf
rf
: W [y℄=1fr0:r1=0 po:1 rf
1:r2=1
rf
a: W [x℄=1
b: R [y℄=1 po:0 rf
0:r1=1
: W [y℄=1
rf d: R [x℄=0
po:1rf
fr1:r2=0
rf
a: W [x℄=1
b: R [y℄=1po:0 d: R [x℄=1rfrf
0:r1=1
: W [y℄=1rf po:1 rf
1:r2=1Figure 3: Exeution witnesses for i3.2.1.3 ArhiteturesWe dene here what we onsider to be an arhiteture.Preserved program order We assume a funtion ppo, whih gathers allpairs of events that are not to be reordered with respet to the program orderpo
→. Consider for example the test i3, depited at g. 1: the speied outomewould be valid only if writes and reads to dierent loations ould be reordered.Thus, an arhiteture that would authorise the speied outome would notinlude write-read pairs in its preserved program order.We will note ppo→ for the relation outputed by this funtion on a given eventstruture E, whih is to be inluded in po→. This relation is to be onsideredglobal, that is, all proessors must behave with respet to the onstraints induedby it.Globality of relations As stated in the introdution, we onsider writes to benon-atomi, that is, not neessarily available to all parts of the memory systemat one. Thus, the behaviour of all proessors must not neessarily inlude theonstraints indued by rf→ relations. However, we distinguish the onstraintsindued by internal rf→  rf→ relation on a same proessor  and external  rf→from one proessor to another. Thus, we split the rf→ relation into r→, whihrepresents the events in rf→ on the same proessor, and rfe→, whih represents theevents in rf→ on dierent proessors:
w
r
→ r , w
rf
→ r ∧ proc w = proc r
w
rfe
→ r , w
rf
→ r ∧ proc w 6= proc r INRIA
Fenes in Weak Memory Models 9Relations indued by the presene of barriers We assume given a fun-tion ab, whih, provided an event struture E and an exeution witness X ,denes the relation over events indued by the presene of a barrier in between in po→  two instrutions:
ab : E → X → rln Ewhere E (resp. X ) is the type of event strutures (resp. exeution witnesses).These informations are what denes for us an arhiteture, depited by A:Denition 1 (Arhiteture)
A , (ppo, int, ext, ab)2.1.4 Validity of an exeution with respet to an arhitetureWe dene here what it means for an exeution witness X to be valid on a givenarhiteture A.Uniproessor behaviour Some doumentations [3℄ laim that a sole proes-sor is supposed to respet the sequential exeution model, that is:the model of program exeution in whih the proessor appearsto exeute one instrution at a time, ompleting eah instrutionbefore beginning to exeute the next instrutionFollowing Alpha [10℄, we dene the proessor issue order, depited by the pio→relation, as follows:
e1
pio
→ e2 , e1
po
→ e2 ∧ loc e1 = loc e2We all hb→ the union of the three relations rf→, ws→ and fr→:hb
→ ,
rf
→ ∪
ws
→ ∪
fr
→Notie that hb→ is not the proper happens-before relation in the general ase,but rather the happens-before of a memory with multi-opy-atomi writes. Wedene the general happens-before relation in the next setion.To provide our exeutions the guarantee that they respet the sequentialexeution model, we require that all the relations rf→, ws→ and fr→ are onsistentwith the proessor issue order, that is:
uniproc , acyclic (
hb
→ ∪
pio
→)Figure 4 gives an example of an outome that is forbidden beause of uniproc.There are two exeutions for this outome, with dierent write serializations:
a
ws
→ b on the left, and b ws→ a on the right. In former ase, we have c fr→ b (by
a
ws
→ b and a rf→ c). Thus, invalidation follows from yle b pio→ c fr→ b. In thelatter ase, the yle is a rf→ c pio→ d fr→ a, the last step following from b ws→ a and
b
rf
→ d.RR n° 7010
10 Jade Alglave & Lu Maranget
P0 P1
(a) x← 1 (b) x← 2
(c) r2← x
(d) r3← xForbidden: 1:r2=1; 1:r3=2;
a:W [x℄=1
b:W [x℄=2ws
:R [x℄=1
rf po:1po-lo
d:R [x℄=2
rfrffr po:1po-lo1:r2=1
1:r3=2
a: W [x℄=1
: R [x℄=1rfrf
b: W [x℄=2ws po:1po-lo
d: R [x℄=2
rf
po:1 po-lo1:r2=1
fr
1:r3=2Figure 4: Invalid exeutions by uniproc.All together We all ghb→ the union of the relations that are global:ghb
→ ,
ppo
→ ∪
ws
→ ∪
fr
→ ∪
rf ?
→ ∪
ab
→with rf ?→ , r ?→ ∪ rfe ?→ where r ?→ (resp. rfe ?→ ) is r→ (resp. rfe→) if int (resp. ext)is true, the empty relation otherwise.We an now dene what a valid exeution is, with respet to an arhite-ture A:Denition 2 (Valid exeution)
A.valid E X , wf ∧ uniproc ∧ acyclic (
ghb
→)Weak Memory Models Let W be the type of memory models, dened asfollows:
W , E → X → {⊤,⊥}Thus, we dened a funtion Wmm  A being the type of arhitetures, whihprodues a weak memory model indued by A:Denition 3 (Weak Memory Model)
Wmm : A →W
Wmm(A) , ∀ E X, A.valid E XHeneforth, we will note AWmm for Wmm(A).2.1.5 ExamplesWe will show how to produe a partiular model from our generi frameworkon two lassial memory models, Sequential Consisteny [17℄, later on referredto as SC and TSO [1℄, thus illustrating the onepts we used to dene ourframework. We will show at setion 2.3 that these denitions are equivalent tothe native ones. INRIA
Fenes in Weak Memory Models 11Sequential onsisteny SC has been dened by Lamport as follows:The result of any exeution is the same as if the operations ofall the proessors were exeuted in some sequential order, and theoperations of eah individual proessor appear in this sequene inthe order speied by its program. [17℄We give here a formal denition of an SC exeution. We need at rst asequential exeution ex→, that is a total order onsistent with the program order:
seq
ex
→ , total order
ex
→ E ∧
po
→⊆
ex
→We need to highlight the impliit exeution model, whih states that a read
r reads from the most reent write that is before it  in ex→. Let us note pwo(r)the set of previous writes for r in a partial order o to dene the rf→ relation foran SC exeution  that is, whih read reads from whih write:
SC.rf
ex
→ , {(w, r) | w
prf
→ r ∧w = max pwex
→
(r)}Thus a valid SC exeution will be given by a sequential exeution ex→ andthe alulation of its indued rf→ relation as above.From suh an exeution, we an produe an exeution witness:
SC.ws
ex
→ , {(w1, w2) | w1
ex
→ w2 ∧ w1
pws
→ w2)}
SC.wit
ex
→ , (SC.rf
ex
→ , SC.ws
ex
→ )We propose here an alternative notion of SC, whih we will show equivalentto the native one in or. 3:
Sc.Arch , (
po
→, true, true,
ab
→)
Sc.Wmm , Wmm(Sc.Arch)TSO To design a proper TSO exeution, we need to require what the Spardoumentation [1℄ speies:
R ∗ , {(r, e) | r
po
→ e}
WW , {(w1, w2) | w1
po
→ w2}
ptso
ex
→ , partial order
ex
→ E ∧
R ∗ ⊆
ex
→ ∧
WW ⊆
ex
→ ∧
∃
tso
→,
tso
→⊆
ex
→ ∧
total order
tso
→ WRR n° 7010
12 Jade Alglave & Lu MarangetMoreover, we need to highlight the expliit exeution model, provided bythe V al axiom in the doumentation:
V al(La) = V al(maxex
→
{Sa | Sa
ex
→ La ∨ Sa
po
→ La})whih states that a read r reads from the most reent write that is before itin ex→ ∪ po→. Thus we dene the rf→ relation for a TSO exeution  that is, whihread reads from whih write:
TSO.rf
ex
→ , {(w, r) | w
prf
→ r ∧ w = max(pw
(
ex
→∪
po
→)
(r))}As in the SC ase, we produe an exeution witness:
TSO.ws
ex
→ , {(w1, w2) | w1
pws
→ w2 ∧ w1
ex
→ w2}
TSO.wit
ex
→ , (TSO.rf
ex
→ , TSO.ws
ex
→ )We propose here an alternative notion of TSO, whih we will show equivalentto the native one in or. 4:
ppo_tso , R ∗ ∪ WW
Tso.Arch , (ppo_tso, false, true, ab)
Tso.Wmm , Wmm(Tso.Arch)The ppo→ is quite lear from the doumentation. The Val axiom indiatesthat the internal rf→ are not inluded in ex→, whereas the external are, as thewrite from whih a read reads is the max of its previous writes in ex→. Thus weonsider rfe→ to be global, whereas r→ are not.2.2 Comparison of arhiteturesFrom our denition of arhiteture arises a very simple notion of omparison;we dene the prediate weaker among arhitetures as follows:Denition 4 (Weaker)
A1 ≤ A2 , ppo1 ⊆ ppo2 ∧
int1 → int2 ∧ ext1 → ext2 ∧
ab1 ⊆ ab2Theorem 1 (Validity is dereasing)
∀A1A2, A1 ≤ A2 ⇒
∀EX, A2.valid E X → A1.valid E XProof[in Coq℄ From A1 ≤ A2, we have A1.ghb ⊆ A2.ghb, thus if A2.ghb isayli, so is A1.ghb. INRIA
Fenes in Weak Memory Models 132.2.1 Making validity monotonousWe dene here a riterion to hek if an exeution X running on an arhiteture
A1 would be valid on a stronger arhiteture A2:
A1.checkA2 , acyclic (A2.ghb)We show that this riterion haraterises an exeution running on A1 thatwould be valid on A2:Theorem 2 (Charaterisation)
∀A1A2, A1 ≤ A2 ⇒
∀EX, A1.valid E X ∧A1.checkA2 E X ↔ A2.valid E XProof[in Coq℄
⇒ X being valid on A1, we have all requirements  well formedness andunipro  to guarantee it is valid on A2, exept the last prediate, whihholds by the hypothesis checkA2 .
⇐ X being valid on A2 gives us all requirements  well formedness andunipro  to guarantee its validity on A1 exept the last one. As A1 ≤ A2,we know that A1.ghb ⊆ A2.ghb (lemma ghb_inl), thus the ayliityrequirement for A1.ghb holds if A2.ghb is ayli. 2.2.2 ExamplesS In the ontext of our generi framework, we designed a riterion to deideif a partiular exeution X , with respet to an event struture E and on anarhiteture A, is S :
A.checkSc , acyclic (
po
→ ∪
hb
→)This riterion haraterises valid weak exeutions that are S :Corollary 1 (S haraterisation)
∀AEX, A ≤ Sc, A.valid E X ∧ A.checkSc E X ↔ Sc.valid E XProof[in Coq℄
⇒ As po→ ∪ hb→= Sc.ghb, this is a diret onsequene of thm. 2.
⇐ as A ≤ Sc, this is a diret onsequene of thm. 1. This result allows us to see that the outome 0:r1=0; 1:r2=0 for i3 (leftmostpiture in gure 3) will never show up on a sequentially onsistent mahine. Allother exeutions depited in g. 3 are SC by the same argument.
RR n° 7010
14 Jade Alglave & Lu MarangetTso In the ontext of our generi framework, we designed a riterion to deideif a partiular exeution X , with respet to an event struture E and on anarhiteture A, is Tso ; onsider hb_tso→ to be ws→ ∪ fr→ ∪ rfe→:
A.checkTso , acyclic (
ppo_tso
→ ∪
hb_tso
→ )This riterion haraterises valid weak exeutions that are Tso :Corollary 2 (Tso haraterisation)
∀AEX, A ≤ Tso, A.valid E X ∧ A.checkTso E X ↔ Tso.valid E XProof[in Coq℄
⇒ As ppo_tso→ ∪ hb_tso→ = Tso.ghb, this is a diret onsequene of thm. 2.
⇐ as A ≤ Tso, this is a diret onsequene of thm. 1. This result allows us to onlude that all the outomes for i3 speied ing. 3 may show up on a Tso mahine.2.3 Equivalene with native models2.3.1 S is SCWe show that the SC denition from [17℄ is equivalent to our denition:Theorem 3 (S is SC)
∀EX, Sc.valid E X ↔ ∃
ex
→, seq
ex
→ ∧ SC.wit
ex
→ = XProof[in Coq℄
⇒ from X being valid on Sc, we have acyclic (ghb→ ), whih means acyclic (hb→
∪
po
→) on Sc. We know by or. 1 this ondition is neessary and suientto obtain an equivalent SC exeution.
⇐ from the sequential exeution ex→, we produe a SC.wit whih is valid onany weaker arhiteture by thm. 1. 2.3.2 Tso is TSOWe show that the TSO denition from [1℄ is equivalent to our denition:Theorem 4 (Tso is TSO )
∀EX, Tso.valid E X ↔ ∃
ex
→, ptso
ex
→ ∧ TSO.wit
ex
→ = XProof[in Coq℄
⇒ from X being valid on Tso, we know X satises checkTso by or. 2.
checkTso gives us an ayli relation, therefore a partial order on E, suhthat its restrition to W is the total order on stores required by TSO. As
Tso.ghb inludes R∗ and WW by onstrution, we have the nal require-ments to provide an exeution valid on TSO.
⇐ from ex→, we produe a TSO.wit whih is valid on any arhiteture weakerthan Tso by thm. 1. INRIA
Fenes in Weak Memory Models 152.4 TestingIn this setion we preisely dene our testing methodology and desribe ourtools.2.4.1 Toolslitmus To understand the memory model provided by a given mahine M ,we use litmus tests, whih are assembly programs, with speied initial stateof memory and registers. To run them on a mahine, we use our litmus tool,whih runs a C skeleton into whih the litmus test is enapsulated. For a giventest t running on M , we ollet the nal ontent of memory and registers, thusdening a set of observed outomes OM (t).memevents To ompare the memory model as observed on a mahine and ourtheoriteal one, we implemented our generi framework in the memevents tool,written in OCaml. The main module axiom is an implementation of the theorypresented at setion 2: provided an arhiteture module A suh as Sc.Arch or
Tso.Arch, it outputs all possible exeution witnesses (in the absene of loops)that are valid in the memory model W indued by A  in the sense of the validprediate dened at setion 2.1.4, whih final dene the set of valid outomes
VW (t). When there are loops, it unfolds them several times, whih gives asubset of valid exeutions, whih has been enough for our purposes. Moreover,memevents is able to output a ounter example: when a partiular outome isspeied, it shows whih yles in the ghb→ relation invalidate this exeution. Thisgives an insight on why this exeution is not allowed on a partiular arhiteture,and if barriers are needed or not.2.4.2 Comparison of modelsAn additional tool, ompare, examines, for a given test t run on a mahine
M , the following ases: OM (t) ⊆ VW (t), from whih we know our model is notinvalidated, and OM (t) 6⊆ VW (t), from whih we know our model is invalidated.When OM (t) ⊆ VW (t), the most hallenging ase is when t is in VW (t) yet notin OM (t) , that is it has an outome whih is valid yet not observed. Severalreasons explain this situation: either t has not been run enough to observe it,or the tested mahine does not implement the feature highlighted by the test.In that ase the model is too permissive with respet to this mahine. However,we do not seek the adequation of OM (t) and VW (t): doing so would lead us topartiularise our model so that it renders the model of the tested mahine. As wewant to give a model of an arhiteture, we should on the ontrary dene a loosermodel whih inludes the observed outomes of any mahine that implementsthe arhiteture.To be more preise, given an arhiteture A, a model W = Wmm(A) andan implementation M of A, we dene two requirements that must satisfy W tobe valid and aurate with respet to M :Denition 5 (Validity and auray of a model)
valid W , ∀M, ∀t,OM (t) ⊆ VW (t)
accurateM W , ∀t,VW (t) ⊆ OM (t)RR n° 7010
16 Jade Alglave & Lu Marangetobserved never observed
i5 int = false int = true
i6 ext = false ext = true
i3 WR 6⊆ ppo WR ⊆ ppo
i4 WW 6⊆ ppo WW ⊆ ppo
i1 RW 6⊆ ppo RW ⊆ ppo
i2 RR 6⊆ ppo RR ⊆ ppoFigure 5: Summary of harateristi tests
i1:RW relaxation
a: R [x℄=1
b: W [y℄=1ppo? po:0
: W [y℄=2ws
ab
d: W [x℄=1po:1fenedrf
rfe
rf
i2: RR relaxation
a: W [x℄=1
b: W [y℄=1po:0fened
: R [y℄=1
ab rfrfe rf
d: R [x℄=0ppo? po:1
fr
rf
i3: WR relaxation
a: W [x℄=1
b: R [y℄=0po:0ppo? rf
: W [y℄=1fr
d: R [x℄=0po:1ppo?rf
fr
rf
rf
i4: WW relaxation
a: W [x℄=1
b: W [y℄=2ppo? po:0 rf
: W [y℄=3ws
d: W [x℄=4po:1ppo?rf
ws
i5: r relaxation
a: W [x℄=1
b: R [x℄=1global?
rpo:0 rf
: R [y℄=0
llpo:2
d: W [y℄=1fr
e: R [y℄=1
rpo:1global? rf
f: R [x℄=0
llpo:1
fr
rf
rf
i6: rfe relaxation
a: W [x℄=1
b: R [x℄=1global ?rfe rf
: W [y℄=1
lspo:1
d: R [y℄=1global ?rfe rf
e: R [x℄=0
llpo:2
fr
rfFigure 6: Charateristi tests exhibiting relaxationsThus we require our model to be valid but not neessarily accurate withrespet to all its implementations.2.4.3 Charateristi testsWe present here the key tests to understand how to instaniate the parametersof our model. Let us assume given a mahine M that implements a model W .The main idea is to observe one relaxation  of the store atomiity or orderingonstraints  at the time, by onsidering an exeution where all relations in-volved are global, exept the one in question. Thus, if the speied outome isobserved, M exhibits this relaxation, otherwise there would have been a ylein the validity hek of this exeution, whih would have been forbidden.We assume here it is always possible to maintain a R∗ pair in program order,using a dependeny between the two aesses. This does not mean we onsiderall R∗ pairs to be preserved in program order, but only the ones that have adependeny between them. Maintained RR (resp. RW ) pairs will be depitedby ll (resp. ls).Globality of rfmapsInternal rfmaps
INRIA
Fenes in Weak Memory Models 17
P0 P1
(a) x← 1 (d) y← 1
(b) r1← x (e) r3← yll ls
(c) r2← y (f) r4← xi5 r1 = r3 = 1 ∧ r2 = r4 = 0 ?The test i5 an be found in [6℄, with number 2.4: it is laimed to highlight afeature alled intra-proessor forwarding, and illustrates the visibility of storebuering to the programmer. If the speied outome of this test is observed,then we onsider r→ not to be global on M . If the speied outome never showsup, then r→ an be onsidered global. Indeed, as depited at g. 6, if internal rf→were global, there would be a bold rf→ between events a and b  W [x] and R[x]from P0  and between events d and e  W [y] and R[y] from P1. This wouldlead to a yle a r→ b ppo→ c fr→ d r→ e ppo→ f fr→ a in this exeution, whih wouldtherefore be forbidden.External rfmaps
P0 P1 P2
(a) x← 1 (b) r1← x (d) r2← yls ll
(c) y← 1 (e) r3← xi6 r1 = 1 ∧ r2 = 1 ∧ r3 = 0 ?The test i6 also an be found in [6℄, with number 2.6, or in the litterature underthe name WRC [13℄, and nally in the Power doumentation with name isa1[3℄. If the speied outome of this test is observed, then we onsider rfe→ notto be global on M . If the speied outome never shows up, then rfe→ an beonsidered global. Indeed, as depited at g. 6, if external rf→ were global, therewould be a bold rf→ between events a and b  W [x] from P0 and R[x] from P1 and between events c and d  W [y] from P1 and R[y] from P2. This would leadto a yle a rfe→ b ppo→ c rfe→ d ppo→ e fr→ a in this exeution, whih would thereforebe forbidden.While observing any rf→ not to be global on a mahine, as this is the weakestondition on rf→, one should assume that any mahine that implements W hasthe orresponding parameter to false, otherwise W ould be invalidated.Preserved program order
WR pairs The test i3, whih is depited at g. 1, an be found in [6℄,with number 2.3a. If the speied outome is observed, then WR pairs arenot preserved in program order: the ppo parameter of this mahine should notinlude WR pairs. If the speied outome never shows up, then the ppo ofthis mahine inludes WR. Indeed, as depited at g. 6, if WR pairs wereglobal, there would be a bold ppo→ between events a and b  W [x] and R[y] on
P0  and between c and d  W [y] and R[x] on P1. This would lead to a yle
a
ppo
→ b
fr
→ c
ppo
→ d
fr
→ a in this exeution, whih would therefore be forbidden.RR n° 7010
18 Jade Alglave & Lu Maranget
WW pairs
P0 P1
(a) x← 1 (c) y← 3
(b) y← 2 (d) x← 4i4 x = 1 ∧ y = 3 ?If the speied outome of the test i4 is observed, then WW pairs are notmaintained in program order. If it never shows up, then the ppo of this mahineinludes WW . Indeed, as depited at g. 6, if WW pairs were global, therewould be a bold ppo→ between events a and b  W [x] and W [y] on P0  and between
c and d  W [y] and W [x] on P1. This would lead to a yle a ppo→ b ws→ c ppo→ d ws→ ain this exeution, whih would therefore be forbidden.
RW pairs
P0 P1
(a) r1← x (c) y← 2
(b) y← 1 fene
(d) x← 1i1 r1 = 1 ∧ y = 2 ?If the speied outome of the test i1 is observed, then RW pairs are not main-tained in program order. If it never shows up, then the ppo of this mahineinludes RW . Indeed, as depited at g. 6, if RW pairs were global, therewould be a bold ppo→ between events a and b  R[x] and W [y] on P0. If thebarrier on P1 is B-umulative, it orders events c and d but also c and a. Thiswould lead to a yle a ppo→ b ws→ c ab→ a in this exeution, whih would thereforebe forbidden.
RR pairs
P0 P1
(a) x← 1 (c) r3← yfene
(b) y← 1 (d) r4← xi2 r3 = 1 ∧ r4 = 0 ?The test i2 an be found in [6℄, with number 2.1. If M has global external
rf , and preserves WW pairs, and the speied outome is observed, then RRpairs are not preserved in ppo. If the speied outome is never observed underthe same hypothesis, then ppo inludes RR pairs.If M does not have global external rf , we an onsider the same test mod-ied so that a B-umulative barrier is between the instrutions on P0: theB-umulativity enfores a global ordering between (a)W [x] on P0 and (c)R[y]on P1. If the speied outome is observed, we an onlude that RR pairs arenot in ppo: otherwise, there would be a bold ppo→ between c and d  R[y] and
R[x] on P1  whih would lead to a yle a ab→ c ppo→ d fr→ a in the exeution,therefore forbidden. INRIA
Fenes in Weak Memory Models 193 Semantis of barriers3.1 Barriers guaranteeLet us onsider two arhitetures A1 ≤ A2. We examine here what the barriersprovided by A1 should guarantee to restore A2.We note rf2\1→ for rf2 ?→ \ rf1 ?→ . We dene the prediate A1.fb  for fullybarriered  on A1 as follows:ab1→= (rf2\1→ )?; ppo2→ ; (rf2\1→ )?We show that this a suient ondition on the barriers provided by A1 torestore an exeution valid on A2:Theorem 5 (Barriers guarantee)
∀A1A2, A1 ≤ A2 ⇒
∀EX, A1.valid E X ∧A1.fb E X ⇒ A2.valid E XProof[in Coq℄ Suppose that it is not valid on A2: thus we have a yle in
A2.ghb, that is in ws→ ∪ fr→ ∪ rf2 ?→ ∪ ppo2→ . Suh a yle is a yle in ws→ ∪ fr→ ∪ rf1 ?→
∪
ppo1→ ∪(rf2\1→ )?∪ ppo2→ ∪(rf2\1→ )? whih implies a yle in ws→ ∪ fr→ ∪ rf1 ?→ ∪ ppo1→
∪
ab1→ that is, A1.ghb. Thus we ontradit the validity of X on A1. This result provides an insight on what power should have a barrier providedby an arhiteture A1  e.g. PowerPC  to restore a stronger model  e.g. SC.First, the barrier should restore the pairs that are preserved in program order on
A2 but not on A1. Seond, the barrier should ompensate the lak of relationsbetween writes and reads events  whih we model by rf→ not being a globalrelation in the general ase. Thus, if the rf→ relation is not global on A1 butglobal on A2, we overome the lak of globality of rf→ by ordering the beginningwith the end of the hain. This is how we interpret the umulativity of barriersas stated in the PowerPC doumentation [3℄. We interpret furthermore theA-umulativity (resp. B-umulativity) property, as applying to barriers thatenfore ordering of pairs in rf→; po→ (resp. po→; rf→). We onsider a barrier that onlypreserves pairs in po→ to be non umulative.Provided a barrier that has suh power, we also have an insight on whereto plae these barriers in the ode: the statement of the theorem indiatesindeed that barriers should be in between any pairs in ppo2→ suh that one of theomponent of this pair (or both) may give rise to a rf→ relation that is to beglobal on the stronger arhiteture A2 but is not on A1.From any arhiteture to Sc We designed semantis for a barrier that,for any weak memory model indued by an arhiteture A, would sue torestablish Sc.
e1
fened
→ e2 , ∃b, e1
po
→ b ∧ b
po
→ e2RR n° 7010
20 Jade Alglave & Lu Marangetfened
→ =
po
→ (plaement)
e1
ab
→ e2 , e1
fened
→ e2 (base)
∨ e1
rf
→ r ∧ r
ab
→ e2 (A-umulativity)
∨ e1
ab
→ w ∧w
rf
→ e2 (B-umulativity)This barrier orders all pairs in po→ as indiates the base ase; it also ompen-sates the eventual lak of visibility of rf→ on A by ordering the two ends of a hainrf
→;
po
→ (resp. po→ ; rf→) as indiates the A-umulativity (resp. B-umulativity)ase.3.2 Considering a weaker guaranteeWe said the barrier should at rst restore the pairs that are preserved in programorder on the stronger arhiteture. Thus, for the simple ase where none of theomponent of a pair in ppo→ gives rise to a rf→ relation that is global on A2 butnot on A1, there is no need for a barrier as powerful as above: a barrier thatonly orders the events that surround it statially would be enough. Considerthe wfb prediate: wfb , ppo2→ \ ppo1→ , the following result arises as a naturalorollary of thm. 5.Corollary 3 (Non umulative barriers guarantee)
∀A1A2, A1 ≤ A2∧
rfe1 ?→ =rfe2 ?→ ⇒
∀EX, A1.valid E X ∧A1.wfb E X ⇒ A2.valid E XFrom Tso to Sc As rfe→ are onsidered global in both Tso and Sc, we onlyneed a non umulative barrier to restore Sc from Tso. We dene the pairs ofwrites and reads in po→ as follows: WR , {(w, r) | w po→ r}. To restore Sc from
Tso, we need to preserve the WR pairs, as they are preserved on Sc but not on
Tso. As internal rf→ are WR pairs, suh a barrier would ompensate the lak ofvisibility of internal rf→ on Tso as well. Therefore we dene the following barriersemantis: fened
→ = WR (plaement)
e1
ab
→ e2 , e1
fened
→ e2 (base)We dene the prediate Tso.wfb as follows: Tso.wfb , Tso.ab = WR,and the following theorem arises as a natural onsequene of or. 3:Theorem 6 (Barriers plaement on Tso)
∀EX, Tso.valid E X ∧ Tso.wfb E X ⇒ Sc.valid E X INRIA
Fenes in Weak Memory Models 21Proof[in Coq℄ From X being wfb on Tso, we have Tso.ghb =ws→ ∪ fr→ ∪ ppo_tso→
∪
rfe
→ ∪WR, whih is ayli sine X is valid on Tso. As WR overs both r→and WR pairs that are not in rf→, we get the ayliity of Sc.ghb diretly. From any arhiteture A ≤ Tso to Tso We designed semantis for a barrierthat, for any weak memory model indued by an arhiteture A weaker than
Tso, would restablish Tso:fened
→ =
ppo_tso
→ (plaement)
e1
ab
→ e2 , e1
fened
→ e2 (base)
∨ e1
rfe
→ r ∧ r
ab
→ e2 (A-umulativity)
∨ e1
ab
→ w ∧ w
rfe
→ e2 (B-umulativity)Here, all pairs exept WR are to be preserved in program order, whihis depited by the plaement ondition fened→ =ppo_tso→ . As internal rf→ are notonsidered global in Tso, there is no need to ompensate them: the orderingpower of the barrier onerns only the external rf→, as depited by the A- andB-umulativity ase.We retrieve what is desribed in the Spar V9 doumentation [2℄: TSO isindeed obtained from PSO, whih is obtained from RMO by barriers plae-ments. In our framework, we dene Rmo and Pso as follows  where R∗l (resp.
WWl) represents all pairs of R∗ (resp. WW ) to the same loation:
Rmo.Arch , (R∗l ∪WWl, false, true, ab1)
Pso.Arch , (R ∗ ∪ WWl, false, true, ab2)As for Tso, we dedue from the Val axiom that external rf→ are global,whereas internal are not.The doumentation speies that PSO is obtained from RMO by adding
LoadLoad and LoadStore barriers after eah read. This statement has twoonsequenes in our framework: rst, R∗ pairs are preserved in Pso, and seond,to restore Pso from Rmo, sine the external rf→ are already global on Rmo, oneshould use a non umulative barrier that preserves R∗ pairs.
TSO is obtained from PSO by adding StoreStore barriers after eah write.We onlude that WW pairs are preserved in Tso, and that to restore Tsofrom Pso, one should use a non umulative barrier that preserves WW pairs.Thus, from Rmo to Tso, a non umulative barrier that preserves R∗ and WWis needed, as stated by the or. 3.4 Case study: a Power modelIn this setion we dene an arhiteture for Power, that is, we dene relationsppo
→ and ab→ and booleans int and ext. We do not laim for a denitive PowerPCRR n° 7010
22 Jade Alglave & Lu Marangetmodel; we rather onfront a tentative PowerPC model against atual PowerPCmahines.4.1 Complete event strutures and exeution witnessesHeneforth, we will reason on omplete event strutures, on whih setion 2abstrat. We shall avoid exhaustive treatment of omplete event strutures,interested readers may refer to [18, 9℄; instead, we sketh the main ideas.Additional events In addition to memory events, the exeution of an in-strution may generate a variety of events: most instrutions generate registerevents that render aesses to registers, memory barriers instrutions generatebarrier events and onditional branh instrutions generate ommit events thatexpress branhing deisions. We note B the set of barrier events and C the set ofommits, b and c being typial elements. We shall handle three memory barrierinstrutions : isyn, syn and lwsyn. The orresponding events are distin-guished by prediates is-isyn, et. As in previous setions, we still denote theset of memory events by E (typial element e), the set of memory read eventsby R (typial element r), and the set of memory write events by W (typialelement w).Extended or additional relations We extend the program order relationpo
→ to all events. In partiular, po→ now orders both memory and barrier events.Moreover, omplete event strutures omprise additional relations, more speif-ially intra instrution ausality iio→ that represents the ordering onstraintsof events within a same instrution. Moreover, the following relation fened(k)→renders the presene of a barrier of style k between memory events e1 and e2:
e1
fened(k)
→ , ∃b, is-k(b) ∧ e1 po→ b po→ e2.Exeution witnesses also beome more omplete, as relation rf→ now relatesregister events. We note rf-reg→ the subrelation of rf→ that relates register stores toregister loads that read their values. As a side note, relation rf-reg→ derives fromsequential exeution in a muh stronger sense than rf→ on memory. Namely,
w
rf-reg
→ r when w is maximal amongst the predeessors of r in program order.Illustration Figure 7 shows a program fragment and a fragment of the or-responding omplete event struture, together with an exeution witness.The rst instrution is an indexed load from memory lwz: base address y istaken from register r5 (whih has been writen into elsewhere), index is 0, andthe value read from the eetive address y + 0 is stored into register r2. Wehere view three events, labelled on gure as c, a and d. As an example of intra-instrution dependeny, a iio→ d expresses that loading from memory preedesstoring into register r2.The ompare and branh sequene is less trivial: the instrution mpwi r2,1ompares the ontent of r2 to the onstant 1 and stores the result of omparison(2 means equality) into the ontrol register r0. The next instrution is aonditional branh bne, with branhing deision (ommit event h) onditionedby the ontents of ontrol register r0 (g iio→ h). Instrution bne is branh notINRIA
Fenes in Weak Memory Models 23
lwz r2,0(r5)
mpwi r2,1
bne L1
stw r3,0(r6)
a: R [y℄=1
d: W 1:r2=1
iio
h: Commit
dd
b: W [z℄=3
trl
: R 1:r5=y
iio
e: R 1:r2=1
rf-reg
f: W 1:CR0=2
iio
g: R 1:CR0=2
rf-reg
iio
poi: R 1:r3=3
po
j: R 1:r6=z
po
iio iio
rf
rf-reg
rf-reg rf-reg
L1:lwz r2,0(r5)mpwi r2,1bne L1stw r3,0(r6)
Figure 7: Example of preserved program order
RR n° 7010
24 Jade Alglave & Lu Marangetequal. Thus, sine ontrol registers signals equality, the branh is not takenand the next instrution to be exeuted is the store stw, as shown by the arrows
h
po
→ i, h po→ j and h po→ b.4.2 Globality of rfmapsRunning tests i5 and i6 on a Power mahine yields the speied outomes. Thus,we onsider the r→ and rfe→ relations not to be global for Power.4.3 Preserved program order ppoSome parts of the po→ program order relation are reeted in the global happens-before relation. In this setion we pik those out, dening a preserved-program-order relation ppo→ .Data dependenies Data dependenies within a proessor arise from anyombination of the reads-from relation on registers and the intra-instrutionausality relation: dd
→ , (
rf-reg
→ ∪
iio
→)+(here R+ denotes the transitive losure of R). Note that this relation inludesno dependenies via memory.The restrition of the above to memory events is written dd-mem→ :dd-mem
→ ,
dd
→∩ (M×M)Control dependenies A memory write is ontrol dependent on a memoryread if there is an intervening ommit (of a onditional branh) that is data-dependent on the read and preedes (in program order) the write:
r
trl
→ w , ∃c ∈ C. r
dd
→ c
po
→ wThis relation models that fat that memory writes are not speulated, whereasreads an be.Isyn dependenies A memory event is isyn-dependent on a memory readif there exists an intervening ommit (of a onditional branh) that is data-dependent on the read and is separated (in program order) by an isyn fromthe event:
r
isyn
→ m , ∃c ∈ C. r
dd
→ c ∧ c
fened(isyn)
→ mNote that this only adds anything beyond ontrol dependenies in the ase of amemory read/read pair.Figure 7 gives an example of a ontrol dependeny, from load a to store bthrough ommit h. INRIA
Fenes in Weak Memory Models 25All together The preserved program order relation is just the union of datadependeny for memory events, ontrol dependenies, and isyn dependenies:ppo
→ ,
dd-mem
→ ∪
trl
→ ∪
isyn
→Note that memory stores are never the soure of a ppo→ pair, as a onsequeneof the instrution semantis: a memory store annot be the soure of a iio→pair, nor of a rf-reg→ pair. Thus, a natural partition of ppo→ pairs is into load/loadpairs and load/store pairs. We inlude these pairs in the global happens-beforerelation: for load-load pairs: we refer to [5, pp. 653668℄, whih states that in suha situation, load r1 will be performed before load r2 with respet to anyproessor, whih we interpret as r1 being globally performed before r2. Itis not lear to us whether or not these notions are equivalent in the aseof a load, or if our interpretation makes sense w.r.t arhitetural insights; for load-store pairs: we dedue from r ppo→ w that r happens before thestore is initiated. Sine we onsider loads to be atomi  thus onsideringthey are globally performed as soon as they are initiated  and stores tobe initiated before being globally performed, we dedue that r is globallyperformed before w is, and inlude suh ppo→ pairs in the global happens-before relation.4.4 Values do not ome out of thin airIn the appliation of our framework for Power, load-store pairs endure a partiu-lar treatment: we extend load-store ppo→ edges by following rf→ ones, thus deninganother relation ppo-ext→ , whih we will also inlude in the global happens-beforerelation.Consider a triple r ppo→ w rf→ r′, where r′ does not need to originate from thesame proessor as r and w. Here r happens before the store is initiated, and
r′ happens after the store is initiated. Thus, intuition suggests that r globallyhappens before r′. We dene the extension of ppo→ by rf→ as follows:
r
ppo-ext
→ r′ , ∃w, r
ppo
→ w
rf
→ r′Our extension of load-store dependenies is a generalization of some hekthat weak models often  and arguably must  inorporate: the ausality hekof [10℄, or the values do no not ome out of thin air of [7℄. The anonialexample of suh a hek is as follows:
P0 P1
(a) r1← x (c) r1← y
(b) y← r1 (d) x← r1We further assume that x and y initially ontain 0. Without spei provision inthe model, the absurd outome x = 1; y = 1 might remain valid, as demontratedby the following exeution witness:RR n° 7010
26 Jade Alglave & Lu Maranget
a: R [x℄=1
b: W [y℄=1ppo-pro
: R [y℄=1rfy=1 d: W [x℄=1ppo-prorf x=1
Our model invalidates the exeution thanks to ppo extension. Namely, wehave a ppo-ext→ c (by a ppo-pro→ b rf→ c) and c ppo-ext→ a (by c ppo-pro→ d rf→ a. Hene,ppo
→ alone is yli. A fortiori ghb→ is yli, sine ppo→ is inluded in ghb→ . Note thatwe ould prevent values to ome out of thin air by adding another sanity hekon the rf→ relation in our generi framework, following Alpha's doumentation.There are two reasons for onsidering suh an extension to be global  thatis, inluded in the global happens before relation: it rules out some examples in whih values would appear out of thin airas presented in the following example; from a global time perspetive, r and r′ are ordered respetively beforeand after the point of time when w is initiated; for more details, see setion6 where an example is disussed.This is perhaps intuitively plausible, and it sues to rule out some exam-ples in whih values would appear out of thin air (we suppose here that thearhiteture should rule out thin-air reads, though that might be debated), andto orrespond with our Power5 experiments on the test presented in setion 6.However, suh an extension does not seem to be fored. Therefore, it is notlear to us whether we should forbid that behaviour in the model.Illustration We onsider here the litmus test adir1v3 (a variation on [7,Test 1℄).Figure 8 shows the program and a non-SC exeution (hb→∪ po→ is yli). Onemay rst notie the ppo→ relation between load b and store c. It follows from adata dependeny, sine the eetive address of c is exatly the value read by b(i.e. the address of loation y). The relation ghb→ is highlighted with blak boldarrows. We have:1. b ppo→ c (data dependeny), and thus b ppo-ext→ d (by b ppo→ c rf→ d andppo-extension).2. d syn→ e (syn instrution), and thus c abe→ (by c rf→ d syn→ e and A-umulativity). INRIA
Fenes in Weak Memory Models 27
P0 P1
(a) x← &y (d) r3← y
(b) r6← x syn
(c) *r6← 1 (e) r4← xInitially: x=&z;Allowed: 1:r3=1; 1:r4=&z;
a:W [x℄=y
b:R [x℄=ypo:0 rf rf
:W [y℄=1po:0ppo
d:R [y℄=1
ppo-ext:b-rf
e:R [x℄=z
A/B:d-erfpo:1 syn1:r3=1
fr
1:r4=&zrfFigure 8: Litmus test adir1v33. e fr→ a, by denition of fr→.Clearly, ghb→ is ayli and the exeution is valid. In some sense, validityfollows from a rf→ b not being global  sine there is a yle a rf→ b ghb→ a. Notethat even if we strengthen the ppo→ relation by its extension, this outome is stillvalid.This rf→ relation is internal to a proessor; by onsidering it not to be global,we model the presene of a store buer on this proessor, whih we suppose tobe at least a part of the reason why this behaviour is observed.4.5 Cumulative memory barrierssyn The syn barrier is the inarnation of the SC-restoring umulative barrierdesribed in setion 3.1, whih denition we expand for larity:syn
→ ,
fened(syn)
→
e1
ab-syn
→ e2 , e1
syn
→ e2 (base)
∨ e1
rf
→ r
syn
→ e2 (A-umulativity)
∨ e1
syn
→ w
rf
→ e2 (B-umulativity)
∨ e1
rf
→ r
syn
→ w
rf
→ e2 (A/B-umulativity)lwsyn PowerPC features a lightweight umulative barrier, lwsyn, whih se-mantis we dene as follows:RR n° 7010
28 Jade Alglave & Lu Marangetlwsyn
→ ,
fened(lwsyn)
→ ∩ ((W×W) ∪ (R× E))
e1
ab-lw
→ e2 , e1
lwsyn
→ e2 (base)
∨ e1
rf
→ r ∧ r
ab-lw
→ e2 ∧ e2 ∈W (A-umulativity)
∨ e1
ab-lw
→ w ∧w
rf
→ e2 ∧ e1 ∈ R (B-umulativity)In other words, lwsyn ats as syn exept on store-load pairs, the exeptionimpating both the base and umulativity ase.Finally we dene relation ab→ as the union of ab-syn→ and ab-lw→ .Analogy between ppo-ext→ and B-umulativity The ppo-ext→ extension is ar-guably an analog of B-umulativity. Here we onjeture that barriers implementA-umulativity by waiting for some stores to be performed globally, in whihase ppo→ ignores the issue. By ontrast, B-umulativity on a load-store pair de-mands no spei ations, as the natural onsequene of a w rf→ r implying that
r is performed only one w is issued.5 Barrier experiments5.1 Oial testsA programming note in [3, p. 415℄ desribes two examples in preise prose. Weformulate those as invalid exeutions of litmus tests.isa1 Let us rst examine the simpler test isa1, given as pseudo-ode at thetop of gure 9. PowerPC doumentation states: Cumulative ordering ditatesthat the value loaded from loation x by proessor 2 is 1. Our interest is inan oially invalid exeution, whih our Power model should also deem invalid(bottom of gure). Thus, we interpret the above presription as forbidding thevalue loaded from loation x by proessor 2 to be 0, the initial ontents of x.To relate an exeution graph to a litmus test, one may rst relate events toinstrutions, using the event annotations ((a), (b),. . . ) in program text and thepo
→ arrows in graphs. For instane, P1 performs a load from loation x, reading 1,(event b), a store of 2 to loation y (event c), and those are separated by a syninstrution (relation b syn→ c). Other arrows are as follows: dashed arrows giverelation rf→, with pending arrows to read events being loads from intial state,and pending arrows from write events being stores to nal state; while boldarrows give relation ghb→ . For instane, e fr→ a results from event e reading theinitial value of loation x, whih is overwritten by event a. Or, a ab→ d resultsfrom barrier umulativity by a rf→ b syn→ c rf→ d.We an now reah interesting onlusions quite easily: by or. 1, the ex-eution shown is not SC, sine there is a yle a rf→ b po→ c rf→ d po→ e fr→ a.More important, the exeution is not valid in the Power model, sine there areyles in ghb→ . It is worth notiing that the test is presented by [3℄ as illustratingINRIA
Fenes in Weak Memory Models 29isa1
P0 P1 P2
(a) x← 1 (b) r1← x (d) r1← ysyn syn
(c) y← 2 (e) r2← xForbidden: 1:r1=1; 2:r1=2; 2:r2=0;a: W [x℄=1
b: R [x℄=1 rf
: W [y℄=2
A/B:b-
d: R [y℄=2
A/B:b-
rf
po:1synA/B:b-1:r1=1rf
e: R [x℄=0
A/B:d-erfpo:2syn2:r1=2
fr
2:r2=0
rf
Figure 9: Oialy invalid exeution of isa1umulative ordering of storage aesses preeeding a memory barrier . As aonsequene, the yle a ab→ c ab→ e fr→ a is the most illustrative. Namely, a ab→ cfollows from a rf→ b and b syn→ c; while c ab→ e follows c rf→ d and d syn→ e;isa2 The more omplex test isa2 (gure 10) is a renement of isa1: a hainof store-reads from P0 to P2 that passes through P1.But P0 now performs two stores to loations x and y, separated by a syn;while P1 loops loading y untils it reads the value 2 written to y by P0, beforestoring value 3 to loation z. P2 remains essentially unhanged. As for isa1, theRR n° 7010
30 Jade Alglave & Lu Marangetisa2
P0 P1 P2
(a) x← 1 L1: (f) r3← zsyn (c, d) r2← y syn
(b) y← 2 mp r2,2 (g) r1← xbne L1
(e) z← 3Forbidden: 2:r3=3; 2:r1=0;
a: W [x℄=1
b: W [y℄=2po:0syn
d: R [y℄=2
A/B:a- rfrf rf
: R [y℄=0 frpo:1
e: W [z℄=3
ppo
f: R [z℄=3
ppo-ext:-e ppo po:1ppo-ext:d-erf
g: R [x℄=0
A/B:f-grfsyn po:22:r3=3
fr
2:r1=0
rf
rf
Figure 10: Oially invalid exeution of isa2
INRIA
Fenes in Weak Memory Models 31arhiteture speiation forbids that P2 loads value 0 from loation x (thirdinstrution) when it has loaded (rst instrution) the value (here 3) stored by P1in some memory loation used for ommuniating (here z). A key observationis the absene of a barrier in P1 ode. Instead, we have a ontrol dependeny.The example being oial, we assume that suh a ontrol dependeny suesto prevent the last load of P2 from reading value 0.In presene of a onditional branh, there is a lear distintion between pro-gram text and exeution, or, more preisely between programm listing order andprogram order po→. We selet a partiular (invalid) exeution witness generatedby memevent, where P1 exeutes two loop iterations (gure 10). The ontroldependeny is expressed as the two edges c ppo→ e and d ppo→ e. The exeution isnon-SC, by the existene of yle a po→ b rf→ d po→ e rf→ f po→ g fr→ a. The exeutionis also invalid in our Power model, sine ghb→ is yli. We learly identify twoyles: a ab→ d ppo→ e ab→ g fr→ a and a ab→ d ppo-ext→ f syn→ g fr→ a. Note that theppo-ext
→ extension is not needed to onlude that this outome is invalid in ourPower model.5.2 Classial testsIn the previous setion, we have demonstrated that our Power model is orretw.r.t. the two oial litmus tests that are publily available. Clearly, two testsare unsuient to draw any onlusion and we need more.Some litmus tests are onventional, suh as iriw (Independant Reads ofIndependent Writes, gure 11) and rw (Read To Write Causality, gure 12) see for instane [13℄, and [4, Example 7.7℄.Figures 11 and 12 show non-SC exeution witnesses, whih are the ones ofinterest. To see that the exeutions we onsider are non-SC, it sues to followrf
→, fr→ and po→ arrows in any graph, so as to nd a yle. These graphs also showthat the exeutions onsidered are invalid our Power model, by the presene ofbold ghb→ yles.We annot onlude from the publi doumentation [5℄ whether these twotests are invalid on the Power arhiteture; therefore we resort to experimenta-tion.5.3 ExperimentsIn experiments, we observe a seletion of the nal values of registers and ofmemory loation, yielding outomes. In the ase of the four litmus tests isa1rw the nal values of registers written to by the load instrutions sue toidentify the non-SC exeution depited. For instane the outome [1:r1=1;1:r2=0; 2:r3=0;℄ sues to identify the non-SC exeution of rw.We performed experiments on two mahines doko and hpx. doko is a 4-ores Power5 mahine, running Linux; while hpx is one 16-ores eServer 575,running AIX.
RR n° 7010
32 Jade Alglave & Lu Maranget
iriw
P0 P1 P2 P3
(a) r1← x (c) r2← y (e) x← 1 (f) y← 2syn syn
(b) r2← y (d) r1← xObserved? 0:r1=1; 0:r2=0; 1:r2=2; 1:r1=0;
a: R [x℄=1b: R [y℄=0 syn(po:0)
0:r1=1
f: W [y℄=2
fr
0:r2=0
: R [y℄=2 d: R [x℄=0syn(po:1)
1:r2=2
e: W [x℄=1
fr
1:r1=0
rfA/B:a-b rf
rf A/B:-drf
rf
rf
Figure 11: A non-SC exeution of litmus test iriw
INRIA
Fenes in Weak Memory Models 33
rw
P0 P1 P2
(a) x← 1 (b) r1← x (d) y← 1syn syn
(c) r2← y (e) r3← xObserved? 1:r1=1; 1:r2=0; 2:r3=0
a: W [x℄=1b: R [x℄=1 rf
: R [y℄=0
A/B:b-
rf
syn(po:1)
1:r1=1
d: W [y℄=1fr
1:r2=0
e: R [x℄=0
syn(po:2)
rf
fr 2:r3=0
rf
rf
Figure 12: A non-SC exeution of litmus test rw
RR n° 7010
34 Jade Alglave & Lu Marangetisa2v1
P0 P1 P2
(a) x← 1 L1: (f) r3← zsyn (c, d) r2← y r4← xor(r3,r3)
(b) y← 2 mp r2,2 (g) r1← *(&x+r4)bne L1
(e) z← 3Unobserved: 2:r3=3; 2:r1=0;Figure 13: isa2v16 Towards a stronger modelWe have provided a valid and aurate model for the Power arhiteture. How-ever, our model may be, to some extent, too weak to program above. We suggesthere some extension to make our model stronger yet still valid.6.1 Extension ppo-ext→We have already presented the ppo-ext→ extension of the ppo→ relation. However,we did not use it in our preeding reasonings. Let us here onsider the isa2v1,depited at g. 13, whih is a variation on isa2 presented at setion 5.1. Theexeution witness we want to invalidate is depited at g. 14.The ode of P2 hanges: a syn instrution is replaed by a data dependenyfrom the value of load f to the eetive address of load g. As a result, we nowhave f ppo→ g, where we previously had f syn→ g. Notie that the dependenyis a so-alled false dependeny: the load instrution g always reads loation x,regardless of the value read by the load instrution f . Nevertheless, we have
f
ppo
→ g.We have olleted about 750 millions of mahine outomes for this test:outome 2:r3=3; 2:r1=0 remains unobserved. One orresponding exeution(with P1 loop being exeuted twie) is as follows:We see that the pair d ppo-ext→ f is neessary to invalidate the exeution.We ignore whether the Power arhiteture would deem the outome as validor not; however, experiments suggest that this outome never shows up. Toinlude suh a presription in our model, we extend ppo→ into ppo-ext→ .6.2 Semantis of lwsynWe provide a rather weak  that is, permissive  semantis of lwsyn. However,some programming patterns suggest a stronger semantis for this barrier. Let usindeed onsider the following test lwsyn, depited at g. 15, and the assoiatedexeution witness of interest, depited at g. 16.W.r.t the semantis of lwsyn provided at setion 4.5, this exeution witness and thus the assoiated outome  is valid. However, this use of lwsynseems to be ommon pratie amongst low-level programmers, and we have notobserved it  yet. As for the ppo-ext→ extension, we do not know whether thisexeution should be onsidered valid w.r.t the Power arhiteture. However, ifso, we would need to extend our framework a bit to handle suh a behaviour.INRIA
Fenes in Weak Memory Models 35
a:W [x℄=1
b:W [y℄=2po:0syn
d:R [y℄=2
A/B:a-brf rfrf
:R [y℄=0fr po:1
e:W [z℄=3
ppo
f:R [z℄=3
ppo-ext:-eppopo:1 ppo-ext:d-erfrf
g:R [x℄=0ppopo:22:r3=3
fr
2:r1=0
rf
rf
Figure 14: Not observed exeution of isa2v1
P0 P1
(a) x← 1 (c) r1← ylwsyn ll
(b) y← 1 (d) r2← xlwsyn x = 1 ∧ r1 = 1 ∧ r2 = 0 ?Figure 15: lwsynRR n° 7010
36 Jade Alglave & Lu Maranget
a: W [x℄=1pro:0 poi:1stw r1,0(r5)
h: Lwsynpro:0 poi:2lwsyn
po
b: W [y℄=1pro:0 poi:4stw r2,0(r6)
lwsyn
rf
po
: R [y℄=1pro:1 poi:0lwz r3,0(r6)
rf rf
d: R [x℄=0pro:1 poi:2lwzx r4,r9,r5
poppo
fr
rf
Figure 16: Not observed exeution of lwsyn
INRIA
Fenes in Weak Memory Models 377 Conlusion7.1 ContributionWe presented a generi framework for weak memory models at setion 2, whihinludes formal denitions of arhitetures and weak memory models. We high-lighted what we think to be the main onepts that should be preisely denedso as to provide formal weak memory models, namely globality of rfmaps andpreserved program order.In this framework, written in the Coq proof assistant [12℄, we have imple-mented lassial formal models suh as Sc and Tso whih we proved equivalentto the native ones, namely SC [17℄ and Spar TSO [1℄. We provided a formaldesription of barriers power at setion 3.1, together with a result on their plae-ment in the ode, whih allowed us to retrieve the inremental desription ofSpar TSO from two weaker models, whih implementations we have skethed.Thus, we illustrated the power of barriers as restoring a memory model from aweaker one, as dened at setion 2.As desribed at setion 2.4 we provide an exeutable version of our frameworkwritten in OCaml, memevents, whih exhaustively outputs all valid  in a sensewe dened formally at setion 2  exeutions of a test run on a partiulararhiteture, together with a testing tool litmus, whih allows us to run thesame tests on partiular hardware, so as to ompare our model and a givenmahine.These framework and tools, together we the testing methodology we provideat setion 2.4 allowed us to provide a model for the Power arhiteture, whihhas been observed to be valid and aurate  in a sense that we dened atsetion 2.4  with respet to the hardware.Thus, we have provided a formal theory  together with preise voabularyas required by reent work on memory models [14℄  whih we believe to haveproved useful on urrent arhitetures, via our simulation and testing tools.7.2 Status of writesWe onsider, in our framework, the writes to be non-atomi as opposed toprevious generi studies [11℄. To do so, we do not impose that ommuniationthrough memory  whih are depited by our rf→ relations  are visible to allproessors.Moreover, we rely on a partiular notion of a write being performed, whihis inspired by the globally performed notion from [15℄. This is not the most negrain notion of performed, as one may distinguish between performed in storageand performed in memory [19℄, or between performed with respet to a proessorand performed with respet to all proessors [5℄. However, this has proved tobe enough to highlight preise and fundamental notions to reason about severalmodern memory models, and to provide an aurate yet simple attempt atdening a model for the Power arhiteture inluding barriers semantis.As underlined in the setion 6.1, we believe we need to extend our frameworkso as to provide a model that is not only valid but also easy to program above,as well as to handle an aurate desription of loks implementation as proposedin the Power doumentation [5℄.RR n° 7010
38 Jade Alglave & Lu MarangetAknowledgmentsWe thank Assia Mahboubi and Vinent Siles for help and advies on the Coqdevelopment. We thank Damien Doligez and Xavier Leroy for valuable disus-sions.This work made use of the failities of HPCx, the UK's national high-performane omputing servie, whih is provided by EPCC at the University ofEdinburgh and by STFC Daresbury Laboratory, and funded by the Departmentfor Innovation, Universities and Skills through EPSRC's High End ComputingProgramme.Referenes[1℄ The Spar Arhiteture Manual Version 8, 1992.http://www.spar.org/standards/V8.pdf.[2℄ The Spar Arhiteture Manual Version 9, 1994.http://www.spar.org/standards/V9.pdf.[3℄ Power ISA Version 2.05. Otober 2007.http://www.power.org/resoures/reading/PowerISA_V2.05.pdf.[4℄ Intel 64 and IA-32 Arhitetures Software Developer's Manual, Vol. 3A.Intel Corporation, Marh 2009. rev. 30.[5℄ Power ISA Version 2.06. January 2009.http://www.power.org/resoures/reading/PowerISA_V2.06_PUBLIC.pdf.[6℄ Intel 64 Arhiteture Memory Ordering White Paper. August 2007.[7℄ A. Adir, H. Attiya, and G. Shurek. Information-Flow Models for SharedMemory with an Appliation to the PowerPC Arhiteture. IEEE Trans-ations on Parallel and Distributed Systems, May 2003.[8℄ S. V. Adve and K. Gharahorloo. Shared memory onsisteny models: Atutorial. 1996.[9℄ J. Alglave, A. Fox, S. Ishtiaq, M. O. Myreen, S. Sarkar, P. Sewell, andF. Zappa Nardelli. The semantis of Power and ARM multiproessor ma-hine ode. In Pro. DAMP 2009, January 2009.[10℄ Alpha Arhiteture Referene Manual, Fourth Edition, 2002.download.majix.org/de/alpha_arh_ref.pdf.[11℄ Arvind and J.-W. Maessen. Memory model = instrution reordering +store atomiity. In Pro. ISCA 2006, June 2006.[12℄ Y. Bertot and P. Casteran. Interative Theorem Proving and ProgramDevelopment: Coq'Art: The Calulus of Indutive Construtions. SpringerVerlag, EATCS Texts in Theoretial Computer Siene.[13℄ H.-J. Boehm and S.V. Adve. Foundations of the C++ onurreny memorymodel. In Pro. PLDI, 2008. INRIA
Fenes in Weak Memory Models 39[14℄ S. Burkhardt and M. Musuvathi. Memory model safety of programs. InECA-2, 2008.[15℄ M. Dubois and C. Sheurih. Memory aess dependenies in share-memorymultiproessors. IEEE Transations on Software Engineering, 16(6), June1990.[16℄ K. Gharahorloo. Memory onsisteny models for shared-memory multi-proessors. WRL Researh Report, 95(9), 1995.[17℄ L. Lamport. How to make a orret multiproess program exeute orretlyon a multiproessor. IEEE Trans. Comput., 46(7):779782, 1979.[18℄ S. Sarkar, P. Sewell, F. Zappa Nardelli, S. Owens, T. Ridge, T. Braibant,M. Myreen, and J. Alglave. The semantis of x86-CC multiproessor ma-hine ode. In Pro. POPL 2009, January 2009.[19℄ J. M. Stone and R. P. Fitzgerald. Storage in the PowerPC. IEEE Miro,1995.
RR n° 7010
Centre de recherche INRIA Paris – Rocquencourt
Domaine de Voluceau - Rocquencourt - BP 105 - 78153 Le ChesnayCedex (France)
Centre de recherche INRIA Bordeaux – Sud Ouest : Domaine Universitaire - 351, cours de la Libération - 33405 Talence Cedex
Centre de recherche INRIA Grenoble – Rhône-Alpes : 655, avenue de l’Europe - 38334 Montbonnot Saint-Ismier
Centre de recherche INRIA Lille – Nord Europe : Parc Scientifique de la Haute Borne - 40, avenue Halley - 59650 Villeneuve d’Ascq
Centre de recherche INRIA Nancy – Grand Est : LORIA, Technopôle de Nancy-Brabois - Campus scientifique
615, rue du Jardin Botanique - BP 101 - 54602 Villers-lès-Nancy Cedex
Centre de recherche INRIA Rennes – Bretagne Atlantique : IRISA, Campus universitaire de Beaulieu - 35042 Rennes Cedex
Centre de recherche INRIA Saclay – Île-de-France : Parc Orsay Université - ZAC des Vignes : 4, rue Jacques Monod - 91893 Orsay Cedex
Centre de recherche INRIA Sophia Antipolis – Méditerranée :2004, route des Lucioles - BP 93 - 06902 Sophia Antipolis Cedex
Éditeur
INRIA - Domaine de Voluceau - Rocquencourt, BP 105 - 78153 Le Ch snay Cedex (France)http://www.inria.fr
ISSN 0249-6399
