Context-Bounded Model Checking for POWER by Abdulla, Parosh Aziz et al.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER ∗
TUAN PHONG NGO a, PAROSH AZIZ ABDULLA b, MOHAMED FAOUZI ATIG c,
AND AHMED BOUAJJANI d
a Hanoi University of Science and Technology, Vietnam
e-mail address: phong.ngotuan@hust.edu.vn
b,c Uppsala University, Sweden
e-mail address: parosh, mohamed faouzi.atig@it.uu.se
d IRIF Universite´ Paris Diderot - Paris 7, France
e-mail address: abou@liafa.univ-paris-diderot.fr
Abstract. We propose an under-approximate reachability analysis algorithm for programs
running under the POWER memory model, in the spirit of the work on context-bounded
analysis intitiated by Qadeer et al. in 2005 for detecting bugs in concurrent programs
(supposed to be running under the classical SC model). To that end, we first introduce a
new notion of context-bounding that is suitable for reasoning about computations under
POWER, which generalizes the one defined by Atig et al. in 2011 for the TSO memory model.
Then, we provide a polynomial size reduction of the context-bounded state reachability
problem under POWER to the same problem under SC: Given an input concurrent program
P, our method produces a concurrent program P ′ such that, for a fixed number of context
switches, running P ′ under SC yields the same set of reachable states as running P under
POWER. The generated program P ′ contains the same number of processes as P plus two
additional processes, and operates on the same data domain. By leveraging the standard
model checker CBMC, we have implemented a prototype tool and applied it on a set of
benchmarks, showing the feasibility of our approach.
1. Introduction
For performance reasons, modern multi-processors may reorder memory access operations.
This is due to complex buffering and caching mechanisms that make the response memory
queries (load operations) faster, and allow to speed up computations by parallelizing
independent operations and computation flows. Therefore, operations may not be visible
to all processors at the same time, and they are not necessarily seen in the same order
by different processors (when they concern different variables). The only model where all
Key words and phrases: Concurrent programs, Safety property, Context-bounded model checking, Weak
memory model, POWER.
∗ A preliminary version of this paper appeared as at TACAS’17 [4].
This work was supported in part by the Swedish Research Council and carried out within the Linnaeus
centre of excellence UPMARC, Uppsala Programming for Multicore Architectures Research Center.
Preprint submitted to
Logical Methods in Computer Science
c© T.P. Ngo, P.A. Abdulla, M.F. Atig, and A. Bouajjani
CC© Creative Commons
ar
X
iv
:1
70
2.
01
65
5v
5 
 [c
s.P
L]
  1
1 J
an
 20
19
2 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
operations are visible immediately to all processors is the Sequential Consistency (SC) model
[36] which corresponds to the standard interleaving semantics where the program order
between operations of a same processor is preserved. Modern architectures adopt weaker
models (in the sense that they allow more behaviours) due to the relaxation in various
ways of the program order. Examples of such weak models are TSO adopted in Intel x86
machines for instance, POWER adopted in PowerPC machines, or the model adopted in
ARM machines.
Apprehending the effects of all the relaxations allowed in such models is extremely hard.
For instance, while TSO allows reordering stores past loads (of different variables) reflecting
the use of store buffers, a model such as POWER allows reordering of all kinds of store
and load operations under quite subtle conditions. A lot of work has been devoted to the
definition of formal models that accurately capture the program semantics corresponding to
models such as TSO [47, 42] and POWER [46, 45, 14, 38]. Still, programming against weak
memory models is a hard and error prone task. Therefore, developing formal verification
approaches under weak memory models is of paramount importance. In particular, it
is crucial in this context to have efficient algorithms for automatic bug detection. This
paper addresses precisely this issue and presents an algorithmic approach for checking
state reachability in concurrent programs running on the POWER semantics as defined in
[46, 45, 25].
The verification of concurrent programs under weak memory models is known to be
complex. Indeed, encoding the buffering and storage mechanisms used in these models leads
in general to complex, infinite-state formal operational models involving unbounded data
structures like FIFO queues (or more generally unbounded partial order constraints). For
the case of TSO, efficient, yet precise encodings of the effects of its storage mechanism have
been designed recently [7, 3, 5]. It is not clear how to define such precise and practical
encodings for POWER.
In this paper, we consider an alternative approach. We investigate the issue of defining ap-
proximate analysis. Our approach consists in introducing a parametric under-approximation
schema in the spirit of context-bounding [44, 39, 35, 33, 15]. Context-bounding has been
proposed in [44] as a suitable approach for efficient bug detection in multithreaded programs.
Indeed, for concurrent programs, a bounding concept that provides both good coverage
and scalability must be based on aspects related to the interactions between concurrent
components. It has been shown experimentally that concurrency bugs usually show up after
a small number of context switches [39].
In the context of weak memory models, context-bounded analysis has been extended in
[15] to the case of programs running on TSO. The work we present here aims at extending
this approach to the case of POWER. This extension is actually very challenging due to
the complexity of POWER and requires developing new techniques that are different from,
and much more involved than, the ones used for the case of TSO. First, we introduce
a new concept of bounding that is suitable for POWER. Intuitively, the architecture of
POWER is similar to a distributed system with a replicated memory, where each processor
has its own replica, and where operations are propagated between replicas according to
some specific protocol. Our bounding concept is based on this architecture. We consider
that a computation is divided in a sequence of “contexts”, where a context is a computation
segment for which there is precisely one active processor. All actions within a context
are either operations issued by the active processor, or propagation actions performed by
its storage subsystem. Then, in our analysis, we consider only computations that have a
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 3
number of contexts that is less or equal than some given bound. Notice that while we bound
the number of contexts in a computation, we do not put any bound on the lengths of the
contexts, nor on the size of the storage system.
We prove that for every bound K, and for every concurrent program Prog , it is possible
to construct, using code-to-code translation, another concurrent program Prog• such that
for every K-bounded computation pi in Prog under the POWER semantics there is a
corresponding K-bounded computation pi• of Prog• under the SC semantics that reaches
the same set of states and vice-versa. Thus, the context-bounded state reachability problem
for Prog can be reduced to the context-bounded state reachability problem for Prog• under
SC. We show that the program Prog• has the same number of processes as Prog plus two
additional processes, and only O(|P| · |X | ·K+ |R|) additional shared variables and local
registers compared to Prog , where |P| is the number of processes, |X | is the number of
shared variables and |R| is the number of local registers in Prog . Furthermore, the obtained
program has the same type of data structures and variables as the original one. As a
consequence, we obtain for instance that for finite-data programs, the context-bounded
analysis of programs under the POWER semantics is decidable. Moreover, our code-to-code
translation allows to leverage existing verification tools for concurrent programs to carry out
verification of safety properties under POWER.
To show the applicability of our approach, we have implemented our reduction in a
prototyping tool, namely Power2SC. We have used CBMC version 5.1 [21] as the backend
tool for solving SC reachability queries. We have carried out several experiments showing the
efficiency of our approach. Our experimental results confirm the assumption that concurrency
bugs manifest themselves within small bounds of context switches. They also confirm that
our approach based on context-bounding is more efficient and scalable than approaches
based on bounding sizes of computations and of storage systems.
Related work. There has been a lot of work on automatic verification of programs running
under weak memory models, based on precise, under-approximate, and abstract analyses,
e.g., [37, 31, 32, 15, 49, 50, 22, 7, 11, 19, 20, 17, 18, 53, 2, 54, 24, 13, 52, 34, 23, 9, 30, 41, 27].
While most of these works concern TSO, only a few of them address the safety verification
problem under POWER (e.g., [8, 13, 49, 12, 14]). The paper [25] addresses the different
issue of checking robustness against POWER, i.e., whether a program has the same (trace)
semantics for both POWER and SC.
The work in [12] extends the CBMC framework by taking into account weak memory
models including TSO and POWER. While this approach uses reductions to SC analysis, it
is conceptually and technically different from ours. The work in [13] develops a verification
technique combining partial orders with bounded model checking, that is applicable to
various weak memory models including TSO and POWER. However, these techniques are
not anymore supported by the latest version of CBMC. The work in [8] develops stateless
model-checking techniques under POWER. In Section 6, we compare the performances of
our approach with those of [12] and [8]. The tool PPCMEM [46] operates on small litmus
tests under the POWER semantics. Our tool can handle in an efficient and precise way such
litmus tests.
Recently, the Cseq tool [48, 29, 50, 51, 40] presented a new verification approach, based
on code-to-code translations, for programs running under SC, TSO, and PSO. They also
discuss the extension of their approach to programs running under POWER (however the
detailed formalization and the implementation of this extension are kept for future work).
4 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Prog ::= var x∗ (proc p reg $r∗ i∗)∗
i ::= λ : s;
s ::= $r←x | x←exp | assume exp
| if exp then i∗ else i∗
| while exp do i∗ | term
Figure 1. Syntax of concurrent programs.
Our approach and the ones proposed in [50, 49] are orthogonal since we are using different
bounding parameters: In this paper, we are bounding the number of contexts while Tomasco
et al. [50, 49] are bounding the number of write operations.
2. Concurrent Programs and Semantics
In this section, we first introduce some notations and definitions. Then, we present the
syntax we use for concurrent programs and the POWER operational semantics including
the transition system it induces as in [25, 46, 45]. Finally, we give our definition of context-
bounding and an example of a context-bounded computation under the POWER semantics.
Preliminaries. Consider sets A and B. We use [A→ B] to denote the set of (partial)
functions from A to B, and write f : A → B to indicate that f ∈ [A→ B]. We write
f(a) = ⊥ to denote that f is undefined for a. We use f [a← b] to denote the function g such
that g(a) = b and g(x) = f(x) if x 6= a. We will use a function gen which, for a given set A,
returns an arbitrary element gen (A) ∈ A. For integers i, j, we use [i..j] to denote the set
{i, i+ 1, . . . , j}. We use A∗ to denote the set of finite words over A. For words w1, w2 ∈ A∗,
we use w1 · w2 to denote the concatenation of w1 and w2.
Syntax. Fig. 1 gives the grammar for a small but general assembly-like language that we
use for defining concurrent programs. A program Prog first declares a set X of (shared)
variables followed by the code of a set P of processes. Each process p has a finite set R (p)
of (local) registers. We assume w.l.o.g. that the sets of registers of the different processes
are disjoint, and define R := ∪pR (p). The code of each process p ∈ P starts by declaring a
set of registers followed by a sequence of instructions.
For the sake of simplicity, we assume that the data domain of both the shared variables
and registers is a single set D. We assume a special element 0 ∈ D which is the initial value
of each shared variable or register. Each instruction i is of the form λ :s where λ is a unique
label (across all processes) and s is a statement. We define lbl (i) := λ and stmt (i) := s.
We define Ip to be the set of instructions occurring in p, and define I := ∪p∈PIp. We assume
that Ip contains a designated initial instruction i
init
p from which p starts its execution. A
read instruction in a process p ∈ P has a statement of the form $r ← x, where $r is a register
in p and x ∈ X is a variable. A write instruction has a statement of the form x ← exp
where x ∈ X is a variable and exp is an expression. We will assume a set of expressions
containing a set of operators applied to constants and registers, but not referring to the
content of memory (i.e., the set of variables). Assume, conditional, and iterative instructions
(collectively called aci instructions) can be explained in a similar manner. The statement
term will cause the process to terminate its execution. We assume that term occurs only
once in the code of a process p and that it has the label λtermp .
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 5
We give a number of definitions that we will use in the definition of the POWER
operational semantics. Firstly, for a write instruction i where stmt (i) is of the form x← exp
or a read instruction i where stmt (i) is of the form $r ← x, we define var (i) := x. For
an instruction i that is neither read nor write, we define var (i) := ⊥. In other words,
the variable function var (i) returns the variable in i. Secondly, for a write instruction i
where stmt (i) is of the form x← exp or an aci instruction i where stmt (i) is of the form
assume exp or if exp then i∗ else i∗ or while exp do i∗, we define exp (i) := exp. For
an instruction that is neither write nor aci, we define exp (i) := ⊥. In other words, the
expression function exp (i) returns the expression in i. Finally, for an expression exp, we use
R (exp) to denote the set of registers that occur in exp. Then, for an instruction i, we define
R (i) := R (exp (i)). Note that R (i) = ∅ if exp (i) = ⊥.
For an instruction i ∈ Ip, we define next (i) to be the set of instructions that may follow
i in a run of a process. Notice that this set contains two elements if i is an aci instruction
(in the case of an assume instruction, we assume that if the condition evaluates to false,
then the process moves to term), no element if i is a terminating instruction, and a single
element otherwise. We define Tnext (i) (resp. Fnext (i)) to be the (unique) instruction to
which the process execution moves in case the condition in the statement of i evaluates to
true (resp. false).
In Section 4, we will describe how to deal with address operators in read and write
instructions and the synchronization primitives.
Configurations. We will assume an infinite set E of events, and will use an event to
represent a single execution of an instruction in a process. A given instruction may be
executed several times during a run of the program (for instance, when it is in the body
of a loop). In such a case, the different executions are represented by different events. An
event e is executed in several steps, namely they are fetched, initialized, and then committed.
Furthermore, a write event may be propagated to the other processes. A configuration c is a
tuple 〈E,≺, ins, status, rf, Prop,≺co〉, defined as follows.
Events. E ⊆ E is a finite set of events, namely the events that have been created up to the
current point in the execution of the program. ins : E 7→ I is a function that maps an event
e to the instruction ins (e) that e is executing. We partition the set E into disjoint sets Ep,
for p ∈ P , where Ep := {e ∈ E | ins (e) ∈ Ip}, i.e., for a process p ∈ P , the set Ep contains
the events whose instructions belong to p. For an event e ∈ Ep, we define proc (e) := p. We
say that e is a write event if ins (e) is a write instruction. We use EW to denote the set of
write events. Similarly, we define the set ER of read events, and the set EACI of aci events
whose instructions are either assume, conditional, or iterative. We define EWp, E
R
p, and E
ACI
p ,
to be the restrictions of the above sets to Ep. For each variable x ∈ X , we assume a special
write event einitx , called the initializer event for x. This event is not performed by any of
the processes in P, and writes the value 0 to x. Finally, we define Einit := {einitx | x ∈ X}
to be a set disjoint from the set of events E that contains all the initializer events.
Program Order. The program-order relation ≺⊆ E×E is an irreflexive partial order that
describes, for a process p ∈ P , the order in which events are fetched from the code of p. We
require that (i) e1 6≺ e2 if proc (e1) 6= proc (e2), i.e., ≺ only relates events belonging to the
same process, and that (ii) ≺ is a total order on Ep.
6 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Predicate Definition Meaning
e ∈ ER :
RdCnd (c, e)
∀e′ ∈ ER :((
e
′ ≺poloc e
)
=⇒ (rf (e′) co rf (e) ))
For all read event e′
preceding the read event e
in ≺poloc, the write event
from which e reads its value
is not a coherence
predecessor of the write
event for e′.
e ∈ E :
ComCnd (c, e)
∀e′ ∈ E :
((
e
′ ≺data e
) ∨ (e′ ≺ctrl e) ∨ (e′ ≺poloc e))
=⇒(
status
(
e
′) = com)

All events preceding the
event e in ≺data, ≺ctrl, or
≺poloc have already been
committed.
e ∈ EW :
WrInitCnd (c, e)
∀e′ ∈ ER :
(
e
′ ≺data e
)
=⇒((
status
(
e
′) = init) ∨ (status (e′) = com))

All events preceding the
write e in ≺data have
already been initialized.
e ∈ EACI :
ValidCnd (c, e)
∀e′ ∈ E :
((
e ≺ e′) ∧ (@e′′ ∈ E : e ≺ e′′ ≺ e′))
=⇒
((
Val (c, e)= true
)∧(ins (e′)=Tnext (ins (e)) ))
∨((
Val (c, e)= false
)∧(ins (e′)=Fnext (ins (e)) ))


If there exists an event e′
that was fetched
immediately after the aci
event e, e′ is consistent with
the value Val (c, e).
Table 1. Definitions of predicates.
Status. The function status : E 7→ {fetch, init, com} defines, for an event e, the current
status of e, i.e., whether it has been fetched, initialized, or committed.
Propagation. The function Prop : P × X 7→ EW ∪ Einit defines, for a process p ∈ P and
variable x ∈ X , the latest write event on x that has been propagated to p.
Read-From. The function rf : ER 7→ EW ∪ Einit defines, for a read event e ∈ ER, the write
event rf (e) from which e gets its value.
Coherence Order. All processes share a global view about the order in which write events
are propagated. This is described by the coherence order relation ≺co that is a partial order
on EW such that e1 ≺co e2 only if var (e1) = var (e2), i.e., it relates only events that write
on identical variables. If a write event e1 is propagated to a process before another write
event e2 and both events write on the same variable, then e1 ≺co e2 holds. Furthermore, the
events cannot be propagated to any other process in the reverse order. As a consequence, a
write event is never propagated to a given process if the process has already seen a coherence
successor of this event.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 7
Dependencies. We introduce a number of dependency orders on events that we will use in
the definition of the semantics.
We define the per-location program-order ≺poloc⊆ E × E such that e1 ≺poloc e2 if
e1 ≺ e2 and var (e1) = var (e2) ∈ X , i.e., it is the restriction of ≺ to events with identical
variables.
We define the data dependency order ≺data such that e1 ≺data e2 if (i) e1 ∈ ER, i.e., e1
is a read event; (ii) e2 ∈ EW ∪EACI, i.e., e2 is either a write or an aci event; (iii) e1 ≺ e2;
(iv) stmt (ins (e1)) is of the form $r ← x; (v) $r ∈ R (ins (e2)); and (vi) there is no event
e3 ∈ ER such that e1 ≺ e3 ≺ e2 and stmt (ins (e3)) is of the form $r ← y. Intuitively, the
loaded value by e1 is used to compute the value of the expression exp (ins (e2)).
We define the control dependency order ≺ctrl such that e1 ≺ctrl e2 if e1 ∈ EACI and
e1 ≺ e2.
Committed and Initial Configurations. We say that c is committed if status (e) = com for
all events e in the event set of c. The initial configuration cinit is defined by〈∅, ∅, λe.⊥, λe.⊥, λe.⊥, λp.λx.einitx , ∅〉
We use C to denote the set of all configurations.
Evaluation Functions. Given a configuration c, an event e, and an expression exp, we first
define a function Val (c, e, exp) that returns the value of the expression exp when evaluated
at the event e in the configuration c. We define Val (c, e) := Val (c, e, exp (ins (e))). Note
that Val (c, e) = ⊥ if exp (ins (e)) = ⊥.
Let c = 〈E,≺, ins, status, rf, Prop,≺co〉 be a configuration. Formally, we define
Val (c, e, exp) recursively, depending on the type of the expression exp:
• If exp is a constant c, then Val (c, e, exp) := c.
• If exp is f(exp1, · · · , expn) for some function f and expressions exp1, · · · , expn, then
Val (c, e, exp) := f(Val (c, e, exp1) , · · · , Val (c, e, expn)). Note that if Val (c, e, expi) = ⊥
for some i : 1 ≤ i ≤ n, then f(Val (c, e, exp1) , · · · , Val (c, e, expn)) := ⊥.
• If exp is $r for some register $r ∈ R, then let e′ ∈ E be the closest read event that
precedes e in the program order ≺ and loads a value to the register $r.
◦ If there is no such event e′, then Val (c, e, exp) := 0.
◦ If there is such event e′, then let e′′ ∈ E∪Einit be the write event such that rf (e′) = e′′.
∗ If e′′ ∈ Einit, then Val (c, e, exp) := 0.
∗ If e′′ /∈ Einit, then let exp′′ = exp (ins (e′′)). We define
Val (c, e, exp) := Val
(
c, e′′, exp′′
)
◦ If there is such an event e′ and there is no such write event e′′ ∈ E ∪ Einit such that
rf (e′) = e′′, i.e. rf (e′) = ⊥, then Val (c, e, exp) := ⊥.
Transition Relation. We define the transition relation as a relation −→ ⊆ C×P × C. For
configurations c1, c2 ∈ C and a process p ∈ P , we write c1 p−→ c2 to denote that 〈c1, p, c2〉 ∈−→ .
Intuitively, this means that p moves from the current configuration c1 to c2. The relation
−→ is defined through the set of inference rules shown in Fig. 2. Below we will explain these
inference rules. Table 1 gives some predicates used in the transition system.
The rule Fetch chooses the next instruction to be executed in the code of a process
p ∈ P. This instruction should be a possible successor of the instruction that was last
executed by p. To satisfy this condition, we define MaxI (c, p) to be a set of instructions as
8 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
e 6∈ E, ≺′=≺ ∪{〈e′, e〉 | e′ ∈ Ep}, i ∈ MaxI (c, p)
c
p−→ 〈E ∪ {e} ,≺′, ins[e← i], status[e← fetch], rf, Prop,≺co〉
Fetch
e ∈ ERp, status (e) = fetch, CW (c, e) = e′, status (e′) = init
c
p−→ 〈E,≺, ins, status[e← init], rf[e← e′], Prop,≺co〉
Local-Read
e ∈ ERp, status (e)=fetch, (CW (c, e) = ⊥) ∨ (CW (c, e)=e′ ∧ status (e′)=com)
c
p−→ 〈E,≺, ins, status[e← init], rf[e← Prop (p, var (e))], Prop,≺co〉
Prop-Read
e ∈ ERp, status (e) = init, ComCnd (c, e), RdCnd (c, e)
c
p−→ 〈E,≺, ins, status[e← com], rf, Prop,≺co〉
Com-Read
e ∈ EWp, status (e) = fetch, WrInitCnd (c, e)
c
p−→ 〈E,≺, ins, status[e← init], rf, Prop,≺co〉
Init-Write
e ∈ EWp, status (e) = init, ComCnd (c, e),
≺′co=≺co ∪{〈e′, e〉 | e′ co Prop (p, var (e))}
c
p−→ 〈E,≺, ins, status[e← com], rf, Prop[〈p, var (e)〉 ← e],≺′co〉
Com-Write
q ∈ P, e ∈ EWp, status (e) = com, Prop (q, var (e)) ≺co e,
≺′co=≺co ∪{〈e′, e〉 | e′ co Prop (q, var (e))}
c
p−→ 〈E,≺, ins, status, rf, Prop[〈q, var (e)〉 ← e],≺′co〉
Prop
e ∈ EACIp , status (e) = fetch, ComCnd (c, e), ValidCnd (c, e)
c
p−→ 〈E,≺, ins, status[e← com], rf, Prop,≺co〉
Com-ACI
Figure 2. Inference rules defining the relation
p−→ where p ∈ P . We assume
that c is of the form 〈E,≺, ins, status, rf, Prop,≺co〉.
follows: (i) If Ep = ∅ then define MaxI (c, p) :=
{
iinitp
}
, i.e., the first instruction fetched by p
is iinitp . (ii) If Ep 6= ∅, let e′ ∈ Ep be the maximal event of p (w.r.t. ≺) in the configuration c
and then define MaxI (c, p) := next (ins (e′)). In other words, we consider the instruction
i′ = ins (e′) ∈ Ip, and take its possible successors. The possibility of choosing any of the
(syntactically) possible successors corresponds to speculatively fetching statements. As seen
below, whenever we commit an aci event, we check whether the made speculations are
correct or not. We create a new event e, label it by i ∈ MaxI (c, p), and make it larger than
all the other events of p w.r.t. ≺. In such a way, we maintain the property that the order on
the events of p reflects the order in which they are fetched in the current run of the program.
There are two ways in which read events get their values, namely either from local write
events that are performed by the process itself, or from write events that are propagated
to the process. The first case is covered by the rule Local-Read in which the process p
initializes a read event e ∈ ERp on a variable (say x), where e has already been fetched.
Here, the event e is made to read its value from a local write event e′ ∈ EWp on x such that
(i) e′ has been initialized but not yet committed, and such that (ii) e′ is the closest write
event that precedes e in the order ≺poloc. Notice that, by condition (ii) e′ is unique if it
exists. To formalize this, we define the Closest Write function CW (c, e) := e′ where e′ is the
unique event such that (i) e′ ∈ EWp, (ii) e′ ≺poloc e, and (iii) there is no event e′′ such that
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 9
e
′′ ∈ EWp and e′ ≺poloc e′′ ≺poloc e. Notice that such an event e′ may not exist, i.e., it may
be the case that CW (c, e) = ⊥. If e′ exists and it has been inititialized but not commited, we
initialize e and update the read-from relation appropriately. On the other hand, if such an
event does not exist, i.e., if there is no write event on x before e by p, or if the closest write
event on x before e by p has already been committed, then we use the rule Prop-Read to let
e fetch its value from the latest write event on x that has been propagated to p. Notice this
event is the value of Prop (p, x).
To commit an initialized read event e ∈ ERp, we use the rule Com-Read. The rule can be
performed if e satisfies two predicates in c, namely RdCnd (c, e) and ComCnd (c, e).
To initialize a fetched write event e ∈ EWp, we use the rule Init-Write that requires all
events that precede e in ≺data should have been initialized. This condition is formulated by
the predicate WrInitCnd (c, e). When a write event in a process p ∈ P is committed, it is
also immediately propagated to p itself. To maintain the coherence order, the semantics
keeps the invariant that the latest write event on a variable x ∈ X that has been propagated
to a process p ∈ P is the largest in coherence order among all write events on x that have
been propagated to p up to now in the run. This invariant is maintained in Com-Write by
requiring that the event e (that is being committed) is strictly larger in coherence order
than the latest write event on the same variable as e that has been propagated to p.
Write events are propagated to other processes by the rule Prop. A write event e on a
variable x is propagated to a process q only if it has a coherence order that is strictly larger
than the coherence of any event that has been to propagated to q up to now. Notice that
this is given by coherence order of Prop (q, x) which is the latest write event on x that has
been propagated to q.
When committing an aci event by the rule Com-ACI, we require that we verify any
potential speculation that have been made when fetching the subsequent events. We
formulate this requirement by the predicate ValidCnd (c, e).
Bounded Reachability. A run pi is a sequence of transitions c0
p1−→ c1 p2−→ c2 · · · cn−1 pn−→ cn.
In such a case, we write c0
pi−→ cn. We define last (pi) := cn. We define pi ↑:= p1p2 · · · pn,
i.e., it is the sequence of processes performing the transitions in pi. For a sequence σ =
p1p2 · · · pn ∈ P∗, we say that σ is a context if there is a process p ∈ P such that pi = p
for all i : 1 ≤ i ≤ n. We say that cn is complete if (i) cn is committed and (ii) there is no
configuration c′ such that c p−→ c′ for all p ∈ P by allowing p to execute any inititalizing,
committing, or propagating inference rule. It should be the case that all fetched instructions
are committed, and all fetched write instructions have been propagated or cannot be
propagated to a process in the system. We say that pi is complete if last (pi) is complete. We
also say that pi is k-bounded if pi ↑= σ1 · σ2 · · · ·σk where σi is a context for all i : 1 ≤ i ≤ k.
For c ∈ C and p ∈ P , we define the set of reachable labels of the configuration c as follows.
Let ep ∈ Ep be the maximal event of p (w.r.t. ≺) in c. We define lbl (c, p) := lbl(ins (ep)),
i.e. process p reaches the label of the maximal event e of p (w.r.t. ≺) in c. Observe
that in the case such an event ep does not exist, we define lbl (c, p) = ⊥. We define
lbl (c) := {lbl (c, p)) | p ∈ P}. In the reachability problem, we are given a label λ and
asked whether there is a complete run pi and a configuration c such that cinit
pi−→ c where
λ ∈ lbl (c). For a natural number K, the K-bounded reachability problem is defined by
requiring that the run pi in the above definition is K-bounded.
10 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
var x, y
proc p1
0 : x← 1;
1 : y ← 1;
2 : term;
proc p2
reg $r1, $r2
3 : $r1←y;
4 : $r2←x;
5 : assume $r1 = 1;
6 : assume $r2 = 0;
7 : assume 1;
8 : term;
Figure 3. A variant of the message passing test [14].
Event Instruction
e1 0 : x←1
e2 1 : y←1
e3 3 : $r1←y
e4 4 : $r2←x
(a)
p2 : h 3, fetchi
p2 : h 4, fetchi
p2 : h 4, initi
p2 : h 4, comi
⇡1
p1 : h 1, fetchi
p1 : h 1, initi
p1 : h 1, comi
p1 : h 2, fetchi
p1 : h 2, initi
p1 : h 2, comi
p1 : h 2, propi
⇡2
p2 : h 3, initi
p2 : h 3, comi
⇡3
p1 : h 1, propi
⇡4
(b)
Figure 4. A complete run satisfies the reachability problem of the program
in Fig. 3: (A) read and write events and (B) four contexts of the run containing
read and write events. The notion 〈e, status (e)〉 gives the recent status of
an event e.
Example 2.1. We give an example of a small concurrent program that has different
behaviours under the SC and POWER semantics. We first explain intuitively the program
and its behaviours under the SC semantics. Then, we give a specific reachability problem
which the program cannot satisfy under SC. Then, we explore a context-bounded run of the
program under POWER that gives a positive answer for the reachability problem.
Fig. 3 illustrates a program that is written following the syntax in Fig. 1. The program
has two processes P={p1, p2} communicating through two variables X ={x, y}. Moreover,
process p2 has two registers R={$r1, $r2}. At the beginning, all the variables and registers
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 11
are initialized to 0. Process p1 has two write instructions that set x and y to 1. Process p2
loads the values of y and x into $r1 and $r2 respectively. Then p2 checks whether the value
of $r1 is 1 (line 5) and the value of $r2 is 0 (line 6).
The reachability problem under SC or POWER checks whether p2 reaches the label
of line 7. Note that p2 can only reach line 7 if it has executed the instructions in lines 5
and 6 and it has evaluated these instructions to true. Therefore, to satisfy this reachability
problem, p2 must read 1 from y, and while it is reading y it should not see that x has been
set to 1. Since at the beginning, all variables are 0, the value 1 for y observed by p2 must be
written by process p1.
The reachability problem has a negative answer under SC semantics. The reason is that
the program order between two write instructions to x and y requires process p1 to set x
and y to 1 in order. As a consequence, when p2 reads 1 from y, it must see that x has been
set to 1.
However, the complete run pi given in Fig. 4 shows that the reachability problem is
satisfiable under POWER. For the sake of simplicity, we only show the part of the run
relating to the read and write events. The run pi can be decomposed into 4 contexts: pi1,
pi2, pi3, and pi4. In the first context pi1, p2 fetches the two read instructions from y and x,
described by e3 and e4 respectively. After that, it initializes the fetched event e4 and loads
0 from x into register $r1, and then commits e4. Then, in the second context pi2, p1 fetches
the write instruction on x, described by e1, in order to initialize and commit, but delay
propagating e1 to p2. Then, p1 fetches the write instruction on y, described by e2. At this
time, it initializes, commits, and propagates e2 to p2. Then, in the third context pi3, p2
resumes its execution by initializing the fetched event e3 to load 1 from y that is the value
just propagated from p1, and then committing e3. Then, p2 fetches three assume events e5,
e6, and e7 (not shown in Fig. 4) corresponding to the instructions “5 : assume $r1 = 1”,
“6 : assume $r2 = 0”, and “7 : assume 1” respectively in order to commit them and
terminates. Finally, in the fourth context pi4, p1 terminates by propagating e1 to p2. The
run pi is complete and 4-bounded, and it satisfies the reachability problem.
3. Translation
In this section, we introduce an algorithm that reduces, for a given numberK, theK-bounded
reachability problem under POWER to the corresponding problem under SC. Given an
input concurrent program Prog , the algorithm constructs an output concurrent program
Prog• whose size is polynomial in Prog and K, such that for each K-bounded run pi in Prog
under POWER there is a corresponding K-bounded run pi• of Prog• under SC that reaches
the same set of process labels. Below, we first present a scheme for the translation of Prog ,
and mention some of the challenges that arise due to the POWER semantics. Then, we give
a detailed description of the data structures we use in Prog•. Finally, we describe the codes
of the processes in Prog•.
Scheme. Our construction is based on a code-to-code translation scheme that transforms
the program Prog into the program Prog• following the map function J.KK given in Fig. 5.
Let P and X be the sets of processes and (shared) variables in Prog . The map J.KK replaces
the variables of Prog by O(|P| ·K) copies of the set X , in addition to a finite set of finite-
data structures (which will be formally defined in the Data Structures paragraph). The
map function then declares two additional processes initProc and verProc that will be
12 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
JProgKK def= var x∗ 〈addvars〉K 〈initProc〉K
〈verProc〉
K
(Jproc p reg $r∗ i∗KK)∗
〈addvars〉
K
def
= µ (|P|, |X |,K) µinit (|P|, |X |,K)
α (|P|, |X |,K) αinit (|P|, |X |,K)
ν (|P|, |X |) iR (|P|, |X |) cR (|P|, |X |)
iW (|P|, |X |) cW (|P|, |X |) iReg (|R|)
cReg (|R|) ctrl (|P|) active (K) cnt
〈initProc〉
K
def
= JinitProcKK
〈verProc〉
K
def
= JverProcKKJproc p reg $r∗i∗KK def= proc p reg $r∗ (JiKpK)∗JiKp
K
def
= λ : 〈activeCnt〉p
K
JsKp
K
〈closeCnt〉p
K
〈activeCnt〉p
K
def
= assume (active (cnt) = p)
〈closeCnt〉p
K
def
= cnt ← cnt + gen ([0..K− 1]) ;
assume(cnt ≤ K)J$r ← xKp
K
def
= J$r ← xKp,Read
KJx← expKp
K
def
= Jx← expKp,Write
KJassume expKp
K
def
= assume exp; 〈control〉p
KJif exp then i∗ def= if exp then (JiKp
K
)∗
else i∗Kp
K
else (JiKp
K
)∗; 〈control〉p
KJwhile exp do i∗Kp
K
def
= while exp do (JiKp
K
)∗; 〈control〉p
K
〈control〉p
K
def
= ctrl (p)←ctrl (p)+gen ([0..K−1]);
assume(ctrl (p) ≤ K)JtermKp
K
def
= term
Figure 5. Translation map J.KK. We omit the label of an intermediary
instruction when it is irrelevant.
used to initialize the data structures and to check the reachability problem at the end
of the run of Prog•. The formal definition of initProc (resp. verProc) will be given in
the Initializing Process (resp. Verifying Process) paragraph. Furthermore, the map
function J.KK transforms the code of each process p ∈ P to a corresponding process p• that
will simulate the moves of p. The processes p and p• will have the same set of registers.
For each instruction i appearing in the code of the process p, the map JiKp
K
transforms it
to a sequence of instructions as follows: First, it adds the code defined by activeCnt to
check if the process p is active during the current context, then it transforms the statement
s of the instruction i into a sequence of instructions following the map JsKp
K
, and finally it
adds the sequence of instructions defined by closeCnt to guess the occurrence of a context
switch. The translation of an aci statement keeps the same statement and adds control
to guess the contexts when the corresponding event will be committed. The terminating
statement remains the same by the map function JtermKp
K
. The translations of write and
read statements will be described in the Write Instructions and Read Instructions
paragraphs respectively.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 13
Challenges. There are two aspects of the POWER semantics (cf. Section 2) that make it
difficult to simulate the run pi under the SC semantics, namely non-atomicity and asynchrony.
First, events are not executed atomically. In fact, an event is first fetched and initialized
before it is committed. In particular, an event may be fetched in one context and be
initialized and committed only in later contexts. Since there is no bound on the number
of events that may be fetched in a given context, our simulation should be able to handle
unbounded numbers of pending events. Second, write events of one process are propagated in
an asynchronous manner to the other processes. This implies that we may have unbounded
numbers of “traveling” events that are committed in one context and propagated to other
processes only in subsequent contexts. This creates two challenges in the simulation. On
the one hand, we need to keep track of the coherence order among the different write events.
On the other hand, since write events are not distributed to different processes at the same
time, the processes may have different views of the values of a given variable at a given point
of time.
Since it is not feasible to record the initializing, committing, and propagating contexts
of an unbounded number of events in the SC runs of a finite-state program, our algorithm
will instead predict the summary of effects of arbitrarily long sequences of events that may
occur in a given context. This is implemented using a scheme that first guesses and then
checks these summaries. Concretely, each event e in the run pi is simulated by a sequence
of instructions in pi•. This sequence of instructions will be executed atomically (without
interruption from other processes and events). More precisely, if e is fetched in a context
k : 1 ≤ k ≤ K, then the corresponding sequence of instructions will be executed in the same
context k in pi•. Furthermore, we let pi• guess (speculate) (i) the contexts where e will be
initialized, committed, and propagated to the other processes, and (ii) the values of variables
that are seen by read operations. Then, we check whether the guesses made by pi• are valid
w.r.t. the POWER semantics. As we will see below, these checks are done both on-the-fly
during pi•, as well as at the end of pi•. To implement the guess-and-check scheme, we use a
number of data structures, described below.
Data Structures. We will introduce the data structures used in our simulation in order to
deal with the above asynchrony and non-atomicity challenging aspects.
Asynchrony. In order to keep track of the coherence order, we associate a timestamp with
each write event. A timestamp τ is a mapping P 7→ KÀÁ where KÀÁ := KÀ ∪ KÁ,
K
À := {1} × [1..K] and KÁ := {2} × [1..K]. For a process p ∈ P , if the value of τ (p) is of
the form 〈1, k〉 where k ∈ [1..K], i.e. τ (p) ∈ KÀ, then τ (p) represents that the associated
event is propagated to p in the context k. If the value of τ (p) is of the form 〈2, k〉 where
k ∈ [1..K], i.e. τ (p) ∈ KÁ, then τ (p) represents that (i) the associated event will not be
propagated to p and (ii) the maximal context of all coherence predecessors of the event is k.
For a timestamp τ in the form 〈1, k〉 or 〈2, k〉, we define τ (p)↓ := k. We use T to denote
the set of timestamps. We define an order v on T such that τ1 v τ2 if τ1(p)↓ ≤ τ2(p)↓ for
all processes p ∈ P. If τ1 v τ2 and there is a process p ∈ P such that τ1(p)↓ < τ2(p)↓, then
we write τ1 < τ2. Note that if τ1(p) v τ2(p) and τ1 6< τ2 then both τ1 v τ2 and τ2 v τ1.
The coherence order ≺co on write events will be reflected by the order v on their
timestamps. In particular, for two events e1 and e2 with timestamps τ1 and τ2 respectively,
if τ1 < τ2 then e1 precedes e2 in coherence order (following the definition of <). Moreover,
14 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
if both τ1 v τ2 and τ2 v τ1 then the two associated events are from the same process, and
the coherence order between them can be reflected by the program order.
Given two timestamps τ1 and τ2, we define the summary of τ1 and τ2, denoted by
τ1 ⊕ τ2, to be the timestamp τ as follows. (i) If τ1(p)↓ > τ2(p)↓ then τ(p) := τ1(p). (ii) If
τ2(p)↓ > τ1(p)↓ then τ(p) := τ2(p). (iii) If τ1(p)↓ = τ2(p)↓ = k and (τ1(p) ∈ KÁ ∨ τ2(p) ∈
K
Á) then τ(p) := 〈2, k〉. (iv) If τ1(p)↓ = τ2(p)↓ = k and (τ1(p) ∈ KÀ ∧ τ2(p) ∈ KÀ) then
τ(p) := 〈1, k〉.
Our simulation observes the sequence of write events received by a process in each
context. In fact, the simulation will initially guess and later verify the summaries of the
timestamps of such a sequence. This is done using the data structures αinit and α. The
mapping αinit : P ×X × [1..K] 7→
[
P → KÀÁ
]
stores, for a process p ∈ P , a variable x ∈ X ,
and a context k : 1 ≤ k ≤ K, an initial guess αinit (p, x, k) of the summary of the timestamps
of the sequence of write events on x propagated to p up to the start of the context k. Starting
from a given initial guess for a given context k, the time stamp is updated successively
using the sequence of write events on x propagated to p in k. The result is stored using
the mapping α : P × X × [1..K] 7→
[
P → KÀÁ
]
. More precisely, we initially set the value
of α to αinit . Each time a new write event e on x is is executed by p in a context k, we
guess the timestamp β of e, and then update α (p, x, k) by computing its summary with β.
Thus, given a point in a context k, α (p, x, k) contains the summary of the timestamps of
the whole sequence of write events on x that have been propagated to p up to that point.
At the end of the simulation, we verify, for each context k : 1 ≤ k < K, that the value of α
at the end of the context k is equal to the value of αinit for the next context k + 1.
Furthermore, we use three data structures for storing the values of variables. The
mapping µinit : P × X × [1..K] 7→ D stores, for a process p ∈ P, a variable x ∈ X , and a
context k : 1 ≤ k ≤ K, an initial guess µinit (p, x, k) of the value of the latest write event on
x propagated to p up to the start of the context k. The mapping µ : P × X × [1..K] 7→ D
stores, for a process p ∈ P, a variable x ∈ X , and a point in a context k : 1 ≤ k ≤ K, the
value µ (p, x, k) of the latest write event on x that has been propagated to p up to that point.
Moreover, the mapping ν : P × X 7→ D stores, for a process p ∈ P and a variable x ∈ X ,
the latest value ν (p, x) that has been written on x by p.
Non-atomicity. In order to satisfy the different dependencies between events, we need to
keep track of the contexts where they are initialized and committed. One aspect of our
translation is that it only needs to keep track of the context where the latest read or write
event on a given variable in a given process is initialized or committed. The mapping
iW : P × X 7→ [1..K] defines, for p ∈ P and x ∈ X , the context iW (p, x) where the latest
write event on x by p is initialized. The mapping cW : P ×X 7→ [1..K] is defined in a similar
manner for committing (rather than initializing) write events. Furthermore, we define similar
mappings iR and cR for read events. The mapping iReg : R 7→ [1..K] gives, for a register
$r ∈ R, the initializing context iReg ($r) of the latest read event loading a value to $r. For
an expression exp, we define iReg (exp) := max {iReg ($r) | $r ∈ R (exp)}. The mapping
cReg : R 7→ [1..K] gives the context for committing (rather than initializing) of the read
events. We extend cReg from registers to expressions in a similar manner to iReg. Finally,
the mapping ctrl : P 7→ [1..K] gives, for a process p ∈ P, the committing context ctrl (p)
of the latest aci event in p.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 15
Algorithm 1: JinitProcKK.
1 for p ∈ P ∧ x ∈ X do
2 iR (p, x)← 1; cR (p, x)← 1; iW (p, x)← 1;
3 cW (p, x)← 1; ν (p, x)← 0; µ (p, x, 1)← 0;
4 for q ∈ P do α (p, x, 1) (q)← 〈2, 1〉 ;
5 for p ∈ P do
6 ctrl (p)← 1;
7 for $r ∈ R do
8 iReg ($r)← 1; cReg ($r)← 1;
9 for p ∈ P ∧ x ∈ X ∧ k ∈ [2..K] do
10 for q ∈ P do
11 αinit (p, x, k) (q)← gen
(
K
ÀÁ
)
;
12 α (p, x, k)← αinit (p, x, k);
13 µinit (p, x, k)← gen (D);
14 µ (p, x, k)← µinit (p, x, k);
15 for k ∈ [1..K] do
16 active (k)← gen (P);
17 cnt ← 1;
Initializing Process. Algorithm 1 shows the initializing process. The for-loop of lines 1, 5
and 7 define the values of the initializing and committing data structures for the variables
and registers together with ν (p, x), µ (p, x, 1), α (p, x, 1) and ctrl (p) for all p ∈ P and
x ∈ X . The for-loop of line 9 defines the initial values of α and µ at the start of each context
k ≥ 2 (as described above). The for-loop of line 15 chooses an active process to execute in
each context. The current context variable cnt is initialized to 1.
Write Instructions. Consider a write instruction i of a process p ∈ P whose stmt (i) is of
the form x ← exp. The translation of i is shown in Algorithm 2. The code simulates an
event e executing i, by encoding the effects of the inference rules Init-Write, Com-Write
and Prop that initialize, commit, and propagate a write event respectively. The translation
consists of three parts, namely guessing, checking and update.
Guessing. We guess the initializing and committing contexts for the event e, together with
its timestamp. In line 1, we guess the context where the event e will be initialized, and store
the guess in iW (p, x). Similarly, in line 3, we guess the context where the event e will be
committed, and store the guess in cW (p, x) (having stored its old value in the previous line).
In the for-loop of line 4, we guess a timestamp for e and store it in β. This means that, for
each process q ∈ P , we guess the context where the event e will be propagated to q and we
store this guess in β (q).
Checking. We perform sanity checks on the guessed values in order to verify that they are
consistent with the POWER semantics. Lines 6–8 perform the sanity checks for iW (p, x).
In lines 6, we verify that the initializing context of the event e is not smaller than the
current context. This captures the fact that initialization happens after fetching of e. Line
16 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Algorithm 2: Jx← expKp,Write
K
.
// Guess
1 iW (p, x)← gen ([1..K]);
2 old-cW← cW (p, x);
3 cW (p, x)← gen ([1..K]);
4 for q ∈ P do
5 β (q)← gen
(
K
ÀÁ
)
;
// Check
6 assume (iW (p, x) ≥ cnt);
7 assume (active (iW (p, x)) = p);
8 assume (iW (p, x) ≥ iReg (exp));
9 assume (cW (p, x) ≥ iW (p, x));
10 assume(cW (p, x) ≥ max{cReg (exp) , ctrl (p) , cR (p, x) , old-cW});
11 for q ∈ P do
12 if q = p then
13 assume
(
β (q) ∈ KÀ ∧ β (q)↓ = cW (p, x)
)
14 if q 6= p then
15 assume(β (q)∈ KÀ =⇒ β (q)↓ ≥ cW (p, x));
16 if β (q) ∈ KÀ then
17 assume (α (q, x, β (q)↓) v β);
18 assume(active (β (q)↓) = p);
19 else assume (∃k : 1≤k≤K : βvα (q, x, k)) ;
// Update
20 for q ∈ P do
21 if β (q) ∈ KÀ then
22 α (q, x, β (q)↓)← α (q, x, β (q)↓)⊕ β;
23 µ (q, x, β (q)↓)← exp;
24 ν (p, x)← exp;
7 verifies that initialization happens in a context where p is active. In line 8, we check
whether WrInitCnd in the rule Init-Write is satisfied. To do that, we verify that the data
dependency order ≺data holds. More precisely, we find, for each register $r that occurs in
exp, the initializing context of the latest read event loading to $r. We make sure that the
initializing context of e is later than the initializing contexts of all these read events. By
definition, the largest of all these contexts is stored in iReg (exp).
Lines 9–10 perform the sanity checks for cW (p, x). In line 9, we check the committing
context of the event e is at least as large as its initializing context. In line 10, we check that
ComCnd in the rule Com-Write is satisfied. To do that, we check that the committing context
is larger than (i) the committing context of all the read events from which the registers
in the expression exp fetch their values (to satisfy the data dependency order ≺data, in a
similar manner to that described for initialization above), (ii) the committing contexts of the
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 17
latest read and write events on x in p, i.e., cR (p, x) and cW (p, x) (to satisfy the per-location
program order ≺poloc), and (iii) the committing context of the latest aci event in p, i.e.,
ctrl (p) (to satisfy the control order ≺ctrl).
The for-loop of line 11 performs three sanity checks on β. In line 12, we verify that the
event e is propagated to p in the same context as the one where it is committed. This is
consistent with the rule Com-Write which requires that when a write event is committed
then it is immediately propagated to the committing process. In line 14, we verify that if the
event e is propagated to a process q (different from p), then the propagation takes place in
a context later than or equal to the one where e is committed. This is to be consistent with
the fact that a write event is propagated to other processes only after it has been committed.
In line 17, we check that guessed timestamp of the event e does not cause a violation of
the coherence order ≺co. To do that, we consider each process q ∈ P to which e will be
propagated (i.e., β (q) ∈ KÀ). The timestamp of e should be larger than the timestamp of
any other write event e′ on x that has been propagated to q up to the current point (since e
should be larger in coherence order than e′). Notice that by construction the timestamp of
the largest such event e′ is currently stored in α (q, x, β (q)). Moreover, in line 18, we check
that the event is propagated to q in the context where p is active. Line 19 checks that for
the case the event is never propagated to q (i.e. β (q) ∈ KÁ), q will receive a coherence
successor of this event in some context.
Updating. The for-loop of line 20 uses the values guessed above for updating the global data
structure α. More precisely, if the event e is propagated to a process q, i.e., β (q) ∈ KÀ,
then we add β to the summary of the timestamps of the sequence of write operations on x
propagated to q up to the current point in the context β (q). Lines 23–24 assign the value
exp to µ (p, x, β (q)) and ν (p, x) respectively. Recall that the former stores the value defined
by the latest write event on x propagated to q up to the current point in the context β (q),
and the latter stores the value defined by the latest write on x by p.
Read Instructions. Consider a read instruction i in a process p ∈ P whose stmt (i) is of
the form $r ← x. The translation of i is shown in Algorithm 3. The code simulates an event
e executing i by encoding the three inference rules Local-Read, Prop-Read, and Com-Read.
In a similar manner to a write instruction, the translation scheme for a read instruction
consists of guessing, checking and update parts. Notice however that the initialization of
the read event is carried out through two different inference rules.
Guessing. In line 1, we store the old value of iR (p, x). In line 2, we guess the context where
the event e will be initialized, and store the guessed context both in iR (p, x) and iReg ($r).
Recall that the latter records the initializing context of the latest read event loading a value
to $r. In lines 3–4, we execute similar instructions for committing (rather than initializing).
Checking. Lines 5–9 perform the sanity checks for iR (p, x). Lines 5–6 check that the
initializing context for the event e is not smaller than the current context and that the
initialization happens in a context where p is active. Line 7 ensures that at least one of the
two inference rules Local-Read and Prop-Read is satisfied, by checking that the closest write
event CW (c, e) (if it exists) has been initialized or committed. In line 8, we satisfy RdCnd in
the rule Com-Read. Lines 9–11 perform the sanity checks for cR (p, x) in a similar manner to
the corresponding instructions for write events (see above).
18 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Algorithm 3: J$r ← xKp,Read
K
.
// Guess
1 old-iR← iR (p, x);
2 iR (p, x)← gen ([1..K]); iReg ($r)← iR (p, x);
3 old-cR← cR (p, x);
4 cR (p, x)← gen ([1..K]); cReg ($r)← cR (p, x);
// Check
5 assume (iR (p, x) ≥ cnt);
6 assume (active (iR (p, x)) = p);
7 assume (iR (p, x) ≥ iW (p, x));
8 assume(iR (p, x) ≥ cW (p, x) =⇒ α (p, x, old-iR) v α (p, x, iR (p, x)));
9 assume (cR (p, x) ≥ iR (p, x));
10 assume (active (cR (p, x)) = p);
11 assume(cR (p, x) ≥ max {ctrl (p) , old-cR, cW (p, x)});
// Update
12 if iR (p, x)<cW (p, x) then $r ← ν (p, x) ;
13 else $r ← µ (p, x, iR (p, x)) ;
Algorithm 4: JverProcKK.
1 for p ∈ P ∧ x ∈ X ∧ k ∈ [1..K− 1] do
2 assume
(
α (p, x, k) = αinit (p, x, k + 1)
)
;
3 assume
(
µ (p, x, k) = µinit (p, x, k + 1)
)
;
4 if λ is reachable then error ;
Updating. The purpose of the update part (the if-statement of line 12) is to ensure that
the correct read-from relation is defined as described by the inference rules Local-Read and
Prop-Read. If iR (p, x) < cW (p, x), then this means that the latest write event e′ on x by p
is not committed and hence, according to Local-Read, the event e reads its value from that
event. Recall that this value is stored in ν (p, x). On the other hand, if iR (p, x) ≥ cW (p, x)
then the event e′ has been committed and hence, according to Prop-Read, the event e reads
its value from the latest write event on x propagated to p in the context where e is initialized.
We notice that this value is stored in µ (p, x, iR (p, x)).
Verifying Process. The verifying process makes sure that the updated value α of the
timestamp at the end of a given context k : 1 ≤ k ≤ K− 1 is equal to the corresponding
guessed value αinit at the start of the next context. It also performs the corresponding test
for the values written to variables (by comparing µ and µinit). Finally, it checks whether we
reach an error label λ (given in the reachability problem) or not.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 19
Prog ::= var x∗ (proc p reg $r∗ i∗)∗
i ::= λ : s;
s ::= $r←x | $r← [exp]
| x←exp | [exp′]←exp
| assume exp
| if exp then i∗ else i∗
| while exp do i∗ | term
| sync | lwsync | isync
Figure 6. Syntax of concurrent programs includes the address operators
and sychronization instructions. The additional statements are written in
blue.
4. Extending the Semantics: Address Operators and Synchronisation
Instructions
In this section, we give the syntax of concurrent programs and the POWER operational
semantics taking into account address operators and synchronization instructions as for-
malized in [25, 46, 45]. We also give an example of a small program that illustrates how
synchronization instructions work under the POWER semantics.
Syntax. Fig. 6 gives the grammar containing address operators and synchronisation instruc-
tions. The additional statements are written in blue.
The address operators are used in read and write instructions. We assume that all
shared variables have unique addresses. Memory accessing instructions use the notation
[exp] to denote the memory location where the address is given by the value of the expression
exp. A read statement of the form $r← [exp] loads the value stored in the memory location
given by the value of the expression exp to the register $r. A write statement of the form
[exp′] ← exp stores the value of the expression exp to the memory location given by the
value of the expression exp′.
There are three kinds of synchronisation (or fence or memory barrier) statements, namely
sync, lwsync, and isync. Intuitively, the synchronization instructions are used to enforce
the committing order between read and/or write instructions or the propagation ordering
between write instructions. We will explain in detail the semantics of the synchronisation
instructions in the Configurations, Transition Relation, and Example paragraphs.
We recall and extend several definitions that we will use in the extended POWER
operational semantics.
We keep the definitions of the instruction set I, lbl (i), stmt (i), R (i), next (i), Tnext (i),
and Fnext (i) as in Section 2.
We extend the definitions of the functions var (i) and exp (i) to cover the address
operators as follows. (i) For a write instruction i where stmt (i) is of the form x← exp or
a read instruction i where stmt (i) is of the form $r ← x, we define var (i) := x. (ii) For a
write instruction i where stmt (i) is of the form [exp′]← exp or a read instruction i where
stmt (i) is of the form $r ← [exp], we define var (i) := >. Intuitively, this means that the
variable in stmt (i) is undetermined. (iii) For an instruciton i that is neither write nor read,
we define var (i) := ⊥. Then, we define exp (i). (i) For a write instruction i where stmt (i)
is of the form x← exp or [exp′]← exp or an aci instruction i where stmt (i) is of the form
20 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
assume exp or if exp then i∗ else i∗ or while exp do i∗, we define exp (i) := exp. (ii) For
an instruction i that is neither write nor aci, we define exp (i) := ⊥.
Given an instruction i, we define addr (i) to be the address function in i as follows. (i)
For a write instruction i where stmt (i) is of the form [exp′]←exp we define addr (i) := exp′.
(ii) For a read instruction i where stmt (i) is of the form $r← [exp], we define addr (i) := exp.
(iii) For a write instruction i where stmt (i) is of the form x← exp or a read instruction i
where stmt (i) is of the form $r←x, we define addr (i) to be a constant that is the address of
the variable x. (iii) For an instruction i that is neither write nor read, we define addr (i) := ⊥.
Configurations. We assume that the set E contains synchonization events. Similar to the
semantics in Section 2, we present the execution of an instruction by an event through several
steps, namely fetching, initializing, committing, and propagating. For the special case of a
synchronization instruction, it is first fetched and then committed without being initialized.
Furthermore, after a sync or lwsync instruction is committed, it will be propagated to the
other processes. A configuration c is a tuple
〈E,≺, ins, status, rf, Prop, SyncProp, SeenWr, SeenSyncs,≺co〉
defined as follows.
Events. We keep the definitions of E, Ep, ins(e), proc (e), E
W, ER, EACI, EWp, E
R
p and E
ACI
p
as in Section 2. Moreover, we use ESS, ELS and EIS to denote the set of sync events, lwsync
events, and isync events respectively. We define ESSp , E
LS
p , and E
IS
p to be the restrictions of
the above sets to Ep.
Program Order, Status, Propagation, Read-From, Coherence Order. We keep the definitions
of ≺, status, Prop, rf, ≺co as in Section 2.
Synchronisation Propagation. The function SyncProp : P 7→ 2ESS∪ELS defines, for a process
p ∈ P , the set of sync and lwsync events propagated to p. In contrast to a write event, there
is no global view about the order in which sync and lwsync events are propagated (that is
presented by the coherence order for write events). Moreover, a sync or lwsync event will be
propagated to all processes in the system.
Seen Writes. The function SeenWr : (ESS ∪ ELS) × X 7→ EW defines, for a sync or lwsync
event and a variable x ∈ X , the last write event on x that has been propagated to the
process committing the synchonization event. In [46], for a sync or lwsync event e, the set
{SeenWr(e, x)|x ∈ X} is called the Group A writes of the event e.
Seen Synchronisations. The function SeenSyncs : EW 7→ 2ESS∪ELS defines, for a write event,
the set of sync and lwsync events that have been propagated to the process committing the
write event.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 21
Dependencies. To formalize the POWER operational semantics, we need to define the
dependency orders on the set of events. We keep the definition of ≺ctrl as in Section 2.
Below we extend the orders ≺poloc and ≺data. We also introduce the address dependency
order ≺addr.
We define the per-location program-order ≺poloc⊆ E×E such that e1 ≺poloc e2 if e1 ≺ e2,
and var (ins (e1)) = var (ins (e2)) ∈ X or var (ins (e1)) = > or var (ins (e2)) = >, i.e. it
is the restriction of ≺ to events with identical or undetermined variables.
We define the data dependency order ≺data such that e1 ≺data e2 if (i) e1 ∈ ER, i.e., e1
is a read event; (ii) e2 ∈ EW ∪EACI, i.e., e2 is either a write or an aci event; (iii) e1 ≺ e2;
(iv) stmt (ins (e1)) is of the form $r ← x or $r ← [exp]; (v) stmt (ins (e2)) is of the form
x ← exp, [exp′] ← exp, if exp then i∗ else i∗, or while exp do i∗ and $r ∈ R (exp); and
(vi) there is no e3 ∈ ER such that e1 ≺ e3 ≺ e2 and stmt (ins (e3)) is of the form $r ← y or
$r ← [exp].
We define the address dependency order ≺addr such that e1 ≺addr e2 if (i) e1 ∈ ER, i.e.,
e1 is a read event; (ii) e2 ∈ ER ∪ EW, i.e., e2 is either a read or write event; (iii) e1 ≺ e2;
(iv) stmt (ins (e1)) is of the form $r ← x or $r ← [exp]; (v) stmt (ins (e2)) is of the form
$r ← [exp] such that $r ∈ R (exp) or of the form [exp′] ← exp such that $r ∈ R (exp′);
and (vi) there is no e3 ∈ ER such that e1 ≺ e3 ≺ e2 and stmt (ins (e3)) is of the form
$r ← y or $r ← [exp]. Intuitively, the loaded value by e1 is used to compute the address
addr (ins (e2)).
Committed and Initial Configurations. We keep the definitions of a committed configuration
and C as in Section 2. The initial configuration cinit is defined by
〈∅, ∅, λe.⊥, λe.⊥, λe.⊥, λp.λx.einitx , λp.∅, λe.λx.⊥, λe.∅, ∅〉
Evaluation Functions. We keep the definitions of the functions Val (c, e, exp) and Val (c, e)
as in Section 2.
Let e be an event and c be a configuration. We define Var(c, e) to be the variable
whose address is given by Val (c, e, addr (ins (e))). Note that if addr (ins (e)) = ⊥, then
Var(c, e) = ⊥. Moreover, if addr (ins (e)) 6= ⊥ and Val (c, e, addr (ins (e))) = ⊥, then
Var(c, e) = >. Intuitively, it means that the event e is accessing (i.e. reading or writing) to
an undetermined variable.
The relations between var (ins (e)) and Var(c, e) can be seen by considering different
forms of the statement stmt (ins (e)) as follows.
• If stmt (ins (e)) is of the form $r ← x or x← exp, then var (ins (e)) = Var(c, e) = x.
• If stmt (ins (e)) is of the form $r ← [exp] or [exp′] ← exp, then var (ins (e)) = > and
Var(c, e) ∈ X ∪ {>}.
• If stmt (ins (e)) is neither a read nor write statement, then var (ins (e)) = Var(c, e) = ⊥.
Transition Relation. The relation −→ taking into account the address operators and syn-
chronization instructions is defined by the set of inference rules shown in Fig. 7. Analogously
to Section 2, we define different transition rules for fetching, initializing, committing, and
propagating in the next paragraphs. Let c be the configuration where we are executing a
transition rule. We keep the rule Fetch as in Section 2. Below we explain other rules for
initializing, commiting, and propagating. Table 2 give all predicates that are extended or
introduced.
22 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Predicate Definition Meaning
e ∈ ER :
RdInitCnd (c, e)
∀e′ ∈ ER : (e′ ≺addr e) =⇒ (status (e′) = init) All read events preceding ein ≺addr have already been
initialized.
e ∈ EW :
WrInitCnd (c, e)
∀e′ ∈ ER :((
e
′ ≺data e
)∨(e′ ≺addr e) =⇒ (status (e′)=init))
All read events preceding e
in ≺data or ≺addr have
already been initialized.
e ∈ ER :
RdCnd (c, e)
∀e′ ∈ ER :
(
e
′ ≺poloc e
) ∧ (Var (c, e′) = Var (c, e) )
=⇒(
rf
(
e
′) co rf (e) )

For all read event e′
preceding the read e in
≺poloc (with the same
defined variable), the write
event from which e reads
its value is not a coherence
predecessor the write event
for e′.
e ∈ E :
ComCnd (c, e)
∀e′ ∈ E :

(
e
′ ≺data e
) ∨ (e′ ≺ctrl e) ∨ (e′ ≺addr e)
∨((
e
′ ≺poloc e
)∧(Var (c, e′) ∈ {Var (c, e) ,>} ))

=⇒(
status
(
e
′) = com)

All events preceding e in
≺data, ≺ctrl, ≺addr, or
≺poloc (with the same
defined variable or
undetermined variable)
have already been
committed.
e ∈ E :
PropSyncs (c, e)
∀e′ ∈ ESS :
(
e
′ ≺ e)
=⇒(∀p ∈ P : e′ ∈ SyncProp (p) )

All sync events preceding
e in ≺ have already been
propagated to all processes
in the system.
e ∈ E :
ComLwsyncs (c, e)
∀e′ ∈ ELS : (e′ ≺ e) =⇒ (status (e′) = com) All lwsync eventspreceding e in ≺ have
already been committed.
e ∈ E :
ComIsyncs (c, e)
∀e′ ∈ EIS : (e′ ≺ e) =⇒ (status (e′) = com) All isync events precedinge in ≺ have already been
committed.
e ∈ E :
AllSyncCnd (c, e)
PropSyncs (c, e) ∧ ComLwsyncs (c, e) ∧ ComIsyncs (c, e)
A conjunction of
PropSyncs (c, e),
ComLwsyncs (c, e), and
ComIsyncs (c, e).
e ∈ ESS ∪ELS :
SeenWrCnd (c, e, p)
∀x ∈ X : SeenWr (e, x) co Prop (p, x)
For each seen write of e,
that write (or some
coherence successor) has
already been propagated
to p.
e ∈ EW :
SeenSyncCnd (c, e, p)
∀e′ ∈ SeenSyncs (e) : e′ ∈ SyncProp (p)
All seen synchronizations
of e have already been
propagated to p.
e ∈ ESS ∪ELS :
ComRdWrCnd (c, e)
∀e′ ∈ ER ∪EW : (e′ ≺ e) =⇒ (status (e′) = com)
All read and write events
preceding e in ≺ have
already been committed.
e ∈ EIS :
AddrRdWrCnd (c, e)
∀e′ ∈ ER ∪EW : (e
′ ≺ e)
=⇒
(∀e′′ ≺addr e′ : status
(
e
′′) = com)

All events that provides
the value for address
expressions in all read and
write events preceding e in
≺ have already committed.
Table 2. Definitions of predicates taking into account of the address op-
erators and synchronization instructions. We omit the predicates that are
identical to Section 2.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 23
e 6∈ E, ≺′=≺ ∪{〈e′, e〉 | e′ ∈ Ep}, i ∈ MaxI (c, p)
c
p−→ 〈E ∪ e,≺′, ins[e← i], status[e← fetch], rf, Prop, SyncProp, SeenWr, SeenSyncs,≺co〉
Fetch
e ∈ ERp, status (e) = fetch, RdInitCnd (c, e),
e
′ = CW (c, e), status (e′) = init, AllSyncCnd (c, e)
c
p−→ 〈E,≺, ins, status[e← init], rf[e← e′], Prop, SyncProp, SeenWr, SeenSyncs,≺co〉
Local-Read
e ∈ ERp, status (e) = fetch, RdInitCnd (c, e), AllSyncCnd (c, e),
(CW (c, e) = ⊥) ∨ (e′ = CW (c, e) ∧ status (e′) = com)
c
p−→ 〈E,≺, ins, status[e← init], rf[e← Prop (p, Var (c, e))], Prop, SyncProp, SeenWr,
SeenSyncs,≺co〉
Prop-Read
e ∈ ERp, status (e) = init, ComCnd (c, e), RdCnd (c, e), AllSyncCnd (c, e)
c
p−→ 〈E,≺, ins, status[e← com], rf, Prop, SyncProp, SeenWr, SeenSyncs,≺co〉
Com-Read
e ∈ EWp, status (e) = fetch, WrInitCnd (c, e), AllSyncCnd (c, e)
c
p−→ 〈E,≺, ins, status[e← init], rf, Prop, SyncProp, SeenWr, SeenSyncs,≺co〉
Init-Write
e ∈ EWp, x = Var (c, e), status (e) = init, ComCnd (c, e),
AllSyncCnd (c, e), ≺′co=≺co ∪{〈e′, e〉 | e′ co Prop (p, x)}
c
p−→ 〈E,≺, ins, status[e← com], rf, Prop[〈p, x〉 ← e], SyncProp, SeenWr,
SeenSyncs[e← SyncProp (p)],≺′co〉
Com-Write
q ∈ P, e ∈ EWp, status (e) = com, Prop (q, Var (c, e)) ≺co e,
SeenSyncCnd (c, e, q), ≺′co=≺co ∪{〈e′, e〉 | e′ co Prop (q, x)}
c
p−→ 〈E,≺, ins, status, rf, Prop[〈q, Var (c, e)〉 ← e], SyncProp, SeenWr, SeenSyncs,≺′co〉
Prop
e ∈ EACIp , status (e) = fetch, ComCnd (c, e), ValidCnd (c, e), AllSyncCnd (c, e)
c
p−→ 〈E,≺, ins, status[e← com], rf, Prop, SyncProp, SeenWr, SeenSyncs,≺co〉
Com-ACI
e ∈ EISp , ComCnd (c, e), AllSyncCnd (c, e), AddrRdWrCnd (c, e)
c
p−→ 〈E,≺, ins, status[e← com], rf, Prop, SyncProp, SeenWr, SeenSyncs,≺co〉
Com-ISync
e ∈ ESSp ∪ELSp , ComCnd (c, e), AllSyncCnd (c, e), ComRdWrCnd (c, e)
c
p−→ 〈E,≺, ins, status[e← com], rf, Prop, SyncProp[p← SyncProp (p) ∪ {e}],
SeenWr[〈e, x〉 ← Prop (p, x)], SeenSyncs ≺co〉
Com-Sync
q ∈ P, e ∈ ESSp ∪ELSp , status (e) = com, SeenWrCnd (c, e, q)
c
p−→ 〈E,≺, ins, status, rf, Prop, SyncProp[q ← SyncProp (q) ∪ {e}], SeenWr, SeenSyncs,≺co〉
Prop-Sync
Figure 7. Inference rules with synchronizations and address operators
defining the relation
p−→ where p ∈ P. We assume that c is of the form
〈E,≺, ins, status, rf, Prop,≺co〉.
24 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Similar to Section 2, there are two ways in which read events get their values, namely
either from local write events by the rule Local-Read or from write events that are propagated
to the process by the rule Prop-Read. In the rule Local-Read, the process p initializes a
read event e ∈ ERp on a variable Var (c, e) (say x), where e has already been fetched. We
note that by satisfying predicate RdInitCnd (c, e), if var (ins (e)) = >, then Var(c, e) ∈ X ,
i.e. the variable from which e is reading has been defined. Here, the event e is made to
read its value from a local write event e′ ∈ EWp on x such that (i) e′ has been initialized
but not yet committed, and (ii) e′ is the closest write event that precedes e in ≺poloc (note
that we have extended the definition of ≺poloc to cover the address operators.) By condition
(ii) e′ is unique if it exists. To formalize this, we extend the definition of the Closest Write
function CW (c, e) by taking into account the address operator. We define CW (c, e) := e′
where e′ is the unique event such that e′ ∈ EWp, e′ ≺poloc e, Var (c, e′) ∈ {x,>}, and there is
no event e′′ such that e′′ ∈ EWp, e′ ≺poloc e′′ ≺poloc e, and Var (c, e′′) ∈ {x,>}. If CW (c, e)
does not exist or it has been committed, then we use the rule Prop-Read to let e fetch its
value from the latest write event on x that has been propagated to p. Both rules Local-Read
and Prop-Read can only be performed for a read event e ∈ ERp if e satisfies the predicates
AllSyncCnd (c, e)1 and RdInitCnd (c, e).
To commit an initialized read event e ∈ ERp, we use the rule Commit-Read. The
rule can be performed if e satisfies three predicates in c: RdCnd (c, e), ComCnd (c, e), and
AllSyncCnd (c, e).
To initialize a fetched write event e ∈ EWp, we use the rule Init-Write. The rule can be
performed if e satisfies the predicates WrInitCnd (c, e) and AllSyncCnd (c, e).
The rule Com-Write to commit a write event is similar to the corresponding rule in
Section 2, except that we also keep information about all the seen sync and lwsync events of
the write event by updating SeenSyncs.
Write events are propagated to other processes by the Prop rule. Taking into account
the synchronization instructions, the rule Prop requires that all the seen sync and lwsync
events of the write event e have been propagated to process q. This condition is formulated
by the predicate SeenSyncCnd (c, e, q).
An aci event is committed by the rule Com-ACI that is kept as in Section 2.
We explain the transition rules for synchronization events. To commit and propagate a
sync or lwsync event, we use the rules Com-Sync and Prop-Sync respectively. To commit an
isync event, we use the rule Com-ISync. These rules require the five predicates ComCnd (c, e),
AllSyncCnd (c, e), AddrRdWrCnd (c, e), ComRdWrCnd (c, e), and SeenWrCnd (c, e, p) to hold.
When a sync or lwsync event in a process p ∈ P is committed, it is also immediately
propagated to p itself. Moreover, we keep information about all the seen write events of the
sync or lwsync event by updating SeenWr.
Bounded Reachability. We keep the definitions of the run pi, last (pi), the complete
configuration, the complete run, pi ↑, the context, the K-bounded run, the reachability
problem, and the K-bounded reachability problem as in Section 2.
Example 4.1. We give an example of a small concurrent program to illustrate how sync
and lwsync instructions work under the POWER semantics.
1The semantics of lwsync is formalized as in [45] (page 5): a read event can only be initialized if all lwsync
events preceding it in ≺ have already been committed.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 25
var x, y
proc p1
reg $r1
0 : x← 1;
1 : sync;
2 : $r1←y;
3 : assume $r1 = 0;
4 : assume 1;
5 : term;
proc p2
reg $r2
6 : y ← 1;
7 : sync;
8 : $r2←x;
9 : assume $r2 = 0;
10 : assume 1;
11 : term;
Figure 8. A variant of the SB (Store Buffer) program [14].
Event Instruction
e1 0 : x←1
e2 2 : $r1←y
e3 6 : y←1
e4 8 : $r2←x
e
′ 1 : sync
e
′′ 7 : sync
Figure 9. Read, write, and synchronization events in the program in Fig. 8.
Fig. 8 illustrates a program that is written following the syntax in Fig. 6. The program
has two processes P = {p1, p2} communicating through two variables X = {x, y}. Moreover,
process p1 (resp. p2) has a register $r1 (resp. $r2). At the beginning, all the variables and
registers are initialized to 0. Process p1 has two instructions: writing 1 to x (event e1)
and reading y (event e2). Between these two instructions, p1 executes a sync instruction
(event e′). Similarly, process p2 has two instructions, writing 1 to y (event e3) and reading
x (event e4), and a sync instruction (event e
′′) between these two instructions. In the read
operation, process p3 loads the initial value 0 from y (line 2) to register $r1. If p1 can do
that, it reaches the label of line 3. In a similar way to p1, process p2 loads the initial value 0
from x to register $r2.
The reachability problem under POWER asks whether processes p1 and p2 can reach
the labels of lines 4 and 8 respectively at the same time. This reachability problem has a
negative answer according to the POWER semantics [46, 45].
We explain the negative result of the reachability problem using the transition rules in
Fig. 7. In order to initialize the read event e2, p1 must satisfy the predicate AllSyncCnd (c2, e2)
for some c2 ∈ C (see the rule Prop-Read) by propagating its sync event (e′) to itself and p2.
To propagate e′ to p2, all seen write events of e′ must also be propagated to p2 (see the rule
Prop-Sync). The seen write of e′ for x is the write event e1 since e1 must be committed and
propagated to p1 before e
′ can be committed (see the rule Com-Sync). It means that e2 can
only be initialized after the write e1 has already been propagated to p2. Similarly, e4 can
only be initialized after the write event e3 has already propagated to p1. As a consequence,
26 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
at least one of two processes must see the written value 1 from the variable that it wants to
read. In other words, it is not possible to allow both processes to load the initial values.
If we replace the two sync instructions by two lwsync instructions, the reachability
problem has a positive answer. The reason is that in order to intitialize e2, p1 only need
to commit its lwsync event e′ without propagating it to p2 (see the rule Prop-Read). To
commit e′, p1 only need to commit the write event e1 and can delay the propagation of
e1 to p2 (see the rule Com-Sync). It means that e2 can be initialized before the write e1 is
propagated to p2. Similarly, e4 can be initialized before the write e3 is propagated to p1.
As a consequence, it is possible for both processes to see the initial value from variables that
they want to read.
5. Translation with Address Operators and Synchronization Instructions
In this section, we give the extension of our algorithm in Section 3 that reduces the K-
bounded reachability problem under POWER to the corresponding problem under SC for
concurrent programs taking into account of the address operators and synchronization
instructions.
Below, we present an extended scheme for the translation, our extended data structures,
and the translated code for different types of the instructions.
Scheme. Fig. 10 gives our translation scheme that transforms a program Prog into a pro-
gram Prog• following the map function J.KK. Let P and X be the sets of processes and
(shared) variables in Prog . Similar to Section 3, the map J.KK replaces the variables of Prog
by O(|P|·K) copies of the set X , in addition to a finite set of finite-data structures (explained
and formally defined in Data Structures paragraph). The definition of initProc (resp.
verProc) will be given in Initializing Process (resp. Verifying Process) paragraph.
Analogously to Section 3, the map function J.KK adds for each instruction i appearing in
Prog the code activeCnt, the translation for stmt (i), and finally the code closeCnt. The
translations of write, read, sync, lwsync, isync statements will be described in Write In-
structions, Read Instructions, Sync Instructions, Lwsync Instructions, and Isync
Instructions respectively.
Data Structures. We keep the data structures
µ (|P|, |X |,K) , µinit (|P|, |X |,K) , α (|P|, |X |,K) ,
αinit (|P|, |X |,K) , ν (|P|, |X |) , iR (|P|, |X |) ,
cR (|P|, |X |) , iW (|P|, |X |) , cW (|P|, |X |) , iReg (|R|) ,
cReg (|R|) , ctrl (|P|) , active (K) , cnt
as in Section 3. The translations of read and write instructions taking into account the
address opearators can be extended from the corresponding translations in Section 3 by
using these data structures. Below, we explain our added data structures to handle the
synchronization instructions. The additional data structures are written in blue in Fig. 10.
Similar to the write events, we associate a timestamp with each sync or lwsync event.
A synchronization timestamp τsync is a mapping P 7→ [1..K]. For a process p ∈ P, the
value of τsync (p) of a given sync or lwsync event represents the context where the event is
propagated to p. In contrast to write events, a sync or lwsync event always be propagated
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 27
JProgKK def= var x∗ 〈addvars〉K 〈iniProc〉K
〈verProc〉
K
(Jproc p reg $r∗ i∗KK)∗
〈addvars〉
K
def
= µ (|P|, |X |,K) µinit (|P|, |X |,K)
α (|P|, |X |,K) αinit (|P|, |X |,K)
ν (|P|, |X |) iR (|P|, |X |) cR (|P|, |X |)
iW (|P|, |X |) cW (|P|, |X |) iReg (|R|)
cReg (|R|) ctrl (|P|) active (K) cnt
γ (|P|,K) γinit (|P|,K)
sync (|P|) lsync (|P|) isync (|P|)
ack (|P|) max-addr-cR (|P|)
〈iniProc〉
K
def
= JiniProcKK
〈verProc〉
K
def
= JverProcKKJproc p reg $r∗ i∗KK def= proc p reg $r∗ (JiKpK)∗JiKp
K
def
= λ : 〈activeCnt〉p
K
JsKp
K
〈closeCnt〉p
K
〈activeCnt〉p
K
def
= assume (active (cnt) = p)
〈closeCnt〉p
K
def
= cnt ← cnt + gen ([0..K− 1]) ;
assume(cnt ≤ K)J$r ← [exp]Kp
K
def
= J$r ← [exp]Kp,Read
KJ$r ← xKp
K
def
= J$r ← xKp,Read
KJx← expKp
K
def
= Jx← expKp,Write
KJ[exp′]← expKp
K
def
= J[exp′]← expKp,Write
KJassume expKp
K
def
= assume exp; 〈control〉p
KJif exp then i∗ def= if exp then (JiKp
K
)∗
else i∗Kp
K
else (JiKp
K
)∗; 〈control〉p
KJwhile exp do i∗Kp
K
def
= while exp do (JiKp
K
)∗; 〈control〉p
K
〈control〉p
K
def
= ctrl (p)←ctrl (p)+gen ([0..K−1]);
assume(ctrl (p) ≤ K)JtermKp
K
def
= termJsyncKp
K
def
= JsyncKp,Sync
KJlwsyncKp
K
def
= JlwsyncKp,Lwsync
KJisyncKp
K
def
= JisyncKp,Isync
K
Figure 10. Translation map J.KK with the address operators and synchro-
nization instructions. We omit the label of an intermediary instruction when
it is irrelevant. The additional variables are written in blue.
to all processes in the system, i.e. 1 ≤ τsync (p) ≤ K for all p ∈ P. We use T to denote the
set of timestamps for both write events and synchronization events and keep the order v
and the summary operator ⊕ on T as in Section 3.
Our simulation observes the sequence of sync and lwsync events received by a process in
each context. Similar to the write events, the simulation will initially guess and later verify
28 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
the summaries of the timestamps of such a sequence. This is done using data structures
γinit and γ. The mapping γinit : P × [1..K] 7→ [P → [1..K]] stores, for process p ∈ P and a
context k : 1 ≤ k ≤ K, an initial guess γinit (p, k) of the summary of the timestamps of the
sequence of synchronization events propagated to p up to the start of the context k. Starting
from a given initial guess for a given context k, the timestamp is updated successively using
the sequence of synchronization events propagated to p in k. The result is stored using the
mapping γ : P × [1..K] 7→ [P → [1..K]]. More precisely, we initially set the value of γ to
γinit . Each time a new sync or lwsync event e is created by p in a context k, we guess the
timestamp δ of e, and then update γ (p, k) by computing its summary with δ. Thus, given a
point in a context k, γ (p, k) contains the summary of the timestamps of the whole sequence
of synchronization events that have been propagated to p up to that point. At the end of
the simulation, we verify, for each context k : 1 ≤ k < K, that the value of γ at the end of
the context k is identical to the value of γinit for the next context k + 1.
Furthermore, we use four data structures to keep track of the contexts where the
synchronization events are committed and propagated. The mapping sync : P 7→ [1..K],
lwsync : P 7→ [1..K], and isync : P 7→ [1..K] give, for a process p ∈ P, the committed
contexts sync (p), lsync (p), and isync (p) of the latest sync, lwsync, and isync events in p
respectively. We use ack : P 7→ [1..K] to store, for a process p ∈ P , the maximal propagating
context ack (p) of all sync events in p.
We also use max-addr-cR : P 7→ [1..K] to store, for a process p ∈ P, the maximal
committing context max-addr-cR (p) of all read events that provide the values for some
address expressions in some ≺-successor events. This function will be used to simulate the
predicate AddrRdWrCnd in the rule Com-ISync.
Initializing Process. Algorithm 5 shows the initializating process. The process initializes
all data structures that will be used in the simulation program Prog• in a similar way to
Section 3.
Write Instructions. Consider a write instruction i of a process p ∈ P whose stmt (i) is of
the form x← exp or [exp′]← exp. Below we use x to present the variable in the instruction i
(that can be addressed by the value of exp′). The translation of i is shown in Algorithm 6 and
Algorithm 7. Similar to Section 3, the code simulates an event e executing i, by preforming
of three parts, namely guessing, checking, and update.
We mention the major changes in the translation of a write instruction in Algorithm 6
and Algorithm 7. In line 8, we check whether WrInitCnd in the rule Init-Write holds by
verifying that the dependencies ≺data and ≺addr is respected. More precisely, we find, for
each register $r that occurs in R (i), the initializing context of the latest read event loading
to $r. We make sure that the initializing context of e is later than the initializing contexts
of all these read events. By definition, the largest of all these contexts is stored in iReg (exp)
if stmt (i) is x = exp or iReg (exp + exp′) if stmt (i) is [exp′] = exp. In line 9, we check
whether AllSyncCnd in the rule Init-Write is satisfied. In line 11, we check that ComCnd in
the rule Com-Write is satisfied by verifying that the committing context is larger than (i) the
committing context of all the read events from which the registers in R (i) fetch their values
(to satisfy the dependencies ≺data and ≺addr in a similar manner to that described for the
initialization rule), (ii) the committing contexts of the latest read and write events on x in
p, i.e., cR (p, x) and cW (p, x) (to satisfy the per-location program order ≺poloc), and (iii) the
committing context of the latest aci event in p, i.e., ctrl (p) (to satisfy the control order
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 29
Algorithm 5: JinitProcKK.
1 for p ∈ P ∧ x ∈ X do
2 iR (p, x)← 1; cR (p, x)← 1; iW (p, x)← 1; cW (p, x)← 1;
3 ν (p, x)← 0; µ (p, x, 1)← 0;
4 for q ∈ P do
5 α (p, x, 1) (q)← 〈2, 1〉;
6 for p ∈ P do
7 sync (p)← 1; lsync (p)← 1; isync (p)← 1;
8 ctrl (p)← 1; ack (p)← 1; max-addr-cR (p)← 1;
9 for q ∈ P do
10 γ (p, 1) (q)← 1;
11 for $r ∈ R do
12 iReg ($r)← 1; cReg ($r)← 1;
13 for p ∈ P ∧ x ∈ X ∧ k ∈ [2..K] do
14 for q ∈ P do
15 αinit (p, x, k) (q)← gen
(
K
ÀÁ
)
;
16 α (p, x, k)← αinit (p, x, k);
17 µinit (p, x, k)← gen (D);
18 µ (p, x, k)← µinit (p, x, k);
19 for p ∈ P ∧ k ∈ [2..K] do
20 for q ∈ P do
21 γinit (p, k) (q)← gen ([1..K]);
22 γ (p, k)← γinit (p, k);
23 for k ∈ [1..K] do
24 active (k)← gen (P);
25 cnt ← 1;
≺ctrl). We note that by the checking in lines 9–10, we guarantee the predicate AllSyncCnd
in the rule Com-Write. The for-loop of line 12 performs three sanity checks on β in a similar
way to Section 3, except that we add line 17 to guarantee SeenSyncCnd in the rule Prop.
If the write instruction contain the address operator, we update max-addr-cR (p) in lines
27-28 in Algorithm 7 to keep information about the maximal committing context information
about the maximal committing context of all read events that provide the values for the
registers in R (exp ′).
Read Instructions. Consider a read instruction i of a process p ∈ P whose stmt (i) is of
the form $r ← x or $r ← [exp]. Below we use x to present the variable in the instruction i
(that can be addressed by the value of exp). The translation of i is shown in Algorithm 8
and Algorithm 9. In a similar manner to a write instruction, the code simulates an event e
executing i by performing the three parts: guessing, checking, and update.
We mention the major changes in the translation of a read instruction in Algorithm 8 and
Algorithm 9. In line 8, we check whether the predicate RdInitCnd in the rules Local-Read
30 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Algorithm 6: Jx← expKp,Write
K
.
// Guess
1 iW (p, x)← gen ([1..K]);
2 old-cW← cW (p, x);
3 cW (p, x)← gen ([1..K]);
4 for q ∈ P do
5 β (q)← gen
(
K
ÀÁ
)
;
// Check
6 assume (iW (p, x) ≥ cnt);
7 assume (active (iW (p, x)) = p);
8 assume (iW (p, x) ≥ iReg (exp));
9 assume (iW (p, x) ≥ max {ack (p) , lsync (p) , isync (p)});
10 assume (cW (p, x) ≥ iW (p, x));
11 assume(cW (p, x) ≥ max{cReg (exp) , ctrl (p) , cR (p, x) , old-cW});
12 for q ∈ P do
13 if q = p then
14 assume
(
β (q) ∈ KÀ ∧ β (q)↓ = cW (p, x)
)
;
15 if q 6= p then
16 assume(β (q) ∈ KÀ =⇒ β (q)↓ ≥ cW (p, x));
17 assume
(
β (q) ∈ KÀ =⇒ β (q)↓ ≥ γ (p, β (p)↓) (q)
)
;
18 if β (q) ∈ KÀ then
19 assume (α (q, x, β (q)↓) v β);
20 assume (active (β (q)↓) = p);
21 else assume (∃k : 1 ≤ k ≤ K : β v α (q, x, k)) ;
// Update
22 for q ∈ P do
23 if β (q) ∈ KÀ then
24 α (q, x, β (q)↓)← α (q, x, β (q)↓)⊕ β;
25 µ (q, x, β (q)↓)← exp;
26 ν (p, x)← exp;
and Prop-Read hold by verifying that the dependency ≺addr is respected. (this line is empty
in Algorithm 8.) More precisely, we find, for each register $r that occurs in R (i), the
initializing context of the latest read event loading to $r. We make sure that the initializing
context of e is later than the initializing contexts of all these read events. In line 9, we
check whether AllSyncCnd in the rules Local-Read and Prop-Read is satisfied. In line 13,
we check that ComCnd in the rule Com-Read is satisfied by verifying that the committing
context is larger than (i) the committing context of all the read events from which the
registers in R (i) fetch their values (to satisfy the dependency ≺addr), (ii) the committing
contexts of the latest read and write events on x in p, i.e., cR (p, x) and cW (p, x) (to satisfy
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 31
Algorithm 7: J[exp′]← expKp,Write
K
.
// Guess
1 iW (p, [exp′])← gen ([1..K]);
2 old-cW← cW (p, [exp′]);
3 cW (p, [exp′])← gen ([1..K]);
4 for q ∈ P do
5 β (q)← gen
(
K
ÀÁ
)
;
// Check
6 assume (iW (p, [exp′]) ≥ cnt);
7 assume (active (iW (p, [exp′])) = p);
8 assume (iW (p, [exp′]) ≥ iReg (exp + exp′));
9 assume (iW (p, [exp′])≥max{ack (p),lsync (p),isync (p)});
10 assume (cW (p, [exp′]) ≥ iW (p, [exp′]));
11 assume(cW (p, [exp′]) ≥ max{cReg (exp + exp′) , ctrl (p) , cR (p, [exp′]) , old-cW});
12 for q ∈ P do
13 if q = p then
14 assume
(
β (q) ∈ KÀ ∧ β (q)↓ = cW (p, [exp′])
)
;
15 if q 6= p then
16 assume(β (q) ∈ KÀ =⇒ β (q)↓ ≥ cW (p, [exp′]));
17 assume
(
β (q) ∈ KÀ =⇒ β (q)↓ ≥ γ (p, β (p)↓) (q)
)
;
18 if β (q) ∈ KÀ then
19 assume (α (q, [exp′], β (q)↓) v β);
20 assume (active (β (q)↓) = p);
21 else assume (∃k : 1 ≤ k ≤ K : β v α (q, [exp′], k)) ;
// Update
22 for q ∈ P do
23 if β (q) ∈ KÀ then
24 α (q, [exp′], β (q)↓)← α (q, [exp′], β (q)↓)⊕ β;
25 µ (q, [exp′], β (q)↓)← exp;
26 ν (p, [exp′])← exp;
27 if max-addr-cR (p) < cReg (exp′) then
28 max-addr-cR (p)← cReg (exp′);
the per-location program order ≺poloc), and (iii) the committing context of the latest aci
event in p, i.e., ctrl (p) (to satisfy the control order ≺ctrl). We note that by the checking
in lines 9 and 11, we guarantee the predicate AllSyncCnd in the rule Com-Read.
If the read instruction contain the address operator, we update max-addr-cR (p) in lines
16–17 in Algorithm 9 to keep information about the maximal committing context of all read
events that provide the values for the registers in R (exp).
32 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Algorithm 8: J$r ← xKp,Read
K
.
// Guess
1 old-iR← iR (p, x);;
2 iR (p, x)← gen ([1..K]); iReg ($r)← iR (p, x);;
3 old-cR← cR (p, x);;
4 cR (p, x)← gen ([1..K]); cReg ($r)← cR (p, x);;
// Check
5 assume (iR (p, x) ≥ cnt);;
6 assume (active (iR (p, x)) = p);;
7 assume (iR (p, x) ≥ iW (p, x));;
8 // An intended blank line;
9 assume (iR (p, x) ≥ max {ack (p) , lsync (p) , isync (p)});;
10 assume(iR (p, x) ≥ cW (p, x) =⇒ α (p, x, old-iR) v α (p, x, iR (p, x)));;
11 assume (cR (p, x) ≥ iR (p, x));;
12 assume (active (cR (p, x)) = p);;
13 assume(cR (p, x) ≥ max {ctrl (p) , old-cR, cW (p, x)});;
// Update
14 if iR (p, x) < cW (p, x) then $r ← ν (p, x); ;
15 else $r ← µ (p, x, iR (p, x)); ;
Sync Instructions. Consider a sync instruction i of a process p ∈ P whose stmt (i) is of
the form sync. The translation of i is shown in Algorithm 10. The code simulates an event e
running i by encoding the two inference rules Com-Sync and Prop-Sync. In a similar manner
to write and read instructions, the translation scheme for a sync instruction consists of three
parts: guessing, checking, and update.
Guessing. We guess the committing contexts for the event e, together with its timestamp.
In line 1, we guess the context where the event e will be committed. In the for-loop of line 4,
we guess a timestamp for e and store it in δ. This means that, for each process q ∈ P, we
guess the context where the event e will be propagated to q and we store this guess in δ (q).
Checking. We perform sanity checks on the guessed values in order to verify that they are
consistent with the POWER semantics. Lines 4–8 perform the sanity checks for sync (p).
In lines 4-5, we verify that the committing context for e is not smaller than the current
context. This captures the fact that commitment happens after fetching of e. It also verifies
that commitment happens in a context where p is active. In line 6, we check whether
ComCnd in the rule Com-Sync is satisfied. To do that, we check that the committing context
is larger than the committing context of the latest aci event in p, i.e., ctrl (p) (to satisfy
the control dependency order ≺ctrl). Note that ≺data and ≺poloc (with identical variables)
are not defined for a sync event. In line 7, we check that AllSyncCnd in the rule Com-Sync
is satisfied. In line 8, we check that ComRdWrCnd in the rule Com-Sync is satisfied.
The for-loop of line 9 performs three sanity checks on β. In line 10, we verify that e is
propagated to p in the same context as the one where it is committed. This is consistent with
the rule Com-Sync which requires that when a sync event is committed then it is immediately
propagated to the committing process. In line 11, we verify that the context where e is
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 33
Algorithm 9: J$r ← [exp]Kp,Read
K
.
// Guess
1 old-iR← iR (p, [exp]);
2 iR (p, [exp])← gen ([1..K]); iReg ($r)← iR (p, [exp]);
3 old-cR← cR (p, [exp]);
4 cR (p, [exp])← gen ([1..K]); cReg ($r)← cR (p, [exp]);
// Check
5 assume (iR (p, [exp]) ≥ cnt);
6 assume (active (iR (p, [exp])) = p);
7 assume (iR (p, [exp]) ≥ iW (p, [exp]));
8 assume (iR (p, [exp]) ≥ iReg (exp));
9 assume (iR (p, [exp])≥max{ack (p),lsync (p),isync (p)});
10 assume(iR (p, [exp])≥cW (p, [exp]) =⇒ α (p, [exp], old-iR)vα (p, [exp], iR (p, [exp])));
11 assume (cR (p, [exp]) ≥ iR (p, [exp]));
12 assume (active (cR (p, [exp])) = p);
13 assume(cR (p, [exp]) ≥ max {cReg (exp) , ctrl (p) , old-cR, cW (p, [exp])});
// Update
14 if iR (p, [exp]) < cW (p, [exp]) then $r ← ν (p, [exp]) ;
15 else $r ← µ (p, [exp], iR (p, [exp])) ;
16 if max-addr-cR (p) < cReg (exp) then
17 max-addr-cR (p)← cReg (exp);
propagated to a process q (different from p) is later than or equal to the one where e is
committed. This is to be consistent with the fact that a sync event is propagated to other
processes only after it has been committed. In lines 13–14, we check whether SeenWrCnd in
the rule Prop-Sync is satisfied. Moreover, in line 15, we check that the event is propagated
in the contexts where p is active.
Updating. The for-loop of line 16 uses the timestamp guessed above for updating the global
data structure γ. More precisely, when the event e is propagated to a process q, we add δ
to the summary of the timestamps of the sequence of synchronization events propagated to
q up to the current point in the context δ (q). In the loop in line 18, we update ack (p) to
keep track of the maximal propagating context of all sync events of p.
Lwsync Instructions. Consider a lwsync instruction i of a process p ∈ P whose stmt (i)
is of the form lwsync. The translation of i is shown in Algorithm 11. The code simulates
an event e executing i by encoding the two inference rules Com-Sync and Prop-Sync. In a
similar manner to a sync instruction, the translation scheme for a lwsync instruction consists
of three parts: guessing, checking, and update.
Guessing. We guess the committing context for the event e together with its timestamp.
In line 2, we guess the context where the event e will be committed (having stored its old
value in the previous line). In the for-loop of line 3, we guess a timestamp for e and store it
in δ. This means that, for each process q ∈ P, we guess the context where the event e will
be propagated to q and we store this guess in δ (q).
34 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Algorithm 10: JsyncKp,Sync
K
.
// Guess
1 sync (p)← gen ([1..K]);
2 for q ∈ P do
3 δ (q)← gen ([1..K]);
// Check
4 assume (sync (p) ≥ cnt);
5 assume (active (sync (p)) = p);
6 assume(sync (p) ≥ max{ctrl (p)});
7 assume(sync (p) ≥ max {ack (p) , lsync (p) , isync (p)});
8 assume (∀x ∈ X : sync (p) ≥ max {cR (p, x) , cW (p, x)});
9 for q ∈ P do
10 if q = p then assume (δ (q) = sync (p)) ;
11 if q 6= p then
12 assume(δ (q) ≥ sync (p));
13 for x ∈ X do
14 assume (α (p, x, δ (p)) v α (q, x, δ (q)));
15 assume (active (δ (q)) = p);
// Update
16 for q ∈ P do
17 γ (q, δ (q))← γ (q, δ (q))⊕ δ;
18 for q ∈ P do
19 if ack (p) < δ (q) then ack (p)← δ (q) ;
The checking and update parts in the translation for a lwsync instruction are similar to
the corresponding parts in the translation for a sync instruction, except that we do not need
to update ack (p) (that is only used for sync events).
Isync Instructions. Consider an isync instruction i of a process p ∈ P whose stmt (i) is
of the form isync. The translation of i is shown in Algorithm 12. The code simulates an
event e running i by encoding the inference rule Com-ISync. In contrast to the transitions
for write, read, sync, and lwsync instructions, the translation scheme for a isync instruction
only consists of two parts: guessing and checking.
Guessing. In line 2, we guess the context where the event e will be committed (having stored
its old value in the previous line).
Checking. We perform sanity checks on the guessed values in order to verify that they are
consistent with the POWER semantics. Lines 3-7 perform the sanity checks for isync (p).
In lines 3-4, we verify that the committing context for e is not smaller than the current
context. This captures the fact that commitment happens after fetching of e. It also verifies
that commitment happens in a context where p is active. In line 5, we check whether ComCnd
in the rule Com-ISync is satisfied. To do that, we check that the committing context is larger
than the committing context of the latest aci event in p, i.e., ctrl (p) (to satisfy the control
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 35
Algorithm 11: JlwsyncKp,Lwsync
K
.
// Guess
1 old-lsync← lsync (p);
2 lsync (p)← gen ([1..K]);
3 for q ∈ P do
4 δ (q)← gen ([1..K]);
// Check
5 assume (lsync (p) ≥ cnt);
6 assume (active (lsync (p)) = p);
7 assume(lsync (p) ≥ max{ctrl (p)});
8 assume(lsync (p) ≥ max {ack (p) , old-lsync, isync (p)});
9 assume (∀x ∈ X : lsync (p) ≥ max {cR (p, x) , cW (p, x)});
10 for q ∈ P do
11 if q = p then assume (δ (q) = lsync (p)) ;
12 if q 6= p then
13 assume(δ (q) ≥ lsync (p));
14 for x ∈ X do
15 assume (α (p, x, δ (p)) v α (q, x, δ (q)));
16 assume (active (δ (q)) = p);
// Update
17 for q ∈ P do
18 γ (q, δ (q))← γ (q, δ (q))⊕ δ;
Algorithm 12: JisyncKp,Isync
K
.
// Guess
1 old-isync← isync (p);
2 isync (p)← gen ([1..K]);
// Check
3 assume (isync (p) ≥ cnt);
4 assume (active (isync (p)) = p);
5 assume(isync (p) ≥ max{ctrl (p)});
6 assume(isync (p) ≥ max {ack (p) , lsync (p) , old-isync});
7 assume(isync (p) ≥ max-addr-cR (p);
order ≺ctrl). Note that ≺data and ≺poloc (with identical variables) are not defined for an
isync event. In line 6, we check that AllSyncCnd in the rule Com-ISync is satisfied. In line
7, we check that AddrRdWrCnd in the rule Com-ISync is satisfied.
Verifying Process. In Algorithm 13, the verifying process makes sure that the updated
value α of the timestamp of write events for each pair of process and variable at the end of
a given context is equal to the guessed value αinit at the start of the next context. It also
36 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Algorithm 13: JverProcKK.
1 for p ∈ P ∧ x ∈ X ∧ k ∈ [1..K− 1] do
2 assume
(
α (p, x, k) = αinit (p, x, k + 1)
)
;
3 assume
(
µ (p, x, k) = µinit (p, x, k + 1)
)
;
4 for p ∈ P ∧ k ∈ [1..K− 1] do
5 assume
(
γ (p, k) = γinit (p, k + 1)
)
;
6 if λ is reachable then error ;
make sure that the updated value γ of the timestamp of synchronization events for each
process at the end of a given context is equal to the guessed value γinit at the start of the
next context. Moreover, the verifier process performs the corresponding test for the values
written to the variables (by comparing µ and µinit). Finally, it checks whether we reach an
error label λ (given in the reachability problem) or not.
6. Experimental Results
In order to evaluate the efficiency of our approach, we have implemented a context-bounded
model checker for programs under POWER, called Power2SC. We use CBMC version 5.1 [21]
as the backend tool. However, observe that our code-to-code translation can be implemented
on the top of any backend tool that provides safety verification of concurrent programs
running under the SC semantics.
C/Pthreads Benchmarks. In the following, we present the evaluation of Power2SC on
28 C/pthreads benchmarks collected from Goto-instrument [12], Nidhugg [8], Memorax [7],
and the SV-COMP17 benchmark suit [1]. These are widespread medium-sized benchmarks
that are used by many tools for analyzing concurrent programs running under weak memory
models (e.g. [32, 15, 19, 13, 54, 2, 50, 17, 3, 11, 28, 10, 6, 18]).
We divide our results in two sets. The first set concerns unsafe programs while the second
set concerns safe ones. In both parts, we compare results obtained from Power2SC to the
ones obtained from Goto-instrument and Nidhugg, which are, to the best of our knowledge,
the only two tools supporting C/pthreads programs under POWER2. All experiments were
run on a machine equipped with a 2.4 Ghz Intel x86-32 Core2 processor and 4 GB RAM.
Table 3a shows that Power2SC performs well in detecting bugs compared to the other
tools for most of the unsafe examples. We observe that Power2SC manages to find all the
errors using at most 6 contexts while Nidhugg and Goto-instrument time out to return the
errors for several examples. This also confirms that few context switches are sufficient to find
bugs. Table 3b demonstrates that our approach is also effective when we run safe programs.
Power2SC manages to run most of the examples (except Dijkstra and Lamport) using the
same context bounds as in the case of their respective unsafe examples. While Nidhugg
and Goto-instrument time out for several examples, they do not impose any bound on the
number of context switches while Power2SC does.
2CBMC previously supported POWER [13], but has withdrawn support in later versions.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 37
Table 3. Comparing Â Power2SC with À Goto-instrument and Á Nid-
hugg on two sets of benchmarks: (A) unsafe and (B) safe (with manually
inserted synchronizations). The LB column indicates whether the tools were
instructed to unroll loops up to a certain bound. The CB column gives the
context bound for Power2SC. The program size is the number of code lines.
A t/o entry means that the tool failed to complete within 1800 seconds. The
best running time (in seconds) for each benchmark is given in bold font.
(a)
Program/size LB
À Á Â
time time time CB
Bakery/76 [7] 8 226 t/o 1 3
Burns/74 [7] 8 t/o t/o 1 3
Dekker/82 [1] 8 t/o t/o 1 2
Sim Dekker/69 [7] 8 12 t/o 1 2
Dijkstra/82 [7] 8 t/o t/o 5 3
Szymanski/83 [1] 8 t/o t/o 1 4
Fib bench 0/36 [1] - 2 1101 6 6
Lamport/109 [1] 8 t/o 1 1 3
Peterson/76 [1] 8 25 1056 1 3
Peterson 3/96 [7] 8 t/o 1 3 4
Pgsql/69 [12] 8 1079 1 1 2
Pgsql bnd/71 [8] - t/o 1 1 2
Tbar 2/75 [7] 8 16 1 1 3
Tbar 3/94 [7] 8 104 1 1 3
(b)
Program/size LB
À Á Â
time time time CB
Bakery/85 [7] 8 t/o t/o 70 3
Burns/79 [7] 8 t/o t/o 1018 3
Dekker/88 [1] 8 t/o t/o 1158 2
Sim Dekker/73 [7] 8 209 t/o 14 2
Dijkstra/88 [7] 8 t/o t/o t/o 3
Szymanski/93 [1] 8 t/o t/o 89 4
Fib bench 1/36 [1] - 9 t/o 5 6
Lamport/119 [1] 8 t/o t/o t/o 3
Peterson/84 [1] 8 928 t/o 7 3
Peterson 3/111 [7] 8 t/o t/o 348 4
Pgsql/73 [12] 8 1522 2 38 2
Pgsql bnd/75 [8] - t/o t/o 10 2
Tbar 2/80 [7] 8 t/o 332 29 3
Tbar 3/103 [7] 8 t/o t/o 138 3
38 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
Litmus Tests. We have also tested the performance of Power2SC with respect to the
verification of small litmus tests. Power2SC manages to successfully run all 913 litmus tests
published in [46]. Furthermore, the output result returned by Power2SC matches the ones
returned by the tool PPCMEM [46] in all the litmus tests.
7. Conclusions and Future Work
We have presented a method for solving the K-bounded reachability problem for concurrent
program running under the POWER semantics. To that end, we have presented a code-
to-code scheme that translates the input program into an output program whose size is
polynomial in the size of the input program, and that reaches the same set of process
states when run under the classical SC semantics. On the theoretical side, this shows the
decidability of the K-bounded reachability problem under POWER for finite-state programs.
On the practical side, our tool implementation demonstrates that the method is efficient
both in performance and in the ability to detect errors.
We aim at extending our framework to cover other models such as ARM [26, 43] and
C11 [16]. We also plan to consider other under-approximation techniques, and in particular
to consider notions of context that are different from the one we use in this paper.
References
[1] SV-COM17 benchmark suit. https://sv-comp.sosy-lab.org/2017/benchmarks.php, 2017.
[2] Parosh Aziz Abdulla, Stavros Aronis, Mohamed Faouzi Atig, Bengt Jonsson, Carl Leonardsson, and
Konstantinos F. Sagonas. Stateless model checking for TSO and PSO. In TACAS, volume 9035 of LNCS,
pages 353–367. Springer, 2015.
[3] Parosh Aziz Abdulla, Mohamed Faouzi Atig, Ahmed Bouajjani, and Tuan Phong Ngo. The benefits of
duality in verifying concurrent programs under TSO. In CONCUR, volume 59 of LIPIcs, pages 5:1–5:15.
Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016.
[4] Parosh Aziz Abdulla, Mohamed Faouzi Atig, Ahmed Bouajjani, and Tuan Phong Ngo. Context-bounded
analysis for POWER. In TACAS 2017, pages 56–74, 2017.
[5] Parosh Aziz Abdulla, Mohamed Faouzi Atig, Ahmed Bouajjani, and Tuan Phong Ngo. A Load-Buffer
Semantics for Total Store Ordering. Logical Methods in Computer Science, Volume 14, Issue 1, January
2018.
[6] Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Carl Leonardsson, and Ahmed Rezine.
Automatic fence insertion in integer programs via predicate abstraction. In SAS 2012, pages 164–180,
2012.
[7] Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Carl Leonardsson, and Ahmed Rezine.
Counter-example guided fence insertion under TSO. In TACAS 2012, volume 7214 of LNCS, pages
204–219. Springer, 2012.
[8] Parosh Aziz Abdulla, Mohamed Faouzi Atig, Bengt Jonsson, and Carl Leonardsson. Stateless model
checking for POWER. In CAV, volume 9780 of LNCS, pages 134–156. Springer, 2016.
[9] Parosh Aziz Abdulla, Mohamed Faouzi Atig, Bengt Jonsson, and Tuan Phong Ngo. Optimal stateless
model checking under the release-acquire semantics. PACMPL, 2(OOPSLA):135:1–135:29, 2018.
[10] Parosh Aziz Abdulla, Mohamed Faouzi Atig, Magnus L˚ang, and Tuan Phong Ngo. Precise and sound
automatic fence insertion procedure under PSO. In NETYS 2015, pages 32–47, 2015.
[11] Parosh Aziz Abdulla, Mohamed Faouzi Atig, and Ngo Tuan Phong. The best of both worlds: Trading
efficiency and optimality in fence insertion for TSO. In ESOP, volume 9032 of LNCS, pages 308–332.
Springer, 2015.
[12] J. Alglave, D. Kroening, V. Nimal, and M. Tautschnig. Software verification for weak memory via
program transformation. In ESOP, volume 7792 of LNCS, pages 512–532. Springer, 2013.
[13] J. Alglave, D. Kroening, and M. Tautschnig. Partial orders for efficient bounded model checking of
concurrent software. In CAV, volume 8044 of LNCS, pages 141–157, 2013.
CONTEXT-BOUNDED MODEL CHECKING FOR POWER 39
[14] Jade Alglave, Luc Maranget, and Michael Tautschnig. Herding cats: Modelling, simulation, testing, and
data mining for weak memory. ACM TOPLAS, 36(2):7:1–7:74, 2014.
[15] M. F. Atig, A. Bouajjani, and G. Parlato. Getting rid of store-buffers in TSO analysis. In CAV, volume
6806 of LNCS, pages 99–115. Springer, 2011.
[16] Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. Mathematizing C++ concur-
rency. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011, pages 55–66, 2011.
[17] Ahmed Bouajjani, Egor Derevenetc, and Roland Meyer. Checking and enforcing robustness against TSO.
In ESOP, volume 7792 of LNCS, pages 533–553. Springer, 2013.
[18] S. Burckhardt, R. Alur, and M. M. K. Martin. CheckFence: checking consistency of concurrent data
types on relaxed memory models. In PLDI, pages 12–21. ACM, 2007.
[19] Sebastian Burckhardt and Madanlal Musuvathi. Effective program verification for relaxed memory
models. In CAV, volume 5123 of LNCS, pages 107–120. Springer, 2008.
[20] Jacob Burnim, Koushik Sen, and Christos Stergiou. Testing concurrent programs on relaxed memory
models. In ISSTA, pages 122–132. ACM, 2011.
[21] Edmund M. Clarke, Daniel Kroening, and Flavio Lerda. A tool for checking ANSI-C programs. In
TACAS, volume 2988 of LNCS, pages 168–176. Springer, 2004.
[22] A. Marian Dan, Y. Meshman, M. T. Vechev, and E. Yahav. Predicate abstraction for relaxed memory
models. In SAS, volume 7935 of LNCS, pages 84–104. Springer, 2013.
[23] Andrei Dan, Yuri Meshman, Martin Vechev, and Eran Yahav. Effective abstractions for verification
under relaxed memory models. Computer Languages, Systems and Structures, 47, Part 1:62–76, 2017.
[24] Brian Demsky and Patrick Lam. Satcheck: Sat-directed stateless model checking for SC and TSO. In
OOPSLA 2015, pages 20–36. ACM, 2015.
[25] Egor Derevenetc and Roland Meyer. Robustness against Power is PSpace-complete. In ICALP (2),
volume 8573 of LNCS, pages 158–170. Springer, 2014.
[26] Shaked Flur, Kathryn E. Gray, Christopher Pulte, Susmit Sarkar, Ali Sezgin, Luc Maranget, Will
Deacon, and Peter Sewell. Modelling the armv8 architecture, operationally: concurrency and ISA. In
POPL 2016, pages 608–621, 2016.
[27] Shiyou Huang and Jeff Huang. Maximal causality reduction for TSO and PSO. In OOPSLA 2016, pages
447–461, 2016.
[28] Shiyou Huang and Jeff Huang. Maximal causality reduction for TSO and PSO. In OOPSLA 2016, pages
447–461, 2016.
[29] Omar Inverso, Ermenegildo Tomasco, Bernd Fischer, Salvatore La Torre, and Gennaro Parlato. Bounded
model checking of multi-threaded C programs via lazy sequentialization. In CAV 2014, pages 585–602,
2014.
[30] Michalis Kokologiannakis, Ori Lahav, Konstantinos Sagonas, and Viktor Vafeiadis. Effective stateless
model checking for C/C++ concurrency. PACMPL 2018, 2(POPL):17:1–17:32, 2018.
[31] Michael Kuperstein, Martin T. Vechev, and Eran Yahav. Automatic inference of memory fences. In
FMCAD, pages 111–119. IEEE, 2010.
[32] Michael Kuperstein, Martin T. Vechev, and Eran Yahav. Partial-coherence abstractions for relaxed
memory models. In PLDI, pages 187–198. ACM, 2011.
[33] Salvatore La Torre, P. Madhusudan, and Gennaro Parlato. Reducing context-bounded concurrent
reachability to sequential reachability. In CAV, volume 5643 of LNCS, pages 477–492. Springer, 2009.
[34] Ori Lahav and Viktor Vafeiadis. Explaining relaxed memory models with program transformations. In
FM 2016, pages 479–495, 2016.
[35] Akash Lal and Thomas W. Reps. Reducing concurrent analysis under a context bound to sequential
analysis. FMSD, 35(1):73–97, 2009.
[36] L. Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs.
IEEE Trans. Comp., C-28(9), 1979.
[37] Feng Liu, Nayden Nedev, Nedyalko Prisadnikov, Martin T. Vechev, and Eran Yahav. Dynamic synthesis
for relaxed memory models. In PLDI 2012, pages 429–440. ACM, 2012.
[38] Sela Mador-Haim, Luc Maranget, Susmit Sarkar, Kayvan Memarian, Jade Alglave, Scott Owens, Rajeev
Alur, Milo M. K. Martin, Peter Sewell, and Derek Williams. An axiomatic memory model for POWER
multiprocessors. In CAV, volume 7358, pages 495–512. Springer, 2012.
40 T.P. NGO, P.A. ABDULLA, M.F. ATIG, AND A. BOUAJJANI
[39] Madanlal Musuvathi and Shaz Qadeer. Iterative context bounding for systematic testing of multithreaded
programs. In PLDI, pages 446–455. ACM, 2007.
[40] Truc L. Nguyen, Bernd Fischer, Salvatore La Torre, and Gennaro Parlato. Lazy sequentialization for the
safety verification of unbounded concurrent programs. In ATVA 2016, pages 174–191, 2016.
[41] Brian Norris and Brian Demsky. A practical approach for model checking C/C++11 code. TOPLAS
2016, 38(3):10:1–10:51, 2016.
[42] Scott Owens, Susmit Sarkar, and Peter Sewell. A better x86 memory model: x86-tso. In TPHOLs,
volume 5674 of LNCS, pages 391–407. Springer, 2009.
[43] Christopher Pulte, Shaked Flur, Will Deacon, Jon French, Susmit Sarkar, and Peter Sewell. Simpli-
fying ARM concurrency: multicopy-atomic axiomatic and operational models for armv8. PACMPL,
2(POPL):19:1–19:29, 2018.
[44] Shaz Qadeer and Jakob Rehof. Context-bounded model checking of concurrent software. In TACAS,
volume 3440 of LNCS, pages 93–107. Springer, 2005.
[45] Susmit Sarkar, Kayvan Memarian, Scott Owens, Mark Batty, Peter Sewell, Luc Maranget, Jade Alglave,
and Derek Williams. Synchronising C/C++ and POWER. In PLDI 2012, Beijing, China - June 11 - 16,
2012, pages 311–322, 2012.
[46] Susmit Sarkar, Peter Sewell, Jade Alglave, Luc Maranget, and Derek Williams. Understanding POWER
multiprocessors. In PLDI, pages 175–186. ACM, 2011.
[47] P. Sewell, S. Sarkar, S. Owens, F. Z. Nardelli, and M. O. Myreen. x86-tso: A rigorous and usable
programmer’s model for x86 multiprocessors. CACM, 53, 2010.
[48] Ermenegildo Tomasco, Omar Inverso, Bernd Fischer, Salvatore La Torre, and Gennaro Parlato. Verifying
concurrent programs by memory unwinding. In TACAS 2015, pages 551–565, 2015.
[49] Ermenegildo Tomasco, Truc Nguyen Lam, Bernd Fischer, Salvatore La Torre, and Gennaro Parlato.
Embedding weak memory models within eager sequentialization. October 2016.
[50] Ermenegildo Tomasco, Truc Nguyen Lam, Omar Inverso, Bernd Fischer, Salvatore La Torre, and Gennaro
Parlato. Lazy sequentialization for tso and pso via shared memory abstractions. In FMCAD16, pages
193–200, 2016.
[51] Ermenegildo Tomasco, Truc Lam Nguyen, Bernd Fischer, Salvatore La Torre, and Gennaro Parlato.
Using shared memory abstractions to design eager sequentializations for weak memory models. In SEFM
2017, pages 185–202, 2017.
[52] Oleg Travkin and Heike Wehrheim. Verification of concurrent programs on weak memory models. In
ICTAC 2016, pages 3–24, 2016.
[53] Y. Yang, G. Gopalakrishnan, G. Lindstrom, and K. Slind. Nemos: A framework for axiomatic and
executable specifications of memory consistency models. In IPDPS. IEEE, 2004.
[54] N. Zhang, M. Kusano, and C. Wang. Dynamic partial order reduction for relaxed memory models. In
PLDI, pages 250–259. ACM, 2015.
This work is licensed under the Creative Commons Attribution License. To view a copy of this
license, visit https://creativecommons.org/licenses/by/4.0/ or send a letter to Creative
Commons, 171 Second St, Suite 300, San Francisco, CA 94105, USA, or Eisenacher Strasse
2, 10777 Berlin, Germany
