The Decidability of Verification under Promising 2.0 by Abdulla, Parosh Aziz et al.
The Decidability of Verification under Promising 2.0
PAROSH AZIZ ABDULLA, Uppsala University, Sweden
MOHAMED FAOUZI ATIG, Uppsala University, Sweden
ADWAIT GODBOLE, IIT Bombay, India
SHANKARA NARAYANAN KRISHNA, IIT Bombay, India
VIKTOR VAFEIADIS,MPI-SWS, Germany
In PLDI’20, Kang et al. introduced the promising semantics (PS 2.0) of the C++ concurrency that captures
most of the common program transformations while satisfying the DRF guarantee. The reachability problem
for finite-state programs under PS 2.0 with only release-acquire accesses (PS 2.0-ra) is already known to be
undecidable. Therefore, we address, in this paper, the reachability problem for programs running under PS 2.0
with relaxed accesses (PS 2.0-rlx) together with promises. We show that this problem is undecidable even in
the case where the input program has finite state. Given this undecidability result, we consider the fragment
of PS 2.0-rlx with a bounded number of promises. We show that under this restriction, the reachability is
decidable, albeit very expensive: it is non-primitive recursive. Given this high complexity for PS 2.0-rlx with
bounded number of promises and the undecidability result for PS 2.0-ra, we consider a bounded version of
the reachability problem. To this end, we bound both the number of promises and the “view-switches”, i.e, the
number of times the processes may switch their local views of the global memory. We provide a code-to-code
translation from an input program under PS 2.0, with relaxed and release-acquire memory accesses along
with promises, to a program under SC. This leads to a reduction of the bounded reachability problem under
PS 2.0 to the bounded context-switching problem under SC. We have implemented a prototype tool and tested
it on a set of benchmarks, demonstrating that many bugs in programs can be found using a small bound.
Additional Key Words and Phrases: Model-Checking, Weak Memory Models, Promising Semantics
1 INTRODUCTION
An important long-standing open problem in PL research has been to define a weak memory model
that captures the semantics of concurrent memory accesses in languages like Java and C/C++. A
model is considered good if it can be implemented efficiently (i.e., if it supports all usual compiler
optimizations and its accesses are compiled to plain x86/ARM/Power/RISCV accesses), and is easy
to reason about. After many attempts at solving this problem (e.g., [Batty et al. 2011; Crary and
Sullivan 2015; Jeffrey and Riely 2019; Lahav et al. 2017; Manson et al. 2005; Pichon-Pharabod and
Sewell 2016; Zhang and Feng 2013]), a breakthrough was achieved by Kang et al. [Kang et al. 2017],
who introduced the promising semantics. This was the first model that supported basic invariant
reasoning, the DRF guarantee, and even a non-trivial program logic [Svendsen et al. 2018].
In the promising semantics, the memory is modeled as a set of timestamped messages, each
corresponding to a write made by the program. Each process/thread records its own view of the
memory—i.e., the latest timestamp for each memory location that it is aware of. A message has
the form (x ,v, (f , t],V ) where x is a location, v a value to be stored for x , (f , t] is the timestamp
interval corresponding to the write and V is the local view of the process who made the write to x .
When reading from memory, a process can either return the value stored at the timestamp in its
view or advance its view to some larger timestamp and read from that message. When a process p
writes to memory location x , a new message with a timestamp larger than p’s view of x is created,
and p’s view is advanced to include the new message. In addition, in order to allow load-store
reorderings, a process is allowed to promise a certain write in the future. A promise is also added
Authors’ addresses: Parosh Aziz Abdulla, Uppsala University, Uppsala, Sweden; Mohamed Faouzi Atig, Uppsala University,
Uppsala, Sweden; Adwait Godbole, IIT Bombay, Mumbai, India; Shankara Narayanan Krishna, IIT Bombay, Mumbai, India;
Viktor Vafeiadis, MPI-SWS, Saarland Informatics Campus (SIC), Kaiserslautern and Saarbrücken, Germany.
ar
X
iv
:2
00
7.
09
94
4v
2 
 [c
s.P
L]
  2
1 J
ul 
20
20
1:2 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
as a message in the memory, except that the local view of the process is not updated using the
timestamp interval in the message. This is done only when the promise is eventually fulfilled.
A consistency check is used to ensure that every promised message can be certified (i.e., made
fulfillable) by executing that process on its own. Furthermore, this should hold from any future
memory (i.e., from any extension of the memory with additional messages). The quantification
prevents deadlocks (i.e., processes from making promises they are not able to fulfil). The promising
semantics generally allows program executions to contain unboundedly many concurrent promised
messages, provided that all of them can be certified. As one can immediately see, this is a fairly
complex model, and beyond its support for some basic reasoning patterns, it is not at all obvious
whether it is easy to reason about concurrent programs running under this model. Furthermore,
the unbounded number of future memories, that need to be checked, makes the verification of even
simple programs practically infeasible. Moreover, a number of transformations based on global
value range analysis as well as register promotion were not supported in [Kang et al. 2017].
To address the above concerns, a new version of the promising semantics PS 2.0 [Lee et al.
2020] has been proposed, by redesigning key components of the promising semantics [Kang et al.
2017]. Mainly, PS 2.0 supports register promotion and global value range analysis, while capturing
all features (thread local optimizations, DRF guarantees, hardware mappings) of the promising
semantics of [Kang et al. 2017]. PS 2.0 simplifies also the consistency check and instead of checking
the promise fulfilment from all future memories, PS 2.0 checks for promise fulfilment only from a
specially crafted extension of the current memory called capped memory. PS 2.0 also introduces
the notion of reservations, which allows a process to secure an timestamp interval in order to
perform a future atomic read-modify-write instruction. The reservation blocks any other message
from using that timestamp interval. Reservations allows register promotions.
The wide umbrella of features of PS 2.0 allowing two memory access modes, relaxed (rlx) and
release-acquire (ra) along with promises, reservations and subsequent certification make PS 2.0
a very complex model. While the PS 2.0 semantics is a breakthrough contribution, a natural and
fundamental question is to investigate the verification of concurrent programs under PS 2.0. For
that, investigating the decidability of verification problems as well as defining efficient analysis
techniques are two extremely important problems.
One of the problems addressed in this paper is to ivestigate the decidability of the reachability
problem for PS 2.0. Let PS 2.0-rlx and PS 2.0-ra represent respectively, the fragment of PS 2.0
allowing only relaxed (rlx) and release-acquire (ra) memory accesses. The reachability with only
ra accesses has been shown to be undecidable [Abdulla et al. 2019], even without the features of
promises and reservations. That leaves only the fragment PS 2.0-rlx of PS 2.0 for investigation. We
show that if unbounded number of promises is allowed, the reachability problem is undecidable in
PS 2.0-rlx, while it becomes decidable if we bound the number of promises at any time (however,
the total number of promises made with a run can be unbounded). Our undecidability is obtained
with just 2 threads, with an execution where the number of context switches between the two
processes is three, where a context is a computation segment in which one process is active. The
proof of decidability is done by proposing a new memory model with higher order words LoHoW,
and showing the equivalence of PS 2.0-rlx and LoHoW. Under the bounded promises assumption„
we use the decidability of the coverability problem of well structured transition systems (WSTS)
[Abdulla and Jonsson 1996; Finkel and Schnoebelen 2001] to show that the reachability problem
for LoHoW with bounded number of promises is decidable.
Given this high complexity forPS 2.0-rlxwith bounded number of promises and the undecidability
result for PS 2.0-ra [Abdulla et al. 2019], we consider a bounded version of the reachability problem.
To this end, we propose a parametric under-approximation in the spirit of context bounding
[Abdulla et al. 2019, 2017; Atig et al. 2011; La Torre et al. 2009; Lal and Reps 2009; Musuvathi and
The Decidability of Verification under Promising 2.0 1:3
Qadeer 2007; Norris and Demsky 2016; Qadeer and Rehof 2005]. The bounding concept chosen
for concurrent programs depends on aspects related to the interactions between the processes.
In the case of SC programs, context bounding has been shown experimentally to have extensive
behaviour coverage for bug detection [Musuvathi and Qadeer 2007; Qadeer and Rehof 2005]. A
context in the SC setting is a computation segment where only one process is active. The concept
of context bounding has been extended for weak memory models. For instance, in TSO, the notion
of context is extended to one where all updates to the main memory are done only from the buffer
of the active thread [Atig et al. 2011]. In the case of RA [Abdulla et al. 2019], context bounding was
extended to view bounding, using the notion of view-switching messages. Since PS 2.0 subsumes
RA, we propose a bounding notion that extends the view bounding proposed in [Abdulla et al. 2019].
Using this new bounding notion, we propose a source to source translation from programs under
PS 2.0 to context-bounded executions of the transformed program in SC. The main challenge in the
code-to-code translation of [Abdulla et al. 2019] was to keep track of the causality between different
variables. In our case, the challenge is fundamentally different and is to provide a procedure that
(i) handles different memory accesses rlx and ra, (ii) guesses the promises and reservations in
a non-deterministically manner, and (iii) verify that each promise so guessed is fulfilled using
the capped memory. This reduction is implemented in a tool, called PS2SC. Our experimental
results demonstrate the effectiveness of our approach. We exhibit cases where hard-to-find bugs
are detectable using a small view-bound K . Our tool displays resilience to trivial changes in the
position of bugs and the order of processes.
Related Work. The decidability of the verification problems for programs running under weak
memory models has been addressed for TSO [Atig et al. 2010], PS 2.0-ra [Abdulla et al. 2019],
Power [Abdulla et al. 2020], and for a subclass of PS 2.0-ra [Lahav and Boker 2020]. To the best
of our knoweldge, this the first time that this problem is investigated for PS 2.0-rlx and PS2SC
is the first tool for automated verification of programs under PS 2.0, which also works for the
promising semantics [Kang et al. 2017]. Most of the existing work concerns the development of
stateless model checking (SMC), coupled with (dynamic) partial order reduction techniques (e.g.,
[Abdulla et al. 2018; Kokologiannakis et al. 2017, 2019; Norris and Demsky 2013, 2016]) and do not
handle promises. Context-bounding has been proposed in [Qadeer and Rehof 2005] for programs
running under SC. This work has been extended in different directions and has led to efficient and
scalable techniques for the analysis of concurrent programs (see e.g., [Emmi et al. 2011; La Torre
et al. 2008, 2009, 2010; Lal and Reps 2009; Musuvathi and Qadeer 2007]). In the context of weak
memory models, context-bounded analysis has been only proposed to programs running under
TSO/PSO in [Atig et al. 2011; Tomasco et al. 2017] and under POWER in [Abdulla et al. 2017].
2 PRELIMINARIES
In this section, we introduce the simple programming language and the notation that will be used
throughout. Then, we review PS 2.0 definition, and present the model following [Lee et al. 2020].
2.1 Notations
Given two natural numbers i, j ∈ N s.t. i ≤ j , we use [i, j] to denote the set {k | i ≤ k ≤ j}. LetA and
B be two sets. We use f : A→ B to denote that f is a function from A to B. We define f [a 7→ b] to
be the function f ′ such that f ′(a) = b and f ′(a′) = f (a′) for all a′ , a. For a binary relation R, we
use [R]∗ to denote its reflexive and transitive closure. Given an alphabet Σ, we use Σ∗ (resp. Σ+)
to denote the set of possibly empty (resp. non-empty) finite words over Σ. Letw = a1a2 · · ·an be
a word over Σ, we use |w | to denote the length of w . Given an index i in [1, |w |], we use w[i] to
1:4 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
denote the ith letter ofw . Given two indices i and j s.t. 1 ≤ i ≤ j ≤ |w |, we usew[i, j] to denote the
word aiai+1 · · ·aj . Sometimes, we consider a word as a function from [1, |w |] to Σ.
2.2 Program Syntax
The simple programming language we use is described in Figure 1. A program Prog consists of a
set Loc of (global) variables or memory locations, and the definition of a set P of processes. Each
process p declares a set Reg (p) of (local) registers followed by a sequence of labeled instructions.
We assume that these sets of registers are disjoint and we use Reg := ∪pReg (p) to denote their
union. We assume also a (potentially unbounded) data domain Val from which the registers and
locations take values. All locations and registers are assumed to be initialized with the special value
0 ∈ Val (if not mentioned otherwise). An instruction i is of the form λ : s where λ is a unique label
and s is a statement. We use Lp to denote the set of all labels of the process p, and L =
⋃
p∈P Lp
the set of all labels of all processes. We assume that the execution of the process p starts always
with a unique initial instruction labeled by λpinit. A write instruction is of the form x
o = $r assigns
the value of register $r to the location x , and o denotes the access mode. If o = rlx, the write is a
relaxed write, while if o = ra, it is a release write. A read instruction $r = xo reads the value of the
location x into the local register $r . Again, if the access mode o = rlx, it is a relaxed read, and if
o = ra, it is an acquire read. Atomic updates or RMW instructions are either compare-and-swap
(CASor ,ow ) or FADDor ,ow . Both have a pair of accesses (or ,ow ∈ {rel, acq, rlx}) to the same location
– a read followed by a write. Following [Lee et al. 2020], FADD(x ,v) stores the value of x into a
register $r , and adds v to x , while CAS(x ,v1,v2) compares an expected value v1 to the value in x ,
and if the values are same, sets the value of x to v2. The old value of x is then stored in $r .
Prog ::= var x ∗(proc p | | . . . | |proc p)
proc p ::= Reg(p) i∗
i ::= λ : s
s ∈ St ::=
skip |s ; s |assume(x = e)
|do s∗ while e |while e do s∗done
|if e then s else s
|$r := e |$r := xo |xo := $r
|$r := FADDo,o (x, v)
|$r := CASo,o (x, v, v) |fencesc
o ∈ Mode ::= rlx |ra
Fig. 1. Syntax of concurrent programs.
A local assignment instruction $r = e assigns to the
register $r the value of e , where e is an expression over
a set of operators, constants as well as the contents of the
registers of the current process, but not referring to the
set of locations. The fence instruction SC-fence is used
to enforce sequential consistency if it is placed between
two memory access operations. Finally, the conditional,
assume and iterative instructions have the standard se-
mantics. For simplicity, we will write assume(x = e) in-
stead of $r = x ; assume($r = e). This notation is extended
in the straightforward manner to conditional statements.
2.3 The Promising Semantics
In this section, we recall the promising semantics [Lee et al. 2020]. We present here PS 2.0 with
three memory accesses, relaxed (this is the default mode), release writes (rel) and acquire reads
(acq). Read-modify-writes (RMW) instructions have two access modes - one for read and one for
write. We keep aside the release and acquire fences (and subsequent access modes) which are part
of PS 2.0, since they do not affect the results of this paper.
Timestamps. PS 2.0 uses timestamps to maintain a total order over all the writes to the same
variable. We assume an infinite set of timestamps Time, densely totally ordered by ≤, with 0 being
the minimum element. A view is a timestamp function V : Loc→ Time records the largest known
timestamp for each location. Let T be the set containing all the timestamp functions, along with
the special symbol ⊥. Let Vinit represent the initial view where all locations are mapped to 0. Given
two views V and V ′, we use V ≤ V ′ to denote that V (x) ≤ V ′(x) for x ∈ Loc. The merge operation
⊔ between the two views V and V ′ returns the pointwise maximum ofV and V ′, i.e., (V ⊔V ′)(y) is
The Decidability of Verification under Promising 2.0 1:5
the maximum of V (y) and V ′(y). Let I denote the set of all intervals over Time. The timestamp
intervals in I have the form (f , t] where either f = t = 0 or f < t , with f , t ∈ Time. Given an
interval I = (f , t] ∈ I, I .frm and I .to denote f , t respectively.
Memory. In PS 2.0, the memory is modelled as a set of concrete messages (which we just call
messages), and reservations. Each message represents the effect of a write or a RMW operation
and each reservation is a timestamp interval reserved for future use. In more detail, a messagem
is a tuple (x ,v, (f , t],V ) where x ∈ Loc, v ∈ Val, (f , t] ∈ I and V ∈ T. A reservation r is a tuple
(x , (f , t]). Note that a reservation, unlike a message, does not commit to any particular value, but
only specifies the interval which is reserved. We usem.loc (r .loc),m.val,m.to (r .to),m.frm (r .frm)
andm.View to denote respectively x , v , t , f and V . Two elements (either messages or reservations)
are said to be disjoint (m1#m2) if they concern different variables (m1.loc ,m2.loc) or their intervals
do not overlap (m1.to < m2.frm ∨m1.frm > m2.to). Two sets of elements M,M ′ are disjoint,
denoted M#M ′, ifm#m′ for everym ∈ M,m′ ∈ M ′. Two elementsm1,m2 are adjacent denoted
Adj(m1,m2) if m1.loc = m2.loc and m1.to = m2.frm. A memory M is a set of pairwise disjoint
messages and reservations. Let M˜ be the subset ofM containing only messages (no reservations).
For a location x , letM(x) be {m ∈ M | m.loc = x}. Given a viewV and a memoryM , we sayV ∈ M
if V (x) =m.to for some messagem ∈ M˜ for every x ∈ Loc. LetM denote the set of all memories.
Insertion into Memory. Following [Lee et al. 2020], a memoryM can be extended with a message
(due to the execution of a write/RMW instruction) or a reservationm withm.loc = x ,m.frm = f
andm.to = t in a number of ways:
[Additive insertion]M
A←↩m is defined only if (1)M#{m}; (2) ifm is a message, then no message
m′ ∈ M hasm′.loc = x andm′.frm = t ; and (3) ifm is a reservation, then there exists a message
m′ ∈ M˜ withm′.loc = x andm′.to = f . The extended memoryM A←↩m is thenM ∪ {m}.
[Splitting insertion] M
S←↩ m is defined if m is a message, and, if there exists a message m′ =
(x ,v ′, (f , t ′],V )with t < t ′ inM . ThenM is updated toM S←↩m = (M\{m′}∪{m, (x ,v ′, (t , t ′],V )}).
[Lowering Insertion] M
L←↩ m is only defined if there exists m′ in M that is identical to m =
(x ,v, (f , t],V ) except form.View ≤ m′.View. Then,M is updated toM L←↩m = M\{m′} ∪ {m}.
[Cancellation]M
C←↩m is defined ifm is a reservation inM . ThenM is updated asM \ {m}.
Transition System of a Process. Given a process p ∈ P, a state σ of p is defined as a
pair (λ,R) where λ ∈ L is the label of the next instruction to be executed by p and R :
Reg → Val maps each register of p to its current value. (Observe that we use the set of all
labels L (resp. registers Reg) instead of Lp (resp. Reg (p)) in the definition of σ just for the
sake of simplicity.) Transitions between the states of p are of the form (λ,R) t=⇒
p
(λ′,R′) with
t ∈ {ϵ, rd(o,x ,v), wt(o,x ,v), U(or ,ow ,x ,vr ,vw ), SC-fence | x ∈ Loc,v ∈ Val,o ∈ {rlx, ra}}. A
transition of the form (λ,R) rd(o,x,v)======⇒
p
(λ′,R′) denotes the execution of a read instruction of the
form xo = $r labeled by λ where (1) λ′ is the label of the next instructions that can be executed
after the execution of the instruction labelled by λ, and (2) R′ is the mapping that results from
the replacement of the value of the register $r in R by v . The transition relation (λ,R) t=⇒
p
(λ′,R′)
is defined in similar manner for the other cases of t where wrt(o,x ,v) stands for a write instruc-
tion that writes the value v to x , U(or ,ow ,x ,vr ,vw ) stands for a RMW that reads the value vr
from x and write vw to it, SC-fence stands for a SC-fence instruction, and ϵ stands for the ex-
ecution of the other local instructions. Observe that o,or ,ow are the access modes which can
1:6 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
Memory Helpers
(MEMORY : NEW)
(P, M ) m−→
(
P ′, M
A←↩m
)
MEMORY FULFIL
←↩ ∈
{
S←↩, L←↩
}
, P ′=P ←↩ m, M ′=M ←↩m
(P,M ) m−→(P ′ \{m }, M ′)
Process Helpers
m = (x ,−, (−, t],K) ∈ M V (x) ≤ t
o = rlx⇒ V ′ = V [x 7→ t]
o = ra⇒ V ′ = V [x 7→ t] ⊔ K
V
o,m−−−→
rd
V ′
m − (x ,−, (−, t],K) ∈ M,V (x) < t
o = rlx⇒ K = ⊥, o = ra⇒ P(x) = ∅ ∧ K = V ′
(P ,M) m−→ (P ′, M ′) V ′ = V [x 7→ t]
(V ,P,M ) o,m−−−→
wt
(V ′, P ′,M ′)
Process Steps
Read Write Promise
σ
rd(o,x,v)−−−−−−−−→
p
σ ′
m = (x ,v, (−,−], −), V o,m−−−→
rd
V ′
(σ , V , P , M, G) −→
p
(σ ′, V ′, P , M, G)
σ
wt (o,x,v)−−−−−−−−→
p
σ ′
m = (x ,v, (−,−],−), (V , P ,M) o,m−−−→
wt
(V ′, P ′,M ′)
(σ , V , P , M, G) −→
p
(σ ′, V ′, P ′, M ′, G)
m = (−,−, (−,−],K),
M ′ = M A←↩m, K ∈ M ′
(σ , V , P ,M,G) −→
p
(
σ , V , P
A←↩m, M ′, G
)
SC-fence Reserve Cancel
σ
SC f ence−−−−−−−−→
p
σ ′
(σ , V , P , M, G) −→
p
(σ ′, V ⊔ G, P , M, G ⊔ V )
r = (−, (−,−]), M ′ = M A←↩ r
(σ , V , P ,M,G) −→
p
(σ , V , P ∪ {r }, M ′, G)
r = (−, (−,−]) ∈ P
(σ , V , P , M, G) −→
p
(σ , V , P \{r }, M \ {r }, G)
Update
σ
U (or , ow , x, vr , vw )−−−−−−−−−−−−−−−−−−−→
p
σ ′′,mr = (x , vr , (−, t], −), mw = (x ,vw , (t ,−],−),
V
or ,mr−−−−−→
rd
V ′′, (V ′′, P ,M) ow ,mw−−−−−−→
wt
(V ′, P ′,M ′)
(σ , V , P , M, G) −→
p
(σ ′, V ′, P ′, M ′, G)
Fig. 2. PS 2.0 inference rules at the process level, defining the transition (σ , V , P, M, G) −→
p
(σ ′, V ′, P ′, M ′, G′).
be rlx or ra. We use ra for both release and acquire. Finally, we use (λ,R) t−→
p
(λ′,R′) with
t ∈ {rd(o,x ,v), wt(o,x ,v), U(or ,ow ,x ,vr ,vw ), SC-fence | x ∈ Loc,v ∈ Val,o ∈ {rlx, ra}} to denote
that (λ,R) ϵ=⇒
p
σ1
ϵ
=⇒
p
· · · ϵ=⇒
p
σn
t
=⇒
p
σn+1
ϵ
=⇒
p
· · · ϵ=⇒
p
(λ′,R′).
Machine States. A machine stateMS is a tuple ((J,R),VS,PS,M,G), where J : P 7→ L maps each
process p to the label of the next instruction to be executed, R : Reg→ Val maps each register to
its current value, VS = P → T is the process view map, which maps each process to a view,M is a
memory and PS : P 7→ M maps each process to a set of messages (called promise set), andG ∈ T is
the global view (that will be used by SC fences). We use C to denote the set of all machine states.
Given a machine stateMS = ((J,R),VS,PS,M,G) and a process p, letMS↓p denote the pro-
jection, (σ ,VS(p),PS(p),M,G) with σ = (J(p),R(p)), of the machine state to the process p. We call
MS↓p the process configuration. We use Cp to denote the set of all process configurations.
The initial machine stateMSinit = ((Jinit,Rinit),VSinit,PSinit,Minit,Ginit) is one where: (1) Jinit(p)
is the label of the initial instruction of p; (2) Rinit($r ) = 0 for every $r ∈ Reg; (3) for each p, we have
VS(p) = Vinit as the initial view (that maps each location to the timestamp 0); (4) for each process
p, the set of promises PSinit(p) is empty; (5) the initial memory Minit contains exactly one initial
message (x , 0, (0, 0],Vinit) for each location x ; and (6) the initial global view maps each location to 0.
TransitionRelation.Wefirst describe the transition (σ ,V , P ,M,G) −→
p
(σ ′,V ′, P ′,M ′,G ′) between
process configurations in Cp from which we induce the transition relation between machine states.
Process Relation. The formal definition of −→
p
is in Figure 2. Below, we explain these inference rules.
Read. A process p can read from M by observing a messagem = (x ,v, (f , t],K) if V (x) ≤ t (i.e.,
p must not be aware of a later message for x). In case of a relaxed read rd(rlx,x ,v), the process
view of x is updated to t , while for an acquire read rd(ra,x ,v), the process view is updated to
V [x 7→ t] ⊔K . The global memoryM , the set of promises P , and the global viewG remain the same.
Write. A process can add a freshmessage to the memory (MEMORY : NEW) or fulfil an outstanding
promise (MEMORY : FULFILL). The execution of a write (wt(rlx,x ,v)) results in a messagem with
location x along with a timestamp interval (−, t]. Then, the process view of location x is updated to
The Decidability of Verification under Promising 2.0 1:7
t . In case of a release write (wt(ra,x ,v)) the updated process view is also attached tom, and ensures
that the process does not have an outstanding promise on location x . (MEMORY : FULFILL) allows
to split a promise interval or lower its view before fulfilment.
Update. When a process performs a RMW, it first reads a messagem = (x ,v, (f , t],K) and then
writes an update message with frm timestamp equal to t ; that is, a message of the form m′ =
(x ,v ′, (t , t ′],K ′). This forbids any other write to be placed betweenm andm′. The access modes of
the reads and writes in the update follow what has been described for the read and write above.
Promise, Reservation and Cancellation. A process can non-deterministically promise future
writes which are not release writes. This is done by adding a messagem to the memoryM s.t.m#M
and to the set of promises P . Later, a relaxed write instruction can fulfil an existing promise. Recall
that the execution of a release write requires that the set of promises to be empty and thus it can
not be used to fulfil a promise. In the reserve step, the process reserves a timestamp interval to be
used for a later RMW instruction reading from a certain message without fixing the value it will
write. A reservation is added both to the memory and the promise set. The process can drop the
reservation from both sets using the cancel step in non-deterministic manner.
SC fences.The process viewV is merged with the global viewG , resulting inV ⊔G as the updated
process view and global view.
Machine Relation. We are ready now to define the induced transition relation between machine
states. For machine statesMS = ((J ,R),VS, PS,M,G) andMS′ = ((J ′,R′),VS ′, PS ′,M ′,G ′), we
writeMS −→
p
MS′ iff (1) MS↓p −→
p
MS↓p and (J (p ′),VS(p ′), PS(p ′)) = (J ′(p ′),VS ′(p ′), PS ′(p ′))
for all p ′ , p.
Consistency. According to Lee et al. [Lee et al. 2020], there is one final requirement on machine
states called consistency, which roughly states that, from every encountered machine state encoun-
tered, all the messages promised by a process p can be certified (i.e., made fulfillable) by executing p
on its own from a certain future memory (called capped memory), i.e., extension of the memory
with additional reservation. Before defining consistency, we need to introduce capped memory.
Cap View, Cap Message and Capped Memory. The last element of a memory M with respect to
a location x , denoted bymM,x , is an element from M(x) with the highest timestamp among all
elements ofM(x) and is defined asmM,x = maxm∈M (x )m.to. The cap view of a memoryM , denoted
by V̂M , is the view which assigns to each location x , the to timestamp in the message mM˜,x .
That is, V̂M = λx .mM˜,x .to. Recall that M˜ denote the subset of M containing only messages (no
reservations). The cap message of a memoryM with respect to a location x , is given by the message
m̂M,x = (x ,mM˜,x .val, (mM,x .to,mM,x .to + 1], V̂M ).
Then, the capped memory of a memory M , wrt. a set of promises P , denoted by M̂P , is an
extension of M , defined as: (1) for everym1,m2 ∈ M withm1.loc = m2.loc, m1.to < m2.to, and
there is no messagem′ ∈ M(m1.loc) such thatm1.to < m′.to < m2.to, we include a reservation
(m1.loc, (m1.to,m2.frm]) in M̂P , and (2) we include a cap message m̂M,x in M̂P for every variable
x unlessmM,x is a reservation in P .
Consistency of machine states. A machine stateMS = ((J ,R),VS, PS,M,G) is consistent if every
process p ∈ P can certify/fulfil all its promises from the capped memory M̂PS (p), i.e.,MS [−→p ]
∗
((J ′,R′),VS ′, ∅,M ′,G ′).
The Reachability Problem in PS 2.0. A run of Prog is a sequence of the form: MS0 [−−→
pi1
]∗
MS1 [−−→
pi2
]∗ MS2 [−−→
pi3
]∗ . . . ∗−−→
pin
MSn where MS0 = MSinit is the initial machine state and
1:8 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
MS1, . . . ,MSn are consistent machine states. In this case, the machine statesMS0, . . . ,MSn
are said to be reachable fromMSinit.
Given an instruction label function J : P → L that maps each process p ∈ P to an instruc-
tion label in Lp , the reachability problem asks whether there exists a machine state of the form
(J ,R,V , P ,M,G) that is reachable fromMSinit. In the case of a positive answer to this problem, we
say that J is reachable in Prog in PS 2.0.
2.4 Examples
$r1=x
if($r1 != 2){
z=1
$r1=z
assume($r1=3)
z=2
}
else{
z=2 //
}
z=3
$r2=z
assume($r2 =2)
x=2
Fig. 3. The annotated behaviour is not reachable.
In the following, we describe some examples
to demonstrate PS 2.0. For readability, instead
of referring to reachable instruction labels,
we consider possible program outcomes repre-
sented using the program comment annotation
“//”. All writes and reads are relaxed in both
examples below.
Example 2.1. The annotated program out-
come in Figure 3 is not allowed by PS 2.0.
We list the execution steps of PS 2.0 showing
that the annotated behaviour is not possible.
We give a proof by contradiction. Assume that
the annotated behaviour is possible. The only
way for this is that the first process p1 (whose code on the left side) to execute the else branch.
For this, it needs to read 2 from x. This can be provided only by the second process using the write
x=2. For this to happen, p2 first executes the write z=3 by adding a message (z, 3, (r , s],⊥) to the
memory. Next, p2 has to read a message of the form (z, 2, (f , t],⊥) which can only be generated by
p1 as a promise.
Note that p1 can promise the write z = 2 in its if . . . then branch. To certify this promise, p1
starts from the capped memory, and first executes the write z=1 in the if . . . then branch. To do
this, it can split the promise interval (f , t] and add a message (z, 1, (f , t ′],⊥) while modifying (z, 2,
(f , t],⊥) in the memory to (z, 2, (t ′, t],⊥). Note that since we work from the capped memory, there
are no available intervals in [0,max(t , s)], and the only way to add a message for the write z=1 of
p1, in such a way that p1 can read the 3 written by p2, and also to fulfil its promise, is to split the
promise interval. Next, p1 reads (z, 3, (r , s],⊥) to go past the assume(z=3) statement. This imposes
f < t ′ ≤ r < s . However, since p2 wrote 3 to z before reading the promise (z, 2, (t ′, t],⊥), we also
need r < s ≤ f < t ′ which contradicts f < r . Hence, the annotated behaviour is not reachable,
since p1 fails the certification.
Example 2.2. In Figure 4, we present an example having a run realising the program outcome
which has unboundedly many reservations and subsequent cancellations.
We list the execution steps of PS 2.0 leading to the annotated behaviour. Items prefixed with “C”
represent certification steps.
(1) Process 2 writes 1 tow .
(2) Process 3writes arbitrarilymanymessages (y, 0, (f1, t1],⊥), (y, 0, (f2, t2],⊥) . . . (y, 0, (fk , tk ],⊥)
such that t1 < f2 < t2 < f3 · · · < fk < tk , until it reads the value 1 from w . The number of
messages written depends on the number of iterations of while.
(3) Process 1 promises (x , 2, (f , t],⊥) corresponding to the write x = 2 in the else branch.
The Decidability of Verification under Promising 2.0 1:9
$r1=z
if($r1 = 2)
{
x=2 //
}
else{
do{
$r4 = FADD(y,1)
}while (w=0)
x=2
}
w=1
$r2=x
assume($r2 == 2)
z=2
do
y=$r3
while(w=0)
Fig. 4. The annotated behaviour is reachable.
(4) Process 1 makes arbitrarily many reservations (y, (t1, t ′1]), (y, (t2, t ′2]), . . . , (y, (tk−1, t ′k−1]) such
that t ′1 < f2 < t ′2 < f3 . . . t ′k−1 < fk < tk and (y, (tk , tk+1]).
(C1) Starting from the capped memory, process 1 cancels the reservations one by one, while
executing the FADD instructions, thereby adding messages (y, 1, (ti , t ′i ],⊥) to the memory.
(C2) Process 1 fulfils its promise.
(5) Process 2 reads the message (x , 2, (f , t],⊥) and adds the message (z, 2, (f ′′, t ′′],⊥) for the
write z = 2.
(6) Process 1 reads (z, 2, (f ′′, t ′′],⊥) and fulfils (x , 2, (f , t],⊥) reaching the program outcome.
3 UNDECIDABILITY OF CONSISTENT REACHABILITY IN PS 2.0
In this section, we show that reachability is undecidable for PS 2.0 even for finite-state programs.
The proof is by a reduction from Post’s Correspondence Problem (PCP) [Post 1946]. Our proof
works with the fragment of PS 2.0 having only relaxed (rlx) memory accesses and crucially uses
unboundedly many promises to ensure that a process cannot skip any writes made by another
process. It also works even when we restrict our analysis to executions that can be split into a
bounded number of contexts, where within each context, only one process is active. We need just 3
context switches. Our undecidability result is also tight in the sense that the reachability problem
becomes decidable when we restrict ourselves to machine states where the number of promises is
bounded. Given our proof (Theorem 3.1) where undecidability is obtained with the rlx fragment of
PS 2.0, a natural question is the decidability status of the ra fragment of PS 2.0. This is known to
be undecidable from [Abdulla et al. 2019] even in the absence of promises. Let us call the fragment
of PS 2.0 with only rlx memory accesses PS 2.0-rlx.
Theorem 3.1. The reachability problem for concurrent programs over a finite data domain is
undecidable under PS 2.0. In fact, the undecidability still holds for the PS 2.0-rlx fragment. accesses.
The rest of this section is devoted to the proof of Theorem 3.1. The undecidability is obtained by
a reduction from Post’s Correspondence Problem (PCP) [Post 1946]. A PCP instance consists of two
sequences u1, . . . ,un and v1, . . . ,vn of non-empty words over some alphabet Σ. Checking whether
there exists a sequence of indices j1, . . . , jk ∈ {1, . . . ,n} s.t. uj1 . . .ujk = vj1 . . .vjk is undecidable.
We construct a concurrent program with two processes p1 and p2 (see Figure 5), six memory
locations Loc = {x ,y, validate, index, index ′, term}, and two registers {$r , $r ′}. The finite data
domain of Prog is defined as Val = Σ∪ {0, 1, . . . ,n} ∪ {§, #}, where § and # are two special symbols
(not in Σ∪{0, 1, . . . ,n}). All the locations and registers are initialized to zero. We show that reaching
1:10 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
Process p1 Process p2 Modulep1vi Module
p2
ui
/∗ дeneration mode ∗/
if validate = 0 then
while term = 0 do
index = 1
Module
p1
u1
index = #
. . .
index = n
Module
p1
un
index = #
done
index = §
/∗ validation mode ∗/
else
$r ′ = index′
assume($r ′ ∈ [1, n])
while $r ′ , § do
if $r ′ = 1 then
Module
p1
v1
else if $r ′ = 2 then
Module
p1
v2
. . .
else if $r ′ = n then
Module
p1
vn
end if
assume(index′ = #)
$r ′ = index′
assume(index′ , #)
done
index = §
assume(true) //
end if
term = 1;
$r = index;
assume($r ∈ [1, n])
while $r , § do
if $r = 1 then
Module
p2
u1
else if $r = 2 then
Module
p2
u2
. . .
else if $r = n then
Module
p2
un
end if
assume(index = #)
$r = index
assume($r , #)
done
validate = 1
index′ = §
assume(true) //
assume(y = vi [1])
assume(y = #)
assume(y = vi [2])
. . .
assume(y = vi [ |vi |])
assume(y = #)
x = vi [1]
x = #
x = vi [2]
. . .
x = vi [ |vi |]
index = i
index = #
Module
p1
ui
x = ui [1]
x = #
x = ui [2]
. . .
x = ui [ |ui |]
x = #
assume(x = ui [1])
assume(x = #)
assume(x = ui [2])
. . .
assume(x = ui [ |ui |])
assume(x = #)
y = ui [1]
y = #
y = ui [2]
. . .
y = ui [ |ui |]
index′ = i
index′ = #
Fig. 5. Simulation of the PCP problem using two processes.
the instructions annotated by // and // in p1,p2 is possible iff the PCP instance has a solution.
We give below an overview of the execution steps leading to the annotated instructions.
(1) To begin, process p2 writes 1 to the location term.
(2) Process p1 promises to write letters of ui (one by one) to location x , and the respective indices
i to the location index . The number of made promises is arbitrary, since it depends on the
length of the PCP solution. Observe that the sequence of promises made to the variable index
corresponds to the guessed solution of the PCP problem.
(C1) Using the if branch, p1 fulfils its promise.
(3) Process p2 reads from the sequences of promises written to x and index and copies them (one
by one) to variables y and index ′ respectively, and reaches //.
(4) The else branch in p1 is enabled at this point, where p1 reads the sequence of indices from
index ′, and each time it reads an index i from index ′, it checks that it can read the sequence
of letters of vi from y.
(C1) p1 copies (one by one) the sequence of observed values from y and index ′ back to x and
index respectively. To fulfil the promises, it is crucial that the sequence of read values from
The Decidability of Verification under Promising 2.0 1:11
index ′ (resp. y) is the same as the sequence of written values to index (resp. x ). Since y holds
a sequence vi1 . . .vik , the promises are fulfilled iff this sequence is same as the promised
sequence ui1 . . .uik . This happens only when i1, . . . , ik is a PCP solution.
(5) At the end of promise fulfilment, p1 reaches //.
Let us now give more details about the code of the two processes given in Figure 5. Depending
on the value of the validate flag read, process p1 can run in generation mode (then branch) or
validation mode (else branch). In generation mode, p1 writes in sequential manner the sequence
of indices (alternated with the special symbol #) of a potential solution of the PCP problem to
the location index and writes, letter by letter, the sequence of letters of the word ui to location
x each time p1 sets the location index to i (using the Modulep1ui procedure). In validation mode, p1
reads from locations index ′ and y and writes back what it has read, to the locations index and x ,
respectively (using the Modulep1vi ). The second process proceeds in a similar manner as the else
branch of the first process: It reads from locations index and x and writes the values read to index ′
and y, respectively (using the Modulep2ui ). We will show that a solution of the PCP problem exists iff
we can reach the annotations //, // respectively in processes p1,p2.
Assume that a solution of the PCP problem exists. This means that there is a sequence
of indices i1, i2, . . . , ik such that vi1vi2 · · ·vik = ui1ui2 · · ·uik . Let w = ui1ui2 · · ·uik . Let
us show that the pair of annotations //, // are reachable in Prog. For that aim, con-
sider the following run of the program Prog: p2 starts first by setting the location term to
1. Then, p1 will use the then branch of its conditional statement and make the two fol-
lowing sequences of promises (index, i1, (1, 2]), (index, i2, (2, 3]), . . . , (index, ik , (k,k + 1]) and
(x ,w[1], (1, 2]), (x ,w[2], (2, 3]), . . . , (x ,w[|w |], (|w |, |w | + 1]). Observe that p1 can certify such se-
quences of promises by iterating its iterative statement in the then branch of its alternative
statements. Once these promises are performed, p2 reads these two sequences and writes them
back to the locations index ′ and y, respectively. p2 then sets the location validate to 1. Now p1 can
resume its execution by reading the location validate written by the second process and enter its
else branch of its alternative statement. Then, p1 will iteratively read the values written by p2 on
the location index ′ and y and write them back to the locations index and x , respectively. By doing
this p1 fulfils also the sequence of promises that has been issued.
Now assume that we can reach the pair of annotations //, //. In order for p1 to reach //, it
must execute the else branch of its conditional statement. Let us assume it does so. Then, p1
will read the sequence of indices i1, i2, . . . , ik written by the process p2 on the location index ′. Let
us assume that the process p2 writes the sequence of indices j1, j2, . . . , jm on the location index ′
(by reading the sequence of promises made by p1). Each time that the process p1 reads an index
from the location index ′, it writes it back on the location index. The process p1 (resp. p2) alternates
between writing/reading an index in {1, . . . ,n} and the special symbol # in order to make sure that
each written index is at most read once. In similar manner, the process p2 reads the sequence of
indices j1, j2, . . . , jm written by the process p1 on the location index and it writes it back on the
locations index ′. This implies that the sequence j1, j2, . . . , jm is a subsequence of i1, i2, . . . , ik (since
the process p2 can miss reading some written indices by the process p1) and also that the sequence
i1, i2, . . . , ik is a subsequence of j1, j2, . . . , jm (since p1 can miss reading some written index by the
process p2). Thus, we have that the sequences i1, i2, . . . , ik and j1, j2, . . . , jm are the same. Every
time the process p1 (resp. p2) reads an index i from the location index ′ (resp. index), it (1) tries to
read in sequential manner the sequence of letters appearing in vi (resp. ui ) (alternated with the
special symbol #) from the location y (resp. x), and (2) writes the same sequence of letters to the
location x (resp. y). Using a similar argument as in the case of indices, we can deduce that if p1 (resp.
p2) writes the words vi1vi2 · · ·vik (resp. uj1uj2 · · ·ujm ), letter by letter (with an alternation with
1:12 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
the symbol#), to the location x (resp. y), then vi1vi2 · · ·vik (resp. uj1uj2 · · ·ujm ) is a subsequence
of uj1uj2 · · ·ujm (resp. vi1vi2 · · ·vik ). Thus, if the pair of annotations //, // are reachable then
there exist two sequences i1, i2, . . . , ik and j1, j2, . . . , jm , written, respectively, by p1 and p2 such
that i1, i2, . . . , ik is equal to j1, j2, . . . , jm , and vi1vi2 · · ·vik is equal to uj1uj2 · · ·ujm . Observe that
sequence of indices i1, i2, . . . , ik is non-empty due to the assume statement assume($r ′ ∈ [1,n]).
4 DECIDABLE FRAGMENTS OF PS 2.0
Since keeping ra memory accesses renders the reachability problem undecidable [Abdulla et al.
2019] and so does having unboundedlymany promises when having rlxmemory accesses (Theorem
3.1), we address in this section the decidability problem for PS 2.0-rlx with a bounded number of
promises in any reachable configuration. Observe that bounding the number of promises in any
reachable machine state does not imply that the total number of promises made during that run is
bounded. Let bdPS 2.0-rlx represent the restriction of PS 2.0-rlx to boundedly many promises where
the number of promises in each reachable machine state is smaller or equal to a given constant. In
the following, we show the decidability of the reachability problem for bdPS 2.0-rlx. For establishing
this result, we introduce an alternate memory model for concurrent programs which we call LoHoW
(for “lossy higher order words”). We present the operational semantics of LoHoW, and show that
PS 2.0-rlx is operationally equivalent to LoHoW. Then, under the bounded promise assumption,
we show how LoHoW is used to decide the reachability problem for bdPS 2.0-rlx.
4.1 Introduction to LoHoW
Given an alphabet A, a simple word over A is an element of A∗, while a higher order word is an
element of (A∗)∗ (i.e., word of words). A state of LoHoWmaintains a collection of higher order words,
one per location, along with the states of all processes. The higher order word HWx corresponding
to the location x is a word of simple words, representing the sub memoryM(x) in PS 2.0-rlx. Each
simple word inHWx is an ordered sequence of “memory types”, that is, messages or promises in the
memory corresponding to x , maintained in the order of their to timestamps in the memory. Unlike
PS 2.0-rlx, the LoHoW does not store timestamps in the messages and promises; instead, it takes
advantage of the word order which induces a natural ordering amongst these without explicit use of
timestamps. The key information to encode in each memory type occurring in HWx is: (1) whether
it is a message (msg) or a promise (prm), (2) which process (p) added it to the memory, and the value
(val) it holds, (3) the set S (called pointer set) of processes that are aware of this message/promise
(processes which point to this message/promise), and (4) whether the time interval to the right has
been reserved by some process.
Memory Types.Amemory type is an element of Σ = {msg, prm}×Val×P×2P ∪Γ = {msg, prm}×
Val×P×2P ×P. The first component represents a message (msg) or a promise (prm) in the memory
M of PS 2.0-rlx, the second component the value in the message/promise, the third component
is the process which adds the message/promise to the memory and the fourth component is a
pointer set, which contains all processes whose local view agree with the to time stamp of the
message/promise. In the case of Γ, we have a fifth component which holds the id of the process
that has reserved the time slot to the right of this message/promise.
For a memory type m = (r ,v,p, S) (or m = (r ,v,p, S,q)), we use m.value to denote v . For a
memory typem = (r ,v,p, S) (resp.m = (r ,v,p, S,q)) and a process h ∈ P, we use add(m,h) to
denote the memory typem = (r ,v,p, S∪{h}) (resp.m = (r ,v,p, S∪{h},q)). We use alsodelete(m,h)
to denote the memory typem = (r ,v,p, S \ {h}) (resp.m = (r ,v,p, S \ {h},q)). This corresponds to
the addition/deletion of the process h to/from the set of pointers of the memory typem.
The Decidability of Verification under Promising 2.0 1:13
Simple Words. A simple word is a word ∈ Σ∗#(Σ ∪ Γ), and each HWx is a word ∈ (Σ∗#(Σ ∪ Γ))+.
# is a special symbol not in Σ ∪ Γ, which separates the last symbol from the rest of the simple word.
Consecutive symbols of Σ in a simple word represent adjacent messages/promises in the memory
of PS 2.0-rlx, and are hence unavailable for a RMW. The special symbol # segregates these from the
last symbol of Σ ∪ Γ in a simple word. # does not correspond to any element from the memory;
its job is simply to demarcate the messages/promises which are not available for RMW from the
last symbol of the simple word. If the last symbol in a simple word is in Σ, then it is available
for a RMW; if the last symbol is in Γ, then it is not available for a RMW since the next message
adjacent to this symbol is a reservation. The last symbol from Σ ∪ Γ in a simple word Σ∗#(Σ ∪ Γ)
thus represents a message/promise (combined with or not a reservation) in the memory which is
adjacent to the messages represented by the symbols immediately preceding # (if any).
Fig. 6. A higher order word HW.
Higher order words. A higher order word is a sequence of simple words. Figure 6 depicts a higher
order word with four simple words. We use a left to right order in both simple words and higher
order words. Furthermore, we extend in the straightforward manner the classical word indexation
strategy to higher order words. For example, the symbol at the third position of the higher order
word HW given in Figure 6 is HW[3] = (msg, 2,p, {p,q}). A higher order word HW is well-formed
iff for every p ∈ P, there is a unique position i in HW having p in its pointer set; that is, HW[i] is of
the form (−,−,−, S) ∈ Σ or (−,−,−, S,−) ∈ Γ s.t. p ∈ S . Observe that the higher order word given
in Figure 6 is well-formed. We will use ptr(p,HW) to denote the unique position i in HW having p
in its pointer set. Next, we assume that all the manipulated higher order words are well-formed.
As already mentioned, for each x ∈ Loc, we have a higher order word HWx . The higher
order word HWx represents the entire space [0,∞) of available timestamps. Each simple word in
HWx represents a timestamp interval (f , t], with consecutive simple words representing disjoint
timestamp intervals (while preserving order). The memory types in each simple word take up
adjacent timestamp intervals, spanning the timestamp interval of the simple word. This adjacency
of timestamp intervals within simple words is mainly used in RMW steps and reservations. The
memory type in Σ occurring at the end of a simple word denotes a message/promise which is
available for a RMW operation. The memory type in Γ occurring at the end of a simple word
denotes a message/promise followed by a reservation and therefore it is not available for a RMW
operation. The memory types at positions other than the rightmost in a simple word, represent
messages/promises which are not available for RMW. Figure 7 presents a mapping from a memory
of PS 2.0-rlx to a collection of higher order words (one per location) in LoHoW.
Given a higher order word HW, a position i ∈ {1, . . . , |HW|}, and p ∈ P , we use add(HW,p, i)
(resp. delete(HW,p)) to denote the higher order word HW[1, i − 1] · add(HW[i],p) · HW[i +
1, |HW|] (resp.HW[1, i−1] ·delete(HW[ptr(p,HW)],p) ·HW[i+1, |HW|]). This corresponds to the
addition/deletion ofp to/from the set of pointers ofHW[i]/HW[ptr(p,HW)]. We usemove(HW,p, i)
to denote add(delete(HW,p),p, i).
Initializing higher order words. For each location x ∈ Loc, the initial higher order word HWinitx
is defined as , where P is the set of all processes and p1 is some process in P. The
1:14 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
(_,v4, _)(_,v3, _)(_,v1, _) (_,v2, _)(_,v5, _)
(_,v4, _) (_,v6, _) (_, _)
Locs
Timestamp
M(y)
M(x)
promises/messages
reservations
#(_,v4, _, _) #(_,v6, _, _, _)
(_,v4, _, _)(_,v3, _, _)#(_,v1, _, _) (_,v2, _, _)#(_,v5, _, _)
HWx
HWy
Fig. 7. A mapping from memoriesM(x),M(y) to higher order words HWx ,HWy , respectively.
set of all higher order words HWinitx for all locations x represents the initial memory of PS 2.0-rlx
where all locations have value 0, and all processes are aware of the initial message.
Simulating Reads, Writes, RMWs in LoHoW. In the following, we informally describe how to
handle PS 2.0-rlx instructions in LoHoW. Since we only have the rlx access mode, we denote Reads,
Writes and RMWs as wt(x ,v), rd(x ,v) and U(x ,vr ,vw ), dropping the access modes.
Reads. A rd(x ,v) step by a process p (reading v from x ) is handled as follows in LoHoW.
There exists an index j ≥ ptr(p,HWx ) in HWx such that HWx [j] is of the form (−,v,−, S ′) or
(−,v,−, S ′,−). This corresponds to the existence of a memory type holding the value v in HWx
and this symbol is on the right of the current view/pointer of the process p.
Add p to the set of pointers S ′ and remove it from its previous position.
Writes. A wt(x ,v) step by a process p (writing the value v to the location x ) in PS 2.0-rlx is done
by adding a new message with a timestamp higher than the local view of p for x : the timestamp
interval of this new message can be adjacent to the timestamp of the local view of p, or much ahead.
These two possibilities are captured in LoHoW as follows.
(1) Add the simple word to HWx to the right of ptr(p,HWx ), or
(2) there is a symbol α ∈ Σ and two wordsw andw ′ such that HWx = w · # · α ·w ′. Then, update
the higher order word HWx tow · α · # · (msg,v,p, {p}) ·w ′.
Finally, remove p from its previous pointer set.
(RMW). Capturing RMWs is similar to the execution of a read followed by a write. In PS 2.0-rlx,
a process p performing RMW reads from a message with a timestamp interval (, t] and adds a
message to the memory with timestamp interval (t ,−]. This is handled as follows in LoHoW, and
shows the need for the higher order words. Consider a U(x ,vr ,vw ) step by p. Then,
there is a simple word in HWx having (−,vr ,−, S) as the last memory type in it,
and the position of the memory type (−,vr ,−, S) is on the right of the current pointer of p in HWx .
p is removed from its pointer set,
#(−,vr ,−, S) is replaced with (−,vr ,−, S\{p})# and (−,vw ,p, {p}) is appended, resulting in ex-
tending to .
Example 4.1. We illustrate the read, write and RMW in LoHoW on an example. Figure 8 depicts
a run in PS 2.0-rlx and the corresponding run in LoHoW. The run of PS 2.0-rlx shows how the
memory evolves, and the corresponding run in LoHoW faithfully simulates this using higher order
words HWx and HWy .
The Decidability of Verification under Promising 2.0 1:15
x:=1
y:=2
x:=3
x:=5
$r1:=x //3
$r2:= FADD(y,1) //2
Fig. 8. Below, a run in PS 2.0 showing the changes to memory, and above, the corresponding run in LoHoW.
Observe that init stands for the initial memory.
Promises in LoHoW. Next, we discuss how to handle promises.
Promises. Handling promises made by a process p in PS 2.0-rlx is similar to handling wt(x ,v):
we add the simple word in HWx to the right of the position ptr(p,HWx ), or append
(prm,v,p, {}) at the end of a simple word with a position larger than ptr(p,HWx ). Other than
tagging the symbol as a promise (prm), the pointer set is empty.
Reservations and Cancellations in LoHoW. Next, we come to one of the new features of PS 2.0
over the first version, namely, reservations and cancellations. In PS 2.0-rlx, a process p makes a
reservation by adding the pair (x , (f , t]) to the memory, given that there is a message/promise in
the memory with timestamp interval (−, f ]. In LoHoW this is captured by “tagging” the rightmost
memory type (message/promise) in a simple word with the name of the process that makes the
reservation. This requires us to consider the memory types from Γ = {msg, prm}×Val×P×2P ×P
where the last component stores the process which made the reservation. Such a memory type
always appears at the end of a simple word, and represents that the next timestamp interval adjacent
to it has been reserved. Observe that we can not add new memory types to the right of a memory
type of the form (msg,v,p, S,q). Thus, reservations are handled as follows.
(Res) Assume the rightmost symbol in a simple word as (msg,v,p, S). To capture the reservation
by q, (msg,v,p, S) is replaced with (msg,v,p, S,q).
(Can) A cancellation is done by removing the last component q from (msg,v,p, S,q) resulting in
(msg,v,p, S).
Empty Memory Types, Redundant simple words.When a process p reads from a message, the
pointer of p is updated, and moves forward. As a result, we may have memory types of the form
(msg,v,p, {}) as well as (msg,v,p, {},q) representing those messages in the memory whose pointer
set is empty. Call such symbols of Σ ∪ Γ empty memory types. It is then possible to lose an empty
memory type of Σ from a simple word if it is not at the rightmost position. This will not have any
consequence with respect to the reachability problem, since processes can non-deterministically
skip reading some messages in the memory. Likewise, a simple word of the formw#m ∈ Σ∗#(Σ∪ Γ)
1:16 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
where all symbols in w are empty memory types from Σ andm is an empty memory type from
Σ ∪ Γ can be lost entirely. Such simple words are called redundant simple words. Given this, what
cannot be lost from HWx ? The following:
• memory types (prm,−,−,−) or (prm,−,−,−,−) representing promises. This is due to the fact
promises should be fulfilled and therefore can not be lost.
• non empty memory types: the pointer set of these contain at least one process. Since losing any
of these memory types will result in losing the pointer/view of at least one of the processes.
• Only rightmost memory type (right next to #) in a simple word. Losing only this memory type
will result in a non well-defined higher order word.
Certification and Fulfilment. In PS 2.0-rlx, certification, for a processp, happens from the capped
memory, where intermediate time slots (other than reserved ones) are blocked, and any newmessage
can be added only at the maximal timestamp. This is handled in LoHoW by one of the following:
• addition of new memory types is only allowed only at the right end of any HWx ,
• If the rightmost memory typem in HWx is of the form (−,v,−,−,q) with q , p (i.e., tagged by a
reservation for q), then a simple word #(msg,v,q, {}) is appended at the end of HWx .
Memory is altered in PS 2.0-rlx during certification phase to check for promise fulfilment, and
at the end of the certification phase, we resume from the memory which was there before. To
capture this in LoHoW, we work on a duplicate of (HWx )x ∈Loc in the certification phase. Notice
that the duplication allows losing some of empty memory types and redundant simple words non
deterministically (as described in the previous paragraph). This copy of HWx is then modified
during certification, and is discarded once we finish the certification phase.
The fulfilment of a promise by p using the rule
L←↩ (see rule (MEMORY : FULFILL) in Figure
2) will be handled in a similar manner as using the rule
A←↩ (since we are only dealing with the
fragment of PS 2.0 restricted to rlx). This will result in replacing a memory type of the form
(prm,v,p, S) (resp. (prm,v,p, S,q)) by (msg,v,p, S) (resp. (msg,v,p, S,q)) if this memory type is in
a position which is on the right of the current pointer of the process p. Then, the process p is added
to the pointer set S while removing it from the previous pointer set it belongs to.
The fulfilment of a promise by a process p in PS 2.0 using the rule
S←↩ (see rule
(MEMORY : FULFILL) in Figure 2) results in splitting the intervals of the promise, when adding
a new message (x ,v ′, (f , t],⊥) to the memory. To capture this, we allow insertion of a mem-
ory type right before the promise whose interval is split. This will result in replacing a mem-
ory type of the form (prm,v,p, S) (resp. #(prm,v,p, S,q)) by (msg,v ′,p, {p})(prm,v,p, S) (resp.
(msg,v ′,p, {p})#(prm,v,p, S,q)) if this memory type is in a position which is on the right of the
current pointer of the process p. Then, the process p is removed from the previous pointer set it
belongs to. We may also need to update the position of the separator # so that it is just before the
last symbol of a simple word.
SC fences. SC-fences are handled by adding a dummy process д to P. Whenever a process p
performs a SC fence, д,p are added to the same pointer set, by moving д (p) to the pointer set of p
(д) depending on which is more to the right.
Example 4.2. Figure 9 illustrates a run in LoHoW on a program where promises are necessary to
reach the annotated part //. To reach the annotated part in P1, the execution proceeds as follows.
C1, C2 represent two certification phases.
(1) P1 promises the write of 42 to x , by a message (x , 42, (f , t],⊥).
(C1) To certify, P1 begins from the capped memory, and enters the else branch. It begins a duplicate
of the higher order words, and works on them in this phase.
The Decidability of Verification under Promising 2.0 1:17
• Since all positions in (0, t] are blocked, P1 splits the interval (f , t] to write 41 to x , and
modifies the memory to (x , 42, (t ′, t],⊥), (x , 41, (f , t ′],⊥).
• P1 fulfils its promise
(2) P2 reads 42 from x and writes 42 to z
(3) P1 reads 42 from z
(4) P1 fulfils its promise, and reaches the annotated part.
Fig. 9. Run in LoHoW. The certification phase works on the duplicates of HWx ,HWz denoted in yellow.
4.2 Formal Model of LoHoW
In the following, we formally define LoHoW and state the equivalence of the reachability problem
in PS 2.0-rlx and LoHoW.
Insertion into higher order words. A higher order word HW can be extended in position
1 ≤ j ≤ |HW| with a memory typem of the form (r ,v,p, {p}) in a number of ways:
• Insertion as a new simple word. HW N←↩
j
m is defined only if HW[j − 1] = # (i.e., the position
j is the end of a simple word). Let HW′ be the higher order word defined as delete(HW,p) (i.e.,
removing p from its previous set of pointers). Then, the extended higher order HW
N←↩
j
m is defined
as HW′[1, j] · #m · HW′[j + 1, |HW|] (i.e., inserting the new simple word just after the position j).
• Insertion at the end of a simple word. HW E←↩
j
m is defined only if HW[j − 1] = # (i.e., the position
j is the end of a simple word) and HW[j] ∈ Σ (i.e., the last memory type in the simple word should
be free from reservations). Let HW′ be the higher order word defined as delete(HW,p). Then, the
extended higher order HW
E←↩
j
m is defined as w1 ·m′ · #m · w2 with HW′ = w1 · #m′ · w2, and
m′ ∈ Σ, and |w1 · #m′ | = j (i.e., inserting the new memory type just after the position j).
• Splitting a promise. HW SP←↩
j
m is defined only if HW[j] is of the form (prm,−,p,−) or
(prm,−,p,−,−) (i.e., the memory type at position j is a promise). Let HW′ be the higher or-
der word defined as delete(HW,p). Then, the extended higher order HW SP←↩
j
m is defined as
(1) HW′[1, j − 2] · m · #m′ · HW′[j + 1, |HW|] if HW′[j] = m′ and HW′[j − 1] = #, or (2)
HW′[1, j − 1] · m · m′ · HW′[j + 1, |HW|] if HW′[j] = m′ and HW′[j − 1] , #. Observe that
in both cases we are inserting the new memory typem just before the position j.
• Fulfilment of a promise. HW F P←↩
j
m is defined only if HW[j] is of the form (prm,v,p, S) or
(prm,v,p, S,q). Let HW′ be the higher order word defined as delete(HW,p). Then, the extended
1:18 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
higher order HW
F P←↩
j
m is defined as HW′[1, j − 1] ·m′ ·HW′[j + 1, |HW′ |]withm′ = (msg,v,p, S ∪
{p}) if HW[j] = (prm,v,p, S) andm′ = (msg,v,p, S ∪ {p},q) if HW[j] = (prm,v,p, S,q).
• Splitting a reservation. HW SR←↩
j
m is defined only if HW[j] is of the form (r ′,v ′,q, S,p). Let HW′
be the higher order word defined as delete(HW,p). Then, the extended higher order HW SR←↩
j
m
is defined as HW′[1, j − 2] · (r ′,v ′,q, S) · #(r ,v,p, {p},p) · HW′[j + 1, |HW|]. Observe that the new
message (r ,v,p, {p},p) is added to the right of the position j which corresponds to the slot that has
been reserved by p. This special splitting rule will be used during the certification phase. This will
allow the process p to use the reserved slots. Recall that it is not allowed to add memory types in
the middle of the higher order words (other than the reserved ones) during the certification phase.
Making/Canceling a reservation. A higher order word HW can also be modified through
making/cancelling a reservation at a position 1 ≤ j ≤ |HW| by a process p. Thus, we de-
fine the operation Make(HW,p, j) (resp. Cancel(HW,p, j)) that reserves (resp. cancels) a time
slot at the position j. Make(HW,p, j) (resp. Cancel(HW,p, j)) is only defined if HW[j] is of
the form (r ,v,q, S) (resp. (r ,v,q, S,p)) and HW[j − 1] = #. Then, the extended higher order
Make(HW,p, j) (resp. Cancel(HW,p, j)) is defined as HW[1, j − 1] · (r ,v,q, S,p) · HW[j + 1, |HW|]
(resp. HW[1, j − 1] · (r ,v,q, S) · HW[j + 1, |HW|]).
Process configuration in LoHoW. A configuration of p ∈ P in LoHoW consists of a pair (σ ,HW)
where (1) σ is the process state maintaining the instruction label and the register values (see
Subsection 2.3), andHW is a mapping from the set of locations to higher order words. The transition
relations std−−→
p
and cert−−−→
p
between process configuration is given in Figure 10. The transition relation
cert−−−→
p
is used only in the certification phase while std−−→
p
is used to simulate the standard phase of
PS 2.0-rlx. A read operation in both phases (standard and certification) is handled by reading a
value from a memory type which is on the right of the current pointer of p. A write operation, in
the standard phase, can result in the insertion, on the right of the current pointer of p, of a new
memory type at the end of a simple word or as a new simple word. The memory type resulting
from a write in the certification phase is only allowed to be inserted at the end of the higher order
word or at the reserved slots (using the rule splitting a reservation). Write can also be used to fulfil
a promise or to split a promise (i.e., partial fulfilment) during the both phases. Making/canceling a
reservation will result in tagging/untagging a memory type at the end of a simple word on the right
of the current pointer of p. The case of RMW is similar to a read followed by a write operations
(whose resulting memory type should be inserted to the right of the read memory type). Finally,
a promise can only be made during the standard phase and the resulting memory type will be
inserted at the end of a simple word or as a new word on the right of the current pointer of p.
Losses in LoHoW. LetHW andHW′ be two higher order words in (Σ∗#(Σ∪Γ))+. Let us assume that
HW = u1#a1u2#a2 . . .uk#ak and HW′ = v1#b1v2#b2 . . .vm#bm , with ui ,vi ∈ Σ∗ and ai ,bj ∈ Σ ∪ Γ.
We extend the subword relation ⊑ to higher order word as follows: HW ⊑ HW′ iff there is a strictly
increasing function f : {1, . . . ,k} → {1, . . . ,m} s.t. (1) ui ⊑ vf (i) for all 1 ≤ i ≤ k , (2) ai = bf (i),
and (3) we have the same number of memory types of the form (prm,−,−,−) or (prm,−,−,−,−)
in HW and HW′. The relation ⊑ corresponds to the loss of some special empty memory types
and redundant simple words (as explained earlier). The relation ⊑ is extended to mapping from
locations to higher order words as follows: HW ⊑ HW′ iff HW(x) ⊑ HW′(x) for all x ∈ Loc.
LoHoW states. A LoHoW state st is a tuple ((J,R),HW) where J : P 7→ L maps each process p to
the label of the next instruction to be executed, R : Reg → Val maps each register to its current
The Decidability of Verification under Promising 2.0 1:19
σ
rd(x,v)−−−−−→
p
σ ′, i ≥ ptr(p,HW(x)), v = HW(x)[i].value, HW′ = HW[x 7→move(HW(x),p, i)]
(σ ,HW) a−→
p
(σ ′,HW′)
Read
a ∈ {cert, std}
σ
wt(x,v)−−−−−→
p
σ ′, i > ptr(p,HW(x)), HW′ = HW[x 7→ (HW(x) K←↩
i
(msg,v,p, {p}))]
(σ ,HW) a−→
p
(σ ′,HW′)
(Partial) fulfilment(write)
a ∈ {cert, std},K ∈ {SP , FP}
σ
wt(x,v)−−−−−→
p
σ ′, i ≥ ptr(p,HW(x)), HW′ = HW[x 7→ (HW(x) K←↩
i
(msg,v,p, {p}))]
(σ ,HW) std−−→
p
(σ ′,HW′)
Standard write
K ∈ {N ,E}
σ
wt(x,v)−−−−−→
p
σ ′, i = |HW(x)|, HW′ = HW[x 7→ (HW(x) K←↩
i
(msg,v,p, {p}))]
(σ ,HW) cert−−−→
p
(σ ′,HW′)
Certification write
K ∈ {N ,E}
σ
wt(x,v)−−−−−→
p
σ ′, i ≥ ptr(p,HW(x)), HW′ = HW[x 7→ (HW(x) SR←↩
i
(msg,v,p, {p}))]
(σ ,HW) cert−−−→
p
(σ ′,HW′)
Splitting a reservation (write)
i ≥ ptr(p,HW(x)), HW′ = HW[x 7→ Make(HW(x),p, i)]
(σ ,HW) std−−→
p
(σ ,HW′)
Making a reservation
i ≥ ptr(p,HW(x)), HW′ = HW[x 7→ Cancel(HW(x),p, i)]
(σ ,HW) a−→
p
(σ ,HW′)
Cancelling a reservation
a ∈ {cert, std}
σ
U(x,vr ,wr )−−−−−−−−→
p
σ ′, i ≥ ptr(p,HW(x)), vr = HW(x)[i].value, HW′ = HW[x 7→ (HW(x) E←↩
i
(msg,wr ,p, {p}))]
(σ ,HW) std−−→
p
(σ ′,HW′)
Standard update
σ
U(x,vr ,wr )−−−−−−−−→
p
σ ′, i = |HW(x)|, vr = HW(x)[i].value, HW′ = HW[x 7→ (HW(x) E←↩
i
(msg,wr ,p, {p}))]
(σ ,HW) cert−−−→
p
(σ ′,HW′)
Certification Update
σ
U(x,vr ,wr )−−−−−−−−→
p
σ ′, i ≥ ptr(p,HW(x)), vr = HW(x)[i].value, HW′ = HW[x 7→ (HW(x) K←↩
i+1
(msg,wr ,p, {p}))]
(σ ,HW) a−→
p
(σ ′,HW′)
(Partial) fulfilment (update)
a ∈ {cert, std},K ∈ {SP , FP}
σ
U(x,vr ,wr )−−−−−−−−→
p
σ ′, i ≥ ptr(p,HW(x)), vr = HW(x)[i].value, HW′ = HW[x 7→ (HW(x) SR←↩
i
(msg,wr ,p, {p}))]
(σ ,HW) cert−−−→
p
(σ ′,HW′)
Splitting a reservation (update)
i ≥ ptr(p,HW(x)), HW′ = HW[x 7→ (HW(x) E←↩
i
(prm,v,p, {}))]
(σ ,HW) std−−→
p
(σ ,HW′)
Promise
σ
(SC-fence)−−−−−−−→
p
σ ′, ix =max(ptr(p,HW(x)), ptr(д,HW(x))), HW′ = HW[x 7→move(HW(x),p, ix )]x ∈Loc[x 7→move(HW(x),д, ix )]x ∈Loc
(σ ,HW) a−→
p
(σ ′,HW′)
SC-fence
a ∈ {std, cert}
Fig. 10. LoHoW inference rules at the process level, defining the transition (σ , HW) a−→
p
(σ ′, HW′) where p ∈ P and
a ∈ {std, cert} is the current mode. σ = (J , R) and σ ′ = (J ′, R′) represent local process states.
value, and HW is a mapping from locations to higher order words. The initial LoHoW state stinit
is defined as ((Jinit,Rinit),HWinit) where: (1) Jinit(p) is the label of the initial instruction of p; (2)
Rinit($r ) = 0 for every register $r ∈ Reg; and (3) HWinit(x) = HWinitx for all x ∈ Loc.
Now we are ready to define the induced transition relation between LoHoW states. For two
LoHoW states st = ((J,R),HW) and st′ = ((J′,R′),HW′) and a ∈ {std, cert}, we write st a−→
p
st′ iff
one of the following cases holds: (1) ((J(p),R),HW) a−→
p
((J′(p),R′),HW′) and J(p ′) = J′(p ′) for all
p ′ , p, or (2) (J,R) = (J′,R′) and HW ⊑ HW′.
Two phases LoHoW states. A two-phases state of LoHoW is S = (π ,p, ststd, stcert) where π ∈
{cert, std} is a flag describing whether the LoHoW is in “standard” phase or “certification” phase, p
is the process which evolves in one of these phases, while ststd, stcert are two LoHoW states (one for
each phase). When the LoHoW is in the standard phase, then ststd evolves, and when the LoHoW is
in certification phase, stcert evolves. A two-phases LoHoW state is said to be initial if it is of the form
(std,p, stinit, stinit), where p ∈ P is any process. The transition relation→ between two-phases
LoHoW states is defined as follows: Given S = (π ,p, ststd, stcert) and S′ = (π ′,p ′, st′std, st′cert), we
have S → S′ iff one of the following cases hold:
1:20 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
• During the standard phase. π = π ′ = std, p = p ′, stcert = st′cert and ststd
std−−→
p
st′std. This
corresponds to a simulation of a standard step of the process p.
• During the certification phase. π = π ′ = cert, p = p ′, ststd = st′std and stcert
cert−−−→
p
st′cert.
This corresponds to a simulation of a certification step of the process p.
• From the standard phase to the certification phase. π = std, π ′ = cert, p = p ′,
ststd = st
′
std = ((J,R),HW), and st′cert is of the form ((J,R),HW′) where for every x ∈ Loc,
HW′(x) = HW(x)#(msg,v,q, {}) if HW(x) is of the form w · #(−,v,−,−,q) with q , p, and
HW′(x) = HW(x) otherwise. This corresponds to the copying of the standard LoHoW state
to the certification LoHoW state in order to check if the set of promises made by the process
p can be fulfilled. The higher order word HW′(x) (at the beginning of the certification phase)
is almost the same as HW(x) (at the end of the standard phase) except when the rightmost
memory type (−,v,−,−,q) of HW(x) is tagged by a reservation of a process q , p. In that case,
we append the memory type (msg,v,q, {}) at the end of HW(x) to obtain HW′(x). Note that
this is in accordance to the definition of capping memory before going into certification: to cite,
(item 2 in capped memory of [Lee et al. 2020]), a cap message is added for each location unless it
is a reservation made by the process going in for certification. It is easy to see that this transition
rule can be implemented by a sequence of transitions which copies one symbol at a time, from
HW to HW′.
• From the certification phase to standard phase. π = cert, π ′ = std, ststd = st′std, stcert =
st′cert, and stcert is of the form ((J,R),HW) with HW(x) does not contain any memory type of
the form (prm,−,p,−)/(prm,−,p,−,−) for all x ∈ Loc (i.e., all promises made by p are fulfilled).
The Reachability Problem in LoHoW. Given an instruction label function J : P → L that maps
each p ∈ P to a label in Lp , the reachability problem in LoHoW asks whether there exists a two
phases LoHoW stateS of the form (std,−, ((J ,R),HW), ((J ′,R′),HW′)) s.t. (1)HW(x) andHW′(x)
do not contain any memory type of the form (prm,−,p,−)/(prm,−,p,−,−) for all x ∈ Loc, and (2)
S is reachable in LoHoW (i.e., S0 [−→]∗ S′ where S0 is an initial two-phases LoHoW states). In the
case of a positive answer to this problem, we say that J is reachable in Prog in LoHoW.
Theorem 4.3. An instruction label function J is reachable in a program Prog in LoHoW iff J is
reachable in Prog in PS 2.0-rlx.
4.3 Decidability of LoHoW with Bounded Promises
The equivalence of the reachability in LoHoW and PS 2.0-rlx, coupled with Theorem 3.1 shows that
reachability is undecidable in LoHoW. To recover decidability, we look at LoHoWwith only bounded
number of the promise memory type in any higher order word. Let K-LoHoW denote LoHoW with
a number of promises bounded by K . (Observe that K-LoHoW corresponds to bdPS 2.0-rlx.)
Theorem 4.4. The reachability problem is decidable for K-LoHoW.
As a corollary of Theorem 4.4, the decidability of reachability follows for bdPS 2.0-rlx. The proof
makes use of the framework ofWell-Structured Transition Systems (WSTS) [Abdulla and Jonsson
1996; Finkel and Schnoebelen 2001], and follows from lemmas 4.5 to 4.8.
Well-Structured Transition Systems (WSTS). We recall the main ingredients of WSTS. For
more details, the reader is referred to Abdulla and Jonsson [1996]; Finkel and Schnoebelen [2001].
Well-quasi Orders. Given a (possibly infinite set) C , a quasi-order on C is a reflexive and transitive
relation ⪯ ⊆ C × C . An infinite sequence c1, c2, . . . in C is said to be saturating if there exists
indices i < j s.t. ci ⪯ c j . A quasi-order ⪯ is said to be a well-quasi order (wqo) onC if every infinite
The Decidability of Verification under Promising 2.0 1:21
sequence in C is saturating. Given a quasi-order ⪯ on C , the embedding order ⊑ on C∗ (i.e., the set
of finite words over C) is defined as a1a2 . . . am ⊑ b1b2 . . .bn if there exists a strictly increasing
function д : {1, 2, . . . ,m} → {1, 2, . . . ,n} s.t. for all 1 ≤ i ≤ m, ai ⪯ bд(i). It is well-known that if ⪯
is a wqo on C , then the embedding order ⊑ is also a wqo on C∗ [Higman 1952].
Upward Closure.Given a wqo ⪯ on a setC , a setU ⊆ C is upward closed if for every a ∈ U and b ∈ C ,
with a ⪯ b, we have b ∈ U . The upward closure of a set U ⊆ C isU↑ = {b ∈ C | ∃a ∈ U ,a ⪯ b}. It
is known that every upward closed setU can be characterized by a finite minor. A minorM ⊆ U is
s.t. (i) for each a ∈ U , there is a b ∈ M s.t. b ⪯ a, and (ii) for all a,b ∈ M s.t. a ⪯ b, we have a = b.
For an upward closed setU , let min be the function that returns the minor ofU .
Well-Structured Transition Systems (WSTS). Let T be a transition system with (possibly infinite) set
of statesC , initial statesCinit and transition relation⇝⊆ C ×C . Let ⪯ be a well-quasi ordering onC .
We define the set of predecessors of a subsetU ⊆ C of states as Pre(U ) = {c ∈ C | ∃c ′ ∈ U . c ⇝ c ′}.
For a state c , we denote the set min(Pre({c}↑) ∪ {c}↑) as minpre(c). T is called well-structured if
⇝ is monotonic w.r.t. ⪯ : that is, given c1, c2 and c3 in C , if c1 ⇝ c2 and c1 ⪯ c3, then there exists a
state c4 s.t. c3
∗⇝ c4 and c2 ⪯ c4.
Given a finite set of statesCtarget ⊆ C , the coverability problem asks if there is a state c ′ ∈ Ctarget↑
reachable in T . The following conditions are sufficient for the decidability of this problem: (i)
for every two states c1, c2 ∈ C , it is decidable if c1 ⪯ c2, (ii) for every c ∈ C , we can check if
{c}↑ ∩Cinit , ∅, and (iii) for each c ∈ C , the set minpre(c) is finite and computable.
The algorithm for checking WSTS coverability is based on a backward analysis. The sequence
(Ui )i≥0 withU0 = min(Ctarget) andUi+1 = min(Pre(Ui↑)∪Ui↑) reaches a fixpoint and is computable
[Abdulla and Jonsson 1996; Finkel and Schnoebelen 2001].
LoHoW with bounded promises is a WSTS. We will show that the K-LoHoW transition sys-
tem is a well-structured transition system. Let C denote the set of two-phases K-LoHoW states
of Prog. Given an instruction label function J : P → L, let Ctarget be a finite subset of
C of the form (std,−, ((J ,R),HW), ((J ′,R′),HW′)) such that for every x ∈ Loc, we have: (1)
HW(x) and HW′(x) do not contain any memory type of the form (prm,−,p,−)/(prm,−,p,−,−),
and (2) |HW(x)|, |HW′(x)| ≤ |P|. We define the well-quasi ordering ⊑ on C in a way
that the upward closure of Ctarget consists of all two-phases K-LoHoW states of the form
(std,−, ((J ,R),HW), ((J ′,R′),HW′)) such that for every x ∈ Loc, HW(x) and HW′(x) do not con-
tain any memory type of the form (prm,−,p,−)/(prm,−,p,−,−). Then, the coverability of Ctarget
is equivalent to the reachability of J in K-LoHoW.
In the following, we define the well-quasi ordering ⊑ on on C (Lemma 4.5). Then, we show the
monotonicity of the K-LoHoW transition relation→ w.r.t. ⊑ (Lemma 4.7). Finally, we show how to
compute the set of predecessors of a given two-phases K-LoHoW state (Lemma 4.8). Observe that
the first and second sufficient conditions for the decidability of the coverability problem, namely
comparing two states and checking whether an upward closure set contains the initial state, are
trivial (the second condition can be reduced whether a minimal state is equal to the initial state).
The ordering ⊑ defined on mapping from locations to higher order words can be extended to
two phases K-LoHoW states by component wise extension: (π ,p, ((J1,R1),HW1), ((J2,R2),HW2)) ⊑
(π ′,p ′, ((J ′1,R′1),HW′1), ((J ′2,R′2),HW′2)) holds iff π ′ = π , p ′ = p, (J1,R1) = (J ′1,R′1), (J2,R2) = (J ′2,R′2),
HW1 ⊑ HW′1, and HW2 ⊑ HW′2. Since the embedded ordering ⊑ is a wqo on higher order words
when the number of promises is bounded [Higman 1952], we obtain the following lemma.
Lemma 4.5. The relation ⊑ is a well-quasi ordering on the two phases K-LoHoW states.
1:22 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
Consider now a two-phases K-LoHoW state S of the form (std,−, ((J ,R),HW), ((J ′,R′),HW′))
such that for every x ∈ Loc, HW(x) and HW′(x) do not contain any memory type of the form
(prm,−,p,−)/(prm,−,p,−,−), then it is easy to see that S ∈ Ctarget↑. This implies that:
Lemma 4.6. The coverability of Ctarget is equivalent to the reachability of J in K-LoHoW.
Monotonicity. The following lemma shows the monotonicity of the K-LoHoW transition relation
→ w.r.t. ⊑. This allows the backward algorithm for coverability to work with only upward closed
sets, since the set of predecessors of an upward closed set is also upward closed [Abdulla and
Jonsson 1996; Finkel and Schnoebelen 2001].
Lemma 4.7. The transition relation→ is monotonic w.r.t. ⊑.
Computing the set of predecessors. The last sufficient condition for the decidability of the
coverability problem in K-LoHoW is stated by the following lemma
Lemma 4.8. For each two-phases K-LoHoW state c , the set minpre(c) is effectively computable.
5 SOURCE TO SOURCE TRANSLATION
We consider a parametric under-approximation in the spirit of context bounding [Atig et al. 2011],
[La Torre et al. 2009], [Lal and Reps 2009], [Norris and Demsky 2016], [Musuvathi and Qadeer 2007],
[Qadeer and Rehof 2005], [Abdulla et al. 2019], [Abdulla et al. 2017]. The bounding concept chosen
for concurrent programs depends on aspects related to the interactions between the processes.
In the case of SC programs, context bounding has been shown experimentally to have extensive
behaviour coverage for bug detection [Musuvathi and Qadeer 2007], [Qadeer and Rehof 2005]. A
context in the SC setting is a computation segment where only one process is active. The concept
of context bounding has been extended for weak memory models. For instance, in TSO, the notion
of context is extended to one where all updates to the main memory are done only from the
buffer of the active thread [Atig et al. 2011]. In the case of POWER [Abdulla et al. 2017], context
was extended to consider propagation actions performed by the active process. In the case of RA
[Abdulla et al. 2019], context bounding was extended to view bounding, using the notion of view
switching messages. The notion of bounding appropriate for a model depends on its underlying
complexity. From a theoretical point of view, we have already seen that PS 2.0 is very complex, and
bounding contexts is not sufficient. Our bounding notion for PS 2.0 is based on its various features
which includes relaxed as well as RA memory accesses, promises and certification. Since PS 2.0
subsumes RA, we recall the bounding notion used in RA first, using view altering messages.
View Altering Messages. A message in the memory is view altering if it changes the view of a process
reading it. The under approximate analysis for RA [Abdulla et al. 2019] considered view bounded
runs, which are runs where the number of view altering messages is bounded.
Essential Events. An essential event in a run ρ of a concurrent program under PS 2.0 is either a
promise, a reservation or a view altering message, which is made by some process in the run.
Bounded Context. A context is an uninterrupted sequence of actions by a single process. In a run
having K contexts, the execution switches from one process to another K − 1 times. A K bounded
context run is one where the number of context switches are bounded by K ∈ N. The K bounded
context reachability problem in SC checks for the existence of a K bounded context run reaching
some chosen instruction. A SC program is called a K bounded context program if all its runs are K
bounded context. Now we define the notion of bounding for PS 2.0.
The Bounded Consistent Reachability Problem. Consider a run ρ of a concurrent program
under PS 2.0, MS0 [−−→
pi1
]∗ MS1 [−−→
pi2
]∗ MS2 [−−→
pi3
]∗ . . . [−−→
pin
]∗ MSn . A run ρ of a concurrent
program Prog under PS 2.0 is called K bounded iff the number of essential events in ρ is ≤ K . The
The Decidability of Verification under Promising 2.0 1:23
⟦Proд⟧B (⟨global vars⟩; ⟨Main⟩; (⟦proc p reg $r∗i∗⟧)∗
⟦proc p reg $r∗ i∗⟧B proc p reg $r∗⟨local vars⟩⟨InitProc⟩⟨CSO⟩p,λ0 (⟦i⟧p )∗
⟦λ : i⟧p B λ : ⟨CSI⟩; ⟦s⟧p ; ⟨CSO⟩p,λ
⟦if exp then i∗ else i∗⟧p B if exp then (⟦i⟧p )∗ else(⟦i⟧p )∗
⟦while exp do i∗⟧p B while exp do (⟦i⟧p )∗
⟦assume(exp)⟧p B assume(exp)
⟦$r = exp⟧p B $r = exp
⟦x = $r⟧po∈{rlx,ra} B see write Pseudocode
⟦$r = x⟧po∈{rlx,ra} B see read Pseudocode
Fig. 11. Source-to-source translation map
K bounded reachability problem for PS 2.0 checks for the existence of a run ρ of Prog which is
K-bounded. Assuming Prog has n processes, we propose an algorithm that reduces the K bounded
reachability problem to a K + n bounded context reachability problem under SC.
Translation Overview. Let Prog be a concurrent program under PS 2.0 with set of processes P
and locations Loc. Our algorithm relies on a source to source translation of Prog to a bounded
context SC program ⟦Prog⟧, as shown in Figure 11 and operates on the same data domain. The
translation adds a new process (Main) that initializes the global variables of ⟦Prog⟧. The translation
of a process p ∈ P adds local variables, which are initialized by the function InitProc.
This is followed by the code block ⟨CSO⟩p,λ0 (Context Switch Out) that optionally enables the
process to switch out of context. For each instruction i appearing in the code of p, the map ⟦i⟧p
transforms it into a sequence of instructions as follows : the code block ⟨CSI ⟩ (Context Switch
In) checks if the process is active in the current context; then it transforms each statement s of
instruction i into a sequence of instructions following the map ⟦s⟧p , and finally executes the code
block ⟨CSO⟩p,λ . ⟨CSO⟩p,λ facilitates two things: when the process is at an instruction label λ, (1)
allows p to make promises/reservations after λ, s.t. the control is back at λ after certification; (2) it
ensures that the machine state is consistent when p switches out of context. Translation of assume,
if and while statements keep the same statement. Translation of read and write statements are
described later. Translation of RMW statements are omitted for ease of presentation.
The set of promises a process makes has to be constrained with respect to the set of promises
that it can certify, since processes can generate arbitrarily many promises/reservations, while, in
reality only a few of them will be certifiable. To address this, in the translation, processes generate
promises and reservations on-the-fly, and immediately certify freshly made promises. A process
runs then in two modes : a ‘normal’ mode and a ‘check’ (consistency check) mode. In the normal
mode, a process does not make any promises or reservations. In the check mode, the process may
make promises and reservations and subsequently certify them. In any context, a process first
enters the normal mode, and then, before exiting the context it enters the check mode. The check
mode is used by the process to (1) make new promises/reservations and (2) certify consistency of
the machine state. We also add an optional parameter, called certification depth (certDepth), which
constrains the number of steps a process may take in the check mode to certify its promises. Figure
12 shows the structure of a translated run under an SC program.
To reduce the PS 2.0 run into a bounded context SC run, we use the bound on the number of
essential events. From the run ρ in PS 2.0, we construct a K bounded run ρ ′ in PS 2.0 where the
processes run in the order of generation of essential events. So, the process which generates the
1:24 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
first essential event is run first, till that event happens, then the second process which generates
the second essential event is run, and so on. This continues till K + n contexts : the K bounds the
number of essential events, and the n is to ensure all processes are run to completion. The bound
on the number of essential events gives a bound on the number of timestamps that need to be
maintained. As observed in [Abdulla et al. 2019], one view altering read requires two timestamps;
additionally, each promise/reservation requires one timestamp. Since we have K such essential
events, 2K time stamps suffice. We choose Time = {0, 1, 2, . . . , 2K} as the set of timestamps.
Data Structures. We mention the significant ones. The message data structure represents a mes-
sage generated as a write or a promise and has 4 fields (i) var , the address of the memory location
written to; (ii) the timestamp t in the view associated with the message; (iii) v , the value written;
and (iv) flag, that keeps track of whether it is a message or a promise; and, in case of a promise,
which process it belongs to. The View data structure stores, for each memory location x , (i) a
timestamp t ∈ Time, (ii) a value v written to x , (iii) a Boolean l ∈ {true, false} representing
whether t is an exact timestamp (which can be used for essential events) or an abstract timestamp
(which corresponds to non-essential events).
Global Variables. The Memory is an array of sizeK holding elements of type message . This array is
populated with the view switching messages, promises and reservations generated by the program.
We maintain counters for (1) the number of elements in Memory ; (2) the number of context switches
that have occurred; and (3) the number of essential events that have occurred.
Local Variables. In addition to its local registers, each process has local variables including
• a local variable view , which stores a local instance of the view function (this is of type View),
• active: a boolean variable which is set when the process is running in the current context,
and
• checkMode: a boolean denoting whether the process is in the certification phase. We imple-
ment the certification phase as a function call, and hence store the process state and return
address, while entering it.
Subroutines. We use certain helper subroutines as follows:
• genMessage is a subroutine which generates an instance of the message data structure;
• saveState(p) is a subroutine which saves the values of the global variables and the local states
(instruction labels and local variables) of process p. This is used when switching into check
mode.
• loadState(p) is a subroutine which loads the the values of global variables and local states of
p which was saved using saveState(p). This is use when switching out of check mode.
init p1 n p1 cc · · · pj−1 n pj−1 cc pj n
ASSERT(false)
CSOp1 CSOp1 CSOpj−2 CSOpj−1 CSOpj−1
≤ certDepth
one context
Fig. 12. Control flow: In each context, a process runs first in normal mode n and then in consistency check
mode cc. The transitions between these modes is facilitated by the CSO code block of the respective process.
We check for assertion failures for K + n context-bounded executions (j ≤ K + n).
5.1 Translation Maps
In what follows we illustrate how the translation simulates a run under PS 2.0. At the outset, recall
that each process alternates, in its execution, between two modes: a normal mode (n in Figure 12)
at the beginning of each context and the check mode at the end of the current context (cc in Figure
12), where it may make new promises and certify them before switching out of context.
The Decidability of Verification under Promising 2.0 1:25
Context SwitchOut (CSOp,λ).We describe theCSOmodule (Algorithm 1 provides its pseudocode).
Algorithm 1: CSO
/* nondeterministically enter check mode and
exit context */
if nondet() then
if ¬checkMode then
/* enter consistency check */
if not in context then
enter context
end
checkMode← true
save localstate
returnAddr← λ
else
/* consistency check successful! */
ensure all Promises for process are certified
/* for next context */
mark all Promises as uncertified
checkMode ← false
load localstate
goto returnAddr
exit context
end
end
Algorithm 2: Write
update localstate with write
if nondet() then /* (i) no fresh timestamp */
if checkMode then
/* since write is not a promise */
certify message with reservation or splitting
else if nondet() then /* (ii) fresh timestamp */
generate a view; generate a message
if checkMode then
insert message into Memory as Promise and
certify
else
insert message into Memory as concrete
message
end
else /* (iii) fulfill old promise */
get Promise from Memory
check variable, value and view match
if checkMode then
mark message as certified
else
mark message as fulfilled
end
replace message into Memory
end
CSOp,λ is placed after each instruction λ in the
original program and serves as an entry and exit
point for the consistency check phase of the process.
When in normal mode (n after some instruction λ,
CSO non-deterministically guesses whether the pro-
cess should exit the context at this point, and sets
the checkMode flag to true and subsequently, saves
its local state and the return address (to mark where
to resume execution from, in the next context). The
process then continues its execution in the consis-
tency check mode (cc) from the current instruction
label (λ) itself. Now the process may generate new
promises (see the write rule) and certify these as well
as earlier made promises. In order to conclude the
check mode phase, the process will enter the CSO
block at some different instruction label λ′. Now the
checkMode flag will be set to true, and hence the
process enters the else branch, check that there are
no outstanding promises of p, and mark all its (tem-
porarily) certified promises. Then it exits the check
mode phase, setting checkMode to false. Finally it
loads the saved state, and returns to the instruction
label λ (where it entered check mode) and exits the
context. Another process may now resume execu-
tion.
Write Statements. The translation of a write in-
struction ⟦x B $r⟧o , where o ∈ {rlx, ra} of a pro-
cess p is given in Algorithm 2. This is the general
psuedo code for both kinds of memory accesses,
with specific details pertaining to the particular ac-
cess mode omitted. Let us first consider execution in
the normal mode (i.e., checkMode is false). First, the
process updates its local state with the value that it
will write. Then, the process non-deterministically
chooses one of three possibilities for the write, it
either (i) does not assign a fresh timestamp (non-
essential event), (ii) assigns a fresh timestamp and
adds it to memory, or (iii) fulfils some outstanding
promise.
Let us now consider a write executing in the certification phase (i.e., when checkMode is true). In
case (i), for non essential events, notice that in certification phase, we work with a capped memory.
We will only highlight differences between the normal and certification phase writes. Hence, the
process can make a write if either (1) the write interval can be generated through splitting insertion
or (2) the write can be certified with the help of a reservation. Basically, the writes we make either
split an existing interval (and add this to the left of a promise), or is adjacent to the right of a
message, or is a reservation. Thus, the time stamp of a neighbour is used. Case (ii) is similar to the
1:26 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
normal mode, except, now the process must make the write as a promise, since it is in the check
mode. Moreover the promise made is immediately certified. In case (iii) the only difference with
the normal mode is that the promise is only temporarily certified (for the current context) and will
need to be fulfilled in some future context.
Algorithm 3: Read
if nondet() then /* local read */
check local state is valid
update local state with read
else /* nonlocal (view-switching) read */
check that local state allows read
get message from Memory
check variable, value, view are allowed
update local state with message view
end
Read Statements. The translation of a read instruction
⟦$r B x⟧o , o ∈ {rlx, ra} of process p is given in Algo-
rithm 3. The process first guesses, whether it will read
from a view switching message in the memory of from its
local view. If it is the latter, the process must first verify
whether it can read from the local view. To give an idea
of when this is not possible, consider a process which
executed a fence instruction. In this case, the timestamp
on a variable x may get incremented from t to t ′ > t . In
this case, the current value (corresponding to t ) cannot be read from again. In the case of a view
switching read, we first check that we have not reached the essential event bound.
6 IMPLEMENTATION AND EXPERIMENTAL RESULTS
In order to check the efficiency of the source-to-source translation, we implement a prototype tool,
PS2SC which is the first tool to handle PS 2.0. PS2SC takes as input a C program and a bound K
and translates it to a program Prog′ to be run under SC. We use CBMC version 5.10 as backend to
verify Prog′. CBMC takes as input L, the loop unrolling parameter for bounded model checking of
Prog′. We supply the bound on Essential Events, K as a parameter to PS2SC. PS2SC then considers
the subset of executions respecting the bounds K and L provided as input. If it returns unsafe, then
the program has an unsafe execution. Conversely, if it returns safe then none of the executions
within the subset violate any assertion. K may be iteratively incremented to increase the number of
executions explored. We provide a functionality with which the user optionally selects a subset of
processes for which promises and reservations will be enabled. While in the extreme cases we can
run PS2SC in the promise-full (all processes can promise) and promise-free modes, partial promises
(allowing subsets of processes to promise) turns out to be an effective technique.
We now report the results of experiments we have performedwith PS2SC. We have two objectives:
(1) studying the performance of PS2SC on benchmarks which are unsafe only if promises are enabled
and (2) comparing PS2SC with other model checkers when operating in the promise free mode
(since they can not handle promises). In the first case, we show that PS2SC is able to uncover bugs
in examples with low interaction (reads and writes) with the shared memory. When this interaction
increases, however, PS2SC does not scale, owing to the huge non-determinism in PS 2.0. However,
with partial promises, PS2SC is once again able to uncover bugs in reasonable amounts of time. In
the second case, our observations highlight the ability to detect hard to find bugs with small K for
unsafe benchmarks, and scalability by altering K as discussed earlier in case of safe benchmarks.
We compare PS2SCwith three state-of-the-art stateless model checking tools, CDSChecker [Norris
and Demsky 2013], GenMC [Kokologiannakis et al. 2019] and Rcmc [Kokologiannakis et al. 2017]
that support the promise-free subset of the PS 2.0 semantics. In the tables that follow we provide
the value of K (for PS2SC only) and the value of L (for all tools). We do not consider compilation
time for any tool while reporting the results. For PS2SC, the time reported is the time taken by the
CBMC backend for analysis. The timeout used is 1 hour for all benchmarks. All experiments are
conducted on a machine with a 3.00 GHz Intel Core i5-3330 CPU and 8GB RAM running a Ubuntu
16 64-bit operating system. We denote timeout by ‘TO’, and memory limit exceeded by ‘MLE’.
The Decidability of Verification under Promising 2.0 1:27
6.1 Experimenting with Promises
In this section we check the efficiency of the source-to-source translation in handling promises for
PS 2.0 (which is the most difficult part due to the non-determinism). We first test PS2SC on some
typical litmus-test scenarios adapted from Chakraborty and Vafeiadis [2019]; Kang et al. [2017]; Lee
et al. [2020] to illustrate promise/reservation generation. PS2SC is able to verify these tests within
1 minute which shows the ability of PS2SC to handle typical programming idioms of PS 2.0.
testcase K PS2SC
fib_local_3 4 0.742s
fib_local_4 4 0.761s
fib_local_cas_3 4 1.132s
fib_local_cas_4 4 1.147s
Table 1. Performance of PS2SC on
cases with local computation
In Table 1 we consider unsafe examples in which a process
is required to generate a promise (speculative write) with
value as the i th fibonacci number (Fibonacci-based bench-
marks for SV-COMP 2019 [Beyer 2019]). This promise is cer-
tified using computations local to the process. Thus though
the parameter i increases the interaction of the promising
process with the memory remains constant. The CAS variant
requires the process to make use of reservations. We note that PS2SC uncovers the bugs effectively
in all these cases.
Now we consider the case where promises require some interaction between processes. We
consider an example adapted from the Fibonacci-based benchmarks for SV-COMP 2019 [Beyer
2019], where two processes compute the ith fibonacci number in a distributed fashion. Unlike the
previous case, here, the amount of interaction increases with i . Here however, our tool times out.
testcase K PS2SC[1p]
fib_global_2 4 55.972s
fib_global_3 4 2m4s
fib_global_4 4 4m20s
exp_global_1 4 19m37s
exp_global_2 4 41m12s
Table 2. Performance of PS2SC on
cases with global computation
How do we recover tractable analysis in this case? We tackle
this problem by a modular approach of allowing partial-
promises, i.e. subsets of processes are allowed to generate
promises/reservations. In the experiments, we allowed only
a single process to do so. The results obtained are in Table 2,
where PS2SC[1p] denotes that only one process is permitted
to perform promises. We then repeat our experiments on two
other unsafe benchmarks - ExponentialBug from Fig. 2 of
[Huang 2015] and have similar observations. With this modular approach PS2SC uncovers the bug.
To summarize, we note that the source to source approach performs well on programs requiring
limited global memory interaction. When this interaction increases, PS2SC times out, owing to the
huge non-determinism of PS 2.0. However, the modular approach of partial-promises enables us to
recover effectiveness.
6.2 Comparing Performance with Other Tools
In this section we compare performance of PS2SC in promise-free mode withCDSChecker ([Norris
andDemsky 2013]),GenMC ([Kokologiannakis et al. 2019]) andRcmc ([Kokologiannakis et al. 2017])
on safe and unsafe benchmarks. We provide a subset of the experimental results, the remaining can
be found in the full version. The results of this section indicate that the source-to-source translation
with essential event bounding is effective at uncovering hard to find bugs in non-trivial programs.
We will observe that in most examples discussed below, we had K ≤ 10. Additionally, the bound K
allows incremental verification of safe programs in cases where the other tools timeout.
Parameterized Benchmarks. In Table 3 we compare the performance of these tools on two
parametrized benchmarks: ExponentialBug (from Fig. 2 of [Huang 2015]) and Fibonacci (from
SV-COMP 2019). In ExponentialBug(N ) N represents the number of times a process writes to a
variable. We note that in ExponentialBug(N ) the number of executions grows as N !, while the
processes have to follow a specific interleaving to uncover the hard to find bug. In Fibonacci(N ),
two processes compute the value of the nth fibonacci number in a distributed fashion. Our tool
1:28 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
performs better than the other tools on the ExponentialBug and competes well on Fibonacci for
larger values of the parameter. These results show the ability of our tool to uncover bugs with a
small value of K .
benchmark L K PS2SC CDSChecker GenMC RCMC
exponential_10_unsafe 10 10 1.854s 1.921s 0.367s 3m41s
exponential_25_unsafe 25 10 3.532s 7.239s 3.736s TO
exponential_50_unsafe 50 10 6.128s 36.361s 39.920s TO
fibonacci_2_unsafe 2 20 2.746s 2.332s 0.084s 0.086s
fibonacci_3_unsafe 3 20 9.392s 46m8s 0.462s 0.544s
fibonacci_4_unsafe 4 20 34.019s TO 12.437s 18.953s
Table 3. Comparison on a set of parameterized benchmarks
Concurrent data structures based benchmarks. We compare the tools in Table 4 on benchmarks
based on concurrent data structures. The first of these is a concurrent locking algorithm originating
from Hehner and Shyamasundar [1981]. The second, LinuxLocks(N) is adapted from evaluations
of CDSChecker [Norris and Demsky 2013]. We note that if not completely fenced, it is unsafe. We
fence all but one lock access. Queue is a safe benchmark adapted from SV-COMP 2018, parameterized
by the number of processes. We note the ability of the tool to uncover bugs with a small value of K .
benchmark L K PS2SC CDSChecker GenMC RCMC
hehner2_unsafe 4 5 7.207s 0.033s 0.094s 0.087s
hehner3_unsafe 4 5 28.345s 0.036s 2m53s 1m13s
linuxlocks2_unsafe 2 4 0.547s 0.032s 0.073s 0.078s
linuxlocks3_unsafe 2 4 1.031s 0.031s 0.083s 0.081s
queue_2_safe 4 4 0.180s 0.031s 0.082s 0.085s
queue_3_safe 4 4 0.347s 0.037s 0.090s 0.092s
Table 4. Comparison on concurrent data structures
Variations of mutual exclusion protocols. We now consider safe and unsafe variants of mutual
exclusion protocols from SV-COMP 2018. The fully fenced versions of the protocols are safe. We
modify these protocols by introducing bugs and comparing the performance of PS2SC for bug
detection with the other tools. These benchmarks are parameterized by the number of processes.
benchmark L K PS2SC CDSChecker GenMC RCMC
peterson1U(4) 1 6 1.408s 0.039s TO 9.129s
peterson1U(8) 1 6 47.786s TO TO TO
szymanski1U(4) 1 2 1.015s 0.043s MLE TO
szymanski1U(8) 1 2 6.176s TO TO TO
Table 5. Comparison of performance on mutual exclusion benchmarks with a single unfenced process
In Table 5, we unfence a single process of the Peterson and Szymanski protocols making them
unsafe. For PS2SC, the value of K taken is 6 and 2 respectively, asserting that bugs can be found
(even for non-trivial examples) with small K . We note that the other tools eventually timeout for
larger values of n.
In Table 6 we keep all processes fenced but introduce a bug into the critical section of a process
(write a value to a shared variable and read a different value from it). We note that all other
tools timeout, while PS2SC is able to detect the bug within one minute, showing that essential
event-bounding is an effective technique for bug-finding. Additionally in Peterson2C, we vary the
example by changing the process in which we add the bug. We note that CDSChecker, can uncover
the bug in Peterson2C(5) in around two minutes, while for Peterson1C(5) it timed out. Thus,
CDSChecker algorithm is sensitive to changes in position of bug due to its exploration strategy.
The Decidability of Verification under Promising 2.0 1:29
benchmark L K PS2SC CDSChecker GenMC RCMC
peterson1C(3) 1 2 0.487s 0.053s 0.083s 0.087s
peterson1C(5) 1 2 2.713s TO TO TO
peterson1C(7) 1 2 11.008s TO TO TO
peterson2C(3) 1 2 0.481s 0.032s 0.099s 0.091s
peterson2C(5) 1 2 2.801s 1m47s TO TO
peterson2C(7) 1 2 11.030s TO TO TO
Table 6. Comparison of performance on completely fenced peterson mutual exclusion benchmarks with a
bug introduced in the critical section of a single process
We consider in Table 7 completely fenced versions of the mutual exclusion protocols. In this
experiment, we increase the loop unwinding bound and the value of K . These examples exhibit the
practicality of iterative increments in K . The other tools eventually timeout, while PS2SC is able to
provide atleast partial guarantees.
benchmark L K PS2SC CDSChecker GenMC RCMC
peterson(3) 1 2 0.878s TO 9.665s 26.208s
peterson(2) 1 2 0.321s 0.325s 0.087s 0.068s
peterson(3) 2 4 1.695s TO MLE TO
peterson(2) 2 4 0.539s 15m22s 0.039s 0.428s
peterson(3) 4 4 15.900s TO MLE TO
peterson(2) 4 4 3.412s TO TO TO
Table 7. Evaluation using safe mutual exclusion protocols
7 CONCLUSION
In this paper, we investigate decidability of the Promising Semantics, PS 2.0 from Lee et al. [2020].
The release-acquire (ra) fragment of PS 2.0 with RMW operations is known to be undecidable
[Abdulla et al. 2019]. However, the decidability of the fragment of PS 2.0 with only relaxed (rlx)
accesses (denoted PS 2.0-rlx) was open. We started with this fragment, and obtained undecidability
of the reachability problem, when there is no bound on the number of promises. In the quest
for decidability, we considered an underapproximation of PS 2.0-rlx where we bound the number
of promises in any execution. The fragment of PS 2.0-rlx with bounded promises is denoted as
bdPS 2.0-rlx. We showed that reachability is decidable for bdPS 2.0-rlx. Our decidability proof
includes the introduction of a newmemorymodel LoHoW, and proving the equivalence of PS 2.0-rlx
and LoHoW. The decidability of bdPS 2.0-rlx is shown using the theory of well structured transition
systems. This also gives non-primitive recursive complexity of bdPS 2.0-rlx, with a proof similar to
RMW-free fragment of release-acquire [Abdulla et al. 2019].
Having explored the decidability landscape of PS 2.0 thoroughly, we moved towards practical
verification techniques for PS 2.0. Motivated by the success of context bounded reachability in
SC [Qadeer and Rehof 2005], and subsequent notions in weak memory models, we introduced a
notion of essential events bounded reachability for PS 2.0, which bounds the number of promises
and view altering messages in any execution. We provide a source to source translation from a
concurrent program under PS 2.0 with this bounded notion to a bounded context SC program,
and implemented this in a tool PS2SC. PS2SC is the first tool capable of handling the promising
framework, PS 2.0 from Lee et al. [2020] and the PS model from Kang et al. [2017]. PS2SC allows
modularity with respect to allowing/disallowing promises on a thread-by-thread basis. We exhibit
the efficacy of this modular technique in the face of non-determinism induced by PS 2.0. We also
compare the performance of PS2SCwith existing tools which do not support promises by operating
it in the promise-free mode (in which no threads are allowed to promise). In this case, we exhibit
the effectiveness of the bounding technique in uncovering hard-to find bugs.
1:30 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
REFERENCES
Parosh Aziz Abdulla, Jatin Arora, Mohamed Faouzi Atig, and Shankara Narayanan Krishna. 2019. Verification of programs
under the release-acquire semantics. In PLDI 2019. ACM, 1117–1132.
Parosh Aziz Abdulla, Mohamed Faouzi Atig, Ahmed Bouajjani, Egor Derevenetc, Carl Leonardsson, and Roland Meyer. 2020.
Safety Verification under Power. In NETYS 2020 (Lecture Notes in Computer Science). Springer. to appear.
Parosh Aziz Abdulla, Mohamed Faouzi Atig, Ahmed Bouajjani, and Tuan Phong Ngo. 2017. Context-Bounded Analysis for
POWER. In Tools and Algorithms for the Construction and Analysis of Systems - 23rd International Conference, TACAS 2017,
Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April
22-29, 2017, Proceedings, Part II (Lecture Notes in Computer Science), Axel Legay and Tiziana Margaria (Eds.), Vol. 10206.
Springer, 56–74.
Parosh Aziz Abdulla, Mohamed Faouzi Atig, Bengt Jonsson, and Tuan Phong Ngo. 2018. Optimal stateless model checking
under the release-acquire semantics. Proc. ACM Program. Lang. 2, OOPSLA (2018), 135:1–135:29.
Parosh Aziz Abdulla and Bengt Jonsson. 1996. Verifying Programs with Unreliable Channels. Inf. Comput. 127, 2 (1996),
91–101. https://doi.org/10.1006/inco.1996.0053
Mohamed Faouzi Atig, Ahmed Bouajjani, Sebastian Burckhardt, and Madanlal Musuvathi. 2010. On the verification problem
for weak memory models. In Proceedings of the 37th ACM SIGPLAN-SIGACT Symposium on Principles of Programming
Languages, POPL 2010, Madrid, Spain, January 17-23, 2010. ACM, 7–18.
Mohamed Faouzi Atig, Ahmed Bouajjani, and Gennaro Parlato. 2011. Getting Rid of Store-Buffers in TSO Analysis. In
Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings
(Lecture Notes in Computer Science), Ganesh Gopalakrishnan and Shaz Qadeer (Eds.), Vol. 6806. Springer, 99–115.
Mark Batty, Scott Owens, Susmit Sarkar, Peter Sewell, and Tjark Weber. 2011. Mathematizing C++ concurrency. In POPL
2011, Thomas Ball and Mooly Sagiv (Eds.). ACM, 55–66. https://doi.org/10.1145/1926385.1926394
Dirk Beyer. 2019. Automatic verification of C and Java programs: SV-COMP 2019. In International Conference on Tools and
Algorithms for the Construction and Analysis of Systems. Springer, 133–155.
Soham Sundar Chakraborty and Viktor Vafeiadis. 2019. Grounding thin-air reads with event structures. PACMPL 3 (2019),
70:1–70:28.
Karl Crary and Michael J. Sullivan. 2015. A Calculus for Relaxed Memory. In POPL 2015, Sriram K. Rajamani and David
Walker (Eds.). ACM, 623–636. https://doi.org/10.1145/2676726.2676984
Michael Emmi, Shaz Qadeer, and Zvonimir Rakamaric. 2011. Delay-bounded scheduling. In Proceedings of the 38th ACM
SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011,
Thomas Ball and Mooly Sagiv (Eds.). ACM, 411–422.
Alain Finkel and Philippe Schnoebelen. 2001. Well-structured transition systems everywhere! Theor. Comput. Sci. 256, 1-2
(2001), 63–92. https://doi.org/10.1016/S0304-3975(00)00102-X
Eric C.R. Hehner and R.K. Shyamasundar. 1981. An implementation of P and V. Inform. Process. Lett. 12, 4 (1981), 196 – 198.
https://doi.org/10.1016/0020-0190(81)90100-9
Graham Higman. 1952. Ordering by Divisibility in Abstract Algebras. Proceedings of the
London Mathematical Society s3-2, 1 (1952), 326–336. https://doi.org/10.1112/plms/s3-2.1.326
arXiv:https://londmathsoc.onlinelibrary.wiley.com/doi/pdf/10.1112/plms/s3-2.1.326
Jeff Huang. 2015. Stateless model checking concurrent programs with maximal causality reduction. In Proceedings of the
36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17,
2015, David Grove and Steve Blackburn (Eds.). ACM, 165–174.
Alan Jeffrey and James Riely. 2019. On Thin Air Reads: Towards an Event Structures Model of Relaxed Memory. Logical
Methods in Computer Science 15, 1 (2019). https://doi.org/10.23638/LMCS-15(1:33)2019
Jeehoon Kang, Chung-Kil Hur, Ori Lahav, Viktor Vafeiadis, and Derek Dreyer. 2017. A promising semantics for relaxed-
memory concurrency. In POPL 2017, Giuseppe Castagna and Andrew D. Gordon (Eds.). ACM, 175–189.
Michalis Kokologiannakis, Ori Lahav, Konstantinos Sagonas, and Viktor Vafeiadis. 2017. Effective Stateless Model Checking
for C/C++ Concurrency. Proc. ACM Program. Lang. 2, POPL, Article 17 (Dec. 2017), 32 pages. https://doi.org/10.1145/
3158105
Michalis Kokologiannakis, Azalea Raad, and Viktor Vafeiadis. 2019. Model checking for weakly consistent libraries. In PLDI.
https://doi.org/10.1145/3314221.3314649
Salvatore La Torre, P. Madhusudan, and Gennaro Parlato. 2008. Context-Bounded Analysis of Concurrent Queue Systems.
In Tools and Algorithms for the Construction and Analysis of Systems, 14th International Conference, TACAS 2008, Held as
Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29-April
6, 2008. Proceedings (Lecture Notes in Computer Science), C. R. Ramakrishnan and Jakob Rehof (Eds.), Vol. 4963. Springer,
299–314.
Salvatore La Torre, P. Madhusudan, and Gennaro Parlato. 2009. Reducing Context-Bounded Concurrent Reachability to
Sequential Reachability. In Computer Aided Verification, 21st International Conference, CAV 2009, Grenoble, France, June
The Decidability of Verification under Promising 2.0 1:31
26 - July 2, 2009. Proceedings (Lecture Notes in Computer Science), Ahmed Bouajjani and Oded Maler (Eds.), Vol. 5643.
Springer, 477–492.
Salvatore La Torre, P. Madhusudan, and Gennaro Parlato. 2010. Model-Checking Parameterized Concurrent Programs Using
Linear Interfaces. In Computer Aided Verification, 22nd International Conference, CAV 2010, Edinburgh, UK, July 15-19,
2010. Proceedings (Lecture Notes in Computer Science), Tayssir Touili, Byron Cook, and Paul B. Jackson (Eds.), Vol. 6174.
Springer, 629–644.
Ori Lahav and Udi Boker. 2020. Decidable verification under a causally consistent shared memory. In Proceedings of the 41st
ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK,
June 15-20, 2020, Alastair F. Donaldson and Emina Torlak (Eds.). ACM, 211–226.
Ori Lahav, Viktor Vafeiadis, Jeehoon Kang, Chung-Kil Hur, and Derek Dreyer. 2017. Repairing sequential consistency in
C/C++11. In PLDI 2017, Albert Cohen andMartin T. Vechev (Eds.). ACM, 618–632. https://doi.org/10.1145/3062341.3062352
Akash Lal and Thomas W. Reps. 2009. Reducing concurrent analysis under a context bound to sequential analysis. Formal
Methods in System Design 35, 1 (2009), 73–97.
Sung-Hwan Lee, Minki Cho, Anton Podkopaev, Soham Chakraborty, Chung-Kil Hur, Ori Lahav, and Viktor Vafeiadis. 2020.
Promising 2.0: global optimizations in relaxed memory concurrency. In Proceedings of the 41st ACM SIGPLAN International
Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, Alastair F.
Donaldson and Emina Torlak (Eds.). ACM, 362–376.
Jeremy Manson, William Pugh, and Sarita V. Adve. 2005. The Java memory model. In POPL 2015, Jens Palsberg and Martín
Abadi (Eds.). ACM, 378–391. https://doi.org/10.1145/1040305.1040336
Madanlal Musuvathi and Shaz Qadeer. 2007. Iterative context bounding for systematic testing of multithreaded programs.
In Proceedings of the ACM SIGPLAN 2007 Conference on Programming Language Design and Implementation, San Diego,
California, USA, June 10-13, 2007, Jeanne Ferrante and Kathryn S. McKinley (Eds.). ACM, 446–455.
Brian Norris and Brian Demsky. 2013. CDSchecker: Checking Concurrent Data Structures Written with C/C++ Atomics. In
OOPSLA 2013. ACM, New York, NY, USA, 131–150. https://doi.org/10.1145/2509136.2509514
Brian Norris and Brian Demsky. 2016. A Practical Approach for Model Checking C/C++11 Code. ACM Trans. Program.
Lang. Syst. 38, 3, Article 10 (May 2016), 51 pages. https://doi.org/10.1145/2806886
Jean Pichon-Pharabod and Peter Sewell. 2016. A concurrency semantics for relaxed atomics that permits optimisation
and avoids thin-air executions. In POPL 2016, Rastislav Bodík and Rupak Majumdar (Eds.). ACM, 622–633. https:
//doi.org/10.1145/2837614.2837616
Emil L. Post. 1946. A variant of a recursively unsolvable problem. Bull. Amer. Math. Soc. 52 (1946), 264–268.
Shaz Qadeer and Jakob Rehof. 2005. Context-Bounded Model Checking of Concurrent Software. In TACAS 2005 (LNCS),
Vol. 3440. Springer, 93–107.
Kasper Svendsen, Jean Pichon-Pharabod, Marko Doko, Ori Lahav, and Viktor Vafeiadis. 2018. A Separation Logic for a
Promising Semantics. In 27th European Symposium on Programming, ESOP 2018 (LNCS), Amal Ahmed (Ed.), Vol. 10801.
Springer, 357–384. https://doi.org/10.1007/978-3-319-89884-1_13
Ermenegildo Tomasco, Truc Lam Nguyen, Bernd Fischer, Salvatore La Torre, and Gennaro Parlato. 2017. Using Shared
Memory Abstractions to Design Eager Sequentializations for Weak Memory Models. In Software Engineering and Formal
Methods - 15th International Conference, SEFM 2017, Trento, Italy, September 4-8, 2017, Proceedings (Lecture Notes in
Computer Science), Alessandro Cimatti and Marjan Sirjani (Eds.), Vol. 10469. Springer, 185–202.
Yang Zhang and Xinyu Feng. 2013. An Operational Approach to Happens-Before Memory Model. In Seventh International
Symposium on Theoretical Aspects of Software Engineering, TASE 2013, 1-3 July 2013, Birmingham, UK. IEEE Computer
Society, 121–128. https://doi.org/10.1109/TASE.2013.24
1:32 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
A DETAILS FOR SECTION 4
In this section, we give details of lemmas from Section 4.
A.1 Equivalence of PS 2.0-rlx and LoHoW
To prove Theorem 4.3, we show the following: Given a program Prog, starting from the initial
machine stateMSinit = ((Jinit,Rinit),Vinit,PSinit,Minit,Ginit) in PS 2.0-rlx, we can reach in PS 2.0-rlx
the machine state MSn = ((Jn ,Rn),Vn ,PSn ,Mn ,Gn) with PSn(p) = ∅ for all p ∈ P iff, start-
ing from an initial LoHoW two phases state Sinit = (std,p, stinit , stinit ), we reach the state
(std,−, ((Jn ,Rn),HWn),−), such that HWn(x) does not contain any memory type of the form
(prm,−,p,−) or (prm,−,p,−,−) for all x ∈ Loc. The equivalence of the runs follows from the fact
that the sequence of instructions followed in each phase std and cert are same in both PS 2.0-rlx
and LoHoW ; LoHoW allows lossy transitions which does not affect reachability. Moreover, the
LoHoW run satisfies the following invariants.
Invariants for HW. The following invariants hold good for HW(x) for all x ∈ Loc. We then say
that HW(x) is faithful to the sub memoryM(x) and the view mapping.
(Inv1) For all x ∈ Loc, HW(x) is well-formed : for each process p ∈ P, there is a unique position i in
HW(x) having p in its pointer set;
(Inv2) For all i > ptr(p,HW(x)), we have HW(x)[i]<{(msg,−,p,−), (msg,−,p,−,−)}. This says
that memory types at positions greater than the pointer of p cannot correspond to messages
added by p toM(x).
Lemma A.1. The higher order words HW(x) for all x ∈ Loc appearing in the states of a LoHoW
run satisfy invariants Inv1 and Inv2.
Lemma A.1 can be proved by inducting on the length of a LoHoW run, starting from the initial
states, using the following.
• For each memory type (msg,v,p, S,−) or (msg,v,p, S) in HW(x), there is a message inM(x)
whichwas added by processp, having valuev . Similarly, for eachmemory type (prm,v,p, S,−)
or (prm,v,p, S) in HW(x), there is a promise inM(x) which was added by process p, having
value v .
• The order between memory types in HW(x) and the corresponding messages inM(x) are the
same. That is, for i < j, the messages or promisesm,m′ ∈ M(x) corresponding to HW(x)[i]
and HW(x)[j] are such thatm.to < m′.to.
• the elements in the pointer set of a memory typem in HW(x) are exactly the set of processes
whose local view is the to stamp of the element ofM(x) corresponding tom.
The base case is easy : the initial two-phases LoHoW state has the same local process states as
the initial PS 2.0 machine state; moreover, the invariants trivially hold, since all process pointers
are at the same position.
For the inductive hypothesis, assume that both invariants hold in a LoHoW run after i steps. To
show that they continue to hold good after i+1 steps, we have to show that for all LoHoW transitions
that can be taken after i steps, they are preserved. Assume that the two phases LoHoW state at the
end of i steps is (std,p, st, st′). The proof for the case when we have a state (cert,p, st, st′) after i
steps of the LoHoW run is similar.
• Assume that we have the transition rd(x,v)−−−−−→
p
. Then ptr(p,HW(x)) is updated in the resultant
state, and so are (J ,R), Clearly, the higher order word in the resultant state satisfies both
invariants since the starting state does.
The Decidability of Verification under Promising 2.0 1:33
• Assume that we have the transition wt(x,v)−−−−−→
p
. Then we remove p from the pointer set at
position i = ptr(p,HW(x)). A new simple word is added at a position > i , or a memory type
(msg,v,p, {p}) is added at a position j > i , right next to a #, by moving the memory type at
j to position j − 2. In either case, the resultant higher order word satisfies both invariants,
since the starting state does.
• The update rule U(x,vr ,vw )−−−−−−−−→
p
combines the above two cases, by first performing a read and
then atomically the write. From the above two cases, the invariants can be seen to hold good
in the higher order words in the state obtained after the transition.
• Consider the Promise rule. In this case, we do not remove p from its pointer set, and only add
the memory type (prm,v,p, {}) ahead of ptr(p,HW(x)). Note that Inv2 only requires that
there are nomemory types of the form (msg,v,p, S) or (msg,v,p, S,−) ahead of ptr(p,HW(x)).
Clearly, both invariants continue to hold.
• Consider a fulfil rule obtained as a write. In this case, p is deleted from the position
ptr(p,HW(x)); and the memory type (prm,v,p, S) (or (prm,v,p, S .−)) is replaced with
(msg,v,p, S ∪ {p}) (or (msg,v,p, S ∪ {p})). It is easy to see both invariants holding good.
• Consider the reservation rule. This does not affect the invariants since we only tag the last
component of a memory type with the process making the reservation.
• Consider the SC fence rule. If ptr(p,HW(x)) > ptr(д,HW(x)), then, in the resultant word,
p is moved to ptr(д,HW(x)). The case when ptr(p,HW(x)) < ptr(д,HW(x)), is handled by
moving д to ptr(p,HW(x)). Since this is the only change in the resultant higher order words,
clearly, both invariants hold good.
Proof of Theorem 4.3
First, from the definition of HW ⊑ HW′, lossy transitions in LoHoW only lose redundant simple
words and empty memory types from HW. The phase of PS 2.0-rlx (standard, certification) is
handled in LoHoW by an appropriate state (std,−,−,−) or (cert,−,−,−).
To see the proof, we consider the four kinds of transitions between phases.
• Switching from certification phase to the standard phase is possible only when the promise
set of the process in the certification phase becomes empty. Any process can non deter-
ministically begin the standard phase when the certification of one process ends success-
fully. These conditions are the same in LoHoW by allowing a transition from a two phases
state (cert,p, ((J ,R),HW), ((J ′,R′),HW′)) to a state (std,q, ((J ,R),HW), ((J ′,R′),HW′))
only when there are no memory types (prm,−,p,−) in HW′.
• The switch into certification phase happens in LoHoW from a capped memory. This is
simulated in LoHoW as follows. When entering the certification phase, the LoHoW duplicates
the higher order words.When the last memory type inHW(x) is not tagged by the reservation
of a process q , p, the duplicated higher order word accounts for the capped memory, since
we do not allow insertions in between during certification. When the last memory type
in HW(x) is tagged by a reservation of process q , p, then we add a new simple word
#(msg,−,q, {}) at the end of the duplicated higher order word. This respects the semantics
of reservation by a process q , p.
• Once we are in a phase an continue in that phase, the proof in both directions is done by
showing that each instruction simulated in PS 2.0-rlx can be simulated by the corresponding
rule in LoHoW preserving the invariants, and conversely.
The first direction from PS 2.0-rlx to LoHoW is done as follows. For each transition by a process
p on an instruction in PS 2.0-rlx, we show that we can simulate the same instruction in LoHoW.
1:34 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
(1) Consider the read rd(x ,v) rule. The read rule updates ptr(p,HW(x)) in such a way that
HW(x) is faithful to M(x) and the view V. In case the read operation in PS 2.0-rlx uses a
message whose to time stamp is not the local view of any process, the corresponding memory
type may or may not be present in HW(x) due to lossiness. Considering the case when this
memory type is not lost, it is used exactly in the same manner as the respective message in
PS 2.0-rlx.
(2) Consider the wt(x ,v) rule. The write rule either appends memory types or adds simple
words to HW(x). Once again, it is easy to see that HW(x) is faithful toM(x) and V. Mapping
memory types in HW(x) toM(x), the relative ordering of the new memory type which gets
added with respect to existing memory types in HW(x) is exactly same as the order the
newly added message has, with respect to others inM(x).
(3) Consider the U(x ,vr ,vw ) rule. The RMW rule appends memory types to a simple word. The
memory type corresponding to the messagem inM(x) on which RMW is done, if available
in HW(x), will be the rightmost in a simple word (right to a #). The memory type which is
appended to # after movingm to the left of #, corresponds to the new addition, right adjacent
tom in M(x). The append operation captures the adjacency of the new message added to
M(x) with respect to the one on which RMW is performed. Note that this also results in
HW(x) being faithful toM(x) and view V.
(4) The promise rule by a process p is similar to the write rule. A new memory type (prm,v,p, {})
is added toHW(x) at a position > ptr(p,HW(x))with an empty pointer set. This corresponds
to the fact that the process p which makes the promise has its local view smaller than the
to time stamp of the promise. Promise memory types are not lost from HW(x). When the
promise is fulfilled, p is added to the pointer set of (prm,v,p, S) and the prm memory type
is replaced with the msg memory type. This corresponds to removing a promise from the
promise set of P . Thus, HW(x) is faithful also to the promise set. If there is a promise which
cannot be fulfilled in PS 2.0-rlx, the corresponding promise memory type will stay in HW(x),
disallowing to reach a state (std,−,−,−) in LoHoW.
(5) The reserve rule done by a process p reserves a timestamp interval adjacent to an existing
messagem in M(x). If the memory type corresponding tom is available in HW(x), then it
will be the rightmost in a simple word of HW(x). The reservation is done by tagging this
memory type as a reservation by p, thereby blocking this memory type from participating in
any RMW. Likewise, a cancellation frees up the reserved timestamp interval inM(x); if the
corresponding tagged memory type is available in HW(x), then it is unblocked from doing
RMW by removing the reserve tag of p from it.
(6) The SC fence rule done by a process p is handled by updating the pointer sets of p (or д)
depending on which one is ahead.
Thus, for every run that reaches a consistent state in PS 2.0-rlx with local process states (J ,R),
there is a run in LoHoW that reaches a two phases state (std,−, ((J ,R), st),−) following the same
sequence of instructions.
The converse follows using the faithfulness ofHW(x) again. The crucial argument is the memory
types in each HW(x) form a subset ofM(x), which has all the “necessary” messages (promises, non
empty memory types in non redundant simple words). Lossiness of empty memory types/redundant
simple words in HW(x) can be interpreted as messages which are skipped over, or which have
already been used in M(x). It is easy to see that any sequence of transitions of instructions in
LoHoW can be simulated by exactly the same instruction sequence in PS 2.0-rlx.
The Decidability of Verification under Promising 2.0 1:35
A.2 Proof of Lemma 4.8
Recall that minpre(c) is defined as min(Pre({c}↑) ∪ {c}↑). In the following, we show the set
minpre(c) is effectively computable for any two-phases K-LoHoW state c . To do that, we will use a
transducer based approach. Lemma 4.8 is an immediate consequence of Lemma A.2, Lemma A.3,
Lemma A.4, and Lemma A.5.
Lemma A.3 shows the regularity of {c}↑, Lemma A.5 and A.4 show the regularity of Pre({c}↑),
while Lemma A.2 shows the effective computability of min(Pre({c}↑) ∪ {c}↑).
Finite-state automata. A finite state automaton A is a tuple A = (Σ1, P , I ,E, F ), where Σ1 is the
finite input alphabet, P is a finite set of states, I , F ⊆ P are subsets of initial and final states, and
E ⊆ P × Σ1 × P is a finite set of transition rules. A word u = a1 . . . an is accepted by A if there is
a run p0 −a1→ p1 −a2→ . . .pn−1 −an−→ pn such that p0 ∈ I , pn ∈ F and (pi−1,ai ,pi ) ∈ E. We use L(A) to
denote the set of words accepted by A.
Regular set of two-phases K-LoHoW-statesWe use an encoding of two-phases K-LoHoW states
as words over a finite alphabet, and use this encoding to define a regular set of two-phases K-LoHoW
states. Let st denote ((J ,R),HW)). Consider a two-phases K-LoHoW state c = (std,p, st, st′) or
(cert,p, st, st′). Recall that (J ,R) gives the local instruction labels of all processes and the local
register values. Assuming we have locations x1, . . . ,xm , HW = (HWxi )1≤i≤m . The state c is
encoded by the word w = std$p$J$R$0HWx1$1 . . .HWxm$m ‡ J ′$′0HW′x1$′HW′x2 . . .HW′xm$′m
or cert$p$J$R$0HWx1$1 . . .HWxm$m ‡ J ′$′0HW′x1$′HW′x2 . . .HW′xm$′m where J defines the local
state of each process, and the ‡, $i , $′i ’s act as delimiters between the contents of the higher order
words. w is denoted Enc(c). w is a correct encoding, if, on “decoding” w , we obtain a unique
decode(w) = (std,p, st, st′) or (cert,p, st, st′) where, each HWx ∈ (Σ∗#(Σ ∪ Γ))+ appearing in st
satisfies the invariants (Inv1) and (Inv2). Given a set R of two-phases K-LoHoW states, let Enc(R)
represent the set of its word encodings. We say that a set R of two-phases K-LoHoW states is regular
if and only if there is a finite state automaton that accepts Enc(R).
Lemma A.2. Given a regular set R of two-phases K-LoHoW states, we can effectively compute
min (R).
Proof. Let A = (Σ1, P , I ,E, F ) be the finite state automaton that accepts Enc(R). The main idea
to effectively compute min (R) is to bound the size of the words accepted by A that encode minimal
two-phases K-LoHoW states. Observe that the cycles inA can be only labeled by the empty memory
type. Otherwise there will be a violation of invariant (Inv1). Now consider a wordw accepted by A.
We will first construct another wordw ′ fromw such decode(w ′) ⊑ decode(w) and the number of
#e where e is an empty memory type from the subset (msg,−,P, {}) of Σ or (msg,−,P, {},−) of Γ
occurring inw ′ is polynomially bounded by the size of A. In the following, for convenience, we use
macro transitions on #a rather than two separate transitions on # followed by a transition for a.
Let us assume thatw is accepted by A using the following run p0 −#a1−→ p1 −#a2−→ . . .pk−1 −#ak−−→ pk .
Let i1 < i2 < · · · < ib be the maximal sequence of indices such that ai j is an empty memory type
∈ (msg,−,P, {}) or (msg,−,P, {},−). Now if b > |P | · |Σ1 |, then there are two indices i j and iℓ
such that i j < iℓ , ai j = aiℓ and pi j−1 = piℓ−1. Furthermore, all the symbols occurring between i j
and iℓ are empty memory types (from (Inv1)). This means that p0 −#a1−→ p1 −#a2−→ . . .pi j−1 −
#aij−−→
piℓ · · ·pk−1 −#ak−−→ pk is an accepting run of A (accepting the wordw1). Furthermore, decode(w1) ⊑
decode(w). We can now proceed iteratively onw1 in order to obtain the wordw ′ that is accepted by
A,decode(w ′) ⊑ decode(w), s.t. the number of #e , with e an emptymemory type from Σ∪Γ occurring
in w ′ is bounded by |P | · |Σ1 |. Observe that the number of #b where b is a non empty memory
type from Σ ∪ Γ occurring inw ′ is also bounded by |P |+K+1 : these are either K promise memory
types (prm,−,−,−) or (prm,−,−,−,−) or those of the form (msg,−,−, S) or (msg,−,−, S,−) where
1:36 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
S , ∅). For the latter, we have a bound of |P | + 1. This comes from Inv1 since each process in
P ∪ {д} appears in a unique pointer set. Thus, the number of #e where e ∈ Σ ∪ Γ occurring inw ′ is
polynomially bounded by the size of A.
Now from w ′ we will construct another word w ′′ accepted by A and such that decode(w ′′) ⊑
decode(w ′) and |w ′′ | is polynomially bounded by the size ofA. Let ρ := д0 −#b1−→ д1 −#b2−→ . . .дt−1 −#bt−→
дt be the run of A acceptingw ′. Let i1 < i2 < · · · < ir be the maximal sequence of indices such that
bi j ∈ Σ ∪ Γ. Observe that r is polynomially bounded by the size of A as we have shown previously.
Assume i0 = 1 and ir+1 = t . Now we can iteratively remove any cycle between two indices if and
if +1 in ρ that is only labeled by empty memory types from Σ to obtainw ′′ satisfying the previous
conditions. □
Lemma A.3. Given a regular set R of K-LoHoW states, the set R ↑ is also regular.
Proof. Let A = (Σ1, P , I ,E, F ) be the finite state automaton that accepts Enc(R). To construct
a finite state automaton A′ that accept Enc(R ↑), we proceed as follows: The automaton A′ is
constructed by replacing each macro transition (p,ba,p ′) ∈ E labeled by the letter a ∈ Σ, b , # by
the following macro-transition (p, e∗ · ba · e∗,p ′) in A′, where e is over the empty memory types of
Σ. Furthermore, any macro transition (p, #a,p ′) ∈ E labeled by the letter a ∈ Σ ∪ Γ is replaced in A′
by the macro-transition (p, #a · (w#b)∗,p ′), wherew ∈ Σ∗ is over the empty memory types of Σ and
b ∈ Σ ∪ Γ is an empty memory type in Σ ∪ Γ. We can also have a loop on empty memory types of
Σ on the initial state. Observe that any macro-transition can be easily translated to a sequence of
simple transitions by using extra-intermediary states. □
Rational Transducers. A rational transducerT is a non-deterministic finite state automatonwhich
outputs words on each transition. Formally, a rational transducer is a tupleT = (Σ1, Σ2,Q, I ,E,η, F ),
where Σ1, Σ2 are finite input and output alphabets, Q is a finite set of states, I , F ⊆ Q are subsets of
initial and final states, E ⊆ Q ×Σ1 ×Q is a finite set of transition rules, and η : E → 2Σ∗2 is a function
specifying a regular language of partial outputs for each transition rule (i.e.,η(e) is a regular language
for all e ∈ E). The relation defined by T contains pairs (u,v) of input and output words, where
u = a1 . . . an and v = v1 . . .vn , for which there is a run q0 −a1 | v1−−−−→ q1 −a2 | v2−−−−→ . . .qn−1 −an | vn−−−−→ qn
such that q0 ∈ I , qn ∈ F , (qi−1,ai ,qi ) ∈ E, vi ∈ η(qi−1,ai ,qi ). The set of pairs (u,v) defined by T is
denoted L(T ).
Lemma A.4. Given a regular language R (described by a finite-state automaton), we can easily
compute a finite state automaton A such that L(A) = {u | (u,v) ∈ L(T ) ∧ v ∈ R}.
Proof. Trivial. □
Lemma A.5. It is possible to construct a transducer T that accepts any pair (Enc(s),Enc(s ′)), with s
and s ′ are two two-phases K-LoHoW-states, such that s ′ is reachable from s in one step.
Proof. Observe that the class of rational transducers are closed under union and therefore it is
sufficient to construct the transducer T for each transition rule. Furthermore, we always assume
that the input and output tape of the transducer T satisfy the two invariants (Inv1) and (Inv2)
(these can be easily specified as a regular language). The proof is about simulating the rules in the
transition system in LoHoW as defined in Section 4.2. We reproduce the rules for easy reference.
The global transition rules in LoHoW
Given S = (π ,p, ststd, stcert) and S′ = (π ′,p ′, st′std, st′cert), we have S → S′ iff one of the
following cases hold:
The Decidability of Verification under Promising 2.0 1:37
(a) During the standard phase. π = π ′ = std, p = p ′, stcert = st′cert and ststd
std−−→
p
st′std. This
corresponds to a simulation of a standard step of the process p.
(b) During the certification phase. π = π ′ = cert, p = p ′, ststd = st′std and stcert
cert−−−→
p
st′cert.
This corresponds to a simulation of a certification step of the process p.
(c) From the standard phase to the certification phase. π = std, π ′ = cert, p = p ′, ststd =
st′std = ((J,R),HW), and st′cert is of the form ((J,R),HW′)where for every x ∈ Loc,HW′(x) =
HW(x)#(msg,v,q, {}) if HW(x) is of the formw · #(−,v,−,−,q) with q , p, and HW′(x) =
HW(x) otherwise. This corresponds to the copying of the standard LoHoW state to the
certification LoHoW state in order to check if the set of promises made by the process p can
be fulfilled. The higher order word HW′(x) (at the beginning of the certification phase) is
almost the same as HW(x) (at the end of the standard phase) except when the rightmost
memory type (−,v,−,−,q) of HW(x) is tagged by a reservation of a process q , p. In that
case, we append the memory type (msg,v,q, {}) at the end ofHW(x) to obtainHW′(x). Note
that this is in accordance to the definition of capping memory before going into certification:
to cite, (item 2 in capped memory of [Lee et al. 2020]), a cap message is added for each location
unless it is a reservation made by the process going in for certification.
Copying HW to HW′ symbol by symbol
We can implement copying ofHW toHW′ by copying symbol by symbol as follows. Consider
any HW(x). Let HW = (axWx )x ∈Loc where HW(x) = axWx ∈ (Σ∗#(Σ ∪ Γ))∗, |ax | = 1. Define
the function copy on the two phases LoHoW state (std,p, ((J ,R), (axWx )x ∈Loc),−), and then
recursively to subsequent states until we end up in (cert,p, ((J ,R),HW), ((J ,R),HW)).
The copy function is defined recursively as follows.
(Base) copy(std,p, ((J ,R), (axWx )x ∈Loc),−) = (cc,p, ((J ,R), (axWx )x ∈Loc), ((J ,R), (ax )x ∈Loc)). This
is copying the first symbol of eachHW(x). cc is an intermediate phase used only in copying.
Notice that the over lined symbol shows the progress of copying, one symbol each time.
(Inter) Next, we copy subsequent symbols. copy(cc,p, ((J ,R), (αaxUx )x ∈Loc), ((J ,R), (Wx )x ∈Loc)) is
defined as (cc,p, ((J ,R), (αaxUx )x ∈Loc), ((J ,R), (Wxax )x ∈Loc)).
(Last) Finally, when all higher order words have been copied, we move from cc to
cert. When a higher word has been completely copied, it has the form α , where
α ∈ (Σ∗#Γ)+. Then we define copy(cc,p, ((J ,R), (αx )x ∈Loc), ((J ,R), (Wx )x ∈Loc)) as
(cert,p, ((J ,R), (αx )x ∈Loc), ((J ,R), (Wx )x ∈Loc)), by removing the overline, and having the
phase cert.
If the last symbol ax in HW(x) is of the form (−,v,−,−,q), for q , p, then copy appends
ax#(msg,v,q, {}) instead of just ax in (Inter).
(d) From the certification phase to standard phase. π = cert, π ′ = std, ststd = st′std,
stcert = st
′
cert, and stcert is of the form ((J,R),HW) with HW(x) does not contain any
memory type of the form (prm,−,p,−)/(prm,−,p,−,−) for all x ∈ Loc (i.e., all promises made
by p are fulfilled).
Description of the Transducer. We consider 4 cases based on the 4 cases we have in the transition
rules (a)-(d) as above.
(1) We first consider the case when s and s ′ have the same phase (std or cert). If the location
involved in the instruction is xi , then the transducer copies all HWx j , j , i as is. For HWxi , if
the phase we have in Enc(s) is std, then the transducer copies ‡ as well as all symbols after
that in the output, while if the phase we have in Enc(s) is cert, the the transducer copies ‡
1:38 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
as well as all symbols before that in the output. This is common to all items below and we
will not mention it separately.
(a) Consider a Read instruction of the form λ : $r = xi of the process p. Then the transducer
will first guess the value v that will be read and update the local states of the processes
(as an output). The only change that the transducer will do concerns the i-th higher order
word HWxi . For each symbol that the transducer reads on the input tape of HWxi before
‡, it outputs the same symbol. Once the symbol pointed by the process p is read on the
input tape, the transducer will check the value of each symbol read on the input tape and
if it corresponds to v , the transducer will non-deterministically add p to its pointer set,
otherwise it will output the same read symbol (while removing p from its pointer set, which
has bee read, if needed).
(b) Consider a Write instruction of the form λ : xi = $r of the process p. Then the transducer
will first update the local states of the processes (as an output). The only change that the
transducer will do concerns the i-th higher order word HWxi . For each symbol that the
transducer reads on the input tape of HWxi , it outputs the same symbol. Once the symbol
having p in its pointer set is read on the input tape, the transducer will output the same
read symbol (while removing p from the pointer set). When the transducer reads a symbol
after #, it can decide to output the new message corresponding to the write instruction and
after that, go on by outputting any read symbol.
(c) The case of RMW is very similar to the case of a Write instruction of the process p.
(d) The case of a promise rule is similar to the Write. The main difference is that when the
transducer reads the symbol pointed by the process p on the input tape, the transducer
will output the same read symbol (without removing p from the pointer set). When the
transducer reads a symbol right after #, it can decide to output the new promise message,
such that the pointer set is empty. After that, it goes on by outputting any read symbol.
(e) The case of a reservation rule is similar to RMW.
(f) The case of a cancel rule by a process p is as follows. The transducer reads on symbols and
outputs the same, till it finds the symbol (−,−,−,−,p). On reading this, it outputs ϵ . After
that, it goes on by outputting any read symbol.
(g) The case of a fulfil rule is as follows. The transducer outputs what it reads till it finds a
symbol having p in its pointer set. It outputs the same symbol removing p from the pointer
set. Then it continues outputting the read symbol till it reads a symbol (prm,v,p, S). It
outputs (msg,v,p, S∪{p}) by addingp to the pointer set. After that, it goes on by outputting
any read symbol.
(h) Consider a SC-fence instruction. In this case the transducer will output any read symbol
except the ones that have д or p in its pointer set. If p and д are in the same pointer set,
then the transducer will continue outputting any read symbol. If the transducer reads the
first encountered symbol that contains only p or д in its pointer set, then the transducer
will output the same symbol without the pointer set containing either д or p. Once the
transducer reads the second encountered symbol whose pointer set contains only p or д
then the transducer will output the same symbol with the pointer set containing both д
and p. This is done for each HWxi .
(2) If the phase in Enc(s) is cert and that of Enc(s ′) is std, then the transducer simply replaces
cert by std, and the process p by any process q, and copies the rest as is in the output.
(3) If the phase in Enc(s) is std and that of Enc(s ′) is cert, then the transducer implements the
copy function described above. Each copy is implemented by a transducer, and the final result
is obtained by composing all these transducers. Note that rational transducers are closed
under composition, so it is possible to obtain one rational transducer that achieves the effect
The Decidability of Verification under Promising 2.0 1:39
of all the copy functions, starting with the std phase and ending in the cert phase. Note that
this is easily done, since in each step, the transducer progressively marks a symbol before ‡
with overline, and copies the same at the end.
□
A.3 Proof of Lemma 4.7
Consider K-LoHoW states c1, c2 s.t. c1 → c2, and let c3 be a state s.t. c1 ⊑ c3. We make a case analysis
based on the transition chosen.
Let c1 = (std,p, ((J1,R1),HW1), ((J2,R2),HW2)), c2 = (π ,q, ((J3,R3),HW3), ((J4,R4),HW4)),
c3 = (std,p, ((J1,R1),HW5), ((J2,R2),HW6)), and c4 = (π ,q, ((J7,R7),HW7), (J8,R8),HW8)). The
case when c1 = (cert,p,−,−) is similar to the case we discuss here.
(1) Consider the transition c1
λ:$r=x−−−−−→
p
c2 by a read instruction $r = x in process p. Then ∃k ≤ j,
k = ptr(p,HW1(x)), and the memory type at HW1(x)[j] has the form (−,v,−, S), v = R($r ).
HW3(x) is obtained by updating ptr(p,HW1(x)) to j, so that p is in the pointer set S . Since
c1 ⊑ c3, there is an increasing function f from the positions of HW1(x) to that of HW5(x)
such that f (k) ≤ f (j), ptr(p,HW5(x)) = f (k) in HW5(x) and the memory type at f (j) has
the form (−,v,−, S ′′), v = R($r ). Indeed, one can update ptr(p,HW5(x)) to f (j), obtaining a
state c4 from c3. The local process states of c4 is same as that of c2. All higher order words
HW5(y), y , x of c3 remain unchanged in c4 (and all higher order words HW1(y), y , x of c1
remain unchanged in c2), hence the ⊑ relation holds for these higher order words in c2, c4.
The same function f between positions of HW1(x) and HW5(x) can be used on positions of
HW3(x) of c2 and HW7(x) of c4 to see that c2 ⊑ c4 and c3 λ:$r=x−−−−−→
p
c4.
(2) Consider the transition c1
λ:x=$r−−−−−→
p
c2. Then, there is a position k in HW1(x) such that k =
ptr(p,HW1(x)). Let the memory type at HW(x)[k] be (−,v1,−, S1 ∪ {p}). After the transition,
we obtain HW3(x) such that ptr(p,HW3(x)) = j − 1 > k . There are 2 possibilities.
(a) j − 1, j form the positions of the 2 symbols #, (msg,R($r ),p, {p}) in the newly added simple
word in HW1(x). Figure 13 depicts this case. Notice that in HW1(x), k = ptr(p,HW1(x)),
and positions j − 3, j − 2 represent the last two positions of a simple word. The new simple
word is added right after this in HW3(x), at positions j − 1, j.
Fig. 13. The higher order words in c1, c2, c3, c4 in case(a). The two pink positions correspond to the newly
added simple word. The positions j − 3, j − 2 have # and a ∈ Σ ∪ Γ denoting the end of a simple word in
HW1(x), so that a new simple word can be inserted right after. The position k in HW1(x) is ptr(p,HW1(x)).
HW1(x) ⊑ HW5(x) witnessed by the increasing function f . HW3(x),HW7(x) respectively are obtained from
HW1(x),HW5(x) by the wt(x ,v) transition.
Since c1 ⊑ c3, let f be an increasing function from the positions of HW1(x) to those of
HW5(x). HW7(x) is obtained from HW5(x) by inserting the new simple word right after
position f (j − 2), at positions f (j − 2) + 1, f (j − 2) + 2. The position f (j − 1) in HW5(x) is
1:40 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
shifted to the right by two positions in HW7(x). Thus, we can define an increasing function
from positions of HW3(x) and HW7(x) as follows.
– For i ∈ {1, . . . , j − 2}, д(i) = f (i),
– д(j −1) = f (j −2)+1,д(j) = f (j −2)+2, (note that д(j −1),д(j) are the two new positions
in HW7(x) corresponding to the new positions j − 1, j in HW3(x)),
– For i ∈ {j + 1, . . . ,n + 2}, д(i) = f (i − 2) + 2
It is easy to see that д is an increasing function between the positions of HW3(x) and
HW7(x) : we know that f (j−2) < f (j−1). Hence, д(j) = f (j−2)+2 < f (j−1)+2 = д(j+1).
This also gives HW3(x) ⊑ HW7(x).
(b) j − 1 is the position obtained by appending to a simple word in HW1(x).
Fig. 14. The higher order words in c1, c2, c3, c4 in case(b). The pink position in HW3(x) corresponds to the
newly added memory type, right after # at position j − 3 in HW1(x). a ∈ Σ at position j − 2 in HW1(x) is
shifted to the left of # in HW3(x). The position k in HW1(x) is ptr(p,HW1(x)). HW1(x) ⊑ HW5(x)
witnessed by the increasing function f . HW3(x),HW7(x) respectively are obtained from HW1(x),HW5(x)
by the wt(x ,v) transition.
Figure 14 illustrates this case. HW1(x) ⊑ HW5(x) is witnessed by the increasing function f .
The new memory type is added at position f (j − 2) + 1 (right next to #), and all subsequent
symbols are shifted right by one position. It is easy to see that HW7(x) is obtained from
HW5(x) by the wt(x ,v) transition. The increasing function д from the positions of HW3(x)
to that of HW7(x) is defined as follows.
• For i ∈ {1, . . . , j − 2},д(i) = f (i),
• д(j − 1) = f (j − 2) + 1,
• For i ∈ {j, . . . ,n + 1}, д(i) = f (i − 1) + 1
Notice that д is an increasing function: д(j − 2) = f (j − 2) < f (j − 2) + 1 = д(j − 1),
д(j) = f (j − 1) + 1 > f (j − 2) + 1 = д(j − 1), and the same relationship holds for subsequent
indices.
(3) The case of c1
λ:CAS(xi ,$r1,$r2)−−−−−−−−−−−−−→
p
c2 is similar to the write.
(4) The case of a promise rule is exactly same as the write rule, as far as monotonicity is concerned.
(5) The case of promise fulfilment is trivial for monotonicity, since we only shift the pointer of p,
and update prm to msg in the memory type.
(6) The case of reservation follows exactly like case (b) of the write rule.
(7) The case of cancellation is trivial for monotonicity since the operation does not change the
length of the word.
(8) The case of SC-fence−−−−−−→
p
is trivial by using the observation that the relative ordering of the
pointers p and д are same in HW1(x) and HW5(x). HW3 and HW7 are obtained respectively
by moving the pointers of p,д to the rightmost one (whichever it is). So the same increasing
function that was used for HW1 ⊑ HW5 will work for HW3 ⊑ HW7.
The Decidability of Verification under Promising 2.0 1:41
B DETAILS OF THE SOURCE TO SOURCE TRANSLATION
B.1 Intuition for the Translation
2K Timestamps. We bound the number of essential events by K . Why do 2K timestamps suffice?.
Intuitively timestamps are used to determine relative order between the events.We track timestamps
of the view-switching messages (messages read by other processes), promises and reservations.
For each view-switch there are two timestamps of consequence. The timestamp of the reading
process before the read and the timestamp of the message to be read. Hence for each view switch,
the comparison operation requires us to maintain two timestamps. For a promises (reservation)
we maintain the timestamp of the promise (reservation). We do not explicitly store timestamps of
messages that will not view switch. These messages however may be read by the same process
that generated them. We keep track of whether the latest write can be read by the same process by
using some thread-local state.
K + n Contexts. It suffices to have K + n contexts since we can run the processes in the order
in which they generate view-switching messages. In each context, the process only depends on
the essential messages generated in previous contexts. If this were not the case we would get a
deadlock. We require n additional contexts to initiallize each process.
B.2 Glossary of Global and Local Variables used in the SC Program
We first give a glossary of all the variables used in the code. The list contains variables global to all
processes or local to a process. A small description of their role is also mentioned, which serve as
invariants.
(1) numEE : a global variable, initialized to 0, keeps track of the number of essential events
(promises, reservations and view switches) so far. Each time an essenial event occurs, numEE
is incremented.
(2) numContexts : a global variable, initialized to 0, keeps track of the number of context switches
so far. This is used in the translation to SC.
(3) view[x].v : a local variable, stores the value of x ∈ Loc in the local view of the process
(4) view[x].t : local variable, stores the time stamp ∈ Time of x ∈ Loc in the local view of the
process.
(5) view[x].l : local variable, boolean, which is set to true when view[x].t is a valid timestamp,
and can be used in comparisons with timestamps of other messages.
(6) view[x].f : local variable, boolean. A true value indicates that view[x].v is recent, and can
be used for reading locally.
(7) view[x].u : local variable, boolean. A true value indicates that the sequence of events starting
from the one that resulted in the timestamp view[x].t till the most recent, form a chain of
CAS operations on x . Whenever a write is published, view[x].u is set to true. view[x].u
is set to false on an unpublished write. On a sequence of CAS operations, view[x].u is left
unchanged.
(8) checkMode : local variable, boolean. Set to true when the process is in certification phase,
which means the process is making and certifying promises.
(9) liveChain[x] : local variable, for each x ∈ Loc, boolean. Can be true only when checkMode
is true. A true value represents that the last write done while the process is in certification
phase is not a published promise message.
(10) extView[x] : local variable, for each x ∈ Loc, boolean. A true value represents that the local
value view[x].v of the process comes from a message generated external to the certification
phase.
1:42 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
(11) blockPromise[x] : a global boolean array, which for each x ∈ Loc stores whether promises
should be blocked on variable x . This is used in the case of ra writes when we cannot have
promises on the same variable later (refer to PS 2.0, ra accesses).
(12) avail[x][t] : for each x ∈ Loc, a global boolean array of length 2K + 1 corresponding to
the 2K + 1 time stamps, checks availability of a time stamp on a fresh write.
(13) usedReservations[x][t] : denotes whether the reservation on variable x with timestamp
t has been used by the process during the certification check. If this not true, the reservation
will be cancelled.
(14) reserv[x][t] : denotes whether the reservation following timestamp t on variable x has
been claimed, and if so which process has claimed it.
(15) upd[x][t] : for each x ∈ Loc, a global boolean array of length 2K + 1 corresponding to the
2K + 1 time stamps, checks whether a certain timestamp has been used to read in a CAS.
(16) globalTimeMap[x] : global variable, for each x ∈ Loc, stores a time stamp ∈ Time. This is
used for simulating SC Fences where this functions as the G timemap from PS 2.0.
(17) messageStore : This is an array of messages, where each message is of type Message as
described in the main paper. The length of the array is K , the bound on the number of
promises + view switches.
(18) messagesUsed : a number from 0 toK which keeps track of the number of populatedmessages
in messageStore.
(19) messageNum : a number from 0 to K which chooses a number from the available free cells in
messageStore.
In addition, the message object stores the following data:
(1) mess.var is the shared variable on which the message has been generated
(2) mess.t[x] stores for each x ∈ Loc the timestamp of x in the view object stored in the
message
(3) mess.l[x] stores for each variable x ∈ Loc, a boolean signifying whether the corresponding
timestamp stored in mess.t[x] was one of the exact timestamps ∈ {0...K} or an abstract
timestamp.
(4) mess.val stores the value of the message
(5) mess.flag stores the promise state of the message, that is whether (1) it is has been fulfilled/is
not a promise (2) if it is a promise then the process that it belongs to. mess.flag takes values
from 0, -1, PIDs. If it is a simple message (not a promise), mess.flag = 0. If it is a promise,
mess.flag is set to the PID of the process which has made the promise. mess.flag is set to
-1 when the process has temporarily certified it in the current certification phase but will be
reset tp PID after exiting the certification phase.
Next we discuss the context switching modules.
B.3 Context Switching Modules
CSI Context-Switch-In. The CSI module switches the process into context by setting active to
true and incrementing numContexts. Finally we check numContexts does not exceed the context
switch bound.
Listing 1. CSI
1 if (! active){
2 atomic_begin ();
3 active = true;
4 numContexts ++;
5 assume(numContexts <= K + n);
The Decidability of Verification under Promising 2.0 1:43
6 }
CSO Context-Switch-Out. The CSO module has two functions- (1) moving the process from
normal to check mode and (2) switching the process out of context. When a process enters the
CSO block, with checkMode set to false, it enters the ‘if’ branch on line 2, sets checkMode to true
and saves the return label (of the current instruction pointer) in retAddr and saves the process
state before entering check mode (lines 9-10). This ensures that the process returns to the current
instruction after the consistency check. Now after the consistency check phase the process switches
out of context. At this point, checkMode is true, and hence the process enters the ‘else’ branch on
line 13. Consequently, we check whether there are no outstanding uncertified promises for the
process (line 15). All the promises that have been certified are reset to belong to the process by
setting mess.flag to the PID (lines 16-18). Then it is checked that there are no uncertified splitting
insertions, by ensuring that liveChain[x] is not true (lines 20-22). Finally we check for unused
reservations during ceritification and cancel them (lines 23-30). Once these checks for cnsistent
configuration are complete, we reload the saved state from before the consistency check phase and
reload the return address from retAddr. Then we move control to the instruction label in retAddr.
After returning control to label, we set checkMode and active to false and exit context.
Listing 2. CSO
1 if (*){
2 if (! checkMode){
3 if (! active){
4 atomic_begin ();
5 active = true;
6 numContexts ++;
7 assume(numContexts <= K+n);
8 }
9 checkMode = true;
10 retAddr = label_i;
11 saveState(PID);
12 }
13 else {
14 for (mess in messageStore){
15 assume(mess.flag != PID);
16 if (m.flag == -1){
17 m.flag = PID;
18 }
19 }
20 for (x in Loc){
21 assume (! liveChain[x]);
22 }
23 for (x in Loc and t in Time){
24 if (reserv[x][t] == PID){
25 if (! usedReservation[x][t]){
26 reserv[x][p] = 0;
27 upd[x][t] = 1;
28 }
29 }
30 }
31 loadState(PID);
32 gotoLabel(retAddr);
33 label_i:
34 checkMode = false;
35 active = false;
36 atomic_end ();
37 }
38 }
loadState and saveState subroutines. The saveState subroutine copies the local state of the
calling process and the global state into a what we refer to as ‘copy’ variables. We note that it does
not however copy numEE, reserv[x][t] and contents of messageStore. The reason for this being,
the promises the process makes in check mode are retained even after exiting check mode is made
false. Hence the increments made to numEE and the messages added to messageStore should be
maintained even after exiting check mode. This is even true for reservations, which are marked in
reserv[x][t], which are maintained evef after the process exits check mode.
Analogously in loadState, we load the contents of the (saved) ‘copy variables’ into their original
counterparts. Another subtle point to be noted is that when the process publishes a message (as a
promise) when checkMode is true, we also update the ‘copy’ variables corresponding to avail[x
][t]. This is done so that when the process returns to normal mode, the changes are reflected in
1:44 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
their original counterparts (which is essential since promise messages are maintained beyond the
time checkMode is false and hence their timestamps must be unavailable).
B.4 Reads
We provide the translation codes for reads of both access types, rlx and ra. We will first explain
with respect to rlx access reads.
rlx reads. The read can be one of two types, view switching, in which a message from
messageStore is acquired or a non view-switching (local) read. We guess non-deterministically,
one amongst these.
In case of a local read (line 2), the process checks that the local value is usable (line 3) by checking
view[x].f which denotes whether view[x].v is a valid value which can be read. It then loads
its local value view[x].v into $r . The local value may become unusable if the process crosses an
SC-fence which increases its view[x].t (see SC-fence).
In the case of a view-switching read (line 6), we check that we have not reached the essential-event
bound K (line 7). We ensure that liveChain[x] is false before the read in order to forbid additive
insertions when checking consistency. Recall from the liveChain invariant that liveChain[x] is
true only when the process is in certification mode and the last write on x was neither published as a
promise message nor was it certified with a reservation. Reading a message from the memory when
liveChain[x] is true implies additive insertion during certification, as illustrated by the following
example.
liveChain Assume the process is in the promise certification mode, with view[x].t set to t1, and let
the first write use a timestamp t2 > t1 with the message not published as promise, with liveChain[x]
as true. Now the instruction a:=x uses a message in the memory with a timestamp t3 ≥ t2.
x:=1; // t2
a:=x; // t3
x:=2; // t3 + 1
If the next write certifies a promise message, the interval in the message
will be t3 + 1, since liveChain[x] is true. This results in two writes during
the certification, with non-adjacent timestamps t2, t3 + 1, with only the
latter being promised. This behaviour is forbidded in PS 2.0 due to capped
memories. Notice that if the earlier write also resulted in a promise message
then we do not have additive insertion (since both are promised) and the read with timestamp t2 is
allowed since liveChain[x] is false.
Finally a new message is fetched from messageStore with a larger timestamp that the one in
the current view (lines 8-11), the process view is updated to include that new message. Whenever
a process makes a global read during check mode, it must reads from a message which has been
created outside its current certification phase. Hence, extView[x] will be set to true (see extView
invariant in the glossary).
Listing 3. readrlx
1 // local read
2 if(*){
3 ASSUME(view[x].f);
4 }
5 // (non -local) view -switching read
6 else {
7
8 ASSUME (! liveChain[x]);
9 ASSUME(numEE < K);
10 messNum = nondet(0, messageUsed -1);
11 mess = messageStore[messNum ];
12 ASSUME(mess.var == x);
13 ASSUME(mess.t[x] > view[x].t or (
mess.t[x] = view[x].t and view[
x].l == true));
14
15 // merge views on x
16 view[x].t = mess.t[x];
17 view[x].l = true;
18 view[x].v = message.val;
19
20 extView[x] = true;
21 numEE ++;
22 }
23 val($r) = view[x].v;
The Decidability of Verification under Promising 2.0 1:45
ra reads. . This case is almost similar to the earlier and hence only state the point of difference.
The main difference is that due to ra access, we merge (take the join of) all the timestamps rather
than just x as we did for rlx.
Listing 4. readra
1 // local read
2 if(*){
3 ASSUME(view[x].f);
4 }
5 // (non -local) view -switching read
6 else{
7 ASSUME(numEE < K);
8 messNum = nondet(0, messageUsed -1);
9 mess = messageStore[messNum ];
10 ASSUME(mess.var == x);
11 ASSUME(mess.t[x] > view[x].t or (
mess.t[x] = view[x].t and view[
x].l == true));
12
13 // merge views
14 for (y in X){
15 if (mess.t[y] == view[y].t){
16 view[y].l = (mess.l[y]) and (view
[y].l);
17 }
18 else if (mess.t[y] > view[y].t){
19 ASSUME (! liveChain[y]);
20 view[y].t = mess.t[y];
21 view[y].l = mess.l[y];
22 }
23 }
24 view[x].v = mess.val;
25
26 extView[x] = true;
27 numEE ++;
28 }
29 val($r) = view[x].v;
B.5 Writes
We now provied the translation of a write instruction x = $r of process. Once again we simulate
two access modes, rlx and ra. we first describe the relaxed mode and then discuss the changes for
the ra mode.
rlx writes. When in normal mode
Let us first consider execution in the normal phase (i.e., when checkMode is false). The value of
val($r ) is recorded in the local view, view[x].v and view[x].f is set to true meaning that the
value in view[x].v is a valid value and can be read from. Then, we non-deterministically choose
one of three possibilities for the write: it either (i) is not assigned a fresh timestamp, (ii) is assigned
a fresh timestamp, (iii) fulfils some outstanding promise. These nondeterministic branches are
given on lines 5, 24 and 60 of the code.
Listing 5. writerlx
1 view[x].v = val($r);
2 view[x].f = true;
3
4 // no fresh timestamp
5 if (*){
6 view[x].l = false;
7 if (checkMode and !liveChain[x]){
8 // new write does not rely on
reservation
9 if (*){
10 // only true if process is in
checkMode
11 liveChain[x] = true;
12 view[x].t = nondet(view[x].t,
MAXTS);
13 }
14 // new write relies on reservation
15 else {
16 view[x].t = nondet(view[x].t,
MAXTS);
17
18 ASSUME(upd[x][view[x].t] or
reserve[x][view[x].t] == p);
19 reserve[x][view[x].t] = p;
20 upd[x][view[x].t] = false;
21 usedReservation[x][view[x].t] =
true;
22 } } }
1:46 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
23 // a new timestamp is assigned to
this write
24 else if(*){
25 view[x].l = true;
26
27 for (y in X){
28 newView[y] = 0;
29 newViewL[y] = true;
30 }
31
32 if (liveChain[x]){
33 newView[x] = view[x].t + 1;
34 }
35 else {
36 newView[x] = nondet(view[x].t + 1,
MAXTS);
37 }
38
39 view[x].t = newView[x];
40 ASSUME(avail[x][ newView[x]]);
41 avail[x][ newView[t]] = false;
42
43 // essential message
44 if (*){
45 if (checkMode){
46 ASSUME (! blockPromise[x]);
47 mess = genMessage(x, newView ,
newViewL , val($r), -1);
48 liveChain[x] = false; numEE ++;
49 }
50 else {
51 mess = genMessage(x, newView ,
newViewL , val($r), 0);
52 }
53 Publish(mess);
54 }
55 else{
56 ASSUME (! checkMode);
57 }
58 }
59 // a previous Promise is certified
60 else{
61 ASSUME (! blockPromise[x]);
62 view[x].l = true;
63 messageNum = nondet(0, messageUsed
-1);
64 mess = messageStore[messageNum ];
65
66 // ensure that the message is a
promise and matches variable
and value
67 ASSUME(mess.var == x and mess.t[x]
> view[x].t);
68 ASSUME(mess.val == val($r) and mess
.flag == p);
69
70 if (checkMode){
71 mess.flag = -1;
72 ASSUME (! liveChain[x] or (view[x].t
+ 1 == mess.t[x]));
73 liveChain[x] = false;
74 }
75 else {
76 mess.flag = 0;
77 }
78
79 view[x].t = mess.t[x];
80 messageStore[messageNum] = mess;
81 }
82
83 if (! checkMode){
84 extView[x] = true;
85 } else {
86 extView[x] = false;
87 }
88
89 view[x].u = true;
In case (i), no message is created, and view[x].l is set to false, signifying that the timestamp
recorded in the view does not correspond to the most recent write to x and should therefore not be
used in the comparisons. The ‘if’ branch on line 7 is not taken checkMode is false.
In case (ii), since in this case, the timestamp in the view is by definition valid, we set view[x].l
to true (line 25). Since the write is relaxed, the message generated will only store the timestamp
on the variable written to (i.e. x) and 0 for all other variables (line 27-30). Now we allocate a
new timestamp to the write. Since we are in normal mode, liveChain[x] is false (see liveChain
invariant in glossary). Thus we choose a timestamp nondeterministically (line 36) and store it into
view[x].t. We use the avail[x][.] array to ensure that allocated timestamps are unique: (1) we
check that the selected timestamp is available (i.e., not allocated) on line 40, and remove it from the
array of available stamps (line 41). Now this message can either be published (for cnsumption by
The Decidability of Verification under Promising 2.0 1:47
another thread) or not. In the former case, the appropriate message is constructed with newView,
newViewL. Note that the last component of the message stores the flag mess.flag. This flag is set
to false since the message is not a promise (see mess.flag invariant in glossary). In the latter case
non of this is done (‘else’ branch on line 55). The assume(!checkMode) is satisfied.
In case (iii) Finally, if the process decides to fulfill a promise, a message is fetched from
messageStore and checked to be an unfulfilled promise by the current process (checking flag ==
p on line 68), and mess.flag is set to 0 and message reinserted into messageStore. Additionally
we set extView[x] to true maintaining the extView invariant.
rlx writes. When in check mode
Let us now consider a write executing in the certification phase (i.e., when checkMode is true).
We will only highlight differences between the normal and certification phase writes.
In case (i), that is when a fresh timestamp is not assigned, the write is certified either by deferring
certification to a promise by using splitting insertion (line 9) or by the a presence of a reservation
(line 15). In the case where, liveChain[x] is already true (line 7), certification for the current
sequence of writes is already deferredand hence we do none of the two. While certification by
either of splitting/reservation we nondeterministically choose an timestamp t after which the
current write occurs (line 12). We note that this is not the timestamp of the write itself, but specifies
between which two timestamps from Time the write occurs. If we rely on splitting insertion (line
9), we set liveChain[x] to true, and In case of certification by reservation we reserve an interval
adjacent to the timestamp t (line 19) after ensuring that it is available (line 18). Finally since this
reservation has been used in some certification, we mark this fact (line 20).
In cases (ii), the write is assigned a timestamp from Time and hence consequently published as a
promise. We allocate a fresh timestamp and store it into view[x].t. The most important point to
note is that we maintain and use the liveChain invariant whenever a fresh timestamp is assigned.
Indeed, if liveChain is true, the process must assign consecutive timestamps, otherwise it can
non-deterministically choose any timestamp greater than view[x].t (line 32-37). Additionally,
when generating a message, the mess.flag is set to -1 denoting that the message is promise but
has been certified and publish the message. We also increment numEE (line 48) as a promise is an
essential event.
In case (iii) we fulfill an older promise, and thus first retrieve an uncertified promise belonging
to the current process (mess.flag == PID) from messageStore (line 68). The main difference
with the normal mode is that we set mess.flag to -1 signifying that the promise is (temporarily)
certified but not fulfilled. We set the extView[x] to false signifying that the processes’ view has
come from checkMode and hence is not external.
Listing 6. writera
1 view[x].v = val($r);
2 view[x].f = true;
3
4 // no timestamp is assigned
5 if (*){
6 view[x].l = false;
7
8
9 if (checkMode){
10 blockPromise[x] = true;
11 ASSUME (! liveChain[x]);
12
13 view[x].t = nondet(view[x].t,
MAXTS);
14
15 // new write must rely on
reservation
16 ASSUME(upd[x][view[x].t] or
reserve[x][view[x].t] == p);
17 reserve[x][view[x].t] = p;
18 usedReservation[x][view[x].t] =
true;
19 }
1:48 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
20 }
21 // a new timestamp is assigned to
this write
22 else {
23 ASSUME (! checkMode);
24
25 view[x].l = true;
26
27 for (y in X){
28 newView[y] = view[y].t;
29 newViewL[y] = view[y].l;
30 }
31
32 newView[x] = nondet(view[x].t + 1,
MAXTS);
33
34 view[x].t = newView[x];
35 ASSUME(avail[x][ newView[t]]);
36 avail[x][ newView[t]] = false;
37
38 // essential message
39 if (*){
40 mess = genMessage(x, newView ,
newViewL , val($r), 0);
41 Publish(mess);
42 }
43 }
ra writes. The ra writes have some minor differences w.r.t rlx. Firstly, the timestamps for all
variables view[x][t] are added to the published messages, (lines 27-30). Next we set blockPromise
[x] to true signifying that henceforth there cannot be any promises on x (refer to PS 2.0, ra accesses).
This also implies that cases (ii) and (iii) (generating new promises and certifying earlier promises)
is not possible for ra writes as enforced on (line 23). Note that blockPromise[x] is also assumed
to be false in rlx writes when either generating new promises (ii) or certifying earlier ones (iii).
B.6 CAS operations
We only provide code for the CAS(rlx, rlx) variant since the others are implemented similarly,
carrying over the access dependent changes from the corresponding read and write codes. CAS is
bootstrapping a read and write, additioanlly enforcing that the timestamps are consecutive.
Listing 7. CAS
1 view[x].v = val($r);
2 view[x].f = true;
3
4 // no fresh timestamp
5 if (*){
6 view[x].l = false;
7 if (checkMode and !liveChain[x]){
8 // new write does not rely on
reservation
9 if (*){
10 // only true if process is in
checkMode
11 liveChain[x] = true;
12 view[x].t = nondet(view[x].t,
MAXTS);
13 }
14 // new write relies on reservation
15 else {
16 view[x].t = nondet(view[x].t,
MAXTS);
17
18 ASSUME(upd[x][view[x].t] or
reserve[x][view[x].t] == p);
19 reserve[x][view[x].t] = p;
20 upd[x][view[x].t] = false;
21 usedReservation[x][view[x].t] =
true;
22 } } }
23 // a new timestamp is assigned to
this write
24 else if(*){
25 view[x].l = true;
26
27 for (y in X){
28 newView[y] = 0;
29 newViewL[y] = true;
30 }
31
32 if (liveChain[x]){
33 newView[x] = view[x].t + 1;
34 }
35 else {
36 newView[x] = nondet(view[x].t + 1,
MAXTS);
37 }
The Decidability of Verification under Promising 2.0 1:49
38
39 view[x].t = newView[x];
40 ASSUME(avail[x][ newView[x]]);
41 avail[x][ newView[t]] = false;
42
43 // essential message
44 if (*){
45 if (checkMode){
46 ASSUME (! blockPromise[x]);
47 mess = genMessage(x, newView ,
newViewL , val($r), -1);
48 liveChain[x] = false; numEE ++;
49 }
50 else {
51 mess = genMessage(x, newView ,
newViewL , val($r), 0);
52 }
53 Publish(mess);
54 }
55 else{
56 ASSUME (! checkMode);
57 }
58 }
59 // a previous Promise is certified
60 else{
61 ASSUME (! blockPromise[x]);
62 view[x].l = true;
63 messageNum = nondet(0, messageUsed
-1);
64 mess = messageStore[messageNum ];
65
66 // ensure that the message is a
promise and matches variable
and value
67 ASSUME(mess.var == x and mess.t[x]
> view[x].t);
68 ASSUME(mess.val == val($r) and mess
.flag == p);
69
70 if (checkMode){
71 mess.flag = -1;
72 ASSUME (! liveChain[x] or (view[x].t
+ 1 == mess.t[x]));
73 liveChain[x] = false;
74 }
75 else {
76 mess.flag = 0;
77 }
78
79 view[x].t = mess.t[x];
80 messageStore[messageNum] = mess;
81 }
82
83 if (! checkMode){
84 extView[x] = true;
85 } else {
86 extView[x] = false;
87 }
88
89 view[x].u = true;
B.7 Fences
SC-fence. The SC-fence command essentially merges the thread local view with the globally
stored view in globalTimeMap. For each shared variable x we do the following. On line 3 we
check whether the globally stored view globalTimeMap[x] is greater than the process local view.
if that is the case, we increase the process-local view view[x].t to the globally stored view.
Additionally, we set view[x].f to false since, the value in view[x].val is no more valid (cannot
be read from again, since the process timestamp has increased). In the order case, (line 8), we raise
the globalTimeMap[x] either to view[x].t (if it is valid, checked by line 9) or to the next higher
timestamp, view[x].t + 1.
Listing 8. SC-fence
1 assume (! checkMode);
2 for (x in Loc){
3 if (globalTimeMap[x] > view[x].t){
4 view[x].t = globalTimeMap[x];
5 view[x].f = 0;
6 view[x].l = true;
7 }
8 else {
9 if (view[x]){
10 globalTimeMap[x] = view[x].t;
11 }
12 else {
13 globalTimeMap[x] = view[x].t + 1;
14 }
15 }
16 }
1:50 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
C COMPLETE EXPERIMENTAL RESULTS
We report the results of experiments we have performed with PS2SC. We have two objectives: (1)
studying the performance of PS2SC on benchmarks which are unsafe only with promises and (2)
comparing PS2SC with other model checkers when operating in the promise free mode. In the first
case, we show that PS2SC is able to uncover bugs in examples with low interaction with the shared
memory. When this interaction increases, however, PS2SC performs poorly, owing to the huge
non-determinism required by PS 2.0. However, with partial promises, PS2SC is once again able to
uncover bugs in reasonable amounts of time. In the second case, our observations highlight the
ability to detect hard to find bugs with small K for unsafe benchmarks, and scalability by altering
K as discussed earlier in case of safe benchmarks. We compare PS2SCwith three state-of-the-art
stateless model checking tools, CDSChecker [Norris and Demsky 2013],GenMC [Kokologiannakis
et al. 2019] and Rcmc [Kokologiannakis et al. 2017] that support the promise-free subset of the PS
2.0 semantics.
We now report results of all the experiments we have performed with PS2SC. In the tables that
follow we provide the value of K used (for our tool only). We also specify the value of L used (for
all tools).
We do not consider compilation time for any tool while reporting the results. For our tool, the
time reported is the time taken by the CBMC backend for analysis. The timeout used is 1 hour for
all benchmarks. All experiments are conducted on a machine equipped with a 3.00 GHz Intel Core
i5-3330 CPU and 8GB RAM running a Ubuntu 16 64-bit operating system. We denote timeout by
‘TO’, and memory limit exceeded ‘MLE’.
C.1 Experimenting with Promises
In this section we ask whether the source-to-source translation technique can effectively handle
promises for PS 2.0. In conclusion, we note that the source to source approach performs well on
programs requiring limited global memory interaction.When this interaction increases PS2SC times
out, owing to the huge non-determinism of PS 2.0. However, the modular approach of partial-
promises enables us to recover effective verification.
testcase K PS2SC
ARM_weak 4 0.765s
LB2cu 4 5.748s
LBcu 4 5.253s
Upd-Stuck 4 1.252s
split 4 25.737s
Table 8. Performance of PS2SC on PS 2.0 idioms
testcase K PS2SC
fib_local_3 4 0.742s
fib_local_4 4 0.761s
fib_local_cas_3 4 1.132s
fib_local_cas_4 4 1.147s
Table 9. Performance of PS2SC on cases with local update followed by promises
The Decidability of Verification under Promising 2.0 1:51
testcase K PS2SC[1p]
fib_global_2 4 55.972s
fib_global_3 4 2m4s
fib_global_4 4 4m20s
exp_global_1 4 19m37s
exp_global_2 4 41m12s
tri_global_2 4 52.973s
tri_global_3 4 1m57s
tri_global_4 4 3m58s
Table 10. Performance of PS2SC on cases with global update
C.2 Comparing Performance with Other Tools
benchmark L K PS2SC CDSChecker GenMC RCMC
exponential_5_unsafe 10 10 1.312s 0.900s 0.135s 6.692s
exponential_10_unsafe 10 10 1.854s 1.921s 0.367s 3m41s
exponential_25_unsafe 25 10 3.532s 7.239s 3.736s TO
exponential_50_unsafe 50 10 6.128s 36.361s 39.920s TO
exponential_70_unsafe 10 10 9.509s 1m33s 2m29s TO
fibonacci_2_unsafe 2 20 2.746s 2.332s 0.084s 0.086s
fibonacci_3_unsafe 3 20 9.392s 46m8s 0.462s 0.544s
fibonacci_4_unsafe 4 20 34.019s TO 12.437s 18.953s
fibonacci_2_safe 2 20 6.454s 8.900s 0.096s 0.162s
fibonacci_3_safe 3 20 30.936s TO 0.910s 3.884s
fibonacci_4_safe 4 20 2m16s TO 1.140s 2m36s
Table 11. Comparison of performance on a set of parameterized benchmarks
benchmark L K PS2SC CDSChecker GenMC RCMC
hehner2_unsafe 4 5 7.207s 0.033s 0.094s 0.087s
hehner3_unsafe 4 5 28.345s 0.036s 2m53s 1m13s
linuxlocks2_unsafe 2 4 0.547s 0.032s 0.073s 0.078s
linuxlocks3_unsafe 2 4 1.031s 0.031s 0.083s 0.081s
queue_2_safe 4 4 0.180s 0.031s 0.082s 0.085s
queue_3_safe 4 4 0.347s 0.037s 0.090s 0.092s
Table 12. Comparison of performance on concurrent data structures based benchmarks
1:52 Parosh Abdulla, M. Faouzi Atig, Adwait Godbole, S. Krishna, and V. Vafeiadis
benchmark L K PS2SC CDSChecker GenMC RCMC
readerwriter_7 0 5 0.719s 0.005s 0.057s 0.690s
readerwriter_8 0 5 0.839s 0.006s 0.056s 7.425s
readerwriter_9 0 5 1.068s 0.007s 0.053s 1m17s
readerwriter_10 0 5 1.393s 0.007s 0.056s 14m49s
redundant_co_10 10 5 0.470s 0.114s 0.087s 38m12s
redundant_co_20 20 5 1.031s 0.548s 0.218s TO
redundant_co_50 50 5 3.219s 8.965s 4.143s TO
redundant_co_70 70 5 6.093s 13.843s 18.185s TO
Table 13. Evaluation using two synthetic safe benchmarks. We note that the value of K is chosen to be large
enough to consider all executions.
benchmark L K PS2SC CDSChecker GenMC RCMC
peterson1U(4) 1 6 1.408s 0.039s TO 9.129s
peterson1U(6) 1 6 7.286s 0.010s TO TO
peterson1U(8) 1 6 47.786s TO TO TO
peterson1U(10) 1 6 4m19s TO TO TO
szymanski1U(4) 1 2 1.015s 0.043s MLE TO
szymanski1U(6) 1 2 2.771s TO MLE TO
szymanski1U(8) 1 2 6.176s TO TO TO
szymanski1U(10) 1 2 12.203s TO TO TO
Table 14. Comparison of performance on mutual exclusion benchmarks with a single unfenced process
benchmark L K PS2SC CDSChecker GenMC RCMC
peterson1C(3) 1 2 0.487s 0.053s 0.083s 0.087s
peterson1C(4) 1 2 1.193s 3.500s TO 3.360s
peterson1C(5) 1 2 2.713s TO TO TO
peterson1C(6) 1 2 6.045s TO TO TO
peterson1C(7) 1 2 11.008s TO TO TO
peterson2C(3) 1 2 0.481s 0.032s 0.099s 0.091s
peterson2C(4) 1 2 1.241s 0.037s TO 9.162s
peterson2C(5) 1 2 2.801s 1m47s TO TO
peterson2C(6) 1 2 6.528s TO TO TO
peterson2C(7) 1 2 11.030s TO TO TO
Table 15. Comparison of performance on completely fenced peterson mutual exclusion benchmarks with a
bug introduced in the critical section of a single process
benchmark L K PS2SC CDSChecker GenMC RCMC
peterson(3) 1 2 0.878s TO 9.665s 26.208s
peterson(2) 1 2 0.321s 0.325s 0.087s 0.068s
peterson(3) 2 4 1.695s TO MLE TO
peterson(2) 2 4 0.539s 15m22s 0.039s 0.428s
peterson(3) 4 4 15.900s TO MLE TO
peterson(2) 4 4 3.412s TO TO TO
Table 16. Evaluation using safe mutual exclusion protocols
