SMT-Based Bounded Model Checking of Multi-threaded Software in Embedded Systems by Cordeiro, Lucas
SMT-based Bounded Model 
Checking for Multi-threaded 
Software in Embedded Systems
Lucas Cordeiro
lcc08r@ecs.soton.ac.uk
Software in Embedded SystemsEmbedded systems are ubiquitous
but their verification becomes more difficult.
• functionality demanded increased significantly
– peer reviewing and testing
• multi-core processors with scalable shared memory
– but software model checkers focus on single-threaded or multi-
threaded with message passing
void *threadA(void *arg) {
lock(&mutex);
x++;




if (x == 0) unlock(&lock);
unlock(&mutex);
}
void *threadB(void *arg) {
lock(&mutex);
y++;










DeadlockBounded Model Checking (BMC)
Basic Idea: check negation of given property up to given depth
. . .
M0 M1 M2 Mk-1 Mk
¬ϕ0 ¬ϕ1 ¬ϕ2 ¬ϕk-1 ¬ϕk
counterexample trace 





• transition system M unrolled k times
– for programs: unroll loops, unfold arrays, …
• translated into verification condition ψ such that
ψ ψ ψ ψ satisfiable iff ϕ ϕ ϕ ϕ has counterexample of max. depth k
• has been applied successfully to verify (sequential) software
counterexample trace • concurrency bugs are tricky to reproduce/debug because 
they usually occur under specific thread interleavings
– most common errors: 67% related to atomicity and order 
violations, 30% related to deadlock [Lu et al.’08]
• problem: the number of interleavings grows exponentially 
with the number of threads (n) and program statements (s) 
BMC of Multi-threaded Software
– number of executions: O(ns)
– context  switches among threads increase the number  of 
possible executions
• two important observations help us:
– concurrency bugs are shallow [Qadeer&Rehof’05]
– SAT/SMT solvers produce unsatisfiable cores that allow us to 
remove possible undesired models of the system• exploit SMT solvers to:
– encode full ANSI-C into the different background theories
– prune the property and data dependent search space
– remove interleavings that are not relevant by analyzing the 
Objective of this work
Exploit SMT to extend BMC of embedded software
– remove interleavings that are not relevant by analyzing the 
proof of unsatisfiability
• propose three approaches to SMT-based BMC:
– lazy exploration of the interleavings
– schedule guards to encode all interleavings
– underapproximation and widening (UW) [Grumberg et al.’05]
• evaluate our approaches implemented in ESBMC over 
embedded software applications• SMT-based BMC for Embedded ANSI-C Software
• Verifying Multi-threaded Software
• Implementation of ESBMC
Agenda
• Implementation of ESBMC
• Integrating ESBMC into Software Engineering Practice
• Conclusions and Future WorkSatisfiability Modulo Theories (1)
SMT decides the satisfiability of first-order logic formulae 
using the combination of different background theories    
(⇒ building-in operators).
Theory Example
Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3) Equality x1=x2 ∧ ¬ (x1=x3) ⇒ ¬(x1=x3)
Bit-vectors (b >> i) & 1 = 1
Linear arithmetic (4y1 + 3y2 ≥ 4) ∨ (y2 – 3y3 ≤ 3)
Arrays (j = k ∧ a[k]=2) ⇒ a[j]=2
Combined theories (j ≤ k ∧ a[j]=2) ⇒ a[i] < 3Satisfiability Modulo Theories (2)
• Given
– a decidable ∑-theory T
– a quantifier-free formula ϕ
ϕ ϕ ϕ ϕ is T-satisfiable iff T ∪ {ϕ} is satisfiable, i.e., there exists a 
structure that satisfies both formula and sentences of T
• Given • Given
– a set Γ ∪ {ϕ} of first-order formulae over T
ϕ ϕ ϕ ϕ is a T-consequence of Γ Γ Γ Γ (Γ ⊧ ⊧ ⊧ ⊧T ϕ) iff every model of T ∪ Γ
is also a model of ϕ
• Checking Γ ⊧ ⊧ ⊧ ⊧T ϕ can be reduced in the usual way to checking 
the T-satisfiability of Γ ∪ {¬ϕ}Satisfiability Modulo Theories (3)
• let a be an array, b, c and d be signed bit-vectors of width 
16, 32 and 32 respectively, and let g be an unary function.
( ) ( ) ( ) ( )
( ) ( ) ( ) 4 1 3 16 , 4 16 ,
3 16 , , 12 , ,
− = + ∧ − = ∧ + − ≠
+
d c c b SignExt c b SignExt g
b SignExt c a store select g
( ) ( ) ( ) ( ) − = + ∧ − = ∧ + − ≠ +
b' extends b to the signed equivalent bit-vector of size 32
( ) ( ) ( ) ( ) 4 1 3 ' 4 ' 3 ' , 12 , ,   : 1   − = + ∧ − = ∧ + − ≠ + d c c b c b g b c a store select g step
( ) ( ) ( ) ( ) 4 1 3 3 4 3 3 3 , 12 , ,   : 2   − = + ∧ − = − ∧ + − − ≠ + − d c c c c c g c c a store select g step
( ) ( ) ( ) ( ) 4 1 3 3 1 , 12 , ,   : 3   − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step
replace b' by c−3 in the inequality
using facts about bit-vector arithmeticSatisfiability Modulo Theories (4)
applying the theory of arrays
( ) ( ) 4 1 3 1 12   : 4   − = + ∧ − ∧ ≠ d c c g g step
The function g implies that for all x and y, 
if x = y, then g (x) = g (y) (congruence rule).
( ) ( ) ( ) ( ) 4 1 3 3 1 , 12 , ,   : 3   − = + ∧ − = − ∧ ≠ d c c c g c c a store select g step
if x = y, then g (x) = g (y) (congruence rule).
10) d   5, (c   AT   : 5   = = S step
• SMT solvers also apply:
– standard algebraic reduction rules 
– contextual simplification
false false r a ∧
( ) ( ) 7 7 7 p a a p a ∧ = ∧ = aSoftware BMC using ESBMC 
• program modelled as state transition system
– state: program counter and program variables
– derived from control-flow graph
– checked safety properties give extra nodes












• unfolded program optimized to reduce blow-up
– constant propagation
– forward substitutions
crucialSoftware BMC using ESBMC 
• program modelled as state transition system
– state: program counter and program variables
– derived from control-flow graph
– checked safety properties give extra nodes












• unfolded program optimized to reduce blow-up
– constant propagation
– forward substitutions
• front-end converts unrolled and
optimized program into SSA
crucial
g1 = x1 == 0
a1 = a0 WITH [i0:=0]
a2 = a0
a3 = a2 WITH [2+i0:=1]
a4 = g1 ? a1 : a3
t1 = a4 [1+i0] == 1Software BMC using ESBMC 
• program modelled as state transition system
– state: program counter and program variables
– derived from control-flow graph
– checked safety properties give extra nodes












• unfolded program optimized to reduce blow-up
– constant propagation
– forward substitutions
• front-end converts unrolled and
optimized program into SSA
• extraction of constraints C and properties P
– specific to selected SMT solver, uses theories



























) , , ( :
1 , 2 , :
:
0 , , :
0 :
:





a a g ite a
i a store a
a a

















< + ∧ ≥ + ∧




2 1 0 1











PEncoding of Numeric Types
• SMT solvers typically provide different encodings for numbers:
– abstract domains (Z, R)
– fixed-width bit vectors (unsigned int, …)
> “internalized bit-blasting”
• verification results can depend on encodings
valid in abstract domains 
such as Z or R
(a > 0) ∧ (b > 0) ⇒  ⇒  ⇒  ⇒ (a + b > 0) 
– majority of VCs solved faster if numeric types are modelled
by abstract domains but possible loss of precision
– ESBMC supports both types of encoding and also combines 
them to improve scalability and precision
such as Z or R
doesn’t hold for bitvectors, 
due to possible overflowsEncoding Numeric Types as Bitvectors
Bitvector encodings need to handle
• type casts and implicit conversions
– arithmetic conversions implemented using word-level functions 
(part of the bitvector theory: Extract, SignExt, …)
> different conversions for every pair of types
> uses type information provided by front-end
– conversion to / from bool via if-then-else operator – conversion to / from bool via if-then-else operator
t = ite(v ≠ k, true, false)  //conversion to bool
v = ite(t, 1, 0) //conversion from bool
• arithmetic over- / underflow
– standard requires modulo-arithmetic for unsigned integer
unsigned_overflow ⇔ (r – (r mod 2w)) < 2w
– define error literals to detect over- / underflow for other types
res_op ⇔ ¬ overflow(x, y) ∧ ¬ underflow(x, y)
> similar to conversionsFloating-Point Numbers
• over-approximate floating-point by fixed-point numbers
– encode the integral (i) and fractional (f) parts
• binary encoding: get a new bit-vector b = i @ f with the 
same bitwidth before and after the radix point of a.
// m = number of 
bits of i i = 
Extract(b, nb + ma – 1, nb) : ma ≤ mb
SignExt(Extract(b, tb – 1, nb), ma – mb)    : otherwise



































// p = number of decimal places
// n = number of 
bits of f
SignExt(Extract(b, tb – 1, nb), ma – mb)    : otherwise
f = 
Extract(b, nb – 1, nb – nb) : na ≤ nb
ZeroExt(Extract(b, nb -1, 0), na – nb)       : otherwiseEncoding of Pointers
• arrays and records / tuples typically handled directly by 
SMT-solver
• pointers modelled as tuples
– p.o ≙ representation of underlying object
– p.i  ≙ index (if pointer used as array base)
int main() { p := store(p , 0, &a[0])
∧
Store object at 
position 0
int main() {








p1 := store(p0, 0, &a[0])
∧ p2 := store(p1, 1, 0)
∧ g2 := (x2 == 0) 
∧ a1 := store(a0, i0, 0)
∧ a2 := a0
∧ a3 := store(a2, 1+ i0, 1)
∧ a4 := ite(g1, a1, a3)
∧ p3 := store(p2, 1, select(p2 , 1)+2)
C:=
Store index at 
position 1
Update indexEncoding of Pointers
• arrays and records / tuples typically handled directly by 
SMT-solver
• pointers modelled as tuples
– p.o ≙ representation of underlying object












i0 ≥ 0 ∧ i0 < 2 
∧ 1+ i0 ≥ 0 ∧ 1+ i0 < 2 
∧ select(p3 , 0) == &a[0]
∧ select(select(p3 , 0), 
select(p3 , 1)) == 1
P:=
(a[2] unconstrained)
⇒ assert failsEncoding of Memory Allocation
• model memory just as an array of bytes (array theories)
– read and write operations to the memory array on the logic level
• each dynamic object doconsists of
– m ≙ memory array
– s ≙ size in bytes of m
– ρ ≙ unique identifier – ρ ≙ unique identifier
– υ ≙ indicate whether the object is still alive
– l ≙ the location in the execution where m is allocated
• to detect invalid reads/writes, we check whether
– do is a dynamic object
– i is within the bounds of the memory array
( ) n i j d l o
k





 = ∨ ⇔
= 0 .
1 _ _ ρEncoding of Memory Allocation
• to check for invalid objects, we
– set υ to true when the function malloc is called (do is alive)
– set υ to false when the function free is called (do is not longer 
alive)
lvalid_object ⇔ (lis_dynamic_object ⇒ do.υ)
• to detect forgotten memory, at the end of the (unrolled) 
program we check
– whether the do has been deallocated by the function free
ldeallocated_object ⇔ (lis_dynamic_object ⇒ ¬ do.υ)Example of Memory Allocation
#include <stdlib.h>
void main() {
char *p = malloc(5);  // ρ = 1
char *q = malloc(5);  // ρ = 2
p=q;
free(p)
p = malloc(5);           // ρ = 3
free(p)
memory leak: pointer 
reassignment makes do1.υ
to become an orphan
free(p)
}Example of Memory Allocation
#include <stdlib.h>
void main() {
char *p = malloc(5);  // ρ = 1
char *q = malloc(5);  // ρ = 2
p=q;
free(p)
p = malloc(5);           // ρ = 3
free(p)
¬do1.υ ∧ ¬do2.υ ¬do3.υ P:=
free(p)
}
do1.ρ=1 ∧ do1.s=5 ∧ do1.υ=true ∧ p=do1
∧ do2.ρ=2 ∧ do2.s=5 ∧ do2.υ=true ∧ q=do2 
∧ p=do2 ∧ do2.υ=false
∧ do3.ρ=3 ∧ do3.s=5 ∧ do3.υ=true ∧ p=do3 
∧ do3.υ=false
C:=Example of Memory Allocation
#include <stdlib.h>
void main() {
char *p = malloc(5);  // ρ = 1
char *q = malloc(5);  // ρ = 2
p=q;
free(p)
p = malloc(5);           // ρ = 3
free(p)
¬ ¬ ¬ ¬do1.υ υ υ υ ∧ ¬do2.υ ¬do3.υ P:=
free(p)
}
do1.ρ=1 ∧ do1.s=5 ∧ do1.υ υ υ υ=true ∧ p=do1
∧ do2.ρ=2 ∧ do2.s=5 ∧ do2.υ=true ∧ q=do2 
∧ p=do2 ∧ do2.υ=false
∧ do3.ρ=3 ∧ do3.s=5 ∧ do3.υ=true ∧ p=do3 
∧ do3.υ=false
C:=Evaluation EvaluationComparison of SMT solvers




• Set-up: • Set-up:
– identical ESBMC front-end, individual back-ends
– operations not supported by SMT-solvers are axiomatized
– standard desktop PC, time-out 3600 secondsModule #L #P
CVC3 Boolector Z3





































InsertionSort (n=35) 86 17 4 (5) 0 3 (3) 0 3 (3) 0
Comparison of SMT solvers





















Prim 79 30 5 (2) 0 <1 (<1) 0 <1 (<1) 0
StrCmp 14 6 11 (454) 0 195 (257) 0 35 (46) 0
MinMax 19 9 Tb (Mb) 1 42 (7) 0 6 (7) 0
lms 258 23 225 (324) 0 303 (307) 0 306 (307) 0
Bitwise 18 1 3 (6) 0 7 (8) 0 30 (26) 0
adpcm_encode 149 12 6 (26) 0 6 (6) 0 6 (6) 0











































InsertionSort (n=35) 86 17 4 (5) 0 3 (3) 0 3 (3) 0
Comparison of SMT solvers


















Prim 79 30 5 (2) 0 <1 (<1) 0 <1 (<1) 0
StrCmp 14 6 11 (454) 0 195 (257) 0 35 (46) 0
MinMax 19 9 Tb (Mb) 1 42 (7) 0 6 (7) 0
lms 258 23 225 (324) 0 303 (307) 0 306 (307) 0
Bitwise 18 1 3 (6) 0 7 (8) 0 30 (26) 0
adpcm_encode 149 12 6 (26) 0 6 (6) 0 6 (6) 0
adpcm_decode 111 10 3 (27) 0 3 (3) 0 3 (3) 0
All SMT-solvers can 
handle the VCs from the 
embedded applicationsModule #L #P
CVC3 Boolector Z3





































InsertionSort (n=35) 86 17 4 (5) 0 3 (3) 0 3 (3) 0
Comparison of SMT solvers
CVC3 doesn’t scale 
that well and runs 




















Prim 79 30 5 (2) 0 <1 (<1) 0 <1 (<1) 0
StrCmp 14 6 11 (454) 0 195 (257) 0 35 (46) 0
MinMax 19 9 Tb (Mb) 1 42 (7) 0 6 (7) 0
lms 258 23 225 (324) 0 303 (307) 0 306 (307) 0
Bitwise 18 1 3 (6) 0 7 (8) 0 30 (26) 0
adpcm_encode 149 12 6 (26) 0 6 (6) 0 6 (6) 0
adpcm_decode 111 10 3 (27) 0 3 (3) 0 3 (3) 0Module #L #P
CVC3 Boolector Z3





































InsertionSort (n=35) 86 17 4 (5) 0 3 (3) 0 3 (3) 0
Comparison of SMT solvers
Boolector and Z3 roughly 




















Prim 79 30 5 (2) 0 <1 (<1) 0 <1 (<1) 0
StrCmp 14 6 11 (454) 0 195 (257) 0 35 (46) 0
MinMax 19 9 Tb (Mb) 1 42 (7) 0 6 (7) 0
lms 258 23 225 (324) 0 303 (307) 0 306 (307) 0
Bitwise 18 1 3 (6) 0 7 (8) 0 30 (26) 0
adpcm_encode 149 12 6 (26) 0 6 (6) 0 6 (6) 0
adpcm_decode 111 10 3 (27) 0 3 (3) 0 3 (3) 0Module #L #P
CVC3 Boolector Z3





































InsertionSort (n=35) 86 17 4 (5) 0 3 (3) 0 3 (3) 0
Comparison of SMT solvers
The native API is slightly 




















Prim 79 30 5 (2) 0 <1 (<1) 0 <1 (<1) 0
StrCmp 14 6 11 (454) 0 195 (257) 0 35 (46) 0
MinMax 19 9 Tb (Mb) 1 42 (7) 0 6 (7) 0
lms 258 23 225 (324) 0 303 (307) 0 306 (307) 0
Bitwise 18 1 3 (6) 0 7 (8) 0 30 (26) 0
adpcm_encode 149 12 6 (26) 0 6 (6) 0 6 (6) 0
adpcm_decode 111 10 3 (27) 0 3 (3) 0 3 (3) 0Module #L #P
CVC3 Boolector Z3





































InsertionSort (n=35) 86 17 4 (5) 0 3 (3) 0 3 (3) 0
Comparison of SMT solvers
The native API is slightly 
faster than the SMT-LIB 



















Prim 79 30 5 (2) 0 <1 (<1) 0 <1 (<1) 0
StrCmp 14 6 11 (454) 0 195 (257) 0 35 (46) 0
MinMax 19 9 Tb (Mb) 1 42 (7) 0 6 (7) 0
lms 258 23 225 (324) 0 303 (307) 0 306 (307) 0
Bitwise 18 1 3 (6) 0 7 (8) 0 30 (26) 0
adpcm_encode 149 12 6 (26) 0 6 (6) 0 6 (6) 0
adpcm_decode 111 10 3 (27) 0 3 (3) 0 3 (3) 0Module
ESBMC SMT-CBMC
Z3 CVC3 CVC3
BubbleSort (n=35) <1 (<1) 2 (2) 100
Comparison to SMT-CBMC [A. Armando et al.]
• SMT-based BMC for C, built on top of CVC3 (hard-coded)
– limited coverage of language

















BellmanFord <1 (<1) <1 (<1) 43
Prim <1 (<1) <1 (<1) 96
StrCmp 27 (38) 7 (261) T
SumArray 25 (<1) <1 (108) 98
MinMax 6 (6) Tb (Mb) 65Module
ESBMC SMT-CBMC
Z3 CVC3 CVC3
BubbleSort (n=35) <1 (<1) 2 (2) 100
Comparison to SMT-CBMC [A. Armando et al.]
• SMT-based BMC for C, built on top of CVC3 (hard-coded)
– limited coverage of language
• Goal: compare efficiency of encodings


















BellmanFord <1 (<1) <1 (<1) 43
Prim <1 (<1) <1 (<1) 96
StrCmp 27 (38) 7 (261) T
SumArray 25 (<1) <1 (108) 98
MinMax 6 (6) Tb (Mb) 65Module
ESBMC SMT-CBMC
Z3 CVC3 CVC3
BubbleSort (n=35) <1 (<1) 2 (2) 100
Comparison to SMT-CBMC [A. Armando et al.]
• SMT-based BMC for C, built on top of CVC3 (hard-coded)
– limited coverage of language

















BellmanFord <1 (<1) <1 (<1) 43
Prim <1 (<1) <1 (<1) 96
StrCmp 27 (38) 7 (261) T
SumArray 25 (<1) <1 (108) 98
MinMax 6 (6) Tb (Mb) 65
ESBMC substantially faster, 
even with identical solvers
⇒ probably better encodingModule
ESBMC SMT-CBMC
Z3 CVC3 CVC3
BubbleSort (n=35) <1 (<1) 2 (2) 100
Comparison to SMT-CBMC [A. Armando et al.]
• SMT-based BMC for C, built on top of CVC3 (hard-coded)
– limited coverage of language

















BellmanFord <1 (<1) <1 (<1) 43
Prim <1 (<1) <1 (<1) 96
StrCmp 27 (38) 7 (261) T
SumArray 25 (<1) <1 (108) 98
MinMax 6 (6) Tb (Mb) 65
Z3 uniformly
better than CVC3• SMT-based BMC for Embedded ANSI-C Software
• Verifying Multi-threaded Software
• Implementation of ESBMC
Agenda
• Implementation of ESBMC
• Integrating ESBMC into Software Engineering Practice
• Conclusions and Future WorkLazy exploration of interleavings
C/C++
source verification  SMT 
Idea: iteratively generate all possible interleavings and 



















parse, and  
type-check
deadlock, atomicity and 
order violations, etc…Lazy exploration of interleavings
C/C++
source verification  SMT 








Idea: iteratively generate all possible interleavings and 














deadlock, atomicity and 
order violations, etc…
check satisfiability 
using an SMT solver
stop the generate-and-test 
loop if there is an error
scheduler
reused/extended from the 
Cprover framework
properties
BMCLazy exploration of interleavings
• Main steps of the algorithm:
1. Initialize the stack with the initial node ν0 and the initial path 
π0 = 〈υ0〉
2. If the stack is empty, terminate with “no error”.
3. Pop the current node υ and current path π off the stack and 
compute the set υ’ of successors of υ using rules R1-R8. compute the set υ’ of successors of υ using rules R1-R8.
4. If υ’ is empty, derive the VC      for π and call the SMT solver 
on it. If      is satisfiable, terminate with “error”; otherwise, 
goto step 2.
5. If υ’ is not empty, then for each node υ ∈ υ’, add ν to π, and 
push node and extended path on the stack. goto step 3.
π ϕk
π ϕk
( ) ( ) ( )
}
property s constraint
1 1 0 0 , , k k k k s s R s s R s I φ ϕ
π ¬ ∧ ∧ ∧ ∧ = −
4 4 4 4 4 8 4 4 4 4 4 7 6




2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Running Example
• the program has sequences of operations that need to be 
protected together to avoid atomicity violation
– requirement: the region of code (val1 and val2) should execute 
atomically
A state s ∈ S consists of 
the value of the program  2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=0; val2=0;
local variabes: t1= -1; t2= -1;
the value of the program 
counter pc and the values 
of all program variablesThread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {




2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=0; val2=0;
local variabes: t1= -1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {




2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=0; val2=0;
local variabes: t1= -1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {




write access to the shared 
variable val1 in statement 2
of the thread twoStage
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= -1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {




2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= -1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {





2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= -1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8
val1-access: WtwoStage,2 - Rreader,8
val2-access:
read access to the shared 
variable val1 in statement 8
of the thread reader
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= -1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11
val1-access: WtwoStage,2 - Rreader,8- Rreader,11
val2-access:
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= 1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12
val1-access: WtwoStage,2 - Rreader,8- Rreader,11
val2-access:
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= 1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12
val1-access: WtwoStage,2 - Rreader,8- Rreader,11
val2-access:
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= 1; t2= -1;
CS2Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12-4
val1-access: WtwoStage,2 - Rreader,8- Rreader,11
val2-access:
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= 1; t2= -1;
CS2Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12-4-5
val1-access: WtwoStage,2 - Rreader,8- Rreader,11 - RtwoStage,5
val2-access: WtwoStage,5
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=2;
local variabes: t1= 1; t2= -1;
CS2Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12-4-5-6
val1-access: WtwoStage,2 - Rreader,8- Rreader,11 - RtwoStage,5 
val2-access: WtwoStage,5
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=2;
local variabes: t1= 1; t2= -1;
CS2Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12-4-5-6
val1-access: WtwoStage,2 - Rreader,8- Rreader,11 - RtwoStage,5 
val2-access: WtwoStage,5
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);






global variables: val1=1; val2=2;
local variabes: t1= 1; t2= -1;
CS2Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12-4-5-6-13
val1-access: WtwoStage,2 - Rreader,8- Rreader,11 - RtwoStage,5 
val2-access: WtwoStage,5
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=2;




2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12-4-5-6-13-14
val1-access: WtwoStage,2 - Rreader,8- Rreader,11 - RtwoStage,5 
val2-access: WtwoStage,5 - Rreader,14
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=2;




2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12-4-5-6-13-14-15
val1-access: WtwoStage,2 - Rreader,8- Rreader,11 - RtwoStage,5 
val2-access: WtwoStage,5 - Rreader,14
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=2;




2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12-4-5-6-13-14-15-16
val1-access: WtwoStage,2 - Rreader,8- Rreader,11 - RtwoStage,5 
val2-access: WtwoStage,5 - Rreader,14
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=2;




2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving Is
statements: 1-2-3-7-8-11-12-4-5-6-13-14-15-16
val1-access: WtwoStage,2 - Rreader,8- Rreader,11 - RtwoStage,5 
val2-access: WtwoStage,5 - Rreader,14
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





QF formula is unsatisfiable,
i.e., assertion holdsThread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {




2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=0; val2=0;
local variabes: t1= -1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {




2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= -1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {





2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= -1; t2= -1;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving If
statements: 1-2-3-7-8-11-12-13-14-15-16
val1-access: WtwoStage,2- Rreader,8- Rreader,11 
val2-access: Rreader,14
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=0;
local variabes: t1= 1; t2= 0;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving If
statements: 1-2-3-7-8-11-12-13-14-15-16
val1-access: WtwoStage,2- Rreader,8- Rreader,11
val2-access: Rreader,14
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);






global variables: val1=1; val2=0;
local variabes: t1= 1; t2= 0;Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving If
statements: 1-2-3-7-8-11-12-13-14-15-16-4-5-6
val1-access: WtwoStage,2- Rreader,8- Rreader,11 - RtwoStage,5
val2-access: Rreader,14- WtwoStage,5
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





global variables: val1=1; val2=2;
local variabes: t1= 1; t2= 0;
CS2Thread twoStage
1:  lock(m1);
2:  val1 = 1;
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
Lazy exploration: interleaving If
statements: 1-2-3-7-8-11-12-13-14-15-16-4-5-6
val1-access: WtwoStage,2- Rreader,8- Rreader,11 - RtwoStage,5
val2-access: Rreader,14- WtwoStage,5
CS1
2:  val1 = 1;
3:  unlock(m1);
4:  lock(m2);
5:  val2 = val1 + 1;
6:  unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);




QF formula is satisfiable,










global and local variables
active thread, context bound
CS1
Lazy Approach: State Transitions
execution paths














CS2• use a reachability tree (RT) to describe reachable states of a 
multi-threaded program
• each node in the RT is a tuple for a 
given time step i, where:
– Ai represents the currently active thread
– C represents the context switch number












=1 , , , , υ
– Ci represents the context switch number
– si represents the current state
– represents the current location of thread j
– represents the control flow guards accumulated in thread j
along the path from     to 








i lR1 (assign): If I is an assignment, we execute I, which 
generates si+1. We add as child to υ a new node υ’ 
• we have fully expanded υ if
− I within an atomic block; or
Expansion Rules of the RT
( )
1 1 1 , , , , '





i i i i G l s C A υ





− I within an atomic block; or
− I contains no global variable; or
− the upper bound of context switches (Ci= C) is reached
• if υ is not fully expanded, for each thread j ≠ Aiwhere    





'   , , , 1 ,  





i i i j G l s C j υR2 (skip): If I is a skip-statement with target l, we increment 
the location of the current thread and continue with it. We 
explore no context switches:
R3 (unconditional goto): If I is an unconditional goto-
Expansion Rules of the RT
( )




















     :          
: 1
1
R3 (unconditional goto): If I is an unconditional goto-
statement with target l, we set the location of the current 
thread and continue with it. We explore no context 
switches:
( )






i i i i G l s C A υ 

 =






i      :   
:
1R4 (conditional goto): If I is a conditional goto-statement with 
test c and target l, we create two child nodes υ’ and υ’’. 
− for υ’ , we assume that c is true and proceed with the target 
instruction of the jump:
Expansion Rules of the RT
( )
1 1, , , , '





i i i i G c l s C A υ 

 =






i      :   
:
1
− for υ’’, we add ¬c to the guards and continue with the next 
instruction in the current thread
− prune one of the nodes if the condition is determined statically
( )
1 1 + + i i i i i i
( )
1 1, , , , ' '



















     :          
: 1
1R5 (assume): If I is an assume-statement with argument c, 
we proceed similar to R1.
− we continue with the unchanged state si but add c to all 
guards, as described in R4
− If            evaluates to false, we prune the execution path
Expansion Rules of the RT
j
i G c∧
R6 (assert): If I is an assert-statement with argument c, we 
proceed similar to R1.
− we continue with the unchanged state si but add c to all 
guards, as described in R4
− we generate a verification condition to check the validity of cR5 (start_thread): If I is a start_thread instruction, we add the 
indicated thread to the set of active threads:
− where       is the initial location of the thread and 
− the thread starts with the guards of the currently active thread
Expansion Rules of the RT
1
1
1 1 1, , , , '
+
+


























− the thread starts with the guards of the currently active thread
R6 (join_thread): If I is a join_thread instruction with 
argument Id, we add a child node:
− where                 only if the joining thread Id has exited
( )






i i i i G l s C A υ




i l l• naïve but useful:
– bugs usually manifest with few context switches 
[Qadeer&Rehof’05]
– keep in memory the parent nodes of all unexplored paths only
– exploit which transitions are enabled in a given state
– bound the number of preemptions (C) allowed per threads
Observations about the lazy approach
– bound the number of preemptions (C) allowed per threads
>number of executions: O(nc)
– as each formula corresponds to one possible path only, its size 
is relatively small
• can suffer performance degradation:
− in particular for correct programs where we need to invoke the 
SMT solver once for each possible execution path• add a fresh variable (ts) for each context switch block (i) so 
that 0 < tsi ≤ number of threads
– record in which order the scheduler has executed the program 
(aka scheduler guards)
Schedule Recording
Idea: systematically encode all possible interleavings 
into one formula
(aka scheduler guards)
– SMT solver determines the order in which threads are simulated
• add scheduler guards only to effective statements 
(assignments and assertions)
– record effective context switches (ECS)
>context switches to an effective statement
– ECS block: sequence of program statements that are executed 











ts1==1 ∧ ts2==1 
→ val1=1
twoStage, reader
ts1==1 ∧ ts2==2 
→ lock(m1) 
twoStage, reader
ts1==2 ∧ ts2==1 
→ lock(m1)
twoStage, reader
ts1==2 ∧ ts2==2  
→ unlock(m1)
CS1















ts1==1 ∧ ts2==1 
→ val1=1
twoStage, reader
ts1==1 ∧ ts2==2 
→ lock(m1) 
twoStage, reader
ts1==2 ∧ ts2==1 
→ lock(m1)
twoStage, reader




If the guard of the parent node is 
false then the guard of the child 
node is false as wellSchedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




ECS block: sequence of 
program statements that 
are executed with no  2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);
14: t2 = val2;
15: unlock(m2);
16: assert(t2==(t1+1)); 
are executed with no 
intervening ECSSchedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);





guarded statement can only be 
executed if statement 1 is 
scheduled in the ECS block 1
each program statement is  2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);
14: t2 = val2;
15: unlock(m2);
16: assert(t2==(t1+1)); 
each program statement is 
then prefixed by a schedule 
guard tsi = j, where:
• i is the ECS block number
• j is the thread identifierSchedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);





ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);
14: t2 = val2;
15: unlock(m2);
16: assert(t2==(t1+1)); 
ts2 == 1Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);





ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);




ts2 == 1Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);




ts2 == 1Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2 ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);




ts2 == 1Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);





ts2 == 1Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);






ts2 == 1Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);







ts6 == 2Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);








ts6 == 2Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);









ts6 == 2Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);










ts6 == 2Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);











ts6 == 2Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);









ts10== 1 ts7 == 2
ts6 == 2Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);
14: t2 = val2;
15: unlock(m2);
16: assert(t2==(t1+1)); 









CSSchedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);













ts6 == 2Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);













ts6 == 2Schedule Recording: Interleaving Is
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);




CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);
14: t2 = val2;
15: unlock(m2);
16: assert(t2==(t1+1)); ts14== 2











ts6 == 2Schedule Recording: Interleaving If
Thread twoStage
1: lock(m1);
2: val1 = 1;    
Thread reader
7:  lock(m1);
8:  if (val1 == 0) {
statements: 1-2-3-7-8-11-12-13-14-15-16-4-5-6
twoStage-ECS: ts1,1-ts2,3-ts3,4-ts4,12-ts5,13-ts6,14
reader-ECS: ts7,4 -ts8,5 -ts11,6-ts12,7-ts13,8-ts14,9-ts15,10-ts16,11
CS ts4 == 2
ts5 == 2
ts1 == 1
ts2 == 1 2: val1 = 1;    
3: unlock(m1);
4: lock(m2); 
5: val2 = val1 + 1;
6: unlock(m2);
8:  if (val1 == 0) {
9:    unlock(m1);
10:  return NULL; }
11: t1 = val1;
12: unlock(m1);
13: lock(m2);















ts8 == 2• we systematically explore the thread interleavings as before, 
but now:
– add schedule guards to record in which order the scheduler 
has executed the program
– encode all execution paths into one formula
Observations about the schedule 
recoding approach
– encode all execution paths into one formula
> bound the number of preemptions 
> exploit which transitions are enabled in a given state 
• the number of threads and context switches can grow very 
large quickly, and easily “blow-up” the solver:
− there is a clear trade-off between usage of time and memory 
resources• start from a single interleaving (under-approximation) and 
widen the model by adding more interleavings incrementally
• main steps of the algorithm:
ψ
Under-approximation and Widening
Idea: check models with an increased set of allowed 
interleavings [Grumberg&et al.’05]
1. encode control literals (cli,j) into the verification condition ψ
> cli,j where i is the ECS block number and j is the thread identifier
2. check the satisfiability of ψ (stop if ψ is satisfiable)
3. extract proof objects generated by the SMT solver
4. check whether the proof depends on the control literals (stop if the 
proof does not depend on the control literals)
5. remove literals that participated in the proof and go to step 2• use the same guards as in the schedule recording approach 
as control literals 
– but here the schedule is updated based on the information 
extracted from the proof
UW Approach: Running Example
Thread twoStage
1:  lock(m1); cl1,twoStage → ts1 == 1
cl → ts == 1
1:  lock(m1);
2:  val1 = 1;    
3:  unlock(m1);
4:  lock(m2); 
5:  val2 = val1 + 1;
6:  unlock(m2);
cl2,twoStage → ts2 == 1
cl3,twoStage → ts3 == 1
cl8,twoStage → ts8 == 1
cl9,twoStage → ts9 == 1
cl10,twoStage→ ts10== 1
• reduce the number of control points from m x n to e x n
– m is the number of program statements; n is the number of 
threads, and e is the number of ECS blocksEvaluation EvaluationComparison of the Approaches
• Goal: compare efficiency of the proposed approaches
– lazy exploration
– schedule recording
– underapproximation and widening
• Set-up:
– ESBMC v1.15.1 together with the SMT solver Z3 v2.11
– support the logics QF_AUFBV and QF_AUFLIRA
– standard desktop PC, time-out 3600 secondsAbout the benchmarks
Module #L #T #P B #C Description
1 fsbench_ok 81 26 47 26 2 Frangipani file system
2 fsbench_bad 80 27 48 27 2
Frangipani file system with array 
out of bounds
3 indexer_ok 77 13 21 129 4
Insert messages into a hash 
table concurrently
4 aget-0.4_bad 1233 3 279 200 2
Multi-threaded download 
accelerator




the number of BMC 
unrolling steps




5 bzip2smp_ok 6366 3 8568 1 9 Data compressor
6 reorder_bad 84 10 7 10 11 Contains a data race
7 twostage_bad 128 100 13 100 4 Contains an atomicity violation
8 wronglock_bad 110 8 8 8 8
Contains wrong lock acquisition 
ordering
9 exStbHDMI_ok 1060 2 24 16 20 Configures the HDMI device
10 exStbLED_ok 425 2 45 10 10 Front panel LED display
11 exStbThumb_bad 1109 2 249 2 1
Demonstrate how thumbnail 
images can be manipulated
12 micro_10_ok 1171 10 10 1 17 synthetic micro-benchmark
unrolling stepsModule #L #T #P B #C Description
1 fsbench_ok 81 26 47 26 2 Frangipani file system
2 fsbench_bad 80 27 48 27 2
Frangipani file system with 
array out of bounds
3 indexer_ok 77 13 21 129 4
Insert messages into a hash 
table concurrently
4 aget-0.4_bad 1233 3 279 200 2
Multi-threaded download 
accelerator





5 bzip2smp_ok 6366 3 8568 1 9 Data compressor
6 reorder_bad 84 10 7 10 11 Contains a data race
7 twostage_bad 128 100 13 100 4 Contains an atomicity violation
8 wronglock_bad 110 8 8 8 8
Contains wrong lock acquisition 
ordering
9 exStbHDMI_ok 1060 2 24 16 20 Configures the HDMI device
10 exStbLED_ok 425 2 45 10 10 Front panel LED display
11 exStbThumb_bad 1109 2 249 2 1
Demonstrate how thumbnail 
images can be manipulated
12 micro_10_ok 1171 10 10 1 17 synthetic micro-benchmarkModule #L #T #P B #C Description
1 fsbench_ok 81 26 47 26 2 Frangipani file system
2 fsbench_bad 80 27 48 27 2
Frangipani file system with array 
out of bounds
3 indexer_ok 77 13 21 129 4
Insert messages into a hash 
table concurrently
4 aget-0.4_bad 1233 3 279 200 2
Multi-threaded download 
accelerator





5 bzip2smp_ok 6366 3 8568 1 9 Data compressor
6 reorder_bad 84 10 7 10 11 Contains a data race
7 twostage_bad 128 100 13 100 4
Contains an atomicity 
violation
8 wronglock_bad 110 8 8 8 8
Contains wrong lock 
acquisition ordering
9 exStbHDMI_ok 1060 2 24 16 20 Configures the HDMI device
10 exStbLED_ok 425 2 45 10 10 Front panel LED display
11 exStbThumb_bad 1109 2 249 2 1
Demonstrate how thumbnail 
images can be manipulated
12 micro_10_ok 1171 10 10 1 17 synthetic micro-benchmarkModule #L #T #P B #C Description
1 fsbench_ok 81 26 47 26 2 Frangipani file system
2 fsbench_bad 80 27 48 27 2
Frangipani file system with array 
out of bounds
3 indexer_ok 77 13 21 129 4
Insert messages into a hash 
table concurrently
4 aget-0.4_bad 1233 3 279 200 2
Multi-threaded download 
accelerator
5 bzip2smp_ok 6366 3 8568 1 9 Data compressor
About the benchmarks
5 bzip2smp_ok 6366 3 8568 1 9 Data compressor
6 reorder_bad 84 10 7 10 11 Contains a data race
7 twostage_bad 128 100 13 100 4 Contains an atomicity violation
8 wronglock_bad 110 8 8 8 8
Contains wrong lock acquisition 
ordering
9 exStbHDMI_ok 1060 2 24 16 20 Configures the HDMI device
10 exStbLED_ok 425 2 45 10 10 Front panel LED display
11 exStbThumb_bad 1109 2 249 2 1
Demonstrate how thumbnail 
images can be manipulated
12 micro_10_ok 1171 10 10 1 17 synthetic micro-benchmark
Set-top box 
applications from NXP 
semiconductorsModule #L #T #P B #C Description
1 fsbench_ok 81 26 47 26 2 Frangipani file system
2 fsbench_bad 80 27 48 27 2
Frangipani file system with array 
out of bounds
3 indexer_ok 77 13 21 129 4
Insert messages into a hash 
table concurrently
4 aget-0.4_bad 1233 3 279 200 2
Multi-threaded download 
accelerator
5 bzip2smp_ok 6366 3 8568 1 9 Data compressor
About the benchmarks
5 bzip2smp_ok 6366 3 8568 1 9 Data compressor
6 reorder_bad 84 10 7 10 11 Contains a data race
7 twostage_bad 128 100 13 100 4 Contains an atomicity violation
8 wronglock_bad 110 8 8 8 8
Contains wrong lock acquisition 
ordering
9 exStbHDMI_ok 1060 2 24 16 20 Configures the HDMI device
10 exStbLED_ok 425 2 45 10 10 Front panel LED display
11 exStbThumb_bad 1109 2 249 2 1
Demonstrate how thumbnail 
images can be manipulated
12 micro_10_ok 1171 10 10 1 17 synthetic micro-benchmark
It is used to check the 
scalability of multi-threaded 
software verification tools 
[Ghafari 2010]Comparison of the approaches
Module
Lazy Schedule UW
Time Result #FI/#I Time Result Time Result Iter
fsbench_ok 282 + 0/676 304 + 301 + 1
fsbench_bad <1 + 729/729 360 + 786 + 2
indexer_ok 595 + 0/17160 220 + 218 + 1














aget-0.4_bad 137 + 1/1 127 + 125 + 1
bzip2smp_ok 1800 + 0/1294 MO - MO - 1
reorder_bad <1 + 1/154574 MO - MO - 1
twostage_bad 88 + 1/139 93 + 195 + 5
wronglock_bad 90 + 6/104015 MO - MO - 1
exStbHDMI_ok 229 + 0/1 226 + 213 + 1
exStbLED_ok 73 + 0/11 73 + 787 + 1
exStbThumb_bad 95 + 3/3 14 + 12 + 1
micro_10_ok 254 + 0/29260 MO - MO - 1Module
Lazy Schedule UW
Time Result #FI/#I Time Result Time Result Iter
fsbench_ok 282 + 0/676 304 + 301 + 1
fsbench_bad <1 + 729/729 360 + 786 + 2
indexer_ok 595 + 0/17160 220 + 218 + 1
aget-0.4_bad 137 + 1/1 127 + 125 + 1
Comparison of the approaches (1)
lazy encoding often 
more efficient than 
schedule recording 
and UW
aget-0.4_bad 137 + 1/1 127 + 125 + 1
bzip2smp_ok 1800 + 0/1294 MO - MO - 1
reorder_bad <1 + 1/154574 MO - MO - 1
twostage_bad 88 + 1/139 93 + 195 + 5
wronglock_bad 90 + 6/104015 MO - MO - 1
exStbHDMI_ok 229 + 0/1 226 + 213 + 1
exStbLED_ok 73 + 0/11 73 + 787 + 1
exStbThumb_bad 95 + 3/3 14 + 12 + 1
micro_10_ok 254 + 0/29260 MO - MO - 1Module
Lazy Schedule UW
Time Result #FI/#I Time Result Time Result Iter
fsbench_ok 282 + 0/676 304 + 301 + 1
fsbench_bad <1 + 729/729 360 + 786 + 2
indexer_ok 595 + 0/17160 220 + 218 + 1
aget-0.4_bad 137 + 1/1 127 + 125 + 1
Comparison of the approaches (2)
lazy encoding often more 
efficient than schedule 
recording and UW, but 
not always
aget-0.4_bad 137 + 1/1 127 + 125 + 1
bzip2smp_ok 1800 + 0/1294 MO - MO - 1
reorder_bad <1 + 1/154574 MO - MO - 1
twostage_bad 88 + 1/139 93 + 195 + 5
wronglock_bad 90 + 6/104015 MO - MO - 1
exStbHDMI_ok 229 + 0/1 226 + 213 + 1
exStbLED_ok 73 + 0/11 73 + 787 + 1
exStbThumb_bad 95 + 3/3 14 + 12 + 1
micro_10_ok 254 + 0/29260 MO - MO - 1Module
Lazy Schedule UW
Time Result #FI/#I Time Result Time Result Iter
fsbench_ok 282 + 0/676 304 + 301 + 1
fsbench_bad <1 + 729/729 360 + 786 + 2
indexer_ok 595 + 0/17160 220 + 218 + 1
aget-0.4_bad 137 + 1/1 127 + 125 + 1
Comparison of the approaches (3) lazy encoding is 
extremely fast for 
satisfiable instances
aget-0.4_bad 137 + 1/1 127 + 125 + 1
bzip2smp_ok 1800 + 0/1294 MO - MO - 1
reorder_bad <1 + 1/154574 MO - MO - 1
twostage_bad 88 + 1/139 93 + 195 + 5
wronglock_bad 90 + 6/104015 MO - MO - 1
exStbHDMI_ok 229 + 0/1 226 + 213 + 1
exStbLED_ok 73 + 0/11 73 + 787 + 1
exStbThumb_bad 95 + 3/3 14 + 12 + 1
micro_10_ok 254 + 0/29260 MO - MO - 1Comparison to CHESS [Musuvathi and Qadeer]
• CHESS (v0.1.30626.0) is a concurrency testing tool for C# 
programs; also works for C/C++ (Windows API) .
– implements iterative context-bounding 
– requires unit tests that it repeatedly executes in a loop, 
exploring a different interleaving on each iteration
> it is similar to our lazy approach
– performs state hashing based on a happens-before graph
> avoids exploring the same state repeatedly
• Goal: compare efficiency of the approaches
– on identical verification problems taken from standard 
benchmark suites of multi-threaded softwareModule
#T B C
CHESS Lazy
Time Tests Time #FI/#I
reorder_4_bad (3,1) 4 4 5 98 130000 <1 1/82
reorder_5_bad (4,1) 5 5 6 TO 429000 <1 1/277
reorder_6_bad (5,1) 6 6 7 TO 396000 <1 1/853
reorder_6_bad (5,1) 6 6 8 TO 371000 <1 1/2810
Comparison to CHESS [Musuvathi and Qadeer] CHESS is effective for 
programs where there are a 
small number of threads
reorder_6_bad (5,1) 6 6 9 TO 367000 <1 1/8124
twostage_4_bad (3,1) 4 4 4 215 27000 2 1/42
twostage_5_bad (4,1) 5 5 4 TO 384000 2 1/44
twostage_6_bad (5,1) 6 6 4 TO 366000 2 1/45
wronglock_4_bad (1,3) 4 4 8 21 3000 5 2/489
wronglock_5_bad (1,4) 5 5 8 724 93000 10 3/2869
wronglock_6_bad (1,5) 6 6 8 TO 356000 18 4/12106
micro_2_ok (100) 2 1 2 316 35855 <1 0/4
micro_2_ok (100) 2 1 17 TO 40000 1095 0/131072Module
#T B C
CHESS Lazy
Time Tests Time #FI/#I
reorder_4_bad (3,1) 4 4 5 98 130000 <1 1/82
reorder_5_bad (4,1) 5 5 6 TO 429000 <1 1/277
reorder_6_bad (5,1) 6 6 7 TO 396000 <1 1/853
reorder_6_bad (5,1) 6 6 8 TO 371000 <1 1/2810
Comparison to CHESS [Musuvathi and Qadeer] CHESS is effective for programs 
where there are a small number of 
threads, but it does not scale 
that well and consistently runs 
out of time when we increase 
the number of threads
reorder_6_bad (5,1) 6 6 9 TO 367000 <1 1/8124
twostage_4_bad (3,1) 4 4 4 215 27000 2 1/42
twostage_5_bad (4,1) 5 5 4 TO 384000 2 1/44
twostage_6_bad (5,1) 6 6 4 TO 366000 2 1/45
wronglock_4_bad (1,3) 4 4 8 21 3000 5 2/489
wronglock_5_bad (1,4) 5 5 8 724 93000 10 3/2869
wronglock_6_bad (1,5) 6 6 8 TO 356000 18 4/12106
micro_2_ok (100) 2 1 2 316 35855 <1 0/4
micro_2_ok (100) 2 1 17 TO 40000 1095 0/131072Comparison to SATABS [D. Kroening]
• SATABS (v2.5) implements predicate abstraction using SAT
– avoids exponential number of theorem prover calls (for each 
potential assignment) to construct the Boolean program
– uses BDD-based model checking (Cadence SMV) to verify the 
Boolean program
– supports most ANSI-C constructs (incl. arithmetic overflow) 
and the verification of multi-threaded software with locks and 
shared variables
• Goal: compare efficiency of both approaches
– on identical verification problems taken from standard 
benchmark suites of multi-threaded softwareModule
SATABS Lazy
Time Result Time Result #FI/#I
fsbench_ok † - 282 + 0/676
fsbench_bad † - <1 + 729/729
indexer_ok TO - 595 + 0/17160
aget-0.4_bad 3346 + 137 + 1/1
Comparison to SATABS [D. Kroening]
failed to validate the 
counterexample
aget-0.4_bad 3346 + 137 + 1/1
bzip2smp_ok TO - 1800 + 0/1294
reorder_bad 1 - <1 + 1/154574
twostage_bad 2 - 88 + 1/139
wronglock_bad 2 - 90 + 6/104015
exStbHDMI_ok TO - 229 + 0/1
exStbLED_ok RF - 73 + 0/11
exStbThumb_bad 317 + 95 + 3/3
micro_10_ok TO - 254 + 0/29260
failed to refine the 
predicateModule
SATABS Lazy
Time Result Time Result #FI/#I
fsbench_ok † - 282 + 0/676
fsbench_bad † - <1 + 729/729
indexer_ok TO - 595 + 0/17160
aget-0.4_bad 3346 + 137 + 1/1
Comparison to SATABS [D. Kroening]
false positives 
answers aget-0.4_bad 3346 + 137 + 1/1
bzip2smp_ok TO - 1800 + 0/1294
reorder_bad 1 - <1 + 1/154574
twostage_bad 2 - 88 + 1/139
wronglock_bad 2 - 90 + 6/104015
exStbHDMI_ok TO - 229 + 0/1
exStbLED_ok RF - 73 + 0/11
exStbThumb_bad 317 + 95 + 3/3
micro_10_ok TO - 254 + 0/29260Module
SATABS Lazy
Time Result Time Result #FI/#I
fsbench_ok † - 282 + 0/676
fsbench_bad † - <1 + 729/729
indexer_ok TO - 595 + 0/17160
aget-0.4_bad 3346 + 137 + 1/1
Comparison to SATABS [D. Kroening] SATABS uses predicate 
abstraction and refinement 
and tries to solve a harder 
problem than ESBMC
aget-0.4_bad 3346 + 137 + 1/1
bzip2smp_ok TO - 1800 + 0/1294
reorder_bad 1 - <1 + 1/154574
twostage_bad 2 - 88 + 1/139
wronglock_bad 2 - 90 + 6/104015
exStbHDMI_ok TO - 229 + 0/1
exStbLED_ok RF - 73 + 0/11
exStbThumb_bad 317 + 95 + 3/3
micro_10_ok TO - 254 + 0/29260Module
SATABS Lazy
Time Result Time Result #FI/#I
fsbench_ok † - 282 + 0/676
fsbench_bad † - <1 + 729/729
indexer_ok TO - 595 + 0/17160
aget-0.4_bad 3346 + 137 + 1/1
Comparison to SATABS [D. Kroening] SATABS uses predicate abstraction and 
refinement and tries to solve a harder 
problem than ESBMC, but this problem 
may still be too hard as SATABS is 
unable to prove the required properties
aget-0.4_bad 3346 + 137 + 1/1
bzip2smp_ok TO - 1800 + 0/1294
reorder_bad 1 - <1 + 1/154574
twostage_bad 2 - 88 + 1/139
wronglock_bad 2 - 90 + 6/104015
exStbHDMI_ok TO - 229 + 0/1
exStbLED_ok RF - 73 + 0/11
exStbThumb_bad 317 + 95 + 3/3
micro_10_ok TO - 254 + 0/29260• SMT-based BMC for Embedded ANSI-C Software
• Verifying Multi-threaded Software
• Implementation of ESBMC
Agenda
• Implementation of ESBMC
• Integrating ESBMC into Software Engineering Practice
• Conclusions and Future WorkContinuous Verification
• based on Fowler’s continuous integration (CI):
build and test full system after each change
• complement testing by verification
(SMT-based bounded model checking)
– assertions
– language-specific properties – language-specific properties
• exploit existing information
– development history (SCM)
– test cases
• limit change propagation
– equivalence checksFunctional Equivalence Checking
• determine whether modified functions need to be re-verified
– no need to re-verify properties if functions are equivalent
– less expensive than re-verifying the function
– undecidable due to unbounded memory usageFunctional Equivalence Checking
• determine whether modified functions need to be re-verified
– no need to re-verify properties if functions are equivalent
– less expensive than re-verifying the function
– undecidable due to unbounded memory usage
• goal: compare input-output relation
unsigned Inv(int signal) {
unsigned inverter; unsigned inverter;






unsigned Inv(int signal) {





• determine whether modified functions need to be re-verified
– no need to re-verify properties if functions are equivalent
– less expensive than re-verifying the function
– undecidable due to unbounded memory usage
• goal: compare input-output relation
– remove variables and returns 
unsigned Inv(int signal) {
unsigned inverter; – remove variables and returns 
unsigned inverter;






unsigned Inv(int signal) {





• determine whether modified functions need to be re-verified
– no need to re-verify properties if functions are equivalent
– less expensive than re-verifying the function
– undecidable due to unbounded memory usage
• goal: compare input-output relation
– remove variables and returns
unsigned Inv(int signal) {
unsigned inverter; – remove variables and returns
– convert the function bodies into SSA
unsigned inverter;






unsigned Inv(int signal) {
















∗ − = ∧
=
=










( ) [ ] 1 1 1 2 2 ' : '   ? 0 ' ' signal signal signal signal − < = = αFunctional Equivalence Checking
• determine whether modified functions need to be re-verified
– no need to re-verify properties if functions are equivalent
– less expensive than re-verifying the function
– undecidable due to unbounded memory usage
• goal: compare input-output relation
– remove variables and returns – remove variables and returns
– convert the function bodies into SSA
– show that the input and output variables coincide
( ) ( ) ( ) 2 3 1 1 2 1 ' ' signal inverter signal signal = → = ∧ ∧α α
SSA of function 1 and 2
inputs outputsFunctional Equivalence Checking
• determine whether modified functions need to be re-verified
– no need to re-verify properties if functions are equivalent
– less expensive than re-verifying the function
– undecidable due to unbounded memory usage
• goal: compare input-output relation
– remove variables and returns – remove variables and returns
– convert the function bodies into SSA
– show that the input and output variables coincide
( ) ( ) ( ) ( )
'
1 1 2 3 1 1 2 1 ' ' g g signal inverter signal signal = ∧ = → = ∧ ∧α α
SSA of function 1 and 2
inputs outputs
global variablesGeneralizing Test Cases
• use existing test cases to reduce the state space
– run the unit tests, keep track of inputs
– guide model checker to visit states not yet visited
• test stubs break the global model into local models
– use test case as initial state
– generate reachable states on-demand 







assume(a>10 && a<200);Generalizing Test Cases: Example
Simple circular FIFO buffer:
static char buffer[BUFFER_MAX];
void initLog(int max) {
buffer_size = max;




static void testCircularBuffer(void) {
int senData[] = {1, -128, 98, 88, 59,
Test case:
check whether messages are 





void insertLogElem(int b) {





int senData[] = {1, -128, 98, 88, 59,








}Generalizing Test Cases: Example
The array buffer is of type char[]
static char buffer[BUFFER_MAX];
void initLog(int max) {
buffer_size = max;




BUT: implementation is flawed!




void insertLogElem(int b) {





Assign an integer variableGeneralizing Test Cases: Example
The array buffer is of type char[]
static char buffer[BUFFER_MAX];
void initLog(int max) {
buffer_size = max;




BUT: implementation is flawed!




void insertLogElem(int b) {





Assign an integer variable
We can detect the error by 
assigning a non-deterministic 
value
This can lead to false resultsGeneralizing Test Cases: Example
Rather than modifying the program we modify the test stubs
static void testCircularBuffer(void) {
int senData[] = {nondet_int(), …, nondet_int()};




Block larger parts of 
the search space 
(combine respective 










⇒ detects two bugs related to arithmetic over- and underflow




• we translate the LTL formulae into Buechi Automata (BA) 
and further into ANSI-C
– monitor the design’s progress and watch out for violations
• we extract two properties of the pulse oximeter device:
a) verify the data flow to compute the HR value that is provided 
by the sensor by the sensor
b) verify whether the user is able to adjust the sample time of 
the device
• the properties (a) and (b) can be expressed as:




pre-state pos-stateTranslation from BA to ANSI-C
init









if (r || !p) state=s3;
…


























if (r || !p) state=s3;
…






model the hardware interrupt and 



















indicate whether a hardware 
interrupt has occuredConcurrent Execution of Main, 






















block 9Evaluation EvaluationSet-top Box Case Study
• Goal: evaluate the feasibility of the elements of the 
continuous verification approach
– use of the unit tests and function equivalence checking
• embedded software used in a commercial product from NXP
– high definition internet protocol and hybrid digital TV 
applications applications
– Linux operating system (LinuxDVB, DirectFB and ALSA)
• Set-up:
– ESBMC v1.15.1 together with the SMT solver Z3 v2.11
– standard desktop PC, time-out 3600 secondsVerification of the Test Cases
Test Program L B P VC Time
commandLoop.TC1 545 - 18 0 4
commandLoop.TC2 545 500* 18 3 29
commandLoop.TC3 545 500* 18 3 29
commandLoop.TC4 545 17 18 5 14
commandLoop.TC5 545 - 18 1 4
commandLoop.TC6 545 - 18 0 4 commandLoop.TC6 545 - 18 0 4
commandLoop.TC7 545 1 18 15 19
checkCommandParams.TC1 238 17 17 56 9
checkCommandParams.TC2 238 17 17 36 5
checkCommandParams.TC3 238 17 17 37 5
checkCommandParams.TC4 238 17 17 36 30
checkCommandParams.TC5 238 17 17 80 50
checkCommandParams.TC6 238 17 17 664 44
checkCommandParams.TC7 238 20* 17 1117 215Verification of the Test Cases
Test Program L B P VC Time
commandLoop.TC1 545 - 18 0 4
commandLoop.TC2 545 500* 18 3 29
commandLoop.TC3 545 500* 18 3 29
commandLoop.TC4 545 17 18 5 14
commandLoop.TC5 545 - 18 1 4
commandLoop.TC6 545 - 18 0 4
ESBMC fails to verify these 
functions due to memory 
limitations and time-outs
commandLoop.TC6 545 - 18 0 4
commandLoop.TC7 545 1 18 15 19
checkCommandParams.TC1 238 17 17 56 9
checkCommandParams.TC2 238 17 17 36 5
checkCommandParams.TC3 238 17 17 37 5
checkCommandParams.TC4 238 17 17 36 30
checkCommandParams.TC5 238 17 17 80 50
checkCommandParams.TC6 238 17 17 664 44
checkCommandParams.TC7 238 20* 17 1117 215Verification of the Test Cases
Test Program L B P VC Time
commandLoop.TC1 545 - 18 0 4
commandLoop.TC2 545 500* 18 3 29
commandLoop.TC3 545 500* 18 3 29
commandLoop.TC4 545 17 18 5 14
commandLoop.TC5 545 - 18 1 4
commandLoop.TC6 545 - 18 0 4
If we use the test cases to 
guide the symbolic execution, 
ESBMC can verify these 
functions with a larger bound
commandLoop.TC6 545 - 18 0 4
commandLoop.TC7 545 1 18 15 19
checkCommandParams.TC1 238 17 17 56 9
checkCommandParams.TC2 238 17 17 36 5
checkCommandParams.TC3 238 17 17 37 5
checkCommandParams.TC4 238 17 17 36 30
checkCommandParams.TC5 238 17 17 80 50
checkCommandParams.TC6 238 17 17 664 44
checkCommandParams.TC7 238 20* 17 1117 215Verification of the Test Cases
Test Program L B P VC Time
commandLoop.TC1 545 - 18 0 4
commandLoop.TC2 545 500* 18 3 29
commandLoop.TC3 545 500* 18 3 29
commandLoop.TC4 545 17 18 5 14
commandLoop.TC5 545 - 18 1 4
commandLoop.TC6 545 - 18 0 4
ESBMC is not able to 
prove or falsify some of 
the properties due to 
unwinding violations
commandLoop.TC6 545 - 18 0 4
commandLoop.TC7 545 1 18 15 19
checkCommandParams.TC1 238 17 17 56 9
checkCommandParams.TC2 238 17 17 36 5
checkCommandParams.TC3 238 17 17 37 5
checkCommandParams.TC4 238 17 17 36 30
checkCommandParams.TC5 238 17 17 80 50
checkCommandParams.TC6 238 17 17 664 44
checkCommandParams.TC7 238 20* 17 1117 215Equivalence Checking
Product Releases
Test Program L B P Time PR10 PR11 PR12 PR13
threadRename 6 17 0 3 X
fileExists 19 17 0 3 X
readLine 27 17 11 3 X
getCommand 269 17 61 3 X N/3 N/3
powerDown 9 17 0 2 X powerDown 9 17 0 2 X
digitStart 12 17 0 2 X Y/2
difgitAdd 34 17 2 2 X Y/2
checkEndOfPvrStream 32 17 13 2 X Y/2
checkEndOfMediaStream 28 17 1 2 X
commandLoop 545 17 53 Mf X Mf Mf
checkCommandParams 238 17 269 Tb X Tb Tb Tb
singal_handler 13 17 0 2 X
setupFBResolution 29 17 0 2 X Y/3 Y/3 Y/3
setupFramebuffers 115 17 8 3 X N/3 N/2 N/2
main_Thread 68 17 4 4 X Y/3 Y/2Equivalence Checking
Product Releases
Test Program L B P Time PR10 PR11 PR12 PR13
threadRename 6 17 0 3 X
fileExists 19 17 0 3 X
readLine 27 17 11 3 X
getCommand 269 17 61 3 X N/3 N/3
powerDown 9 17 0 2 X
Each PR only changes a 
few functions, but while 
six functions remain 
unchanged over all PRs, 
there are changes in 
each individual PR
powerDown 9 17 0 2 X
digitStart 12 17 0 2 X Y/2
difgitAdd 34 17 2 2 X Y/2
checkEndOfPvrStream 32 17 13 2 X Y/2
checkEndOfMediaStream 28 17 1 2 X
commandLoop 545 17 53 Mf X Mf Mf
checkCommandParams 238 17 269 Tb X Tb Tb Tb
singal_handler 13 17 0 2 X
setupFBResolution 29 17 0 2 X Y/3 Y/3 Y/3
setupFramebuffers 115 17 8 3 X N/3 N/2 N/2
main_Thread 68 17 4 4 X Y/3 Y/2Equivalence Checking
Product Releases
Test Program L B P Time PR10 PR11 PR12 PR13
threadRename 6 17 0 3 X
fileExists 19 17 0 3 X
readLine 27 17 11 3 X
getCommand 269 17 61 3 X N/3 N/3
powerDown 9 17 0 2 X
We have 19 changes 
over all PRs, where 8 
changes are equivalent, 
5 changes are not 
equivalent and we fail to 
check 5 changes
powerDown 9 17 0 2 X
digitStart 12 17 0 2 X Y/2
difgitAdd 34 17 2 2 X Y/2
checkEndOfPvrStream 32 17 13 2 X Y/2
checkEndOfMediaStream 28 17 1 2 X
commandLoop 545 17 53 Mf X Mf Mf
checkCommandParams 238 17 269 Tb X Tb Tb Tb
singal_handler 13 17 0 2 X
setupFBResolution 29 17 0 2 X Y/3 Y/3 Y/3
setupFramebuffers 115 17 8 3 X N/3 N/2 N/2
main_Thread 68 17 4 4 X Y/3 Y/2Medical Device Case Study
• Goal: check ESBMC’s performance in verifying temporal 
properties
• embedded software of a pulse oximeter device
– device drivers (display, keyboard, serial, sensor, and timer)
– system log to debug code
– applications that call the services provided by the platform
• Set-up:
– ESBMC v1.15.1 together with the SMT solver Z3 v2.11
– standard desktop PC, time-out 3600 secondsMedical Device Case Study
• P1: whenever the bit 0 of the micro-controller port is set to 1, 
the start button will eventually be detected
– include two Boolean variables (BIT0 and startButton)
• P2: whenever the start button is pressed, the application will 
eventually be initialized
AG (BIT0 → F startButton)
eventually be initialized
– include two Boolean variables (startButton and startApp)
• P3: it is possible to get to a state where the next position of 
the buffer is less than its total size
– no changes to the program
AG (next < buffer_size)
AG (startButton → F startApp)Faults Injected
• keyboard: we comment out the break statement (of the case
START: command=startButton)
– if START was pressed, the code would fall through to the next 
line, and have the wrong value assigned to command
• menu_app: we do not initialize the application after the start 
button is pressed button is pressed
• log: we change the program statements so that in a situation 
where the next index is at the end of the array buffer, an 
overflowing index by one byte can occur




fault:Verification of the LTL Properties
Test Program L T B C Time #FI/#I















































































48/52Verification of the LTL Properties
Test Program L T B C Time #FI/#I


































reactive system: ESBMC 
can check the LTL 
properties up to a certain 
unwinding bound














































48/52Verification of the LTL Properties
Test Program L T B C Time #FI/#I


































for small values of the 
unwinding bound, ESBMC 
verifies the properties 
without a specified upper 
bound on the context 
switches














































48/52Verification of the LTL Properties
Test Program L T B C Time #FI/#I


































ESBMC is able to detect the 
violation in few seconds and 
about 15% of the generated 
interleavings fail














































48/52• SMT-based BMC for Embedded ANSI-C Software
• Verifying Multi-threaded Software
• Implementation of ESBMC
Agenda
• Implementation of ESBMC
• Integrating ESBMC into Software Engineering Practice
• Conclusions and Future WorkResults
• described and evaluated first SMT-based BMC for full ANSI-C  
– no SMT tool existed that can reliably handle full ANSI-C
– provided encodings for typical ANSI-C constructs not directly 
supported by SMT-solvers 
⇒ used three different SMT solvers to check the effectiveness of our 
encoding
– found undiscovered bugs related to arithmetic overflow, buffer  – found undiscovered bugs related to arithmetic overflow, buffer 
overflow and invalid pointer in standard benchmarks suite
⇒ confirmed by the benchmark’s creators
• lazy, schedule recording, and UW algorithms
– lazy: check constraints lazily is fast for satisfiable instances and 
to a lesser extent even for safe programs
⇒ it has not been described or evaluated in the literatureResults
• lazy, schedule recording, and UW algorithms
– schedule recording: the number of threads and context switches 
can grow quickly (and easily “blow-up” the model checker)
⇒ combines symbolic with explicit state space exploration
– UW: memory overhead and slowdowns to extract the unsat core
⇒ it has not been used for BMC of multi-threaded software
⇒ uses a different encoding based on the notion of ECS blocks
Future Work
• fault localization in multi-threaded C programs
• interpolants to prove no interference of context switches
• verify real-time software using SMT techniques