SMT-based bounded model checking of multi-threaded software in embedded systems by Cordeiro, Lucas
University of Southampton Research Repository
ePrints Soton
Copyright © and Moral Rights for this thesis are retained by the author and/or other 
copyright owners. A copy can be downloaded for personal non-commercial 
research or study, without prior permission or charge. This thesis cannot be 
reproduced or quoted extensively from without first obtaining permission in writing 
from the copyright holder/s. The content must not be changed in any way or sold 
commercially in any format or medium without the formal permission of the 
copyright holders.
  
 When referring to this work, full bibliographic details including the author, title, 
awarding institution and date of the thesis must be given e.g.
AUTHOR (year of submission) "Full thesis title", University of Southampton, name 
of the University School or Department, PhD Thesis, pagination
http://eprints.soton.ac.ukUNIVERSITY OF SOUTHAMPTON
SMT-Based Bounded Model Checking
of Multi-threaded Software in
Embedded Systems
by
Lucas Carvalho Cordeiro
A thesis submitted in partial fulﬁllment for the
degree of Doctor of Philosophy
in the
Faculty of Engineering and Applied Science
Department of Electronics and Computer Science
April 2011UNIVERSITY OF SOUTHAMPTON
ABSTRACT
FACULTY OF ENGINEERING AND APPLIED SCIENCE
DEPARTMENT OF ELECTRONICS AND COMPUTER SCIENCE
Doctor of Philosophy
by Lucas Carvalho Cordeiro
Our reliance on the correct functioning of embedded systems is growing rapidly. Such
systems are used in a wide range of applications such as airbag control systems, mo-
bile phones, and high-end television sets. These systems are becoming more and more
complex and require multi-core processors with scalable shared memory to meet the
increasing computational power demands. The reliability of the embedded (distributed)
software is thus a key issue in the system development. In this thesis we describe and
evaluate an approach to reason accurately and eﬀectively about large embedded software
using bounded model checking (BMC) based on Satisﬁability Modulo Theories (SMT)
techniques. We present three major novel contributions. First, we extend the encodings
from previous SMT-based bounded model checkers to provide more accurate support for
variables of ﬁnite bit width, bit-vector operations, arrays, structures, unions and point-
ers and thus making our approach suitable to reason about embedded software. We then
provide new encodings into existing SMT theories and we show that our translations
from ANSI-C programs to SMT formulas are as precise as bit-accurate procedures based
on Boolean Satisﬁability. Second, we develop three related approaches for model check-
ing multi-threaded software in embedded systems. In the lazy approach, we generate
all possible interleavings and call the SMT solver on each of them individually, until
we either ﬁnd a bug, or have systematically explored all interleavings. In the schedule
recording approach, we encode all possible interleavings into one single formula and then
exploit the high speed of the SMT solvers. In the underapproximation and widening ap-
proach, we reduce the state space by abstracting the number of interleavings from the
proofs of unsatisﬁability generated by the SMT solvers. Finally, we describe and evalu-
ate an approach to integrate our SMT-based BMC into the software engineering process
by making the veriﬁcation process incremental. In particular, our approach looks at
the modiﬁcations suﬀered by the software system since its last veriﬁcation, and submits
them to a partly static and dynamic veriﬁcation process, which is thus guided by a set of
test cases for coverage. Experiments show that our SMT-based BMC can analyze larger
problems and reduce the veriﬁcation time compared to state-of-the-art techniques that
use BMC, iterative context-bounding or counterexample-guided abstraction reﬁnement.Contents
Abbreviations xiii
Declaration Of Authorship xv
List of Publications xvii
Acknowledgements xix
1 Introduction 1
1.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Outline of the Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 SAT-based and SMT-based Veriﬁcation Techniques 15
2.1 Logical Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.2 Decision Procedures for Satisﬁability . . . . . . . . . . . . . . . . . 18
2.1.3 Satisﬁability Modulo Theories . . . . . . . . . . . . . . . . . . . . . 21
2.1.4 Linear-time Temporal logic . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Bounded Model Checking of Software . . . . . . . . . . . . . . . . . . . . 29
2.2.1 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.2 Veriﬁcation Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.3 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2.3.1 Craig Interpolation . . . . . . . . . . . . . . . . . . . . . 32
2.2.3.2 K-Induction . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.4 BMC Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2.5 Comparison to Other Veriﬁcation Approaches . . . . . . . . . . . . 36
2.3 Veriﬁcation of Multi-threaded Systems . . . . . . . . . . . . . . . . . . . . 39
2.3.1 Concurrency and Interleaving . . . . . . . . . . . . . . . . . . . . . 40
2.3.2 Partial Order Reduction Technique . . . . . . . . . . . . . . . . . . 43
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3 SMT-based Bounded Model Checking for Embedded ANSI-C Soft-
ware 49
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 SMT-based BMC Formulation . . . . . . . . . . . . . . . . . . . . . . . . 51
vvi CONTENTS
3.3 Illustrative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.4 Encodings and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.1 Scalar Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.2 Fixed-Point Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4.3 Arithmetic Overﬂow and Underﬂow . . . . . . . . . . . . . . . . . 59
3.4.4 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.4.5 Structures and Unions . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4.6 Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4.7 Dynamic Memory Allocation . . . . . . . . . . . . . . . . . . . . . 66
3.5 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.5.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.5.2 Comparison of SMT solvers . . . . . . . . . . . . . . . . . . . . . . 69
3.5.3 Error-Detection Capability . . . . . . . . . . . . . . . . . . . . . . 72
3.5.4 Comparison to SMT-CBMC . . . . . . . . . . . . . . . . . . . . . . 73
3.5.5 Comparison to CBMC . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.6 Industrial Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4 Verifying Multi-threaded Software using SMT-based Context-Bounded
Model Checking 83
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.1 Multi-threaded Goto Programs . . . . . . . . . . . . . . . . . . . . 85
4.2.2 Formal Model of Multi-threaded Software . . . . . . . . . . . . . . 86
4.2.3 Context-Bounded Encoding . . . . . . . . . . . . . . . . . . . . . . 88
4.3 Context-Bounded Model Checking of Multi-threaded Software . . . . . . . 89
4.3.1 Exploring the Reachability Tree . . . . . . . . . . . . . . . . . . . 89
4.3.2 Lazy Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3.3 Schedule Recording Approach . . . . . . . . . . . . . . . . . . . . . 96
4.3.4 UW Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.3.5 Pruning the RT with Partial Order Reduction . . . . . . . . . . . 99
4.4 Verifying Race Conditions and Atomicity Violations . . . . . . . . . . . . 103
4.4.1 Detecting Data Races . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.4.2 Checking Atomicity . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.5 Modelling Synchronization Primitives in Pthread . . . . . . . . . . . . . . 105
4.5.1 Modelling Mutex Locking Operations . . . . . . . . . . . . . . . . 106
4.5.2 Modelling Conditional Waiting . . . . . . . . . . . . . . . . . . . . 108
4.6 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.6.1 Comparison to MPOR and PPOR . . . . . . . . . . . . . . . . . . 112
4.6.2 Comparison to CHESS . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.6.3 Comparison to SATABS . . . . . . . . . . . . . . . . . . . . . . . . 115
4.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5 Implementation of ESBMC 121
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121CONTENTS vii
5.2 Tool Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.3 Code Simpliﬁcation and Reduction . . . . . . . . . . . . . . . . . . . . . . 123
5.4 Exploiting Datatype Representations . . . . . . . . . . . . . . . . . . . . . 127
5.5 Evaluation of Performance Improvements . . . . . . . . . . . . . . . . . . 128
5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
6 Integrating ESBMC into Software Engineering Practice 131
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.2 Continuous Veriﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.3 Generalizing Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.4 Specifying Temporal Properties with B¨ uchi Automata . . . . . . . . . . . 137
6.5 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.5.1 Set-top Box Case Study . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5.2 Medical Device Case Study . . . . . . . . . . . . . . . . . . . . . . 145
6.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7 Conclusions 151
7.1 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.2 Future Work Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
7.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
A ESBMC plug-in 157
A.1 Front-end Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
A.2 BMC Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
A.3 SMT Solver Conﬁguration . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
A.4 Property Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
A.5 Concurrency Check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
A.6 Counterexample, Property Violation, and Claim Views . . . . . . . . . . . 163
B Static Analysis Benchmarks 165
B.1 EUREKA Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
B.2 POWERSTONE Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
B.3 NECLA Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
B.4 SNU-RT Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
B.5 VERISEC Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
B.6 WCET Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
C Functions of the Pthread Library 177
D Counterexample 179
References 185List of Figures
1.1 A Synthetic Micro-benchmark. . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Proposed SMT-based BMC procedure for software. . . . . . . . . . . . . . 9
2.1 Syntax of the Background Theories . . . . . . . . . . . . . . . . . . . . . . 23
2.2 LTL semantics for the operators X, G, F, U, and R (when ψ ﬁrst becomes
true and when ψ never becomes true) over π [94]. . . . . . . . . . . . . . . 26
2.3 Example of a Kripke structure (with deadlock) for states s0, s1, and s2
(where s2 has a transition back to itself). . . . . . . . . . . . . . . . . . . 27
2.4 (a) A simple C program with a for loop. (b) The corresponding unwound
C program of (a) converted into SSA form. . . . . . . . . . . . . . . . . . 31
2.5 Computing image by interpolation [125]. . . . . . . . . . . . . . . . . . . . 33
2.6 The CBMC Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 (a) A C program with violated property. (b) The C program of (a) in
SSA form. (c) Counterexample of C program in (a) . . . . . . . . . . . . . 37
2.8 The CFG representation of threads TA and TB and we assume that ini-
tially the global variables a and b are set to zero, i.e., a = 0 and b = 0. . . 41
2.9 The CFG that represents all possible interleaving sequences of threads TA
and TB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.10 The transition system that represents the parallel execution of threads
TA and TB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.11 Model context switches inside individual visible statements . . . . . . . . 44
3.1 ANSI-C program with two violated properties. . . . . . . . . . . . . . . . 53
3.2 The program of Figure 3.1 in SSA form. . . . . . . . . . . . . . . . . . . . 54
3.3 ANSI-C program with typecast from char to int. . . . . . . . . . . . . . . 58
3.4 Array out of bounds example. . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5 ANSI-C program with union. . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.6 C program with pointer to an array. . . . . . . . . . . . . . . . . . . . . . 64
3.7 C program with pointer to a struct. . . . . . . . . . . . . . . . . . . . . . 65
3.8 A fragment of an ANSI-C program with dynamic memory allocation. . . . 67
4.1 Multi-threaded Goto Program Language . . . . . . . . . . . . . . . . . . . 86
4.2 (a) A multi-threaded C program with an assertion violation. (b) The C
program of (a) converted into multi-threaded goto form. . . . . . . . . . . 87
4.3 CFG of two threads of the goto program shown in Figure 4.2 (b). . . . . . 88
4.4 Concurrent execution of two threads. . . . . . . . . . . . . . . . . . . . . . 89
4.5 Fragment of the reachability tree of the multi-threaded goto-program of
Figure 4.2(b). Nodes with dashed line represent program locations that
violate the assertion statement in line 18 of Figure 4.2(b). . . . . . . . . . 94
ixx LIST OF FIGURES
4.6 Algorithm of the lazy approach. . . . . . . . . . . . . . . . . . . . . . . . . 96
4.7 Schedule recording applied to the left-hand side of the RT in Figure 4.5. . 97
4.8 Algorithm of the UW approach. . . . . . . . . . . . . . . . . . . . . . . . . 99
4.9 (a) A simple multi-threaded C program. (b) The C program of (a) con-
verted into goto form. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.10 The reachability tree for threads t1, t2, and t3 of the multi-threaded goto-
program of Figure 4.9(b). Edges with dashed line represent transitions
that can be eliminated by RW-POR. . . . . . . . . . . . . . . . . . . . . . 101
4.11 The reachability tree for threads t1, t2, and t3 after applying the RW-POR
technique. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.12 Modelling data race conditions for read operations (l = g). . . . . . . . . . 104
4.13 Modelling data race conditions for write operations (g = l). . . . . . . . . 104
4.14 Modelling atomicity violation at visible statements. . . . . . . . . . . . . . 105
4.15 Computation paths blocking on a mutex. . . . . . . . . . . . . . . . . . . 106
4.16 Modelling mutex lock operation. . . . . . . . . . . . . . . . . . . . . . . . 107
4.17 An example of local deadlock with mutex on a database application. . . . 109
4.18 Modelling conditional waiting operation. . . . . . . . . . . . . . . . . . . . 110
4.19 An example of deadlock with condition variable on a producer and con-
sumer application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.1 Overview of the ESBMC architecture. . . . . . . . . . . . . . . . . . . . . 123
5.2 Code fragment of cyclic redundancy check. . . . . . . . . . . . . . . . . . 124
5.3 Goto-program for the code fragment in Figure 5.2. . . . . . . . . . . . . . 124
5.4 Loop unwound for the goto-program in Figure 5.3. . . . . . . . . . . . . . 125
5.5 Code fragment of blit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.6 Code fragment of SumArray. . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.7 Code fragment of Fast Fourier Transformation. . . . . . . . . . . . . . . . 126
5.8 A C program that uses shift-and-add to multiply two numbers. . . . . . . 127
6.1 Continuous Veriﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.2 (a) Original function to invert the sign of signal. (b) Optimized version. . 134
6.3 Implementation of a circular buﬀer. . . . . . . . . . . . . . . . . . . . . . . 136
6.4 A unit test for the functions shown in Figure 6.3. . . . . . . . . . . . . . . 136
6.5 The modiﬁed unit test for the test case shown in Figure 6.4. . . . . . . . . 137
6.6 Specifying Temporal Properties for Software. . . . . . . . . . . . . . . . . 139
6.7 The C-monitor thread to watch out for violations of the speciﬁed property.139
6.8 Event thread to model the hardware interrupt. . . . . . . . . . . . . . . . 140
6.9 Concurrent execution of main, monitor and event threads. . . . . . . . . . 141
A.1 Front-end options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
A.2 BMC options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
A.3 SMT Solver Conﬁguration. . . . . . . . . . . . . . . . . . . . . . . . . . . 160
A.4 Property check. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
A.5 Concurrency check. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
A.6 Counterexample view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163List of Tables
2.1 Truth table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Examples of First-Order Theories. . . . . . . . . . . . . . . . . . . . . . . 21
3.1 Deﬁnitions of ANSI-C types and their corresponding SMT representations. 57
3.2 Results of the comparison between CVC3, Boolector and Z3. Time-outs
are represented with T in the Time column; Examples that exceed avail-
able memory are represented with M in the Time column. The subscript
b indicates that the error occurred in the back-end. . . . . . . . . . . . . 70
3.3 Results of the error-detection capability of ESBMC. . . . . . . . . . . . . 72
3.4 Results of the comparison between ESBMC and SMT-CBMC [11]. . . . . 74
3.5 Results of the comparison between CBMC and ESBMC. Internal errors
in the respective tool are represented with † in the Time column. The
subscripts f and b indicate whether the errors occurred in the front-end
or back-end, respectively. The superscript ∗ on the unwinding bound
indicates that it is not large enough to prove or falsify the properties. . . 76
3.6 Results of the comparison between CBMC and ESBMC on a industrial
case study. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1 Read-write analysis of interleaving equivalence between visible instructions.103
4.2 Results of the comparison between MPOR and PPOR, and lazy, schedule,
and UW ESBMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.3 Results of the comparison between ESBMC (v1.15.1) and Microsoft CHESS
(v0.1.30626.0). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.4 Results of the comparison between SATABS (v2.5) and ESBMC (v1.15.1).117
6.1 Concrete values to check the circular buﬀer. . . . . . . . . . . . . . . . . . 137
6.2 Transition function δ for the B¨ uchi automaton shown in Figure 6.6. . . . . 138
6.3 Results for running the test cases for the functions commandLoop and
checkCommandParams. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.4 Results for checking the equivalence between the functions of the exStb-
Demo application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.5 Results of the LTL properties veriﬁcation of the pulse oximeter. . . . . . . 146
B.1 Results of applying ESBMC to the veriﬁcation of the benchmarks from
the EUREKA suite. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
B.2 Results of applying ESBMC to the veriﬁcation of the benchmarks from
the PowerStone suite. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
B.3 Results of applying ESBMC to the veriﬁcation of the correct benchmarks
from the NECLA suite. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
xixii LIST OF TABLES
B.4 Results of applying ESBMC to the veriﬁcation of the bad benchmarks
from the NECLA suite. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
B.5 Results of applying ESBMC to the veriﬁcation of the benchmarks from
the SNU-RT suite. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
B.6 Results of applying ESBMC to the veriﬁcation of the correct benchmarks
from the VERISEC suite - Part I. . . . . . . . . . . . . . . . . . . . . . . 170
B.7 Results of applying ESBMC to the veriﬁcation of the correct benchmarks
from the VERISEC suite - Part II. . . . . . . . . . . . . . . . . . . . . . . 171
B.8 Results of applying ESBMC to the veriﬁcation of the correct benchmarks
from the VERISEC suite - Part III. . . . . . . . . . . . . . . . . . . . . . 172
B.9 Results of applying ESBMC to the veriﬁcation of the bad benchmarks
from the VERISEC suite - Part I. . . . . . . . . . . . . . . . . . . . . . . 173
B.10 Results of applying ESBMC to the veriﬁcation of the bad benchmarks
from the VERISEC suite - Part II. . . . . . . . . . . . . . . . . . . . . . . 174
B.11 Results of applying ESBMC to the veriﬁcation of the bad benchmarks
from the VERISEC suite - Part III. . . . . . . . . . . . . . . . . . . . . . 175
B.12 Results of applying ESBMC to the veriﬁcation of the benchmarks from
the WCET suite. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176Abbreviations
ADC Analog-to-Digital Converter
AST Abstract Syntax Tree
BDD Binary Decision Diagram
SAT Boolean Satisﬁability [24]
BMC Bounded Model Checking [24]
BA Buechi Automata
CBMC C Bounded Model Checker [42]
CI Continuous Integration [70]
CTL Computational Tree Logic
CNF Conjunctive Normal Form
CFG Control Flow Graph [134]
CEGAR Counterexample-Guided Abstraction Reﬁnement [46]
DSP Digital Signal Processor
ECTL Existential Computational Tree Logic
DPLL Davis-Putnam-Logemann-Loveland [24]
ECS Eﬀective Context Switches
EFSM Extended Finite State Machine
ESBMC Eﬃcient SMT-Based Bounded Model Checker [52]
ESW Embedded Software
FOL First-Order Logic
FSM Finite State Machine
HDL Hardware Description Language
IC Integrated Circuit
IF Intermediate Frequency
IPC Inter-Process Communication
LTL Linear-time Temporal Logic
MDE Model-Driven Engineering
MDG Multiway Decision Graph
MPOR Monotonic Partial Order Reduction [103]
OCL Object Constraint Language
OFDM Orthogonal Frequency-Division Multiplexing
PL Propositional Logic [94]
xiiixiv ABBREVIATIONS
POR Partial Order Reduction [40]
PPOR Peephole Partial Order Reduction [103]
PSL Property Speciﬁcation Language [7]
QF Quantiﬁer-Free Formula
QF AUFBV Quantiﬁer-free formula over the theory of bit-vectors
and bit-vector arrays with function and predicate symbols [164]
QF AUFLIRA Quantiﬁer-free formula over closed linear formulas with function
and predicate symbols over a theory of arrays of integer index
and real value [164]
RT Reachability Tree
RG Region Graph
RTCTL Real-Time Computational Tree Logic
SMT Satisﬁability Modulo Theories [19]
TA Timed Automata
TECTL Timed Existential Computational Tree Logic
TPN Timed Petri Nets
TS Transport Stream
UW Under-approximation and Widening
V C Veriﬁcation Condition
V CG Veriﬁcation Condition Generator
WCET Worst-Case Execution TimeDeclaration Of Authorship
I, Lucas Carvalho Cordeiro, declare that this thesis entitled as SMT-Based Bounded
Model Checking of Multi-threaded Software in Embedded Systems and the
work presented in it are my own and has been generated by me as the result of my own
original research.
I conﬁrm that:
1. This work was done wholly or mainly while in candidature for a research degree
at this University;
2. Where any part of this thesis has previously been submitted for a degree or any
other qualiﬁcation at this University or any other institution, this has been clearly
stated;
3. Where I have consulted the published work of others, this is always clearly at-
tributed;
4. Where I have quoted from the work of others, the source is always given. With
the exception of such quotations, this thesis is entirely my own work;
5. I have acknowledged all main sources of help;
6. Where the thesis is based on work done by myself jointly with others, I have made
clear exactly what was done by others and what I have contributed myself;
7. Parts of this work have been published.
Signed:
Date:
xvList of Publications
1. Cordeiro, L., Fischer, B., Chen, H., and Marques-Silva, J. Semiformal Veriﬁ-
cation of Embedded Software in Medical Devices Considering Stringent Hardware
Constraints. In 6th Intl. Conf. on Embedded Software and Systems
(ICESS), pp. 396-403, IEEE, 2009.
2. Cordeiro, L., Fischer, B., and Marques-Silva, J. SMT-Based Bounded Model
Checking for Embedded ANSI-C Software. In 24th Intl. Conf. on Automated
Software Engineering (ASE), pp. 137-148, IEEE/ACM, 2009.
3. Cordeiro, L., Fischer, B. and Marques-Silva, J. Continuous Veriﬁcation of Large
Embedded Software using SMT-Based Bounded Model Checking. In 17th Intl.
Conf. and Workshop on the Engineering of Computer Based Systems
(ECBS), pp. 160-169, IEEE, 2010.
4. Cordeiro, L. SMT-Based Bounded Model Checking for Multi-threaded Software in
Embedded Systems. In 32nd Intl. Conf. on Software Engineering (ICSE),
Doctoral Symposium. pp. 373-376, ACM/IEEE, 2010.
5. Cordeiro, L. and Fischer, B. Bounded Model Checking of Multi-threaded Soft-
ware using SMT solvers. In 8th Intl. Workshop on Satisﬁability Modulo
Theories (SMT), Presentation-only paper, FLoC, 2010.
6. Rocha, H., Cordeiro, L., Barreto, R., and Neto, J. Exploiting Safety Properties
in Bounded Model Checking for Test Cases Generation of C Programs. In 4th
Brazilian Workshop on Systematic and Automated Software Testing
(SAST), pp. 121-130, SBC, 2010.
7. Cordeiro, L. and Fischer, B. Verifying Multi-threaded Software using SMT-based
Context-Bounded Model Checking. To appear in 33rd Intl. Conf. on Soft-
ware Engineering (ICSE), ACM/IEEE, 2011.
8. Cordeiro, L., Fischer, B., and Marques-Silva, J. SMT-Based Bounded Model
Checking for Embedded ANSI-C Software. Under review in the IEEE Trans-
actions on Software Engineering (TSE), IEEE, 2011.
xviiAcknowledgements
I would like to thank my friend and supervisor Dr. Bernd Fischer. I could not have com-
pleted this thesis without his invaluable support, guidance, encouragement and friend-
ship over the years of my research. Beyond any duty, his eﬀort helped me considerably to
turn this work into a PhD dissertation. I would also like to thank Prof. Joao Marques-
Silva and Prof. Michael J. Butler for their objective advice and assistance, and Prof.
Mark Zwolinski and Prof. Rupak Majumdar for agreeing to be the examiners of my
work. Many thanks to my friends and colleagues who helped me not only with fruitful
discussions about my work and “proof-reading” my scribbles, but also with my personal
life (beyond the work). I would also like to thank my sponsors (ORSAS and ECS) for
their ﬁnancial support. Last but not least, I would like to thank my wife for her un-
conditional love and support. Without her continuous encouragement, I certainly would
not be where I am today. I am also very grateful to my parents for their support in my
education and my desire to always learn more and more.
xixTo my dearest wife and son ...
xxiChapter 1
Introduction
Embedded computer systems are used in a variety of sophisticated applications, which
range from safety-critical systems such as nuclear reactors and automotive controllers, to
entertainment software such as games and graphics animation. Embedded systems are
ubiquitous in modern day information systems and are becoming increasingly important
in our society. As a consequence, human life has also become more and more depen-
dent on the services provided by this type of system. In general, an embedded system
may be viewed as consisting of an electrical and mechanical subsystem, a controlling
embedded computer and a man-machine interface, which together perform a group of
dedicated functions within a larger system [104]. More speciﬁcally, it consists of a set
of hardware/software components that together implement a set of functionalities while
satisfying constraints such as timing, power dissipation, and monetary costs.
Embedded systems also replace many mechanical and hydraulic control systems within
safety-critical and high dependability applications. Despite the criticality of the appli-
cations, the main role of the software in embedded systems is the interaction with the
physical world, rather than the transformation of data. Embedded software has thus
a number of characteristics that diﬀer substantially from conventional desktop applica-
tions. For example, it is dedicated to perform a particular task, which requires to meet
the timing constraints of the application, access the memory region, handle concurrency,
and control the hardware registers. The reliability of the software in embedded systems
then plays an important role to avoid catastrophic errors (especially in safety-critical
systems) and to reduce costs (because software errors are expensive [1]).
Due to the high pressure imposed by the market to launch new products, coupled with
evolving system speciﬁcations, semiconductor and system development companies are
forced to choose ﬂexible implementations where new products can be built quickly [51].
The increasing computational power and decreasing size and cost of processors is en-
abling system designers to move more functionalities to software. Market analysis shows
that software-based implementations account for more than 80% of system development
12 Chapter 1 Introduction
in the embedded systems domain [174]. The increasing number of functionalities moved
to software-based implementations leads to diﬃculties in verifying design correctness. In
practice, however, this veriﬁcation is of importance due to the dependability properties
(brieﬂy reliability and availability) required in several embedded system domains such
as automotive, industrial automation, and transportation. In order to verify the design
correctness of hardware blocks, model checking has been widely used as a veriﬁcation
methodology [47]. However, the veriﬁcation of embedded software has always been dif-
ﬁcult, mainly due to the stringent constraints imposed by the hardware (e.g., real-time,
memory allocation, interrupts, and concurrency) when verifying the design correctness.
Nowadays, peer reviewing and testing are the major software veriﬁcation techniques used
in practice [14]. A peer review is carried out by a team of experienced software engineers
that inspects the software and preferably has not been involved in the development of
the software under review. Empirical studies show that the peer review technique is
able to catch between 31% and 93% of the defects with a median around 60% [14].
Software testing, as opposed to peer review, is a dynamic technique that actually runs
the software instead of analyzing it statically without executing. Correctness is thus
determined by forcing the software to traverse a set of execution paths and by observing
during test execution the actual and expected output of the software. These approaches,
however, take up to 70% of the total development time to ﬁnd out bugs and implement
the necessary corrections in the design [79].
We can thus see that there is clearly a tradeoﬀ between time-to-market (i.e., the time
between product conception and arrival on the market), costs and quality. On the one
hand, consumer electronics companies strive to shorten the time-to-market with the
purpose of being the ﬁrst one to launch the product and maximize the proﬁt. However,
some steps of the development process might be skipped to achieve this, thus compro-
mising quality. On the other hand, software bugs cause a loss of billions of US dollars
annually [1]. It is then of great importance to detect software bugs as early as possible
with minimum eﬀort, cost and time, because the cost of repairing a software ﬂaw during
maintenance is hundreds of times more expensive than a ﬁx in an early design phase.
Consequently, the development of techniques to ensure low-defect embedded systems
given their complexity (i.e., size and shorter development time) represents a signiﬁcant
research challenge, a challenge that has increased signiﬁcantly with the emergence of
multi-threaded applications. The motivation of this thesis is thus to deal with the in-
creasing diﬃculty in verifying embedded systems design correctness within the market
window and with the required level of conﬁdence in the signed-oﬀ design.
Bounded model checking (BMC) based on Boolean Satisﬁability (SAT) has already
been successfully applied to verify sequential software in embedded systems and discover
subtle errors in real designs [24]. However, the veriﬁcation of multi-threaded software is
a hard problem, because we need to explicitly account for interleavings1 of transitions
1An interleaving represents a possible execution of the program where all of the concurrent eventsChapter 1 Introduction 3
of diﬀerent threads. A major strength of BMC to combat this problem is that BMC
analyzes only bounded program runs (thereby achieving decidability) and state space
reduction is exploited internally by state-of-the-art SAT or SMT (Satisﬁability Modulo
Theories) solvers with the use of conﬂict clauses and non-chronological backtracking.
The basic idea of the BMC technique is thus to check (the negation of) a given property
at a given depth: given a transition system M, a property φ, and a bound k, BMC
unrolls the system k times and translates it into a veriﬁcation condition (VC) ψ such
that ψ is satisﬁable if and only if φ has a counterexample of depth less than or equal to
k. Standard Boolean Satisﬁability solvers can be used to check whether ψ is satisﬁable.
In BMC of software, the bound k limits the number of loop iterations and recursive
calls in the program. BMC of software thus generates VCs that reﬂect the exact path in
which a statement is executed, the context in which a given function is called, and the
bit-accurate representation of the expressions. Proving the validity of the VCs arising
from (sequential or) multi-threaded software remains a major performance bottleneck in
verifying embedded software, despite attempts to cope with increasing system complexity
by applying SMT solvers.
In this thesis, we develop and evaluate approaches that exploit the use of SMT solvers for
model checking multi-threaded ANSI-C software and our modelling of the synchroniza-
tion primitives of the Pthread library [135]. We also describe translations from ANSI-C
programs to SMT formulas with the same precision as bit-accurate SAT-based proce-
dures. In this sense, we extend the encodings from previous SMT-based bounded model
checkers [11, 71] to provide more accurate support for variables of ﬁnite bit width, bit-
vector operations, arrays, structures, unions and pointers. In contrast to previous fully
symbolic approaches to handle multi-threaded systems (e.g., [73, 102, 103, 152, 84]), we
combine symbolic model checking with explicit state space exploration. We analyze a
reachability tree, which is a description of all reachable states of a program, built by
unfolding the actions of each thread and we then propose novel exploration methods
to traverse this reachability tree. In particular, we explicitly explore the possible inter-
leavings of a program (up to the given context bound) while we treat each interleaving
itself symbolically. We also exploit SMT techniques to prune the property and data
dependent search space and to remove interleavings that are not relevant by analyzing
the proof of unsatisﬁability.
In summary, we propose a comprehensive SMT-based context-bounded model checking
procedure which we implemented in the ESBMC (Eﬃcient SMT-based Context-Bounded
Model Checker)2 tool for verifying multi-threaded software in embedded systems written
in ANSI-C. In our work, we consider embedded software because it has characteristics
that make it attractive for BMC, e.g., dynamic memory allocations and recursion are
highly discouraged, and that make the limitations of bounded model checking less strin-
gent. We also chose ANSI-C because it is the most common implementation language
are arranged in a linear order.
2Available at http://users.ecs.soton.ac.uk/lcc08r/esbmc/4 Chapter 1 Introduction
for embedded software (and in particular for developing optimized applications), but
all techniques that we describe in this thesis are also applicable to languages that are
similar to ANSI-C (e.g., MISRA-C). Our experimental results show that our approach
scales signiﬁcantly better than both SAT-based and SMT-based versions of the CBMC
model checker [42, 105] and SMT-CBMC [11], a bounded model checker for sequen-
tial C programs that is based on the SMT solvers CVC3 [20] and Yices [65]. We also
show that our approaches to verify multi-threaded software can analyze larger problems
and substantially reduce the veriﬁcation time compared to state-of-the-art techniques
for multi-threaded veriﬁcation that use BMC (e.g., [103]), iterative context-bounding
algorithms (e.g., [138]) and others that implement counterexample-guided abstraction
reﬁnement (CEGAR) techniques (e.g., [44]).
The rest of this chapter describes the problem statement, objectives and outlines the
solution. It then summarises our contributions and presents the structure of the thesis.
1.1 Problem Description
This PhD thesis tackles two major problems in computer-aided veriﬁcation: (1) provid-
ing suitable encodings into the SMT theories to reason accurately and eﬀectively about
realistic embedded programs and (2) exploiting SMT techniques to leverage bounded
model checking of multi-threaded software.
Part of the ﬁrst problem stems from the fact that most software veriﬁcation tools are
unable to reason accurately about embedded programs. Most programming languages
provide basic data types that have a bounded range deﬁned by the number of bits
allocated to each of them. They also contain constructs such as structures, unions,
and pointers that are not directly supported by the SMT solvers, and are often en-
coded imprecisely using axioms and uninterpreted functions by software veriﬁcation
tools that employ theorem provers (e.g., Simplify [62]) as back-end (e.g., ESC/Java [69],
BLAST [89], and Magic [35]). Nevertheless, in order to reason about embedded soft-
ware accurately, an SMT-based software veriﬁcation tool must consider a number of
issues that are not easily mapped into the theories supported by the SMT solvers, e.g.,
QF AUFBV (the theory of bit-vectors and bitvector arrays with function and predicate
symbols) and QF AUFLIRA (closed linear formulas with function and predicate sym-
bols over a theory of arrays of integer index and real value) [164]. In previous work
on SMT-based BMC for software [11, 71] only the theories of uninterpreted functions,
arrays and linear arithmetic were considered, but no encoding was provided for ANSI-
C [95] constructs such as bit operations, unions, ﬁxed-point arithmetic, pointers (e.g.,
pointer arithmetic and comparisons) and dynamic memory allocation. This limits its
usefulness for analyzing and verifying realistic embedded software written in ANSI-C.
The other part of the ﬁrst problem stems from the fact that most software veriﬁcationChapter 1 Introduction 5
tools are unable to reason eﬀectively about embedded programs. There are tools that
employ SAT solvers as back-end and thus provide a bit-level accurate symbolic simulator
(e.g., CBMC [42], F-SOFT [96]), but they have limitations due to ineﬃcient transla-
tions and loss of high-level design information during the BMC problem formulation,
especially when reasoning on the propositional encoding of arithmetical operators (e.g.,
multiplication) [49]. SMT solvers, however, often integrate a simpliﬁer, which applies
standard algebraic reduction rules (e.g., a ∧ false  → false) and contextual simpliﬁcation
(e.g., b = 7 ∧ p (b)  → b = 7 ∧ p (7)) before bit-blasting or bit-ﬂattening (i.e., replacing
the word-level operators by bit-level circuit equivalents) propositional expressions to a
SAT solver. As structural word-level information remains in the problem formulation,
bit-blasting is used by the SMT solvers only as a last resort if the more abstract and less
expensive techniques are not powerful enough to solve the problem at hand (e.g., the
incremental and layered approach which permits strengthening incrementally the model
of the arithmetic operators [28]).
Consequently, new encodings are needed into existing SMT theories in order to make
veriﬁcation scalable and to model precisely ANSI-C scalar data types (with accurate
arithmetic over- and underﬂow), arrays, pointers, structures, and unions.
Second, the widespread use of multi-core processors with scalable shared memory in
embedded systems is already having a tangible impact on development and testing for
major software vendors [144]. However, the veriﬁcation of the software design and the
correctness of its multi-threaded implementations has become increasingly diﬃcult, for
at least three reasons. The ﬁrst reason is that the veriﬁcation of multi-threaded programs
exhibits more non-deterministic behaviour (i.e., the choice of interleaving among threads
in addition to the non-deterministically chosen values), which results in a large state
space that must be explored by a model checker. The second reason is that concurrency
errors are tricky to reproduce and debug because they usually occur under speciﬁc thread
interleavings. These errors most frequently manifest as deadlock, data races, atomicity
violations, and order violations and ﬁnding them in realistic multi-threaded programs is
challenging. In particular, an empirical study shows that the most common concurrency
errors are related to atomicity and order violations (approx. 67%) and deadlock (approx.
30%) [117]. This leads to the third reason namely that errors related to multi-threaded
software typically involve changes in program state due to particular interleavings of
multiple threads of execution, thus making them diﬃcult to understand in the code.
As an example of the non-determinism related to the choice of interleaving among
threads, we consider a synthetic micro-benchmark extracted from Ghafari et al. [74],
which checks for a single valid property as shown in Figure 1.1. This micro-benchmark
is used in [74] to check the scalability of multi-threaded software veriﬁcation tools by
varying two key problem parameters: the number of threads (n) and program statements
(s).6 Chapter 1 Introduction
S0: x++;
.
.
.
Sk: x++;
assert (x>0); assert (x>0);
x++;
.
.
.
x++;
......
T1 T2
int x=0
assert (x>0);
x++;
.
.
.
x++;
Tn
Figure 1.1: A Synthetic Micro-benchmark.
This micro-benchmark uses a shared global variable x, which in the initial state is initial-
ized to 0. Then, n threads are created such that each thread consists of s increments of
the variable x followed by an assertion that checks if x is greater than 0. Although this
micro-benchmark is a simplistic example of a multi-threaded program, it has essentially
three key elements that make it worthwhile to mention: local state (the program coun-
ters), shared global state x, and long data-dependency chains that grow with code size
and must then be inspected to prove the assertions. Therefore, this micro-benchmark
shows that as we increase n and s (and consequently the data-dependency), the number
of interleavings can grow very quickly (i.e., the number of possible execution sequences
is O(ns)) since context switches among threads (due to the global variable x) increase
the number of possible execution paths considerably. Hence, in order to fully verify
multi-threaded programs against a given speciﬁcation, all possible interleavings must be
considered, and this thus represents a challenging problem in computer-aided veriﬁca-
tion.
Recently, there have been attempts to extend BMC to the veriﬁcation of multi-threaded
software [73, 102, 103, 152]. The main challenge remains the classic state space explosion
problem in which the number of interleavings grows exponentially with the number of
threads and program statements as sketched above. Previous attempts are unable to
model check realistic multi-threaded programs (e.g., [152] evaluate their approach on a
concurrent bubblesort and [103] on a parameterized version of the dining philosophers
model, which are untypical multi-threaded C programs.) and they are unable to ﬁnd
bugs related to local and global deadlock (e.g., [73, 102, 103, 152]). Other attempts (e.g.,
[39]) encode the semantics of the SystemC scheduler, which does not allow preempting
a thread at any visible instruction in its execution and it is thus unsuitable to model
check multi-threaded software.
As far as we are aware, there is no other work that considers a comprehensive SMT-basedChapter 1 Introduction 7
context-bounded model checking technique to verify real-world multi-threaded ANSI-C
software by combining symbolic model checking with explicit state space exploration.
Thus the problem considered in this thesis is expressed in the following question: can an
algorithmic method reason accurately about realistic multi-threaded software in embedded
systems and at the same time control the veriﬁcation complexity?
1.2 Objectives
The main objective of this thesis is thus to propose and evaluate an SMT-based bounded
model checking formulation to reason accurately about multi-threaded software, for ex-
ample, used in embedded systems. In particular, we focus on embedded applications
written in ANSI-C that are platform-independent (single- and multi-threaded); and we
do not model check platform-dependent software (e.g., software that controls the hard-
ware registers) nor the timing constraints of the application. We further try to exploit
the SMT solvers to remove possible undesired models of the system in order to satisfy
a given property. In this respect, we develop new algorithmic methods and correspond-
ing tools based on SMT techniques to verify single- and multi-threaded software (with
shared variables) in embedded systems. More speciﬁcally we will:
1. Provide details of an accurate translation from programs written in (full) ANSI-C
into quantiﬁer-free (QF) ﬁrst-order logic formulae (cf. Chapter 3).
2. Propose approaches to model check multi-threaded software with shared variables
by combining symbolic model checking with explicit state space exploration and by
bounding the number of context switches allowed among threads (cf. Chapter 4).
3. Develop heuristics to simplify the unwound formula arising from BMC instances
and exploit the diﬀerent theories and SMT solvers (cf. Chapter 5).
4. Detect design errors and integration problems as quickly as possible by exploit-
ing information from the software conﬁguration management (SCM) system (cf.
Chapter 6).
In Chapter 3, we propose a new encoding for (full) ANSI-C by exploiting the back-
ground theories supported by the SMT solvers (e.g., uninterpreted functions, arithmetic,
bit-vectors, and arrays). Hence, we extend and combine these background theories to
develop an approach to model precisely the ANSI-C program’s semantics. We will
demonstrate that this new encoding allows us to reason accurately about realistic em-
bedded software systems and improve the performance of software model checking for a
wide range of applications.8 Chapter 1 Introduction
In Chapter 4, we describe and evaluate three approaches to SMT-based bounded model
checking: lazy, schedule recording and underapproximation and widening. In all three
approaches, we combine symbolic model checking with explicit state space exploration
by constructing a reachability tree derived from the program and we also use a context-
bounded analysis [112, 171] that limits the number of context switches it explores. This
thus allows exploring explicitly the possible thread interleavings (up to the given con-
text bound) while treating each interleaving itself symbolically. We will evaluate our
approaches over several multi-threaded applications and show that they substantially
reduce the veriﬁcation time compared to other state-of-the-art techniques.
In Chapter 5, we exploit the diﬀerent background theories of SMT solvers and com-
bine diﬀerent theories and solvers, based on an analysis of the syntactic structure of a
given ANSI-C program. This allows exploiting the structure provided by the program,
and thus, improving scalability by making the analysis computationally more tractable.
Additionally, we describe a set of simpliﬁcations that we used in order to reduce the un-
wound formula. We will evaluate the performance improvement of these simpliﬁcations
and heuristics over a large set of benchmarks and show that they prevent overburdening
the model checker in realistic applications.
In Chapter 6, we describe an approach to integrate our SMT-based model checker into
the software engineering practice by focusing systematically the veriﬁcation eﬀort on
new or modiﬁed functions. We investigate the use of equivalence checking to determine
whether modiﬁed functions need to be re-veriﬁed formally and use existing test cases
to reduce the search space for the model checker, thus combining dynamic and static
veriﬁcation. We will demonstrate through case studies that the proposed approach can
potentially improve the error-detection capability and reduce the overall veriﬁcation
time.
1.3 Outline of the Solution
Our approach deals with the theoretical and pragmatic aspects of using SMT techniques
to model check single- and multi-threaded software in embedded systems. We thus de-
velop algorithms and the corresponding tools and evaluate them using standard software
model checking benchmarks. The tools are built using a number of advanced (and com-
plex) techniques, including symbolic execution engines, satisﬁability modulo theories
solvers, context-bounded analysis, and partial order reduction. We use oﬀ-the-shelf
software wherever possible and focus our eﬀort on a comprehensive and implemented
SMT-based model checking procedure to verify embedded software or more precisely, ﬁ-
nite approximations of embedded software. In particular, we reuse the C/C++ front-end
from the CProver framework3 and use existing SMT solvers.
3The CProver framework consists of the components on which the veriﬁcation tools CBMC [42] and
SATABS [43] are based. It provides a mature and robust front-end for ANSI-C and C++ programs.Chapter 1 Introduction 9
Figure 1.2 shows an overview of our SMT-based bounded model checking procedure for
single- and multi-threaded software in embedded systems. In Figure 1.2, the box labelled
CFG with solid lines represents the component that we reused without any modiﬁca-
tion from the CProver framework (i.e., the construction of the control-ﬂow graph). The
boxes labelled BMC (i.e., symbolic execution engine) and properties (i.e., property in-
strumentation) that are shown with thick dashed lines represent the components that we
substantially extended from the CProver framework in order to simplify the unwound
formula and handle multi-threaded programs; in particular, the CProver framework does
not perform bounded model checking of multi-threaded programs. The boxes labelled
scheduler and veriﬁcation conditions with thick solid lines represent components that we
developed from scratch. We describe here only a summary of each phase of our proposed
approach; more details are presented in the next chapters. The phases can be described
as follows:
C/C++
source
CFG scheduler
properties
BMC
verification
conditions
SMT 
solver
scan, parse, 
and type-check
single- and 
multi-threaded 
goto programs
deadlock, atomicity and
order violations, data race
guide the symbolic 
execution engine for 
multi-threaded goto
programs
symbolic execution
engine
QF formula 
generation
check satisfiability using
an SMT solver
arithmetic under- and overflow,
array bounds, pointer safety,
memory leaks, user-specified assertions
single-threaded 
goto programs
Figure 1.2: Proposed SMT-based BMC procedure for software.
• Building the control-ﬂow graph (CFG): In BMC, the program to be analyzed
is modelled as a state transition system, which is extracted from the control-ﬂow
graph (CFG) [134]. The CFG is used as part of a translation process from program
text to single static assignment (SSA) form. This component is thus responsible
for scanning, parsing, and type-checking the C/C++ code and is reused from the
CProver framework without any modiﬁcation.
• Automatic generation of (concurrency) properties: In addition to the
language-speciﬁc safety properties that are generated automatically by the CProver
framework (e.g., absence of arithmetic under- and overﬂow, out-of-bounds array
indexing, or NULL-pointer dereferencing), we extend its class of properties to gen-
erate veriﬁcation conditions to check for memory leaks, data races and atomicity
and order violations in single- and multi-threaded programs. We also provide a new10 Chapter 1 Introduction
instrumented model of the Pthread functions to generate veriﬁcation conditions to
check for local and global deadlocks in the client code.
• Thread scheduler: This component guides the symbolic execution between
threads and systematically explores all the possible interleavings. To this end,
we construct a reachability tree (RT) of a multi-threaded program by unwinding
the control-ﬂow graph in a depth-ﬁrst search manner. We thus generate explicitly
the thread interleavings (with techniques similar to explicit-state model checking)
and we then guide the symbolic execution engine in order to encode each thread
symbolically. In particular, we explore the reachability tree by using three diﬀer-
ent approaches called lazy exploration, schedule recording, and underapproximation
and widening. In the lazy exploration approach, we traverse the RT depth-ﬁrst,
and simply call the single-threaded BMC procedure on the interleaving whenever
we reach an RT leaf node. We stop the RT traversal either when we ﬁnd a bug,
or have systematically explored all interleavings. In the scheduling recording ap-
proach, we use the RT to encode all the possible execution paths into one single
formula, which is then fed into the SMT solver. In the underapproximation and
widening approach, we model check models with an increasing set of allowed in-
terleavings. We start from an underapproximation describing a single interleaving
and widen the model by adding more interleavings incrementally based on the
proof objects generated from an SMT solver.
• Symbolic execution engine: For single-threaded programs, this component
takes as input the CFG representation of the program, a property φ, and a bound
k. It derives as output a veriﬁcation condition ψk such that ψk is satisﬁable
if and only if φ has a counter-example of length k or less. For multi-threaded
programs, this component takes as input a reachability tree Υ = {ν1,...,νN}
(where νi is a given node in the reachability tree) that represents the program
unfolding for a context bound C and a bound k, and a property φ. It derives as
output a veriﬁcation condition ψπ
k for a set of interleavings π =
 m
i=0 πm (where
m is the total number of interleavings) or for a given interleaving (or computation
path) πi = {ν1,...,νk} such that ψπ
k (or ψ
πi
k ) is satisﬁable if and only if φ has a
counterexample of depth less than or equal to k that is exhibited by π (or πi).
Here, we extend the SSA form of the symbolic execution engine to avoid naming
conﬂicts (when verifying multi-threaded programs) such as local (i.e., threads that
contain local variables with the same name) and path (i.e., nodes of the RT that
contain variables with the same name) conﬂicts. Additionally, we implement a set
of simpliﬁcation techniques (e.g., constant propagation and forward substitution)
to reduce the unwound formula and we also perform an up-front analysis in the
control-ﬂow graph of the program during the symbolic execution to determine the
most appropriate encoding and solver for a particular program.
• Quantiﬁer-free formula generation: This component takes as input the veri-Chapter 1 Introduction 11
ﬁcation conditions generated by the symbolic execution engine and encodes them
into a quantiﬁer-free formula in a decidable subset of ﬁrst-order logic. Here, new
encodings are provided into existing SMT theories to model precisely ANSI-C
scalar data types (with accurate arithmetic overﬂow and underﬂow), arrays and
pointers (i.e., pointer arithmetic and comparisons), structures and unions, memory
allocation and ﬁxed-point arithmetic.
1.4 Contributions
The main contribution of this PhD thesis is the development, implementation, and
evaluation of a comprehensive SMT-based bounded model checking procedure to verify
realistic single- and multi-threaded software in embedded systems. In this respect, this
thesis makes three major novel contributions.
First, we describe the details of an accurate translation from ANSI-C programs into
quantiﬁer-free formulae using the SMT logics QF AUFBV and QF AUFLIRA from the
SMT-LIB and we also apply a set of optimization techniques to prevent overburdening
the solver. We demonstrate that our encoding and optimizations improve the per-
formance of software model checking for a wide range of embedded software systems.
Additionally, we show that our encoding allows us to reason about arithmetic under-
and overﬂow, pointer safety, memory leaks, array bounds, atomicity and order viola-
tions, deadlock, data race, and user-speciﬁed assertions; and to verify programs that
make use of bit-level, pointers, dynamic memory allocation, structs, unions and ﬁxed-
point arithmetic. Note that we do not require the user to annotate the programs with
pre/post-conditions and the veriﬁcation is thus completely automatic. We also use three
diﬀerent SMT solvers (CVC3 [20], Boolector [31], and Z3 [57]) in order to check the eﬀec-
tiveness of our encoding techniques. We considered these solvers because they were the
most eﬃcient ones for the categories of QF AUFBV and QF AUFLIRA in the last SMT
competitions.4 As far as we are aware, no SMT-based bounded model checking tool ex-
isted that can reliably handle full ANSI-C. We also exploit diﬀerent background theories
and solvers, based on an analysis of the syntactic structure of a given ANSI-C program
in order to improve scalability and precision in a completely automatic way. To the best
of our knowledge, this is the ﬁrst work that reasons accurately about ANSI-C constructs
commonly found in embedded software and extensively applies SMT solvers to check the
veriﬁcation conditions emerging from the bounded model checking of embedded software
industrial applications.
Second, we exploit SMT to improve bounded model checking of multi-threaded software.
In particular, we exploit SMT solvers to prune the property and data dependent search
space (via non-chronological backtracking and conﬂict clauses learning) and to remove
4The results are available at http://www.smtcomp.org12 Chapter 1 Introduction
possible undesired models (i.e., interleavings that are not relevant) of the system in order
to satisfy a given property (which is done by analyzing the proof of unsatisﬁability). We
describe and evaluate three approaches: lazy, schedule recording, and underapproxima-
tion and widening (UW) to model check multi-threaded software with shared variables
and locks using bounded model checking based on SMT techniques and our modelling
of the synchronization primitives of the Pthread library. Here, the main novelty is in
the combination of symbolic model checking with explicit state space exploration that
underlies all three approaches. To the best of our knowledge, the lazy approach has
not been described or evaluated in the literature. Similarly, underapproximation and
widening has not been used for bounded model checking of multi-threaded software. Ad-
ditionally, our approach is based on the new notion of eﬀective context switches (ECS)
blocks and it thus uses a diﬀerent encoding from Grumberg et al. [84]. The diﬀerence
between our schedule recording and the approaches proposed by [73, 102, 103, 152] is
that they all work in a fully symbolic context. We also describe a new modelling of the
Pthread synchronization primitives for mutex and condition variables that allows us to
detect local and global deadlock.
Finally, we explore a new concept called continuous veriﬁcation to detect design errors
and integration problems as quickly as possible by exploiting information from the soft-
ware conﬁguration management (SCM) system, systematically focusing the veriﬁcation
eﬀort on new or modiﬁed functions [54]. We thus add a state-space reduction technique
for our SMT-based bounded model checking procedure, which looks at the modiﬁca-
tions suﬀered by the system since its last veriﬁcation, and submits them to a partly
static, partly dynamic “continuous” veriﬁcation process, guided by a set of test cases
for coverage. As a result, we integrate the continuous veriﬁcation approach with the
combination of diﬀerent encodings and solvers in order to allow us to verify larger parts
of the state space of the system (compared to software model checkers only) and explore
more exhaustively the state space (compared to testing only).
1.5 Organization of the Thesis
This introduction has outlined the context, motivation, and problem addressed by this
thesis, and the objectives, solution and contributions of the research. The remainder of
the chapters of this thesis are organized as follows:
Chapter 2, SAT-based and SMT-based Veriﬁcation Techniques, overviews the main con-
cepts needed to understand this thesis, such as propositional logic, SAT-based bounded
model checking, satisﬁability modulo theories and concurrent systems and reviews some
methods to achieve completeness in the BMC framework such as Craig interpolation and
k-induction. It also includes an explanation about veriﬁcation conditions and partial-
order reduction. Additionally, this chapter reviews the related work on model checkingChapter 1 Introduction 13
sequential and multi-threaded software as well as techniques applied to the veriﬁcation
of large embedded software systems.
Chapter 3, SMT-based Bounded Model Checking for Embedded ANSI-C Software, de-
scribes the encoding and application of diﬀerent background theories and SMT solvers
to the veriﬁcation of embedded software written in ANSI-C in order to improve scal-
ability and precision in a completely automatic way. We evaluate these approaches
on both standard software model checking benchmarks and typical embedded software
applications from telecommunications, control systems, and medical devices. Our ex-
periments show that our approaches can analyze larger problems than existing tools and
substantially reduce the veriﬁcation time.
Chapter 4, Verifying Multi-threaded Software using SMT-based Context-Bounded Model
Checking, describes and evaluates three approaches to model check multi-threaded soft-
ware with shared variables and locks using bounded model checking based on Satisﬁa-
bility Modulo Theories (SMT) and our modelling of the synchronization primitives of
the Pthread library. In all three approaches, we bound the number of context switches
allowed among threads in order to reduce the number of interleavings explored. This
chapter shows that our approaches can analyze larger problems and substantially reduce
the veriﬁcation time compared to state-of-the-art techniques that use BMC, iterative
context-bounding algorithms or counter-example guided abstraction reﬁnement.
Chapter 5, Implementation of ESBMC, describes the main software components of the
ESBMC architecture and the simpliﬁcations that we used in order to reduce the un-
wound formula. It also evaluates the simpliﬁcation techniques, which give a substantial
performance improvement over a large set of benchmarks.
Chapter 6, Integrating ESBMC into Software Engineering Practice, describes a new
approach called continuous veriﬁcation to detect design errors as quickly as possible
by exploiting information from the software conﬁguration management system and by
combining dynamic and static veriﬁcation to reduce the state space to be explored. This
chapter shows that the proposed approach can potentially reduce the overall veriﬁcation
time in a case study from the telecommunications domain.
Finally, Chapter 7, Conclusions, concludes the contributions of this thesis and describes
how our work diﬀer from the others. This chapter also outlines the limitations of our
approaches and presents some directions for future work.
Appendix A describes an Eclipse plug-in for the ESBMC model checker that can assist
the software engineer during the veriﬁcation process. This plug-in was developed with
the help of Qiang Li during his summer internship.
Appendix B shows the detailed results of the error-detection capability of ESBMC over
a large set of well-known static analysis benchmarks.14 Chapter 1 Introduction
Appendix C describes the main functions of the POSIX Pthread library [135] that ES-
BMC supports.
Appendix D shows an example of the counterexample that is generated by ESBMC for
a multi-threaded program.Chapter 2
SAT-based and SMT-based
Veriﬁcation Techniques
This chapter introduces the main concepts needed to understand this thesis. It is di-
vided into three main sections. The ﬁrst section, Logical Foundations, deﬁnes the syntax
and semantics of propositional logic and sketches decision procedures for checking sat-
isﬁability of propositional formulae. It also describes the background theories of the
Satisﬁability Modulo Theories (SMT) solvers that are used throughout this thesis, and
how to specify safety and liveness properties using linear-time temporal logic. The sec-
ond section, Bounded Model Checking of Software, presents the BMC technique and
shows how to achieve completeness in BMC via Craig interpolation and k-induction
techniques. This section also overviews the BMC architectures used in software veriﬁ-
cation and compares the BMC technique to other software veriﬁcation approaches that
also use logic to describe states and transformations between system states. Finally, the
third section, Veriﬁcation of Multi-threaded Systems, presents concepts and deﬁnitions
of multi-threaded (concurrent) systems and the partial order reduction technique used
to prune the state space of multi-threaded systems.
2.1 Logical Foundations
Logic can be deﬁned by means of symbols and a system of rules to manipulate the
symbols [29]. The use of logic allows us to model the programs and to reason about
them formally. This section thus introduces the logical foundations that will be the basis
for the explanation of our techniques described in Chapters 3, 4, and 6.
1516 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
2.1.1 Propositional Logic
This section recalls the deﬁnition of propositional logic (PL) syntax and semantics along
with some examples. Further information can be found in textbooks [29, 94, 109, 130].
The syntax of PL consists of symbols and rules so that we can combine the symbols to
construct “sentences” (more speciﬁcally formulae). Generally speaking, propositional
logic or calculus is a two-valued logic, which is based on the assumption that every
sentence is either true or false. A truth value (or a logical value, which is represented
by tt or ﬀ ), is a value indicating the relation of a proposition (i.e., the meaning of
the sentence) to truth. The basic elements of PL are the constants true (sometimes
also represented as ⊤ or 1) and false (sometimes also represented as ⊥ or 0) and the
propositional variables: x1,x2,...,xn (whose set is usually denoted by the letter X,
except where noted otherwise and n is a ﬁnite number of propositional variables). Logical
operators (e.g., ¬,∧), also called Boolean operators, provide the expressive power of PL.
Deﬁnition 2.1. The syntax of formulae in PL is deﬁned by the following grammar:
Fml ::= Fml ∧ Fml | ¬Fml | (Fml) | Atom
Atom ::= Variable | true | false
Using the logical operators conjunction (∧) and negation (¬), the full power of propo-
sitional logic is obtained. Other logical operators such as disjunction (∨), implication
(⇒), equivalence (⇔), exclusive or (⊕), and conditional expression (ite) can be deﬁned
as follows.
Deﬁnition 2.2. We deﬁne the usual logical operators as follows:
• φ1 ∨ φ2 ≡ ¬(¬φ1 ∧ ¬φ2)
• φ1 ⇒ φ2 ≡ ¬φ1 ∨ φ2
• φ1 ⇔ φ2 ≡ (φ1 ⇒ φ2) ∧ (φ2 ⇒ φ1)
• φ1 ⊕ φ2 ≡ (φ1 ∧ ¬φ2) ∨ (φ2 ∧ ¬φ1)
• ite(θ,φ1,φ2) ≡ (θ ∧ φ1) ∨ (¬θ ∧ φ2)
A PL formula is then deﬁned in terms of the basic elements true, false, or a propositional
variable x; or the application of one of the following logical operators to a formula φ:
“not” (¬φ), “and” (φ1 ∧ φ2), “or” (φ1 ∨ φ2), “implies” (φ1 ⇒ φ2), “iﬀ” (φ1 ⇔ φ2).
“parity” (φ1 ⊕ φ2) or “ite” (ite(θ,φ1,φ2)).
Each operator in PL has an arity (i.e., the number of arguments that it takes). The
operator “not” is unary while the other operators are binary, except for “ite”, which isChapter 2 SAT-based and SMT-based Veriﬁcation Techniques 17
a ternary operator. The left and right arguments of ⇒ are called the antecedent and
consequent respectively. The propositional variables, and propositional constants, true
and false, stand for indecomposable propositions, known as atoms, or atomic proposi-
tions. A literal is an atom β or its negation ¬β. A formula is a literal or the application
of a logical operator to a formula or formulae.
Formulae in PL are strings over the alphabet {x1,x2,x3,...} ∪ {¬,∧,∨,⇒,⇔} ∪ {(,)}.
The string ∧(¬) ∨ x1x2 ⇔ is a word over that alphabet, but it does have any meaning
as far as propositional logic is concerned.
Deﬁnition 2.3. We say that a PL formula is a well-formed formula if we use the
construction rules from Deﬁnition 2.1 to obtain it given that negation has priority over
conjunction.
Deﬁnition 2.4. We deﬁne the relative precedence of the logical operators from highest
to lowest as follows: ¬, ∧, ∨, ⇒ and ⇔.
In order to check whether a given PL formula is true or false, we ﬁrst deﬁne a mechanism
for evaluating the propositional variables by means of interpretations. An interpretation
I assigns to every propositional variable exactly one truth value. For instance, I =
{x1  → tt,x2  → ﬀ } is an interpretation assigning true to x1 and false to x2. Given a PL
formula and an interpretation, the truth value of a formula can be computed by a truth
table or by induction. Considering the possible evaluations of a propositional variable x
(i.e., tt or ﬀ ), we can construct the truth table for the logical operators ¬,∧,∨,⇒, ⇔
and ⊕ as shown in Table 2.1. It is important to note that x1 ⇒ x2 is false iﬀ x1 is true
and x2 is false.
x1 x2 ¬x1 x1 ∧ x2 x1 ∨ x2 x1 ⇒ x2 x1 ⇔ x2 x1 ⊕ x2
ﬀ ﬀ tt ﬀ ﬀ tt tt ﬀ
ﬀ tt tt ﬀ tt tt ﬀ tt
tt ﬀ ﬀ ﬀ tt ﬀ ﬀ tt
tt tt ﬀ tt tt tt tt ﬀ
Table 2.1: Truth table.
We also describe an inductive deﬁnition of PL’s semantics that deﬁnes the meaning of
basic operators and also the meaning of more complex formulae in terms of the basic
operators. We write I |= φ if φ evaluates to tt under I and I  |= φ if φ evaluates to ﬀ
under I.
Deﬁnition 2.5. We deﬁne the evaluation of formula φ under an interpretation I as
follows.
• I |= x iﬀ I [x] = tt18 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
• I |= ¬φ iﬀ I  |= φ
• I |= φ1 ∧ φ2 iﬀ I |= φ1 and I |= φ2
Lemma 2.6. The semantics of more complex formulae are evaluated as:
I |= φ1 ∨ φ2 iﬀ I |= φ1 or I |= φ2
I |= φ1 ⇒ φ2 iﬀ, whenever I |= φ1 then I |= φ2
I |= φ1 ⇔ φ2 iﬀ I |= φ1 and I |= φ2, or I  |= φ1 and I  |= φ2
As an example, consider the formula φ : x1 ∨ x2 ⇒ x1 ∧ x2 under the interpretation
I : {x1  → ﬀ ,x2  → tt}. We can compute the truth value of φ as follows
1. I  |= x1 since I [x1] = ﬀ
2. I |= x2 since I [x2] = tt
3. I |= x1 ∨ x2 by 2 and semantics of the operator ∨
4. I  |= x1 ∧ x2 by 1 and semantics of the operator ∧
5. I  |= φ by 3 and 4 and semantics of the operator ⇒
2.1.2 Decision Procedures for Satisﬁability
Section 2.1.1 introduced the truth table and semantic argument methods for determin-
ing the satisﬁability of PL formulae. However, an algorithmic method can easily be
implemented in order to decide satisﬁability of PL formulae.
Deﬁnition 2.7. A PL formula is satisﬁable with respect to a class of interpretations if
there exists an assignment to its variables under which the formula evaluates to true.
The input of the algorithm to check the satisﬁability is usually a PL formula in conjunc-
tive normal form (CNF).
Deﬁnition 2.8. Formally, a PL formula φ is in conjunctive normal form if it consists
of a conjunction of one or more clauses, where each clause is a disjunction of one or
more literals. It has the form
 
i
  
j lij
 
, where each lij is a literal.
A PL formula can easily be transformed into an equisatisﬁable CNF formula in polyno-
mial time using Tseitin’s encoding [172].
Deﬁnition 2.9. Two PL formulae are said to be equisatisﬁable if they are both satisﬁable
or they are both unsatisﬁable.Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques 19
In Tseitin’s encoding, we add a new literal to each logical operator (e.g., ∧, ∨, and ¬)
in the original PL formula, and several clauses to constrain the value of this literal to
be equal to the expression it represents. The original PL formula is satisﬁable iﬀ the
conjunction of these clauses together with the new literal is satisﬁable.
As an example, consider the following PL formula:
x1 ⇒ (¬x2 ∨ x3) (2.1)
For this example, let us assign the variable b3 to the subexpression ¬x2, b2 to the
subexpression b3 ∨ x3, and b1 to the implication x1 ⇒ b2, which is also the topmost
operator of this formula. We need to satisfy b1, together with three equivalences, as
follows:
b1 ⇔ x1 ⇒ b2
b2 ⇔ b3 ∨ x3
b3 ⇔ ¬x2 (2.2)
The equivalences can be rewritten to CNF using Deﬁnition 2.2 as follows:
(¬b1 ∨ ¬x1 ∨ b2) ∧ (x1 ∨ b1) ∧ (b1 ∨ ¬b2) (2.3)
(¬b2 ∨ b3 ∨ x3) ∧ (¬b3 ∨ b2) ∧ (¬x3 ∨ b2) (2.4)
(¬b3 ∨ ¬x2) ∧ (¬x2 ∨ b3) (2.5)
The overall CNF formula is thus the conjunction of (2.3), (2.4), (2.5), and the unit
clause (i.e., a clause that is composed of a single literal) b1, which represents the topmost
operator. The propositional satisﬁability (SAT) problem is then to decide if there exists
a satisfying assignment to the literals of the PL formula φ (in CNF) to satisfy all clauses.
The algorithm to check the satisﬁability of φ is a decision procedure, because given any
formula, the algorithm always terminates with a “correct” yes/no answer after some
ﬁnite amount of computation.
Modern decision procedures to check the satisﬁability of PL formulae in CNF are based
on a variant of the Davis-Putnam-Logemann-Loveland algorithm (DPLL), which consists
essentially of two steps: (i) choose a truth value for some literal and (ii) propagate the
implications of this decision that are easy to infer. This method is known as unit
propagation, which can simplify a set of clauses and thus avoids a large part of the20 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
naive search space. The algorithm backtracks when a conﬂict is reached and learns the
assignments to literals of the conﬂict to avoid reaching the same conﬂict again.
In this context, a SAT solver is thus an algorithm (based on a variant of the DPLL) that
takes as input a formula φ (which is in CNF) and decides whether it is satisﬁable or
unsatisﬁable. The formula φ is said to be satisﬁable (or sat) if the SAT solver is able to
ﬁnd an interpretation that makes the formula true (cf. Deﬁnition 2.7). The formula φ is
said to be unsatisﬁable (or unsat) if none of the interpretations make the formula true.
In the satisﬁable case, SAT solvers can provide a model, i.e., a satisfying assignment to
the propositional variables of the formula φ. In the unsatisﬁable case, when a SAT solver
concludes that there is no satisfying assignment to φ, its internal steps for concluding
this can be used to construct a resolution proof [154] (and most state-of-the-art SAT
solvers can output such steps that can be used as an independently checkable proof of
unsatisﬁability).
Deﬁnition 2.10. A resolution proof is a sequence of deduction steps based on the in-
ference rule:
p1∨...∨pn∨(α) q1∨...∨qm∨(¬α)
p1∨...∨pn∨q1∨...∨qm
where p1 ∨ ... ∨ pn,∨q1 ∨ ... ∨ qm are literals and α is a variable (also called resolution
variable). The clauses p1 ∨ ... ∨ pn ∨ (α) and q1 ∨ ... ∨ qm ∨ (¬α) are called resolving
and p1 ∨ ... ∨ pn ∨ q1 ∨ ... ∨ qm is called resolvent.
The intuitive interpretation of resolution is that to satisfy clauses p1 ∨ ... ∨ pn ∨ (α)
and q1 ∨ ... ∨ qm ∨ (¬α) that share the resolution variable α but disagree on its value,
either the rest of p1 ∨...∨pn or the rest of q1 ∨...∨qm must be satisﬁed. For example,
consider the following formula in CNF:
φ : (p ∨ ¬q) ∧ q ∧ ¬p (2.6)
from resolution
p ∨ ¬q q
p
(2.7)
we can construct
φ1 : (p ∨ ¬q) ∧ q ∧ ¬p ∧ p (2.8)
from resolutionChapter 2 SAT-based and SMT-based Veriﬁcation Techniques 21
¬p p
￿
(2.9)
we can conclude that the original formula φ is unsatisﬁable because the last deduction
step (2.9) ends with empty clause ￿. Therefore, we can also say that a PL formula
in CNF is unsatisﬁable iﬀ there exists a ﬁnite series of deduction steps (based on the
inference rule deﬁned in (2.10)) ending with the empty clause.
Although PL formulae can be converted into CNF in polynomial time (using Tseitin’s
encoding as described above), the problem to decide satisﬁability of PL formulae be-
longs to the well-known NP-complete [55] class. Much research in the past decade has
advanced the state-of-the-art considerably. For a recent survey on SAT we refer the
reader to [24].
From the veriﬁcation point of view, a propositional encoding and use of a SAT solver
to reason about programs have two main limitations as follows. First, the size of the
propositional encoding depends directly on the size of the basic data types and arrays
occurring in the program. Consequently, large data-paths in programs involving complex
expressions lead to large propositional formulae. Second, high-level information is lost
when veriﬁcation conditions are converted into propositional logic. SAT solvers operate
at the bit-level and are thus unable to exploit the structure provided by the higher
abstraction levels. These limitations can be substantially reduced by encoding word-
level information in theories richer than propositional logic and using SMT solvers for
the generated veriﬁcation conditions.
2.1.3 Satisﬁability Modulo Theories
SMT decides the satisﬁability of certain ﬁrst-order formulae using a combination of dif-
ferent background theories and thus generalizes propositional satisﬁability by support-
ing uninterpreted functions, linear and non-linear arithmetic, bit-vectors, tuples, arrays,
and other decidable ﬁrst-order theories (FOL is in general undecidable [37]). Table 2.2
shows some examples of the decidable ﬁrst-order theories (e.g., equality, bit-vectors,
linear arithmetic, arrays) supported by typical SMT solvers.
Theory Example
Equality z1 = z2 ∧ ¬(z1 = z3) ⇒ ¬(z2 = z3)
Bit-vectors ((b >> i)||2)&1 = 1
Linear Arithmetic (4y1 + 3y2 + 1 ≥ 4) ∨ (y2 − 3y3 + 5 ≤ 3)
Arrays (j = k ∧ select(a,k) = 2) ⇒ select(a,j) = 2
Combined Theories g (select(store(a,c,12),c))  = g (1) ∧ c − 3 = c − 3
Table 2.2: Examples of First-Order Theories.
A ﬁrst-order theory T is deﬁned by a signature Σ that consists of a set of functions,22 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
predicates, and constant symbols (also called nonlogical symbols) and a set of axioms
A that consists of ﬁrst-order logic formulae in which the only nonlogical symbols that
appear are in Σ [29]. A Σ-formula is a formula that uses nonlogical symbols of Σ, as
well as variables, logical connectives (∧, ∨, ¬), quantiﬁers (∃ and ∀) and parentheses.
Deﬁnition 2.11. Given a Σ-theory T and a quantiﬁer-free formula ψ, we say that ψ
is T -satisﬁable if and only if there exists a structure that satisﬁes both the formula and
the sentences of T , or equivalently, whether T ∪ {ψ} is satisﬁable.
Deﬁnition 2.12. Given a set Γ∪{ψ} of ﬁrst-order formulae over a Σ-theory T , we say
that ψ is a T -consequence of Γ, and write Γ |=T ψ, if and only if every model of T ∪ Γ
is also a model of ψ. Checking Γ |=T ψ can be reduced in the usual way to checking the
T -satisﬁability of Γ ∪ {¬ψ}.
State-of-the-art SMT solvers are built on top of eﬃcient SAT solvers to speed up the
performance and support the combination of diﬀerent decidable theories [20, 31, 57]. For
example, SAT solvers do not scale well when reasoning on the propositional encoding
of arithmetical operators (e.g., multiplication), because the operands are treated as
arrays of Booleans and most of the computational eﬀort might be wasted during the
boolean search (e.g., up to 2w factor in the amount of boolean search, where w represents
the width of the data type) [27]. SMT solvers, however, often integrate a simpliﬁer,
which applies standard algebraic reduction rules (e.g., r ∧false  → false) and contextual
simpliﬁcation (e.g., a = 7 ∧ p(a)  → a = 7 ∧ p(7)) before replacing the word-level
operators by bit-level circuit equivalents (i.e., before bit-blasting). Furthermore, SMT
solvers (e.g., [32]) often implement an incremental and layered approach which permits
strengthening incrementally the model of the arithmetic operators and they thus achieve
performance improvements of several orders of magnitude when compared to plain bit-
blasting, as reported in [28]. Consequently, as structural word level information (i.e.,
predicates from various decidable theories) remains in the problem formulation, then
bit-blasting is used by the SMT solvers only as a last resort if higher level and less
expensive techniques are not enough to solve the problem at hand.
The SMT-LIB initiative [164] aims at establishing a common standard for the speciﬁca-
tion of background theories, but the background theories still vary and most of current
SMT solvers provide functions in addition to those speciﬁed in the SMT-LIB. There-
fore, we describe here all the fragments that we found in the SMT solvers CVC3 [20],
Boolector [31] and Z3 [57] for the theory of linear, non-linear, and bit-vector arithmetic.
We summarize the syntax of these background theories as follows:
Note that here we use standard notation to describe the above grammar, and we thus
only focus on certain aspects of the notation. In this grammar Fml denotes Boolean-
valued expressions, Trm denotes terms built over integers, reals, and bit-vectors while
op denotes binary operators. The logical connectives con consist of conjunction (∧), dis-
junction (∨), exclusive-or (⊕), implication (⇒), and equivalence (⇔). The interpretation24 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
follows [20, 31, 57]:
a = b ⇐ ∀i   select(a,i) = select(b,i)
a  = b ⇒ ∃i   select(a,i)  = select(b,i)
The theory of arrays employs the notion of unbounded arrays size, but arrays in software
are typically of bounded size. This means that if an index variable i exceeds the size of
an array in a program, the value returned might be undeﬁned or a crash might occur.
Chapter 3 shows how to generate veriﬁcation conditions in order to check for array
bounds violation in programs.
Another theory of interest to software veriﬁcation is the theory of tuples, where it allows
us to model the ANSI-C struct and union datatypes. They provide store and select
operations similar to those in arrays, but working on the tuple elements. Each ﬁeld of
the tuple is represented by an integer number. Hence, the expression select(t, f) denotes
the ﬁeld f of tuple t while the expression store(t, f, v) denotes a tuple t that at ﬁeld
f has the value v and all other tuple elements remain the same. Chapter 3 shows how
structures and unions are encoded using the theory of tuples.
As a running example for background theories, we give a simple SMT formula that uses
three theories (bit-vector arithmetic, theory of arrays, and uninterpreted functions). Let
a be an array, b, c and d be signed bit-vectors of width 16, 32 and 32 respectively, and
let g be an unary function. The function g implies that for all x and y (where x and
y are variables), if x = y, then g (x) = g (y) (congruence rule). Formally, the unary
function g instantiates to the following axiom: ∀x,y.x = y ⇒ g (x) = g (y). In other
words, we say that function g always produces the same result when applied to the same
arguments [29, 58, 133].
g (select(store(a,c,12),SignExt(b,16) + 3))  = g (SignExt(b,16) − c + 4)
∧SignExt(b,16) = c − 3 ∧ c + 1 = d − 4
In order to sum SignExt(b,16)+3, subtract SignExt(b,16)−c and compare SignExt(b,16) =
c − 3, we have ﬁrst to expand the term SignExt(b,16) so that the resulting bit-vector,
say b′, extends b to the signed equivalent bit-vector of size 32 (i.e., SignExt (b,16) thus
extends b to the size w + 16, where w is the original width of the bit-vector b). After
expanding the term SignExt(b,16), we then obtain the following formula:
g (select(store(a,c,12),b′ + 3))  = g (b′ − c + 4) ∧ b′ = c − 3 ∧ c + 1 = d − 4
Now the bit-vectors b′ and c have the same width. One way of checking the satisﬁability
of this formula is to replace b′ by c−3 in the inequality so that we obtain an equivalence
formula such as:Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques 25
g (select(store(a,c,12),c − 3 + 3))  = g (c − 3 − c + 4) ∧ c − 3 = c − 3 ∧ c + 1 = d − 4
after using facts about bit-vector arithmetic, this formula can be rewritten as:
g (select(store(a,c,12),c))  = g (1) ∧ c − 3 = c − 3 ∧ c + 1 = d − 4
Finally, the theory of arrays implies that the select/store functions reduce the arguments
of function g (select(store(a,c,12),c)) to g (12) and the formula becomes:
g (12)  = g (1) ∧ c − 3 = c − 3 ∧ c + 1 = d − 4
Consequently, the formula above is satisﬁable since there is an assignment to the bit-
vectors c (e.g., c = 5) and d (e.g., d = 10) such that the ﬁrst (g (12)  = g (1)), second
(c − 3 = c − 3) and third (c + 1 = d − 4) terms hold.
2.1.4 Linear-time Temporal logic
Linear-time temporal logic, or simply LTL, is a commonly used speciﬁcation logic in
bounded model checking [22, 94, 101], which extends propositional logic (discussed in
Subsection 2.1.1) by including temporal operators. It models time by means of a se-
quence of states (denoted by si ∈ S, where i indicates a state in a given time step and
S is the set of states), or computation path (henceforth called π), extending inﬁnitely
into the future (hence the term “linear”, which means that at each state in time there
is a single successor state). In LTL, we are thus able to specify properties of the type
“for some state on the path” or “for every two consecutive states”.
Deﬁnition 2.13. The syntax of LTL is deﬁned over a set of atomic propositions, logical
operators and temporal operators as follows:
φ ::= ⊤ | ⊥ | p | ¬φ | φ1 ∧ φ2 | φ1 ∨ φ2 | φ1 ⇒ φ2
| Xφ | Fφ | Gφ | Aφ | φ1Uφ2 | φ1R φ2
The symbols ⊤ and ⊥ are atoms and represent true and false respectively (as described
in Subsection 2.1.1). The logical operators include negation (¬), conjunction (∧), dis-
junction (∨) and implication (⇒). The temporal operators are “next state” (X), “some
future state (eventually)” (F), “all future states (globally)” (G), “along all computation
paths” (A), “until” (U) and “release” (R). An LTL formula can be evaluated over a com-
putation path π (i.e., π = s1 → s2 → ... → sn) or over a set of states. LTL formulae are
thus of two kinds: computation path and state formulae. The intuitive interpretation
of the operators X, G, F, U and R over computation path formulae is as follows:26 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
• X φ means that φ has to hold at the neXt point in time.
• F φ means that φ has to hold at some point in the Future.
• G φ means that φ has to hold Globally (at all future points).
• ψ U φ means that ψ has to hold continuously Until φ holds.
• ψ R φ means that φ has to remain true up to and including the moment when ψ
ﬁrst becomes true; if ψ never becomes true, φ must remain true forever; ψ Releases
φ.
and the interpretation of the operator A over state formulae is as follows:
• A φ means that φ has to hold along All computation paths.
The operators X, F, G and A are unary, so that X φ, F φ, G φ and Aφ are well-formed
formula whenever φ is a well-formed formula. The operators U and R are binary, so
that ψ U φ and ψ R φ are well-formed formula whenever both ψ and φ are well-formed
formulae. We omit the W operator because R and W are actually quite similar; the
diﬀerences are that they swap the roles of ψ and φ, and the clause for W has an i − 1
where R has i (see below the satisfaction relation of the LTL formulae). Figure 2.2
shows the informal semantics of the LTL operators so that each operator is shown in a
computation path π, where each dot represents a state in time (e.g., s1,s2,s3,...).
(a) X operator
(b) G operator
(c) F operator
(d) U operator
(e) R operator
(e) R operator
Figure 2.2: LTL semantics for the operators X, G, F, U, and R (when ψ ﬁrst becomes
true and when ψ never becomes true) over π [94].
Software systems are typically modelled by means of a state transition system M (also
called model).
Deﬁnition 2.14. A state transition system, denoted by M, is deﬁned by a triple (S,R,S0)
where S represents the set of states, R ⊆ S × S represents the set of transitions (i.e.,
pairs of states specifying how the system can move from state to state) and S0 ⊆ S
represents the set of initial states.Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques 27
{a, b}
{b, c} {c}
S0
S1
S2
Figure 2.3: Example of a Kripke structure (with deadlock) for states s0, s1, and s2
(where s2 has a transition back to itself).
In software systems, a state represents the assignment of values (e.g., Booleans, integers,
characters) to variables. The semantics of an LTL formula is then deﬁned along a
computation path π = s1 → s2 → ... → sn, which is a sequence of states over M. A
program thus deﬁnes the form of its states, the set of transitions between the states,
and the set of computations that it can potentially produce. The set of computations
of a program deﬁnes the program itself with the same precision of its source code.
Formally, let πi be a computation path π with a designated formula evaluation position
i. We assume a labelling (or interpretation) function L : S ⇒ 2P mapping L from each
state to the set of propositional variables represented by P. For example, the power set
of {a,b} is {⊘,{a},{b},{a,b}} and L is just an assignment of truth values to all the
propositional variables, exactly as it was for the case of interpretation of PL formulae. To
help us deﬁne the semantics of LTL formulae, we then extend deﬁnition 2.14 to include
the labelling function L : S ⇒ 2P so that M now becomes a quadruple K = (S,R,S0,L),
which is called a Kripke structure.
Deﬁnition 2.15. A Kripke structure is a quadruple K = (S,R,S0,L) consisting of a
set of states S, a set of transitions R, a set of initial states S0 (as deﬁned in 2.14) and
a labelling function L : S ⇒ 2P, which deﬁnes for each state s ∈ S the set L(s) of all
propositional variables that belong to s.
Note that we can construct an inﬁnite path in a Kripke structure and thus a deadlock
state (i.e., a state with a transition back to itself) might occur in K. Figure 2.3 shows
an example of representation of K, which consists of three states s0, s1 and s2 with
transitions s0 → s1, s0 → s2, s1 → s0, s1 → s2 and s2 → s2; and L(s0) = {a,b},
L(s1) = {b,c} and L(s2) = {c}.
Deﬁnition 2.16. Let K = (S,R,S0,L) be a model of our system and π be a path in
K. The formal semantics whether π satisﬁes an LTL formula φ is thus deﬁned by the
satisfaction relation π |= φ, which extends the satisfaction relation of PL formulae over
temporal operators, as follows:28 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
πi |= p iﬀ p ∈ L(si)
πi |= ¬p iﬀ πi  |= p
πi |= φ1 ∧ φ2 iﬀ πi |= φ1 and πi |= φ2
πi |= φ1 ∨ φ2 iﬀ πi |= φ1 or πi |= φ2
πi |= X φ iﬀ πi+1 |= φ
πi |= F φ iﬀ for some i ≥ 1 such that πi |= φ
πi |= G φ iﬀ for all i ≥ 1,πi |= φ
πi |= φ1 U φ2 iﬀ ∃j ≥ i such that πj |= φ2 and πn |= φ1 for all i ≤ n < j
we have πj |= φ1; or for all k ≥ 1 we have πk |= φ1
πi |= φ1R φ2 iﬀ for all j ≥ i : πj |= φ2 or πn |= φ1 for some i ≤ n < j
πi |= A φ iﬀ π |= φ for all paths π starting in si
According to deﬁnition 2.16, the following LTL formulae hold in the transition system
of Figure 2.3:
• s0 |= A(a ∧ b)
• s1 |= A (bUc)
• s2 |= A G c
and the following LTL formulae do not hold in the transition system of Figure 2.3:
• s0  |= A X (b ∧ c)
• s1  |= A G c
• s2  |= A G F a
As an example of how LTL is used to specify properties, consider the classic mutual
exclusion problem in which two threads, say T1 and T2, cannot have simultaneous access
to a common resource CR. Thread Ti is essentially modelled by three locations as
follows:
1. the noncritical section (i.e., a section that does not need exclusive access to CR);
2. the waiting phase, which is entered when the thread intends to enter the critical
section, i.e., access CR; and
3. the critical section (i.e., a section that accesses CR that must not be concurrently
accessed by more than one thread).Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques 29
Let the propositions c1 and c2 denote that thread T1 and T2 are in their critical sec-
tion. The safety property stating that T1 and T2 never simultaneously have access to
their critical sections (i.e., at most one thread is in critical section at any time) can be
described by the following LTL formula:
AG(¬c1 ∨ ¬c2) (2.10)
This formula expresses that for all paths π at least one of the two threads is not in its
critical section (expressed by ¬ci).
2.2 Bounded Model Checking of Software
This section presents the formulation of the BMC technique, an overview of complete-
ness methods to prove properties in the BMC framework, and describes typical BMC
architectures used in software veriﬁcation. It also compares the BMC technique to other
state-of-the-art software veriﬁcation techniques that are currently used in practice.
2.2.1 Formulation
Bounded model checking (BMC) has been successfully applied to verify software systems
and discovered subtle errors in commercial products. The idea of BMC is to unwind the
program and the correctness properties k times, and generate a propositional formula
that is satisﬁable if and only if a counterexample of size k (or smaller) exists [25].
However, the technique is not complete because there might still be a counterexample
that is longer than k. Completeness can only be ensured if we know an upper bound
on the depth of the state space, i.e., if we can ensure that we have already explored
all the relevant behaviour of the system, and searching any deeper only exhibits states
that have already been checked. In BMC of software, the bound k limits the number
of loop iterations and recursive calls occurring in the program. BMC thus analyzes
only bounded program runs and thereby achieves decidability since software veriﬁcation
in general is undecidable due to inﬁnite program runs (e.g., in reactive or interactive
software systems).
Formally, given a temporal logic property φ to be veriﬁed on a ﬁnite transition system
M (cf. Deﬁnition 2.14), BMC unwinds the system k times and translates it into a
veriﬁcation condition ψ such that ψ is satisﬁable if and only if φ has a counterexample
(i.e., a behaviour which falsiﬁes the property φ) of depth less than or equal to k. The
propositional problem associated with SAT-based BMC is formulated by constructing
the following formula [25]:30 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
ψk = I (s0) ∧
k−1  
i=0
R(si,si+1) ∧ ¬φk (2.11)
Here, φk represents a safety property φ in step k, I is the set of initial states of M,
R(si,si+1) is the transition relation of M at time steps i and i + 1. Hence, the formula
 k−1
i=0 R(si,si+1) represents the set of all executions of M of length k. ¬φk represents
the condition that φ is violated in state k, which is reached by a bounded execution of
M of length k. Finally, the resulting (bit-vector) formula is translated to conjunctive
normal form in linear time and passed to a SAT solver for checking satisﬁability. Formula
(2.11) can be used to check safety properties [149]. Liveness properties (e.g., starvation,
deadlock) that contain the LTL operator F are checked by encoding ¬φk in a loop within
a bounded execution of length at most k, such that φ is violated on each state in the
loop. In this case, formula 2.11 can be rewritten as:
ψk = I (s0) ∧
k−1  
i=0
R(si,si+1) ∧
 
k  
i=0
¬φi
 
(2.12)
where φi is the propositional variable φ at time step i. Thus, this formula can be satisﬁed
if and only if for some i (i ≤ k) there exists a reachable state at time step i in which φ
is violated.
Deﬁnition 2.17. Let M be a transition system. A state s ∈ S is called a reachable
state in M if there exists a ﬁnite sequence of state transitions starting from an initial
state s0 and ending in state s, i.e., s0
R0 → s1
R1 → ...
Rn → sn = s, where s0
R0 → s1 denotes a
state transition by applying R0.
However, in software veriﬁcation, the more common application of BMC relies on check-
ing safety properties that contain the LTL operator G. They are typically formalized
using assert statements that encode the properties that have to hold at the respective
location. The safety properties in single- and multi-threaded programs typically check
for out-of-bounds array indexing, NULL-pointer dereferencing, memory leaks, data race,
atomicity and order violations, and arithmetic overﬂow.
2.2.2 Veriﬁcation Conditions
BMC analyzes only bounded program runs, but generates veriﬁcation conditions (VCs)
that reﬂect the exact path in which a statement is executed, the context in which a given
function is called, and the bit-accurate representation of the expressions. A veriﬁcation
condition is a logical formula (constructed from the bounded program and desired cor-
rectness properties) whose validity implies that the program’s behaviour agrees with its
speciﬁcation [12, 29, 74, 109]. Correctness properties in programs can be speciﬁed by the32 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
ψ :=

   
   
  

a1 = N
∧ a2 = ite(x[0] > 1,a1 − 1,a1)
∧ a3 = ite(x[1] > 1,a2 − 1,a2)
∧ a4 = ite(x[2] > 1,a3 − 1,a3)
∧...
∧ aN+1 = ite(x[N − 1] > 1,aN − 1,aN)
∧¬(aN+1 ≤ N)

   
   
  

(2.13)
The ternary operator f ? t1 : t2 shown in Figure 2.4(b) is converted into the condi-
tional expression ite(f,t1,t2) that takes as its ﬁrst argument the Boolean formula f and
depending on its value selects either the second (i.e., t1) or the third argument (i.e.,
t2). In order to verify that the assertion (a <= N) holds, its negation is added to ψ
and we check whether the entire formula is satisﬁable using an oﬀ-the-self SMT solver.
As described in Section 2.1, Formula (2.13) can simply be represented as a Boolean
logic circuit, which can further be transformed into a (equisatisﬁable) CNF formula over
propositional variables by Tseitin’s transform [172] in linear time and by introducing at
most a linear number of fresh variables. However, checking the validity of a ﬁrst-order
logic formula in a given background theory is an NP-complete problem [145]).
2.2.3 Completeness
Bounded model checking can be used to ﬁnd property violations up to the bound k but
not to prove properties, unless an upper bound is known on the depth of the state space,
which is not generally the case. For software veriﬁcation, we can adopt two diﬀerent
strategies in order to prove properties: (i) compute the completeness threshold, which
can be smaller than or equal to the maximum number of loop-iterations occurring in
the program or (ii) determine the high-level worst-case execution time (WCET), which
also gives a bound on the maximum number of loop-iterations [24, 45, 72]. However, in
practice, complex software systems involve large data-paths and complex expressions.
Therefore, the veriﬁcation conditions that arise from BMC of programs become harder
to solve and require substantial amounts of memory to build.
2.2.3.1 Craig Interpolation
One feasible alternative to prove properties in BMC is to compute the Craig interpolants
for inconsistent pairs (or more generally, sets) of formulae [125, 126, 127, 128]. This
alternative approach exploits the SAT/SMT solvers’ ability to produce refutations, i.e.,
proofs that there is no counter-example of depth less than or equal to k. This proof does
not ensure whether a given property holds in the model, but it contains information
about the reachable states of the model.Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques 33
Deﬁnition 2.18. Given a pair of formulae (A,B), and a proof by resolution for (A,B),
an interpolant for (A,B) is a formula F with the following properties [125, 126]:
• A ⇒ F
• F ∧ B is unsatisﬁable
• F refers only to the common variables of A and B
As an example, consider A = (x1 ∧ x2) and B = (¬x2 ∧ x3). Given that (x1 ∧ x2) must
imply F (or simply that ¬x1 ∨ ¬x2 ∨ F hold) and F ∧ ¬x2 ∧ x3 must be unsatisﬁable,
one possible interpolant for the given pair of formulae (A,B) is F = x2 since x2 is also
common to both A and B.
The use of interpolants allows us to deﬁne a complete method for ﬁnite-state reachability
analysis based on SAT and SMT solvers. In order to show how BMC and interpolation
can be combined, we refer to Section 2.2.1 where we deﬁne the Formula (2.11) and the
terms I, R, and φ. Now suppose that Q = I and we partition Formula (2.11) so that the
set of initial states I and the ﬁrst instance of the transition relation R are in set A, while
the remaining instances of R and the property φ are in set B as shown in Figure 2.5
(note that k is unknown).
R R R R R R R
S0 Sk
A B
S1 S2
=> P
Figure 2.5: Computing image by interpolation [125].
Suppose that we use an SMT solver to prove that the A∧B is unsatisﬁable, i.e., we use
an SMT solver to conclude that there is no satisfying assignment to A∧B.1 The internal
steps performed by the SMT solvers for reaching this conclusion can be used to construct
a proof of unsatisﬁability Π. From this proof, we can derive an interpolant F for the pair
of formulae (A,B), i.e., F = interpolant(Π,A,B). According to Deﬁnition 2.18, A must
imply F and since we deﬁned A to be the set of initial states and the ﬁrst instance of R
(i.e., from Figure 2.5, A = s0∧s1), it follows that F is true in every state reachable from
the initial state in one step. In other words, we can say that F is an over-approximation
of the forward image of I [125, 126]. Also according to Deﬁnition 2.18, the formula F ∧B
must be unsatisﬁable (from Figure 2.5, B = s2 ∧ s3 ∧ ... ∧ sk), which means that there
is no state satisfying F that can reach a ﬁnal state sk. After computing the interpolant
1Note that if at any stage we can satisfy the property φ within k steps from the initial state, then we
have found a counterexample.34 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
F, we then check whether F implies Q. If F implies Q, then no reachable state can
satisfy the property φ and we can thus conclude that the property holds. However, since
F is an approximation, we can falsely conclude that the ﬁnal state is reachable. In this
case, we update Q = F ∨ Q and A = F ∧ R0, increase the value of k + 1 and check
whether A ∧ B is unsatisﬁable. If A ∧ B is satisﬁable, we have found a valid counter-
example (i.e., a path from the initial state to the ﬁnal state). Otherwise, we compute
the interpolant F = interpolant(Π,A,B) again and check whether F implies Q. We
stop this procedure when we have found a valid counter-example or have proved that
the ﬁnal state is not reachable (i.e., the property holds). The details of the algorithm
and further information about the use of interpolants in model checking can be found
in [125, 126, 127, 128].
2.2.3.2 K-Induction
Another feasible alternative to prove properties in BMC is to compute invariants by
means of induction [162, 66]. The k-induction method has been successfully applied to
verify hardware designs (represented as ﬁnite state machines) using a SAT solver, but
the ﬁrst attempts to apply this technique to software are only very recent [63]. In order
to present the k-induction method, we use the notation of [63, 66], which describes the
principle via temporal induction (i.e., the induction is carried out over the time steps of
the ﬁnite state machines). The simplest form of k-induction consists of two steps: the
base-case and the induction-step. Let I (s) and R(s,s′) encode the set of initial states
and transition relation of the ﬁnite transition system M, and let P (s) denote states
satisfying a safety property φ (recall Deﬁnition 2.14). The strengthened induction, as
proposed in [66], is then deﬁned by the following formulae:
Basek = I (s0) ∧ R(s0,s1) ∧ ... ∧ R(sk−1,sk) ∧ (¬P (s0) ∨ ... ∨ ¬P (sk))
Stepk = P (s1) ∧ R(s1,s2) ∧ ... ∧ P (sk) ∧ R(sk,sk+1) ∧ ¬P (sk+1) (2.14)
The intuitive interpretation of these two formulae are as follows: in the base-case, we
aim to check that P holds in all states reachable from an initial state within k steps (we
assume that k ≥ 0) and in the induction-step, we aim to check that whenever P holds in
k consecutive states s1,...,sk, P also holds in the next state sk+1 of the system. In both
cases, we check whether formulae Basek and Stepk, as described above, are unsatisﬁable.
An algorithm can then be devised from these two formulae, which unwinds the system
design incrementally and check whether Basek is satisﬁable or Stepk is unsatisﬁable in
order to determine termination. In particular, if Basek turns to be satisﬁable in time
step k, then we have found a violation of the property. If Stepk is unsatisﬁable in time
step k, then the property holds.Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques 35
2.2.4 BMC Architecture
Here, we overview typical BMC architectures used in software veriﬁcation, focusing
on the most prominent example, the C Bounded Model Checker (CBMC) [42, 41, 106].
The CBMC tool implements the BMC technique for ANSI-C/C++ programs using SAT
solvers. CBMC can process C/C++ code using the goto-cc tool [179], which compiles
the C/C++ code into equivalent GOTO-programs (i.e., control-ﬂow graphs) using a
gcc-compliant style. The GOTO-programs can then be processed by the symbolic exe-
cution engine. Alternatively, CBMC uses its own, internal parser based on Flex/Bison,
to process the C/C++ ﬁles and to build an abstract syntax tree (AST). The type-
checker of the CBMC’s front-end annotates this AST with types and generates a symbol
table. CBMC’s IRep class then converts the annotated AST into an internal, language-
independent format used by the remaining phase of the CBMC front-end.
CBMC derives the VCs using two recursive functions that compute the assumptions or
constraints (i.e., variable assignments) and properties (i.e., safety conditions and user-
deﬁned assertions). 2 CBMC’s VC generator (VCG) automatically generates safety
conditions that check for arithmetic overﬂow and underﬂow, array bounds violations,
and null-pointer dereferences. Both functions accumulate the control ﬂow predicates
to each program point and use that to guard both the constraints and the properties,
so that they properly reﬂect the program’s semantics. Figure 2.6 shows the CBMC
architecture.
C/C++
source
parse
tree
IRep
tree
Properties
BMC verification
condition
SAT
solver
Figure 2.6: The CBMC Architecture.
Although CBMC implements several state-of-the-art techniques for propositional BMC,
it still has the following limitations [11, 71]: (i) large data-paths involving complex
expressions lead to large propositional formulae due to the number of variables and the
width of data types, (ii) high-level information is lost when the VCs are converted into
propositional logic, and (iii) the size of the encoding increases with the size of the arrays
used in the program.
As an example of the veriﬁcation process supported by CBMC, Figure 2.7 shows a
syntactically valid C program that writes accidentally to an address outside the allocated
2Section 2.2.2 shows in a nutshell how to construct logical formulae (or VCs) from a program and
desired correctness properties (for further references, we refer the reader to [8, 26, 42, 106]).38 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
proach called “dynamic” symbolic execution, which extends “static” symbolic execution
by exploiting concrete execution paths to obtain symbolic constraints. The basic idea
of dynamic symbolic execution is to explore diﬀerent execution paths by selecting and
negating a given branch condition from the symbolic traces. After performing this mod-
iﬁcation, the resulting path condition is encoded using the background theories that
are typically supported by SMT solvers and checked for satisﬁability. If the modiﬁed
path condition is satisﬁable, then the SMT solver provides a satisfying assignment that
can be used to guide the execution through new paths. In another related work, Sen
proposes an approach to execute a program concretely and symbolically by combining
random testing and symbolic execution [160]. Both approaches, however, might fail to
compute concrete values that satisfy a given (large) path constraint (which might be
involve complex expressions) due to the solver performance.
Recently, a number of static checkers have been developed that trade oﬀ scalability
and precision. PREﬁx is a static program analysis tool that integrates an SMT solver to
perform bit-precise static analysis [34]. PREﬁx has been developed and used at Microsoft
to analyze large C/C++ programs. Although PREﬁx could detect several software bugs
related to arithmetic overﬂow in the Microsoft products, it may also detect false positive
arithmetic overﬂow bugs as pointed out in [26]. Calysto [13] and Saturn [180] are also
representative examples of static checker that employ SAT/SMT solvers as back-ends
to solve the veriﬁcation conditions. These tools, however, do not support ﬁxed-point
operations and are not able to detect buﬀer overﬂow bugs (which is the number one issue
as reported in [3]), because they unsoundly approximates loops by unwinding them only
once or twice. As a consequence of this decision, soundness is evidently relinquished for
performance gains.
In extended static checking, a veriﬁcation condition generator (VCG) is used to con-
vert code annotated with “contracts” into logical formulas. The contracts consist of a
pre-condition assumption inserted at some location in the program that speciﬁes how a
procedure may be called, a post-condition assertion that speciﬁes the resulting state of
a procedure call (i.e., speciﬁes a property that has to hold at the respective location),
and a loop invariant that speciﬁes properties of intermediary system state. The Spec#
programming system is a good example of a tool that integrates contracts for extended
type safety [17]. Spec# uses the low-level procedural language of Boogie [18] to generate
the VCs and the SMT solver Z3 [57] to check the validity of these VCs. The develop-
ment of Boogie and Spec# were essentially inspired by the experiences obtained with
the extended static checker ESC/Java [62]. However, in contrast to Spec#, ECS/Java
employs the Simplify theorem prover [62] to verify user-supplied invariants and thus
important constructs of the programming language (e.g., bitwise operation) are often
encoded imprecisely using axioms and uninterpreted functions.
Explicit-state model checking is an automated technique that, given a model and a
property, systematically checks whether this property holds for a given state in thatChapter 2 SAT-based and SMT-based Veriﬁcation Techniques 39
model [14]. It manipulates each state individually as opposed to symbolic model check-
ing, which implicitly manipulates large sets of states (by applying data structures such as
BDDs or SAT/SMT procedures). State space reduction techniques such as partial-order
reduction thus takes advantage of the explicit-state model checking technique because
it is much easier to capture and exploit transitions that are independent with respect
to individual states than for a set of states [103]. In this scenario, explicit state model
checkers for concurrent programs have been widely used to verify large designs that arise
from the industry.
One of the most robust explicit state model checkers is Spin [90, 91], which is able to
verify software models using a high level speciﬁcation language called Promela (Spin
also supports the use of embedded C code as part of the Promela code to verify directly
low-level software). Spin implements a number of advanced optimization techniques
to tackle the state explosion by using a compact representation of the search space
and to reduce the number of interleavings by means of partial-order reduction. The
main state compression techniques implemented in Spin include collapse compression (to
avoid replicating a complete description of all local components of the system state) and
bitstate hashing (to store a single bit at the slot indexed by the hash number of the state
to memorize whether the corresponding state has been explored). Additionally, Spin
exploits the use of multi-core computers to leverage parallelism in very large veriﬁcation
models.
Java Pathﬁnder (JPF) is another widely used explicit state model checker, which targets
eﬃcient Java bytecode veriﬁcation [176]; the latest version of JPF also support symbolic
model checking of Java bytecode [185]. JPF implements a set of techniques such as
backtracking (to ﬁnd diﬀerent possible execution paths that have not been explored),
state matching (to check whether every new state has already been explored), partial
order reduction (to reduce the number of thread interleavings) and conﬁgurable search
strategies (to use heuristics to order and ﬁlter the set of states according to the property
being checked). JPF is able to check properties related to data race condition, deadlocks,
heap bounds, unhandled exceptions (e.g. nil-pointer exceptions) and user-speciﬁed as-
sertions arising from (concurrent) Java programs. For a recent survey on software model
checking we refer the reader to [100].
2.3 Veriﬁcation of Multi-threaded Systems
Multi-threaded software is typically diﬃcult to validate with testing methods, mainly
due to two reasons: the non-deterministic executions of the program and the potentially
large state space. On the one hand, as mentioned in Chapter 1, traditional validation
of multi-threaded software aims to test all possible interleaving sequences with the cost
of overloading the system without ensuring complete coverage. On the other hand,40 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
model checking multi-threaded software can guarantee complete coverage with the cost
of generating an extremely large state space.
In Section 2.2, we introduced the notion of transition systems and we have shown the
model checking problem associated with BMC of single-threaded programs. In BMC of
multi-threaded programs, we still have the same notion of states (i.e., the assignment of
values to variables) as described in the sequential case, but here we must now consider the
interleavings of transitions of diﬀerent threads. In particular, a multi-threaded program
contains a number of threads that execute in parallel and the execution of a thread is
scheduled in a non-deterministic way by a global system scheduler, as will be explained
in the next section. For now, we informally assume that the operational behaviour of
the threads that run in parallel are given by transitions systems M1,...,Mn (recall
Deﬁnition 2.14). We can then deﬁne a transition system Mt =
 n
j=0 Mj that speciﬁes
the behaviour of the parallel composition of transition systems M1 through Mn.
This section describes mechanisms to model multi-threaded systems by means of tran-
sition systems composed from diﬀerent individual threads; further information can be
found in textbooks [14, 40]. This then allows us to encode explicitly the interleaving
model into the BMC framework to model check multi-threaded programs.
2.3.1 Concurrency and Interleaving
There are two modes of concurrent execution; asynchronous and synchronous. In the
asynchronous mode, which we consider, only one thread can make progress at a time,
whereas in the synchronous mode all threads can run at the same time. Threads in
asynchronous mode can communicate via message passing or shared variable. In the
message passing model, threads can send/receive messages (comprising zero or more
bytes, data structures, or even segments of code) to/from other threads. In the shared
variable model, a region of memory may be simultaneously accessed by multiple threads
in order to provide communication among them.
Thread synchronization or serialization (e.g., via mutual exclusion or condition variable)
ensures that multiple threads do not access speciﬁc regions of memory at the same time.
This means that if one thread started to access a region of memory, any other thread
trying to access this region must wait until the ﬁrst thread ﬁnishes. This work considers
multi-threaded programs with asynchronous mode and assumes that the threads in the
program only communicate through shared (global) variables and synchronize to avoid
the simultaneous access to shared variables. Note that this assumption also applies
to the veriﬁcation of software in multi-core systems since asynchronous operation is a
standard solution to avoid contention for memory in multi-core processors [60].
A widely adopted paradigm for multi-threaded programs is that of interleaving. 5 In
5The deﬁnition of interleaving is based on the notion of the asynchronous mode, i.e., only one threadChapter 2 SAT-based and SMT-based Veriﬁcation Techniques 41
this paradigm, an interleaving sequence represents a possible execution of the program
where all of the concurrent events are arranged in a linear order. Thus, the notion of
concurrency is represented by that of interleaving, that is, the non-deterministic choice
between activities of the simultaneously acting threads. This perspective is based on the
fact that only one core is available on which the actions of the threads are interleaved.
From the modelling point of view, this concept also applies if the threads run on diﬀerent
cores. In both cases (single-core or multi-core), there are many interleaving sequences
with diﬀerent orderings between concurrent events.
The interleaving representation of concurrency depends on a scheduler, which interleaves
the steps of concurrently executing threads according to a given strategy. This type of
representation completely abstracts from the speed of the participating threads and thus
models any possible realization by a single-core machine or by several cores with arbitrary
speeds. From the veriﬁcation point of view, in order to fully verify a concurrent program
against a given speciﬁcation, all possible interleaving sequences must be considered. This
can result in an extremely large state space that must be explored by a model checker,
which in turn is the main source of state explosion problem.
As a running example, consider the control-ﬂow graph (CFG) of two threads, say TA and
TB as shown in Figure 2.8, where variables a and b are declared as global. For each thread
Ti, its control-ﬂow graph is a directed graph Ti =  Ni,Ei,ni0 , where Ni is the set of
nodes that represent program statements, Ei is the set of edges that represent transitions
(i.e., saying how each thread Ti can move from node to node) and ni0 is the initial node.
In our example, thread TA =  NA,EA,nA0  where the nodes NA = {TA0,TA1,TA2,TA3},
the edges EA = (TA0 → TA1,TA1 → TA2,TA2 → TA3), and the initial node nA0 = TA0,
while thread TB =  NB,EB,nB0  where the nodes NB = {TB0,TB1,TB2,TB3}, the edges
EB = (TB0 → TB1,TB1 → TB2,TB2 → TB3), and the initial node nB0 = TB0.
TA2: a = a + (b/3)
TA3
TA0
TB2: b = b+3
TB3
TB0
TA1: a = 2 TB1: b = 6
Figure 2.8: The CFG representation of threads TA and TB and we assume that
initially the global variables a and b are set to zero, i.e., a = 0 and b = 0.
We say that a program statement is visible if it accesses a global variable, and it is
invisible otherwise. In our example, we consider that all program statements (i.e., a = 2,
is executed at a given time.42 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
a = a + (b/3), b = 6 and b = b + 3) are visible.
An interleaving represents a possible execution of the program where all of the concurrent
events are arranged in a linear order. Any change of the active thread in an interleaving
is a context switch. A program statement is considered to be atomic if no context switch
can happen during its execution. Statements that involve at most one global variable
are not aﬀected by context switches. In our example, program statements a = 2, b = 6
and b = b + 3 are atomic while the program statement a = a + (b/3) is not atomic,
because it is aﬀected by context switches.
The CFG that represents all possible interleaving sequences of threads TA and TB is
shown in Figure 2.9. The number of possible interleaving sequences I for a given number
of threads N consisting of s program statements in a program without loops can be
computed as follows [176].
I =
  N
i=1 si
 
!
 N
i=1 (si!)
(2.15)
In our running example, we have N = 2, sA = 2 and sB = 2 and the number of possible
interleaving sequences is thus:
I =
(2 + 2)!
(2!   2!)
=
24
6
= 6 (2.16)
The transition system that represents the parallel execution of threads TA and TB is
shown in Figure 2.10. As we can see in Figure 2.10, the choice of two (i.e., those
that have as ﬁnal state {a = 4,b = 9}) and three (i.e., those that have as ﬁnal state
{a = 5,b = 9}) interleaving sequences of the threads in Figure 2.8 do not aﬀect the ﬁnal
state (i.e., they result in the same state when executed in diﬀerent orders) and so they
generate equivalent interleaving sequences. Unfortunately, this observation is not true
for the example in Figure 2.8, because we have to consider context switches inside the
individual visible statements that involve more than one access to a global variable
(since threads TA and TB share the same global variable b). For example, the program
statement a = a+(b/3) in Figure 2.8 is thus broken into three diﬀerent statements (see
nodes TA′
2, TA′
3, and TA′
4 in Figure 2.11) so that a context switch may now occur between
these statements. In chapter 4, we show how to break the visible program statements and
check for atomicity violations. In this new scenario, the number of possible interleaving
sequences increases from six (without considering context switches inside the individual
visible statements) to ﬁfteen as follows:
I =
(4 + 2)!
(4!   2!)
=
720
48
= 15 (2.17)Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques 43
TA2,TB0
a=a+(b/3)
TA3,TB3
TA0,TB0
TA1,TB0
a = 2
TA1,TB1
b = 6
TA3,TB1
b = 6
TA3,TB2
b = b+3
TA2,TB1
a=a+(b/3)
TA1,TB2
b = b+3
TA2,TB2
b = b+3
TA2,TB2
a=a+(b/3)
TA3,TB3 TA3,TB3
TA0,TB2
a=a+(b/3)
TA3,TB3
TA0,TB1
b = 6
TA1,TB1
a = 2
TA1,TB3
a = 2
TA2,TB2
a=a+(b/3)
TA2,TB1
a=a+(b/3)
TA1,TB2
b = b+3
TA2,TB2
b = b+3
TA2,TB2
a=a+(b/3)
TA3,TB3 TA3,TB3
Figure 2.9: The CFG that represents all possible interleaving sequences of threads
TA and TB.
However, in order to remove redundant interleaving sequences, partial order reductions
are usually applied to reduce signiﬁcantly the size of the traversed model (i.e., the
number of possible interleavings to be checked).
2.3.2 Partial Order Reduction Technique
The name Partial Order Reduction (POR) comes from partial order model of program
execution [75]. According to the model, concurrently executed events are not ordered
and each partially ordered execution can correspond to multiple interleaving sequences.
In [146], the name model checking using representatives is used to better describe the
name partial order reduction since the veriﬁcation is carried out using representatives
from equivalence classes of the behaviours. POR techniques [14, 40, 49, 103, 131] aim
to prune the number of states that have to be searched by model checking algorithms.
This is done by removing interleaving sequences that lead to the same system state, i.e.,
it avoids exploring diﬀerent equivalent interleavings of the concurrent events.
As an example of the number of states to be searched, consider the parallel composition
of a number of threads T1 through Tn. The size of the state space to be explored,
which consists of the parallel composition of transition systems M1 through Mn (i.e.,
Mt =
 n
j=0 Mj), is exponential in the number n of threads and program statements. To
model check a simple LTL property of this system requires an inspection of all states
in the underlying transition system Mt. However, instead of constructing a full state44 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
TA2,TB0
a=2, b=0
TA3,TB3
TA0,TB0
a=0, b=0
TA1,TB0
a = 2, b=0
TA1,TB1
a=2, b=6
TA3,TB1
a=2, b=6
TA3,TB2
a=2,b=9
TA2,TB1
a=4,b=6
TA1,TB2
a=2,b=9
TA2,TB2
a=4, b=9
TA2,TB2
a=5,b=9
TA3,TB3 TA3,TB3
TA0,TB2
a=0, b=9
TA3,TB3
TA0,TB1
a=0, b=6
TA1,TB1
a=2, b=6
TA1,TB3
a=2, b=9
TA2,TB2
a=5,b=9
TA2,TB1
a=4,b=6
TA1,TB2
a=2, b=9
TA2,TB2
a=4,b=9
TA2,TB2
a=5, b=9
TA3,TB3 TA3,TB3
Figure 2.10: The transition system that represents the parallel execution of threads
TA and TB.
TA’2: tmp1 = a
TA’5
TA’0
TB’2: b = b+3
TB’3
TB’0
TA’3: tmp2 = b/3
TA’4: a = tmp1 + tmp2
TB’1: b = 6 TA’1: a = 2
Figure 2.11: Model context switches inside individual visible statements
graph, which may be too large to ﬁt in memory, POR techniques aim to build a reduced
state graph using only representatives from the equivalence classes of behaviours.
Naturally, the POR techniques are best suited if applied to concurrent asynchronous
system, because there we can exploit the commutativity of concurrently executed in-
dependent events, i.e., events that result in the same state when executed in diﬀerentChapter 2 SAT-based and SMT-based Veriﬁcation Techniques 45
orders. In this scenario, POR is done in a way that if the property φ holds on the reduced
model, say M′ = (S′,R′,S0), it also holds on the original model M = (S,R,S0) (recall
Deﬁnition 2.14). This reduction is then based on the notion of independence relation
between transitions (I ⊆ R × R), which is deﬁned as follows.
Deﬁnition 2.19. I ⊆ R × R is an independence relation if and only if for each α and
β, where (α,β) ∈ I, the following two conditions hold for all s ∈ S:
1. Transitions α and β may execute in either order from state s, i.e., if α is enabled
in s and s
α → s′, then β is enabled in s if and only if β is enabled in s′;
2. Executing either of the two transition α and β starting from state s leads to the
same state s′, i.e., if α and β are enabled in s, there is a unique state s′ such that
s
α,β
→ s′ and s
β,α
→ s′.
The intuitive interpretation of these two conditions is that (1) independent transitions
can neither disable nor enable each other (enabledness), and (2) executing them in either
order results in the same state (commutativity). The dependency relation D is simply
deﬁned as the complement of I, i.e., if two transitions α and β are not independent, then
they are dependent. The partial order reduction thus exploits the dependency relation
that exists between the transitions of the threads. From a pragmatic point of view,
two transitions α (related to thread T1) and β (related to thread T2) are called to be
independent of each other if and only if the execution of α and β in either order results
in the same global state. If interleaving sequences that diﬀer only by such independently
executed events are indistinguishable by a speciﬁcation, they are called to be equivalent.
It is thus suﬃcient to select only one interleaving sequence from such equivalence class
as representative to be checked against the speciﬁcation by a model checking algorithm.
Classic POR algorithms explore at each state s an adequate subset ample(s) of the
transitions enabled (the set of transitions enabled in s is denoted by enabled(s)). This
exploration has to respect a set of conditions based on Deﬁnition 2.19:
• Condition C0: ample(s) = ∅ iﬀ enable(s) = ∅.
• Condition C1: Along every path of the (full) state graph starting in s, a transi-
tion that is dependent on a transition α in ample(s) must be preceded by α, i.e,
transition α has to occur ﬁrst.
• Condition C2: if ample(s)  = enabled(s), then each transition α in ample(s)
must be invisible w.r.t. property φ.
• Condition C3: If for each state s ∈ S of a cycle in reduced model M′, a transition
α is enabled, then α must be in ample(s).46 Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques
Conditions C0 to C3 are suﬃcient to guarantee that the resulting (reduced) model M′
preserves properties speciﬁed in LTL (described in Section 2.2) [14, 40].
2.4 Summary
This chapter described the main concepts needed to understand this thesis. In Sec-
tion 2.1, Logical Foundations, we introduced PL syntax and semantics along with some
examples. We deﬁned a mechanism to check whether a given PL formula is true or false
by means of interpretations. In particular, we showed that given a PL formula and an
interpretation, the truth value of a formula can be computed by a truth table (most
commonly used for evaluating PL formulae) or by induction (which is most suitable for
evaluating ﬁrst-order logic formulae). We also described the problem of deciding the
satisﬁability of PL formulae and how SAT solvers deal with this problem. In Subsec-
tion 2.1.3, we described the SMT problem, which aims to decide the satisﬁability of
ﬁrst-order logic formulae using a combination of diﬀerent background theories. We also
presented the main background theories implemented in modern SMT solvers and their
advantages over SAT solvers when reasoning about veriﬁcation problems arising from
real-world applications. In Subsection 2.1.4, we also described together with some illus-
trative examples the speciﬁcation logic LTL that is commonly used to specify properties
in the BMC framework.
In Section 2.2, Bounded Model Checking of Software, we presented the BMC technique
that consists of unwinding the design and the correctness property k times, and gener-
ating a propositional formula that is satisﬁable if and only if a counterexample exists.
We also discussed that the BMC technique can be used to ﬁnd violations of the tem-
poral property up to the bound k, but not to prove properties. In Subsection 2.2.3, we
described two methods to prove properties in the BMC framework, which are Craig inter-
polants and k-induction. Craig interpolation in model checking exploits the SAT/SMT
solvers’ ability to produce proof of unsatisﬁability. This proof does not ensure whether
a given property holds in the model, but it contains information about the reachable
states of the model. Therefore, the use of interpolants allows us to deﬁne a complete
method for ﬁnite-state reachability analysis based entirely on SAT and SMT solvers.
The k-induction method is a stronger version of the standard invariant approach to
verify safety properties. We present it as temporal induction (i.e., the induction is
carried out over the time steps of the ﬁnite state machines) and we also showed how
to devise an algorithm from the k-induction method to prove properties in the BMC
framework. We also overviewed typical architectures of the BMC technique such as
those implemented in the CBMC and F-SOFT model checkers, which are able to model
check ANSI-C programs. We conclude this section by comparing the BMC technique to
other modern software veriﬁcation techniques that make use of logic to describe states
and transformations between system states.Chapter 2 SAT-based and SMT-based Veriﬁcation Techniques 47
Finally, in Section 2.3, we provided mechanisms to model multi-threaded systems by
means of transition systems, which allow us to encode multi-threaded systems into the
BMC framework. We also presented the concept of asynchronous and synchronous
modes where the former only allows one thread to make progress at a time, and the
latter allows all threads to run at the same time. In this sense, we further presented
the message passing and shared variable models. In the message passing model, threads
can send/receive messages to/from other threads; while in the shared variable model, a
region of memory may be simultaneously accessed by multiple threads in order to provide
communication among them. As in this work we focus on asynchronous systems, we then
described the interleaving paradigm to model multi-threaded programs, which represents
a possible execution of the program where all of the concurrent events are arranged in
a linear order. We thus concluded this section by showing the eﬀectiveness of partial
order reduction techniques to prune the number of states that have to be searched by
model checking algorithms.Chapter 3
SMT-based Bounded Model
Checking for Embedded ANSI-C
Software
Propositional bounded model checking has been applied successfully to verify embed-
ded software but remains limited by increasing propositional formula sizes and the loss
of high-level information during the translation preventing potential optimizations to
reduce the state space to be explored. These limitations can be overcome by encod-
ing word-level information in theories richer than propositional logic and using SMT
solvers for the generated veriﬁcation conditions. Here, in order to achieve the ﬁrst ob-
jective stated in Section 1.2, we have modiﬁed and extended the encodings from previous
SMT-based bounded model checkers to provide more accurate support for variables of
ﬁnite bit width, bit-vector operations, arrays, structures, unions and pointers. Addi-
tionally, to achieve that objective, we have integrated the Boolector [31], CVC3 [20],
and Z3 [57] solvers with the CProver framework and evaluated them using both stan-
dard software model checking benchmarks and typical embedded software applications
from telecommunications, control systems, and medical devices. The experiments show
that our ESBMC model checker can analyze larger problems than existing tools and
substantially reduce the veriﬁcation time.
3.1 Introduction
Bounded Model Checking (BMC) based on Boolean Satisﬁability (SAT) has been intro-
duced as a complementary technique to Binary Decision Diagrams (BDDs) for alleviating
the state explosion problem [24]. The basic idea of BMC is to check the negation of
a given property at a given depth: given a transition system M, a property φ, and a
bound k, BMC unrolls the system k times and translates it into a veriﬁcation condition
4950 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
(VC) ψ such that ψ is satisﬁable if and only if φ has a counterexample of depth k or
less. Standard SAT checkers can be used to check whether ψ is satisﬁable. Note that in
BMC of software, the bound k limits the number of loop iterations and recursive calls
in the program.
In order to cope with increasing software complexity, SMT (Satisﬁability Modulo The-
ories) solvers can be used as back-ends for solving the generated VCs [10, 11, 71, 105].
Here, predicates from various decidable theories are not encoded using propositional
variables as in SAT, but remain in the problem formulation. These theories are handled
by dedicated decision procedures. Thus, in SMT-based BMC, ψ is a quantiﬁer-free for-
mula in a decidable subset of ﬁrst-order logic which is then checked for satisﬁability by
an SMT solver.
In order to reason about embedded software accurately, an SMT-based BMC must
consider a number of issues that are not easily mapped into the theories supported
by SMT solvers. In previous work on SMT-based BMC for software [10, 11, 71] only the
theories of uninterpreted functions, arrays and linear arithmetic were considered, but
no encoding was provided for ANSI-C [95] constructs such as bit-level operations, ﬁxed-
point arithmetic, pointers (i.e., pointer arithmetic and comparisons) and unions. This
limits its usefulness for analyzing and verifying embedded software written in ANSI-C. In
addition, the SMT-based BMC approaches proposed by Armando et al. [10, 11] and by
Kroening [105] do not support the checking of arithmetic overﬂow and do not make use
of high-level information to simplify the unrolled formula. We address these limitations
by exploiting the diﬀerent background theories of SMT solvers to build an SMT-based
BMC tool that precisely translates program expressions into quantiﬁer-free formulae
and applies a set of optimization techniques to prevent overburdening the solver. This
way we achieve signiﬁcant performance improvements over SAT-based BMC and the
previous work on SMT-based BMC [10, 11, 71, 105].
We describe the details of an accurate translation from single-threaded ANSI-C programs
into quantiﬁer-free formulae using the logics QF AUFBV and QF AUFLIRA from the
SMT-LIB [164].
Deﬁnition 3.1. The QF AUFBV logic represents quantiﬁer-free formulae that are built
over bit-vectors and arrays with free sort and function symbols, but with the restriction
that all array terms have the following structure (array (bit-vector i[w1]) (bit-vector
v[w2])), where i is the index with bit-width w1 and v is the value with bit-width w2.
Deﬁnition 3.2. The QF AUFLIRA logic represents quantiﬁer-free formulae that are
built over reals, integers and arrays with free sort and function symbols, but with the
restriction that all array terms are of the sort (array int real) or (array int (array int
real)), where all argument terms of sort int and real are linear, i.e., there is no occurrences
of the function symbols ∗, /, div, rem, and abs.Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software 51
We further demonstrate that our encoding and optimizations improve the performance
of software model checking for a wide range of software systems, with a particular em-
phasis on embedded software. Additionally, we show that our encoding allows us to
reason about arithmetic overﬂow and to verify programs that make use of bit-level,
pointers, unions and ﬁxed-point arithmetic. We also use three diﬀerent SMT solvers
(Boolector [31], CVC3 [20], and Z3 [57]) in order to check the eﬀectiveness of our en-
coding techniques. We considered these solvers because they were the most eﬃcient
ones for the categories of QF AUFBV and QF AUFLIRA in the last SMT competi-
tions [168]. To the best of our knowledge, this is the ﬁrst work that reasons accurately
about ANSI-C constructs commonly found in embedded software and extensively applies
SMT solvers to check the VCs emerging from the BMC of industrial embedded software
applications. We implemented our ideas in the ESBMC1 (Eﬃcient SMT-Based Bounded
Model Checker) tool that builds on the front-end of the C Bounded Model Checker
(CBMC) [42, 107]. ESBMC supports diﬀerent theories and SMT solvers in order to
exploit high-level information to simplify and to reduce the formula size. Experimental
results show that our approach scales signiﬁcantly better than both the SAT-based and
SMT-based CBMC model checker [42, 107, 105] and SMT-CBMC [11], a bounded model
checker for C programs that is based on the SMT solvers CVC3 and Yices.
The remainder of the chapter is organized as follows. In Section 3.2 we describe the
SMT-based BMC Formulation. In Section 3.3 we provide a running example to illus-
trate our encoding while in Section 3.4 we present the details of an accurate translation
from ANSI-C programs into quantiﬁer-free formulae using the SMT logics. In Section 3.5
we present the results of our experiments using several software model checking bench-
marks and embedded systems applications while in Section 3.6 we describe the results
of applying ESBMC to the veriﬁcation of a commercial embedded software used in the
telecommunications domain. In Section 3.7 we discuss the related work and we conclude
and describe future work in Section 3.8.
3.2 SMT-based BMC Formulation
In BMC, the program to be analyzed is modelled as a state transition system, which
is extracted from the control-ﬂow graph (CFG) [134]. This graph is built as part of a
translation process from program text to single static assignment (SSA) form. A node in
the CFG represents either a (non-) deterministic assignment or a conditional statement,
while an edge in the CFG represents a possible change in the program’s control location.
Let M be an abstract machine that represents a state transition system according to
Deﬁnition 2.14. A state s ∈ S consists of the value of the program counter pc and the
values of all program variables. An initial state s0 assigns the initial program location of
1Available at http://users.ecs.soton.ac.uk/lcc08r/esbmc/52 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
the CFG to pc. We identify each transition γ = (si,si+1) ∈ R between two states si and
si+1 with a logical formula γ(si,si+1) that captures the constraints on the corresponding
values of the program counter and the program variables.
Given the transition system M, a property φ, and a bound k, BMC unrolls the system k
times and translates it into a VC ψ such that ψ is satisﬁable if and only if φ has a counter-
example of length k or less. The VC ψ is a quantiﬁer-free formula in a decidable subset
of ﬁrst-order logic, which is then checked for satisﬁability by an SMT solver. In this
chapter, we are interested in checking safety properties of single-threaded programs. The
associated model checking problem is formulated by constructing the following logical
formula:
ψk = I(s0) ∧
k  
i=0
i−1  
j=0
γ(sj,sj+1) ∧ ¬φ(si) (3.1)
Here, φ is a safety property, I the set of initial states of M and γ(sj,sj+1) the transition
relation of M between time steps j and j +1. Hence, I(s0)∧
 i−1
j=0 γ(sj,sj+1) represents
the executions of M of length i and ψk can be satisﬁed if and only if for some i ≤ k
there exists a reachable state at time step i in which φ is violated. If ψk is satisﬁable,
then φ is violated and the SMT solver provides a satisfying assignment, from which
we can extract the values of the program variables to construct a counter-example. A
counter-example for a property φ is a sequence of states s0,s1,...,sk with s0 ∈ S0,
sk ∈ S, and γ (si,si+1) for 0 ≤ i < k. If ψk is unsatisﬁable, we can conclude that no
error state is reachable in k steps or less. Note that formula (3.1) diﬀers slightly from
(2.11) (presented in Section 2.2) because it represents a violation of length k or less to
the considered safety property while (2.11) represents a violation of exactly length k.
This means that if the system deadlocks in l ≤ k steps and the error is at step j ≤ l,
then the formula (2.11) turns out to be unsatisﬁable and therefore it will not detect the
error.
It is important to note that this approach can be used only to ﬁnd violations of the
property up to the bound k. In order to prove properties we need to compute the
completeness threshold (CT), which can be smaller than or equal to the maximum
number of loop-iterations occurring in the program [24, 45, 72]. However, computing
CT to stop the BMC procedure and to conclude that no counter-example can be found
is as hard as model checking. Moreover, complex programs involve large data-paths and
complex expressions. Consequently, even if we knew CT, the resulting formulae would
quickly become too hard to solve and require too much memory to build. In practice we
can thus only ensure that the property holds in M up to a given bound k. In our work,
we focus on embedded software because it has characteristics that make it attractive for
BMC, e.g., dynamic memory allocations and recursion are highly discouraged, and that
make the limitations of bounded model checking less stringent.Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software 55
C :=


  
   
   
  
   
  
   
   
  


in1 = store(store(store(store(store(store(in0,
0,16),
1,2),
2,nd uchar1),
3,16),
4,3),
5,0)
∧ i1 = 0 ∧ j1 = 0 ∧ out1 = store(out0,0,16)
∧ i2 = 1 ∧ j2 = 1 ∧ out2 = store(out1,1,2)
∧ g1 = nd uchar1  = 0
∧ g2 = ¬(nd uchar1 = 16)
∧ out3 = store(out2,2,nd uchar1)
∧ j4 = 3
∧ ...
∧ j10 = ite(¬g1,j3,j9)
∧ out11 = store(out10,j10,0)


  
   
   
  
   
  
   
   
  


(3.2)
P :=



j5 ≥ 0 ∧ j5 < 6 ∧ j7 ≥ 0 ∧ j7 < 6
∧ j8 ≥ 0 ∧ j8 < 6 ∧ j10 ≥ 0 ∧ j10 < 6
∧ ((select(out11,4) = 3) ∨ (select(out11,5) = 3))


 (3.3)
After this transformation, we build the constraints and properties as shown in formulae
(3.2) and (3.3) using the background theories of the SMT solvers. Furthermore, we create
additional Boolean variables (called deﬁnition literals) for each clause of the formula P
in such a way that the deﬁnition literal is true if and only if a given clause of the formula
P is true. In the example we add a constraint for each clause of P as follows:
l0 ⇔ j5 ≥ 0
l1 ⇔ j5 < 6
   
l9 ⇔ ((select(out,4) = 3) ∨ (select(out,5) = 3))
These deﬁnition literals are used to identify the VCs. Note that the language-speciﬁc
safety properties (e.g., out-of-bounds array indexing) and the user-speciﬁed properties
that hold trivially in the code are already simpliﬁed away (e.g., by keeping track of the
size of the array during the symbolic execution of the code). For instance, there is no
need to generate VCs that check for violations of the lower and upper bound of array
in, since i only takes the values from 0 to 4 when it is used in indexing the array, and
the validity of the bounds check can be evaluated statically. After mapping each VC to56 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
a deﬁnition literal, we then rewrite (3.3) as:
¬P := ¬l0 ∨ ¬l1 ∨ ... ∨ ¬l9 (3.4)
Finally, the formula C∧¬P is passed to an SMT solver to check satisﬁability. Our
approach is thus slightly diﬀerent from that of Armando et al. [11], who transform the
ANSI-C code into conditional normal form as an intermediary step to encode C and P
while we ﬁrst apply a number of simpliﬁcations (as described in Section 5.3) during the
transformation and then encode the ANSI-C code directly from the simpliﬁed SSA form.
Consequently, Armando et al. [11] end up with two sets of quantiﬁer-free formulae C and
P (but possibly with a higher overhead for the solver) and check the validity C |=T
 
P
using an SMT solver.
3.4 Encodings and Properties
This section describes the encodings that we use to convert the constraints and properties
from the ANSI-C program into the background theories of the SMT solvers.
3.4.1 Scalar Data Types
We provide two approaches to model (unsigned and signed) integer data types, either
as the integers provided by the corresponding SMT-lib theories or as bit-vectors, which
are encoded using a particular bit width such as 32 bits. Table 3.1 shows a list of the
ANSI-C types and their corresponding bit-vector representations, based on the storage
sizes (i.e., number of bits) required by ISO ANSI-C [95]. It also gives the representation
using the abstract numerical domains of the SMT-LIB.
In our SMT-based BMC framework, the encoding of the relational (e.g., <, ≤, >, ≥)
and arithmetic operators (e.g., +, −, /, ∗, rem) then depends on the encoding of their
operands as unsigned or signed bit-vectors, or integer or ﬁxed-point numbers. The SMT-
based BMC approach proposed by Armando et al. [11] does not support the encoding of
ﬁxed-point numbers and Kroening [105] does not exploit the SMT solvers to model the
program variables through the corresponding numerical domain (e.g., Z, R). Addition-
ally, the SAT-based BMC approach of Clarke et al. [42] (note that [42] is the original
paper that describes the CBMC’s implementation; a detailed technical report can be
found in [107]) transform the relational and arithmetic operators into a propositional
equation using a carry chain adder and the size of their encoding thus depends on the
size of the bit-vector representation of the scalar data types.
For the bit-vector encodings, the front-end provides six scalar datatypes: bool, signedbv,
unsignedbv, ﬁxedbv, ﬂoatbv, and pointer. The ANSI-C datatypes int, long int, long longChapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software 57
SMT bit-vector representation SMT abstract
numerical domain
16-bit 32-bit 64-bit
ANSI-C Type architecture architecture architecture
bool bool(1) bool(1) bool(1) bool
char signedbv(8) signedbv(8) signedbv(8) integer
unsigned char unsignedbv(8) unsignedbv(8) unsignedbv(8) unsigned integer
short int signedbv(16) signedbv(16) signedbv(16) integer
unsigned short int unsignedbv(16) unsignedbv(16) unsignedbv(16) unsigned integer
int signedbv(16) signedbv(32) signedbv(32) integer
unsigned int unsignedbv(16) unsignedbv(32) unsignedbv(32) unsigned integer
long int signedbv(32) signedbv(32) signedbv(64) integer
unsigned long int unsignedbv(32) unsignedbv(32) unsignedbv(64) unsigned integer
long long int signedbv(64) signedbv(64) signedbv(128) integer
unsigned long long int unsignedbv(64) unsignedbv(64) unsignedbv(128) unsigned integer
pointer pointer(32) pointer(32) pointer(64) integer
double ﬁxedbv(64) ﬁxedbv(64) ﬁxedbv(64) real
Table 3.1: Deﬁnitions of ANSI-C types and their corresponding SMT representations.
int, and char are considered as signedbv with diﬀerent bit widths (depending on the
machine architecture) and the unsigned versions of these datatypes are considered as
unsignedbv. For double and ﬂoat we currently only support ﬁxed-point arithmetic (i.e.,
ﬁxedbv) at this point in time, but not full ﬂoating-point arithmetic (i.e., ﬂoatbv); see
the following section for more details.
We support all type casts, including conversion between integer and ﬁxed-point types. In
the bit-vector representation, the conversions between signedbv, unsignedbv and ﬁxedbv
are performed using the word-level functions Extract (Trm,i,j), SignExt (Trm,k) and
ZeroExt (Trm,k) described in Section 2.1.3. Similarly, upon dereferencing, the object
that a pointer points to is converted using the same word-level functions. The conversions
between signedbv, unsignedbv and ﬁxedbv using the abstract numerical domains are
straightforward; we only consider the integral part. In addition, signedbv and unsignedbv
are converted to bool using the  =-operator by comparing the variable to be converted
with zero. Formally, let v be a variable of signed or unsigned type, k be a constant whose
value represents zero in the type of v, and t be a Boolean variable such that t ∈ {0,1}.
We then convert v into t as follows:
t = ite (v  = k,1,0) (3.5)
while bool is converted to signedbv and unsignedbv using the ite-operator as follows:
v = ite (t,1,0) (3.6)Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software 59
number of bits of the integral and fractional parts, respectively, and b be a bit-vector of
size tb, with mb and nb deﬁned similarly. We apply the encodings in (3.7) and (3.8) in
order to get the new bit-vector b = i@f that has the same bitwidth before and after the
radix point of a.
i =
 
Extract (b,nb + ma − 1,nb) : ma ≤ mb
SignExt (Extract(b,tb − 1,nb),ma − mb) : otherwise
(3.7)
f =
 
Extract (b,nb − 1,nb − na) : na ≤ nb
ZeroExt (Extract (b,nb − 1,0),na − nb) : otherwise
(3.8)
Rational encoding. We encode ﬁxed-point arithmetic using rational arithmetic by
rounding the ﬁxed-point numbers to rationals in base 10. We extract the integral and
fractional parts and convert them to integers I and F, respectively; we then divide F
by 2n, round the result to a given number of decimal places, and convert everything to
a rational number in base 10. Formally, let p be the number of decimal places and let
i and f be the integral and fractional parts resp. of a given ﬁxed-point number a. We
apply the encoding in (3.9) in order to convert a to a rational number.
a =
   
i ∗ p +
 
f∗p
2n + 1
  
/p : f  = 0
i : otherwise
(3.9)
For example, with m = 2, n = 16, and six places decimal precision, the number 3.9
(with a binary representation of 11.1110011001100110) is converted to I = 3, and F =
58982/216, and ﬁnally to 3899994/100000. As a result, the arithmetic operations are
performed in the domain of Q instead of R and there is no need to add missing bits to
the integer and fractional parts.
In general, the drawback is that some numbers are not precisely represented with ﬁxed-
point arithmetic. As an example, if m = 4 and n=4, then the closest representable
numbers to 0.7 are 0.6875 ( 0000.1011 ) and 0.75 ( 0000.1100 ). As a result, the number
needs to be rounded and the deviation might eventually change the control ﬂow of
the program. However, we have not detected any false results caused by this in our
benchmarks.
3.4.3 Arithmetic Overﬂow and Underﬂow
Arithmetic overﬂow and underﬂow are frequent sources of bugs in embedded soft-
ware. ANSI-C, like most programming languages, provides basic data types that have60 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
a bounded range deﬁned by the number of bits allocated to them. Some model check-
ers (e.g., SMT-CBMC [11], F-Soft [71] and Blast [88]) treat program variables either
as unbounded integers or do not generate VCs related to arithmetic overﬂow, and can
consequently produce false results. In our work, we generate VCs related to arithmetic
overﬂow and underﬂow of bit-vectors following the ANSI-C standard. This requires that,
on arithmetic overﬂow of unsigned integer types (e.g., unsigned int, unsigned long int),
the result must be interpreted using modular arithmetic as r mod 2w, where r is the
expression rooted with the operation that caused overﬂow and w is the width of the
resulting type in terms of bits [95]. Hence, the result of this encoding is one greater than
the largest value that can be represented by the resulting type. This semantics can be
encoded trivially using the background theories of the SMT solvers. For each unsigned
integer (sub-)expression, we generate a literal lunsigned overﬂow to represent the validity
of the unsigned operation and add the following deﬁnition:
lunsigned overﬂow ⇔ (r − (r mod 2w)) < 2w
On the other hand, the ANSI-C standard does not deﬁne any behaviour on arithmetic
overﬂow of signed types (e.g., int, long int), and only requires that integer division-
by-zero must be detected. In addition to division-by-zero detection, we consider arith-
metic overﬂow of signed types on addition, subtraction, multiplication, division and
negation operations by deﬁning boundary conditions. For example, we deﬁne a lit-
eral loverﬂow∗
x,y that is true iﬀ the multiplication of x and y exceeds LONG MAX (i.e.,
x ∗ y > LONG MAX) and another literal lunderﬂow∗
x,y that is true iﬀ the multiplication
of x and y is below LONG MIN. We use a literal lres op∗ to denote the validity of the
signed multiplication with the following deﬁnition:
lres op∗ ⇔ (¬loverﬂow∗
x,y ∧ ¬lunderﬂow∗
x,y)
The constraints on addition, subtraction, and division are encoded in a similar way. The
literal overﬂow∼
x is true if and only if the negation of x is outside the interval given by
LONG MIN and LONG MAX.
3.4.4 Arrays
Arrays are encoded in a straight-forward manner using the SMT domain theories, and we
consider the WITH operator and index operator [] to be part of the encoding [42, 80].
These operators are mapped directly to the functions store and select of the array
theory presented in Section 2.1.3 respectively. The assignment a′ = a WITH ([i] := v)
is encoded as a store operation a′ = store(a,i,v) while a[i] is simply encoded as a select
operation select(a,i). The theory of arrays employs the notion of unbounded arrays size,
but arrays in software are typically of bounded size. This means that if an index variable66 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
P :=
 
select(select(y3,a),1) = 1
∧ select(y3,b) = 99
 
(3.22)
3.4.7 Dynamic Memory Allocation
Although dynamic memory allocation is discouraged in embedded software, ESBMC is
capable of model checking programs that use it through the ANSI-C functions malloc
and free. We model memory just as an array of bytes and exploit the array theories of
SMT solvers to model read and write operations to the memory array on the logic level.
ESBMC checks three properties related to dynamic memory allocation; in particular,
it checks whether (i) the argument to any malloc, free, or dereferencing operation is a
dynamic object (IS DYNAMIC OBJECT), (ii) the argument to any free or dereferenc-
ing operation is still a valid object (VALID OBJECT), and (iii) whether the memory
allocated by the malloc function is deallocated at the end of an execution (DEALLO-
CATED OBJECT) [48]. The last check extends CProver framework VCG.
Formally, let po be a pointer expression that points to the object o of type t and let m
be a memory array of type t and size n, where n represents the number of elements to
be allocated. In our encoding, the representation of each dynamic object do contains
a unique identiﬁer ρ that indicates the object’s “serial number” in the sequential order
of all dynamically allocated objects (i.e., 0 < ρ ≤ k, where k represents the current
number of dynamic objects). Each dynamic object consists of the memory array m, the
size in bytes of m, the unique identiﬁer ρ and the location in the execution where m is
allocated, which is used for error reporting.
To detect invalid reads/writes, we check whether do is a dynamic object and also whether
po is within the bounds of the memory array. Let i be an integer variable that indicates
the position in which the object pointed to by po must be stored in the memory array
m of size n. We encode IS DYNAMIC OBJECT as a literal lis dynamic object with the
following deﬁnition:
lis dynamic object ⇔


k  
j=1
do.ρ = j

 ∧ (0 ≤ i < n) (3.23)
To check for invalid objects, we add one additional bit ﬁeld ν to each dynamic object
which indicates whether the object is still alive or not. We set ν to true when the
function malloc is called to denote that the object is alive. When the function free is
called, we update ν to false to denote that the object is no longer alive. We then encode
VALID OBJECT as a literal lvalid object with the following deﬁnition:
lvalid object ⇔ (lis dynamic object ⇒ do.ν) (3.24)68 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
3.5 Experimental Evaluation
The experimental evaluation of the approach presented in this chapter consists of ﬁve
parts. After describing the setup in Section 3.5.1, we compare in Section 3.5.2, the
SMT solvers Boolector, CVC3, and Z3 to identify the most suitable SMT solver for
further experiments. In Section 3.5.3 we check the error detection capability of ESBMC
over a large set of both correct and buggy ANSI-C programs. In the last two sub-
sections, we evaluate ESBMC’s performance relative to that of two other ANSI-C BMC
tools. In Section 3.5.4, we compare ESBMC and SMT-CBMC, using SMT-CBMC’s
own benchmark suite, while we compare ESBMC and CBMC in the ﬁnal Section 3.5.5,
using a variety of programs, including embedded software used in telecommunications,
control systems, and medical devices. Section 3.6 contains the experimental results of
applying ESBMC and CBMC to the veriﬁcation of a commercial embedded software.
The purpose of this section is to evaluate both tools ESBMC and CBMC using large
embedded software industrial applications.
3.5.1 Experimental Setup
We used benchmarks from a variety of sources to evaluate ESBMC’s precision and per-
formance, which include embedded systems benchmark suites and applications as well as
other testsuites and applications, such as the SAT solver PicoSAT [23], the open-source
applications ﬂex [153] and git-remote [132], and a ﬂasher manager application [175]. We
also extracted one particular application from the CBMC manual [42] that implements
the multiplication of two numbers using bit-level operations.
The PowerStone [159] suite contains graphics applications, image decompression, paging
communication protocols, engine control applications and group three fax decode. The
SNU-RT [116] suite consists of matrix and signal processing functions such as matrix
multiplication and decomposition, quadratic equations solving, insertion sort algorithm,
cyclic redundancy check, fast Fourier transform, LMS adaptive signal enhancement, and
JPEG encoding. We use the non-deterministic version of these benchmarks where all
inputs are replaced by non-deterministic values. We also a cubic equation solver from
the MiBench [2] suite. The HLS suite [86] contains programs that implement the encoder
and decoder of the adaptive diﬀerential pulse code modulation (ADPCM).
The NECLA [157] and VERISEC [110] benchmarks are not speciﬁcally related to embed-
ded software, but they allow us to check ESBMC’s error-detection capability easily since
they provide ANSI-C programs with and without known bugs. Here, we use the suﬃx
“-bad” to denote the subset with seeded errors, and “-ok” to the denote the supposedly
correct (“golden”) versions.3 The programs make use of dynamic memory allocation,
3The detailed results shown in Appendix B also show which programs are “bad” and which are “ok”.Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software 69
interprocedural dataﬂow, aliasing, pointers typecast and string manipulation. In addi-
tion, we used some programs from the well-known Siemens [142] test suite, including
pattern matching and string processing, statistics, and aerospace applications. The EU-
REKA [11] benchmarks ﬁnally contain programs that allow us to assess the scalability
of the model checking tools on problems of increasing complexity [11].
Unless stated otherwise, all experiments were conducted on an otherwise idle Intel Xeon
5160, 3GHz server with 4 GB of RAM running Linux OS. For all benchmarks, the time
limit has been set to 3600 seconds for each individual property. All times given are wall
clock time in seconds as measured by the unix time command.
3.5.2 Comparison of SMT solvers
As a ﬁrst step, we compared to which extent the SMT solvers support the domain
theories that are required for SMT-based BMC of ANSI-C programs. For this purpose,
we analyzed the SMT solvers Boolector (V1.4), CVC3 (V2.2), and Z3 (V2.11). In the
theory of linear and non-linear arithmetic, CVC3 and Z3 do not support the remainder
operator, but they allow us to use axioms to deﬁne it. Currently, Boolector does not
support the theory of linear and non-linear arithmetic at all. In the theory of bit-
vectors, CVC3 does not support the division and remainder operators for bit-vectors
representing signed and unsigned integers. However, in all cases, axioms can be used in
order to deﬁne the missing operators. Boolector and Z3 support all word-level, bit-level,
relational, arithmetic functions over unsigned and signed bit-vectors. In the theories of
arrays and tuples, the veriﬁcation problems only involve selecting and storing elements
from/into arrays and tuples, respectively, and both domains thus comprise only two
operations. These operations are fully supported by CVC3 and Z3; Boolector supports
only the theory of arrays but not that of tuples.
We then used 15 ANSI-C programs to compare the performance of Boolector, CVC3,
and Z3 as ESBMC back-ends. The programs 1-8 allow us to assess the scalability of the
model checking tools on problems of increasing complexity [11] and the programs 9-15
contain typical ANSI-C constructs found in embedded software, i.e., they contain linear
and non-linear arithmetic and make heavy use of bit operations.7
0
C
h
a
p
t
e
r
3
S
M
T
-
b
a
s
e
d
B
o
u
n
d
e
d
M
o
d
e
l
C
h
e
c
k
i
n
g
f
o
r
E
m
b
e
d
d
e
d
A
N
S
I
-
C
S
o
f
t
w
a
r
e
CVC3 (v2.2) Boolector (v1.4) Z3 (v2.11)
Program L B P Solver Total Solver Total Solver Total
1 EUREKA.BubbleSort 43 35 17 14 (3) 17 (5) <1 (<1) 2 (2) <1 (<1) 2 (3)
43 70 17 Mb (16) Mb (33) 3 (1) 16 (17) 3 (1) 16 (17)
43 140 17 Mb (Mb) Mb (Mb) 85 (53) 282 (311) 65 (11) 265 (269)
2 EUREKA.SelectionSort 34 35 17 17 (2) 18 (3) <1 (<1) 1 (1) <1 (<1) 1 (1)
34 70 17 Mb (8) Mb (17) 1 (<1) 9 (10) 1 (1) 9 (11)
34 140 17 Mb (42) Mb (209) 10 (3) 161 (171) 12 (6) 165 (173)
3 EUREKA.BellmanFord 49 20 33 <1 (<1) <1 (<1) <1 (<1) <1 (<1) <1 (<1) <1 (<1)
4 EUREKA.Prim 79 8 30 <1 (1) 5 (2) <1 (<1) <1 (<1) <1 (<1) <1 (<1)
5 EUREKA.StrCmp 14 1000 6 4 (444) 11 (454) 192 (248) 195 (257) 32 (37) 35 (46)
6 EUREKA.SumArray 12 1000 7 <1 (106) 1 (107) <1 (<1) 1 (1) 9 (<1) 10 (1)
7 EUREKA.MinMax 19 1000 9 Tb (Mb) Tb (Mb) 38 (2) 42 (7) 2 (1) 6 (7)
8 SNU-RT.InsertionSort 34 35 17 2 (3) 4 (5) <1 (<1) 3 (3) <1 (<1) 3 (3)
34 70 17 3 (11) 14 (24) 4 (<1) 15 (13) 2 (1) 12 (14)
34 140 17 21 (67) 194 (283) 193 (3) 350 (219) 42 (7) 212 (222)
9 SNU-RT.Fibonacci 40 30 4 <1 (<1) 39 (38) <1 (<1) 39 (38) <1 (<1) 39 (38)
10 SNU-RT.bs 95 15 7 <1 (<1) <1 (<1) <1 (<1) <1 (<1) <1 (<1) <1 (<1)
11 SNU-RT.lms 258 202 23 97 (17) 225 (324) <1 (<1) 303 (307) 3 (<1) 306 (307)
12 MiBench.Cubic 66 5 5 <1 (<1) <1 (<1) <1 (<1) <1 (<1) <1 (<1) <1 (<1)
13 CBMC.BitWise 18 8 1 3 (6) 3 (6) 7 (8) 7 (8) 30 (26) 30 (26)
14 HLS.adpcm encode 149 200 12 <1 (21) 6 (26) <1 (<1) 6 (6) <1 (<1) 6 (6)
15 HLS.adpcm decode 111 200 10 <1 (24) 3 (27) <1 (<1) 3 (3) <1 (<1) 3 (3)
Table 3.2: Results of the comparison between CVC3, Boolector and Z3. Time-outs are represented with T in the Time column; Examples that
exceed available memory are represented with M in the Time column. The subscript b indicates that the error occurred in the back-end.Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software 71
Table 3.2 shows the results of the comparison. Here, L is the number of lines of code,
B the unwinding bound, and P the number of properties veriﬁed, for each ANSI-C
program. We checked for all language-speciﬁc safety properties (as described in the
previous sections) as well as user-speciﬁed properties. For each solver, we provide the
total time (in seconds) to simultaneously check all properties of each program, using
the speciﬁed unwinding bound, as well as the solver time itself. The diﬀerence between
both times is spent in the ESBMC front-end. In addition, we provide (in brackets)
the timings using the SMT-LIB interface instead of the native API of the solver.4 The
fastest time for each program is shown in bold. We also indicate whether ESBMC fails
during the veriﬁcation process, either due to a time out (T) or due to memory overﬂow
(M). In this set of experiments, all failures occurred in the back-end (i.e., solver), which
is indicated by the subscript b.
As we can see in Table 3.2, if we use the native API of the solvers, Z3 usually runs
slightly faster than Boolector and CVC3; however, both CVC3 and Boolector are faster
for some programs. Generally the diﬀerences between the solvers (in particular between
Boolector and Z3) are small, although CVC3 fails for some examples. If we use the
SMT-LIB interface, the situation changes, and Boolector runs slightly faster than Z3
and CVC3. However, similar to case of the native API, it is not always the fastest solver;
again, the diﬀerences are generally small, and even smaller than when using the native
API.
Generally, the native API is slightly faster than the SMT-LIB interface, although the
diﬀerence is small as well; this happens because in the SMT-LIB interface, we have to
write/read the resulting SMT formula to/from a ﬁle in the disk in order to interact
with the SMT solver, which is extremely slower than accessing the SMT solver directly
through the native API. However, there are a few notable exceptions where the SMT-
LIB interface is slightly faster than the native API. Using the SMT-LIB interface, CVC3
scales better for BubbleSort and SelectionSort, but slows down substantially for StrCmp
and SumArray. We manually inspected the respective VCs and found that their structure
is essentially the same. We conclude that the SMT-LIB interface of CVC3 lacks some
optimization during the preprocessing. Similarly, Boolector speeds up for InsertionSort
using the SMT-LIB API, but the structure of the VCs using both APIs is also the same;
similarly, we conclude that the SMT-LIB interface enables some optimization during the
preprocessing.
We decided to continue the evaluation with Z3 and Boolector using both the native and
SMT-LIB APIs since CVC3 does not scale so well and fails to check three benchmarks
BubbleSort, SelectionSort and MinMax.
4See Chapter 5 for a detailed description of the diﬀerent solver integrations.72 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
Time Properties Errors
Testsuite #N ΣL ΣP
S
o
l
v
e
r
T
o
t
a
l
P
a
s
s
e
d
V
i
o
l
a
t
e
d
#Ne
t
r
u
e
f
a
l
s
e
1 EUREKA 7 787 420 182 543 420 0 - - -
2 NECLA-ok 30 891 254 98 172 212 42 2 3 0
NECLA-bad 10 342 112 37 47 87 25 10 25 0
3 POWERSTONE 9 2857 2031 728 816 2019 12 1 12 0
4 SNU-RT 20 3320 828 15 570 799 29 4 29 0
5 VERISEC-ok 80 4521 2114 128 211 2094 15 9 15 0
VERISEC-bad 83 4569 2024 127 226 1808 216 83 216 0
6 WCET 10 3430 726 7 73 722 4 2 3 1
Table 3.3: Results of the error-detection capability of ESBMC.
3.5.3 Error-Detection Capability
We now analyze to which extent ESBMC is able to handle and detect errors in standard
ANSI-C benchmarks. Table 3.3 summarizes the results. Here, N is the number of
programs in the benchmark suite, while ΣL and ΣP give its total size (in lines of code)
and the total number of properties checked, respectively. The table again shows both
the solver and total veriﬁcation time. In the last three columns, Ne is the number
of programs in which ESBMC has detected violations of safety properties and user-
speciﬁed assertions, “true” reports the number of property violations that correspond
to true, conﬁrmed faults, “false” reports the number of false negatives produced by
ESBMC. The Appendix B gives the complete results.
The EUREKA suite only contains correct programs and ESBMC is able to verify all
properties without producing any false negative. In the NECLA and VERISEC suites,
ESBMC is able to detect errors related to buﬀer overﬂow, aliasing, dynamic memory allo-
cation, and string manipulation; in particular, it detects all seeded errors in the versions
NECLA-bad and VERISEC-bad. Moreover, ESBMC could verify two programs that
were originally in NECLA-bad, but did not contain any seeded errors; the benchmark
creators conﬁrmed that these programs were misclassiﬁed and subsequently changed the
error seeding [97].
Surprisingly, ESBMC also detects errors in the supposedly correct golden versions. In
NECLA-ok, ESBMC ﬁnds three property violations in two programs, which have been
conﬁrmed as true faults by the benchmark creators [97]. The ﬁrst is an array bounds
violation, caused by an indexing expression x%32 that can become negative for negative
inputs x. The other two are also related to array bounds violations, but are caused by
repeated in-place updates of a buﬀer using the strcat-function, which also appends a
new NULL-character at the end of the new string formed by the concatenation of both
arguments; this NULL-character then causes the violation in the last iteration of theChapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software 75
eight) properties of the module qurt and ﬁfteen additional benchmarks in comparison to
SAT-based CBMC. Both CBMC and ESBMC ﬁnd errors in the SNU-RT (as conﬁrmed
in Section 3.5.3). However, ESBMC ﬁnds additional conﬁrmed errors (see Section 3.5.3
again) in the WCET, SNU-RT, and PowerStone benchmarks, while CBMC produces
false negatives or fails.
In the case of print tokens2, ESBMC runs out of memory if we try to increase the un-
winding bound to 82, but if we restrict the veriﬁcation to the function get token, it
ﬁnds an array-bounds violation in the golden version. We extracted the counterexam-
ple provided by ESBMC and used it to conﬁrm that this is a true fault. ESBMC also
ﬁnds additional errors in ﬂasher manager (violation of a user-speciﬁed assertion) and
adpcm encode (array-bounds violation) applications. Moreover, SAT-based CBMC also
produces false negatives for the golden version of the programs ex30 and ex33 by re-
porting non-existing bugs related to dynamic object upper bounds and invalid pointers.
We can also see that ESBMC not only has a better precision than SAT-based CBMC,
but it also runs slightly faster than the SAT-based CBMC in those benchmarks that it
does not fail. The results in Table 3.5 thus allow us to conclude that ESBMC improves
substantially precision and scales signiﬁcantly better than CBMC for problems that in-
volve tight interplay between non-linear arithmetic, bit operations, pointers and array
manipulations, which are typical for embedded systems software.
3.6 Industrial Case Study
In order to further evaluate ESBMC’s performance relative to CBMC, we analyzed the
embedded software used in a commercial product from NXP semiconductors [141], a
set-top box that is used in high deﬁnition internet protocol (IP) and hybrid digital TV
applications. The embedded software of this platform relies on the Linux operating
system and makes use of diﬀerent applications such as:
1. LinuxDVB that is responsible for controlling the front-end, tuners and multiplex-
ers [6].
2. DirectFB that provides graphics applications and input device handling [5].
3. ALSA that is used to control the audio applications [4].
This platform contains two embedded processors that exchange data via an inter-process
communication (IPC) mechanism using socket (which thus allows the communication
between the two processors).7
6
C
h
a
p
t
e
r
3
S
M
T
-
b
a
s
e
d
B
o
u
n
d
e
d
M
o
d
e
l
C
h
e
c
k
i
n
g
f
o
r
E
m
b
e
d
d
e
d
A
N
S
I
-
C
S
o
f
t
w
a
r
e
SAT-based CBMC (v3.8) [42] ESBMC (v1.15)
Time Properties Time Properties
Module L B P
S
o
l
v
e
r
T
o
t
a
l
P
a
s
s
e
d
V
i
o
l
a
t
e
d
F
a
i
l
S
o
l
v
e
r
T
o
t
a
l
P
a
s
s
e
d
V
i
o
l
a
t
e
d
F
a
i
l
1 Siemens.print tokens2 510 81∗ 135 <1 <1 135 0 0 <1 (<1) <1 (<1) 135 0 0
(get token) 51 82 76 Tb Tb 0 0 135 29 (35) 60 (65) 134 1 0
2 Siemens.replace 564 1∗ 199 †
f †
f - - - <1 (<1) <1 (<1) 199 0 0
3 Siemens.tot info 406 30∗ 73 †
f †
f - - - 32 (3) 98 (79) 73 0 0
4 Siemens.tcas 173 4 38 <1 <1 38 0 0 1 (<1) 2 (1) 38 0 0
5 Siemens.space 9125 126∗ 2016 <1 4 2016 0 0 <1 (<1) 3 (3) 2016 0 0
6 WCET.statistics 157 ∞ 29 †
f †
f - - - 1 (<1) 53 (53) 27 2 0
7 WCET.statemate 1273 3 6 <1 <1 6 0 0 <1 (<1) <1 (<1) 6 0 0
8 SNU-RT.crc new 125 ∞ 13 <1 6 12 1 0 <1 (<1) 8 (8) 12 1 0
9 SNU-RT.ﬀt1k new 158 ∞ 39 †
b †
b 35 0 4 <1 (1) 56 (57) 39 0 0
10 SNU-RT.ﬁbcall new 83 50∗ 2 <1 <1 1 1 0 <1 (<1) <1 (<1) 1 1 0
11 SNU-RT.ﬁr new 316 ∞ 25 5 6 25 0 0 <1 (<1) 2 (2) 25 0 0
12 SNU-RT.insertsort new 94 13 20 †
b †
b 0 0 20 8 (<1) 8 (2) 14 6 0
13 SNU-RT.lms new 256 ∞ 35 †
b †
b 29 0 6 3 (<1) 24 (24) 35 0 0
14 SNU-RT.ludcmp new 142 ∞ 79 Tb Tb 84 0 4 Tb (Tb) Tb (Tb) 84 0 4
15 SNU-RT.qurt new 159 ∞ 8 Tb Tb 2 0 6 Tb (Tb) Tb (Tb) 7 0 1
16 PowerStone.bcnt 83 17 153 2 3 153 0 0 2 (2) 2 (2) 153 0 0
17 PowerStone.blit 95 1 133 <1 <1 133 0 0 <1 (<1) <1 (<1) 129 4 0
18 PowerStone.pocsag 521 42 187 Mf Mf - - - 4 (<30) 22 (48) 186 1 0
19 NECLA.ex30 45 101 16 <1 2 12 4 0 <1 (<1) 3 (3) 16 0 0
20 NECLA.ex33 35 100 13 <1 <1 6 7 0 <1 (<1) <1 (<1) 13 0 0
21 picosat 8160 23∗ 3142 Tf Tf - - - 27 (†
b) 79 (†
b) 3142 0 0
22 ﬂex 14192 2∗ 10002 †
f †
f - - - 3492 (†
b) 3526 (†
b) 10002 0 0
23 git-remote-gitkrb5 6288 5∗ 174 †
b †
b 0 0 174 196 (†
b) 225 (†
b) 174 0 0
24 ﬂasher manager 521 21 26 2 4 26 0 0 25 (22) 29 (27) 25 1 0
25 HLS.adpcm encode 150 100 25 Tb Tb 0 0 25 <1 (<1) 6 (6) 24 1 0
Table 3.5: Results of the comparison between CBMC and ESBMC. Internal errors in the respective tool are represented with † in the Time column.
The subscripts f and b indicate whether the errors occurred in the front-end or back-end, respectively. The superscript ∗ on the unwinding bound
indicates that it is not large enough to prove or falsify the properties.Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software 77
We analyzed the following embedded applications:
1. exStbKey: This application checks the DirectFB key codes that are returned when
the remote control or front panel keys are pressed.
2. exStbHDMI: This application is used to set various capabilities of the HDMI device
(e.g., audio rate, video mode) and to read various statuses of the HDMI device
(e.g., sink type, hotplug status).
3. exStbLED: This application is responsible for setting the front panel LED display;
and uses raw keyboard input from the UART to control what is displayed on the
front panel LED display.
4. exStbHwAcc: This application demonstrates the advantages that can be gained by
using the graphics hardware acceleration that is available on the set-top box.
5. exStbResolution: This application is responsible for modifying the framebuﬀer
dimensions and upscaling by setting framebuﬀer to be accessed and updating the
width and height of the framebuﬀer.
6. exStbFb: This application is used to decode image ﬁles and display them in a
framebuﬀer or on a video layer.
7. exStbCc: This application outputs a test closed caption stream.
8. exStbDemo: This application is used to demonstrate a multitude of system features
in an integrated system. It includes support for DVB reception, channel change,
installation, programme information, recording and playback, IP reception and
playback (both unicast and multicast formats), media ﬁle playback (elementary
streams and transport streams), image decoding and display manipulation.
As we did in Section 3.5.5, we compare our approach only against the SAT-based CBMC
version, which is able to support most of the benchmarks from Table 3.6; in particular,
we again compared CBMC v3.8 and ESBMC v1.15. We also invoked both tools by
manually setting the ﬁle name, the unwinding bound, the checks for array bounds,
pointer safety, division by zero, and arithmetic over- and underﬂow, as before. Table 3.6
reports the results in the usual format.
Both SAT-based CBMC and ESBMC were able to ﬁnd a bug in the application exStb-
HwAcc, which is related to an arithmetic overﬂow on typecast. In a given part of the
program exStbHwAcc, there is a typecast operation of the form (int32 t)(ﬁnfo.smem len),
which converts the ﬁeld smem len of type unsigned integer into a signed integer; and
this is thus considered to be an overﬂow. ESBMC also found two bugs in the applica-
tion exStbCc, which are related to arithmetic overﬂow on addition. In this program, we
have the program statement oﬀset[0] += ret guarded by an if condition, but inside an78 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
inﬁnite loop. Therefore, successive additions of the variable ret to the array oﬀset can
lead to an arithmetic overﬂow.
We are able to model check the application exStbDemo up to the bound 16, and we do
not ﬁnd any property violation. If we increase the unwinding bound further to search
deeper on the state space, we are unable to model check exStbDemo due to memory
limitation and time out resp. as shown in lines 8.1 (running on a machine with 4 GB of
RAM) and 8.2 (running on a machine with 28 GB of RAM) of Table 3.6. If we verify the
application exStbDemo function-by-function, we can thus go deeper into the system and
explore more exhaustively the state space. However, ESBMC provides false negatives
related to pointer safety since we assume that the function parameters are unconstrained
(see functions readLine (8.3), getCommand (8.4) and main Thread) (8.15). From this
set of experiments, we can conclude that the size of the programs that state-of-the-art
bounded model checkers can cope with is still restricted (even if we deﬁne only small
parts of the program to be veriﬁed).
3.7 Related Work
There has been work in the veriﬁcation of low-level (assembly language) programs for
embedded systems. Thiry and Claesen [170] apply a model checking algorithm based
on binary decision diagrams (BDDs) using the SMV model checker [33] to verify a
mouse controller. In this work, however, the authors use the computational tree logic
(CTL) to model and verify the embedded software. Thiry and Claesen are able to ﬁnd
inconsistencies between the assembly code and ﬂow chart speciﬁcations of the mouse
controller. The drawback of this approach is that it is limited to complexity problems
in the symbolic state space representation and manipulation using BDDs [25].
In another work, Balakrishnan and Tahar extend the BDD-based model checking al-
gorithm to support the more general multiway decision graph (MDG) to avoid some
BDD-size blow-up [15]. The main idea behind MDG is to represent the model at higher
abstract levels using a subset of ﬁrst order logic (FOL) and then make use of the au-
tomation oﬀered by BDDs-based tools. Balakrishnan and Tahar also verify the mouse
controller case study of [33]. The authors report that with their approach they can also
ﬁnd inconsistencies between the speciﬁcation and the code in few seconds. This ap-
proach, however, is applied to verify one small embedded application and consequently
does not demonstrate the veriﬁcation of real-world embedded software.C
h
a
p
t
e
r
3
S
M
T
-
b
a
s
e
d
B
o
u
n
d
e
d
M
o
d
e
l
C
h
e
c
k
i
n
g
f
o
r
E
m
b
e
d
d
e
d
A
N
S
I
-
C
S
o
f
t
w
a
r
e
7
9
SAT-based CBMC (v3.8) [42] ESBMC (v1.15)
Time Properties Time Properties
Module L B P
S
o
l
v
e
r
T
o
t
a
l
P
a
s
s
e
d
V
i
o
l
a
t
e
d
F
a
i
l
S
o
l
v
e
r
T
o
t
a
l
P
a
s
s
e
d
V
i
o
l
a
t
e
d
F
a
i
l
1 exStbKey 558 4 33 <1 4 33 0 0 <1 (<1) 1 (1) 33 0 0
2 exStbHDMI 1508 15∗ 138 500 706 138 0 0 316 (†b) 429 (†b) 138 0 0
3 exStbLED 430 50∗ 102 72 122 102 0 0 48 (68) 80 (79) 102 0 0
4 exStbHwAcc 1432 3 239 2 6 238 1 0 <1 (†b) 1 (†b) 238 1 0
5 exStbResolution 353 50 79 †b †b 0 0 70 26 (59) 59 (61) 70 0 0
6 exStbFb 689 10 218 484 825 167 0 0 52 (†b) 101 (†b) 167 0 0
7 exStbCc 331 3 21 <1 3 19 2 0 <1 (<1) <1 (<1) 19 2 0
8 exStbDemo 14841 16∗ 471 †f †f - - - <1 (<1) 6 (7) 471 0 0
exStbDemo [4 GB] 14841 17 471 †f †f - - - Mf (Mf) Mf (Mf) - - -
exStbDemo [28 GB] 14841 17 471 †f †f - - - Tf (Tf) Tf (Tf) - - -
8.1 threadRename 6 17 0 <1 3 0 0 0 <1 (<1) 3 (3) 0 0 0
8.2 ﬁleExists 19 17 0 <1 3 0 0 0 <1 (<1) 3 (3) 0 0 0
8.3 readLine 27 17 11 <1 3 10 1 0 <1 (<1) 3 (3) 10 1 0
8.4 getCommand 269 17 61 <1 6 60 1 0 <1 (<1) 3 (3) 60 1 0
8.5 powerDown 9 17 0 <1 2 0 0 0 <1 (<1) 2 (2) 0 0 0
8.6 digitStart 12 17 0 <1 2 0 0 0 <1 (<1) 2 (2) 0 0 0
8.7 digitAdd 34 17 2 <1 2 2 0 0 <1 (<1) 2 (2) 2 0 0
8.8 checkEndOfPvrStream 32 13 13 <1 2 13 0 0 <1 (<1) 2 (2) 13 0 0
8.9 checkEndOfMediaStream 28 1 1 <1 2 1 0 0 <1 (<1) 2 (2) 1 0 0
8.10 commandLoop 545 17 53 Mf Mf - - - Mf (Mf) Mf (Mf) - - -
8.11 checkCommandParams 238 17 269 Tb Tb 0 0 269 Tb (Tb) Tb (Tb) 0 0 269
8.12 signal handler 13 17 0 <1 2 0 0 0 <1 (<1) 2 (2) 0 0 0
8.13 setupFBResolution 29 17 0 <1 2 0 0 0 <1 (<1) 2 (2) 0 0 0
8.14 setupFramebuﬀers 115 17 8 <1 3 8 0 0 <1 (<1) 3 (3) 8 0 0
8.15 main Thread 68 17 4 Tf Tf - - - <1 (<1) 4 (4) 3 1 0
8.16 set to raw 8 17 0 <1 3 0 0 0 <1 (<1) 3 (3) 0 0 0
8.17 set to buﬀered 8 17 0 <1 2 0 0 0 <1 (<1) 2 (2) 0 0 0
Table 3.6: Results of the comparison between CBMC and ESBMC on a industrial case study.80 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
Lettnin et al. [114] describe a semiformal veriﬁcation methodology that adopts simu-
lation and formal veriﬁcation. This solution uses the frontend of BLAST tool [88] to
convert a C program to a CFG and uses the SymC model checker [83] to verify the
design properties. Lettnin et al. apply this methodology in a case study to verify the
locking and unlocking rules of a driver. This approach, however, faces memory overﬂow
problems due to the BDD-based model checking algorithm and consequently the authors
have to set a threshold on the number of states during the static veriﬁcation. Lettnin
et al. [115] extends [114] to combine assertion-based veriﬁcation and symbolic simulation
for the veriﬁcation of embedded software with hardware dependencies. However, their
approach does not produce counter-examples and therefore becomes hard to debug the
code in case of a failing property.
SMT-based BMC is gaining popularity in the formal veriﬁcation community due to
the advent of sophisticated SMT solvers built over eﬃcient SAT solvers [20, 31, 57].
Previous work related to SMT-based BMC [71, 181, 11] combined decision procedures
for the theories of uninterpreted functions, arrays and linear arithmetic only, but did
not encode key constructs of the ANSI-C programming language such as bit operations,
ﬁxed-point arithmetic and pointers. Ganai and Gupta describe a veriﬁcation framework
for BMC which extracts high-level design information from an extended ﬁnite state
machine (EFSM) and applies several techniques to simplify the BMC problem [71, 72].
However, the authors ﬂatten structures and arrays into scalar variables in such a way that
they use only the theory of integer and real arithmetic in order to solve the veriﬁcation
problems that come out in BMC.
Armando et al. also propose a BMC approach using SMT solvers for C programs [11].
However, they only make use of linear arithmetic (i.e., addition and multiplication by
constants), arrays, records and bit-vectors in order to solve the VCs. As a consequence,
their SMT-CBMC prototype does not address important constructs of the ANSI-C pro-
gramming language such as non-linear arithmetic and bit-shift operations. Kroening
also encodes the VCs generated by the front-end of CBMC by using the bit-vector arith-
metic and does not exploit other background theories of the SMT solvers to improve
scalability [105]. Donaldson et al. present an approach to compute invariants in BMC of
software by means of k-induction [63]. Their method, however, is highly customized for
checking assertions representing DMA operations in the Cell processor, which requires
only a small number of loop iterations and thus allows k-induction to work well with
a small value of k. Xu proposes the use of SMT-based BMC to verify real-time sys-
tems by using TCTL to specify the properties [181]. The author considers an informal
speciﬁcation (written in English) of the real-time system and then models the variables
using integers and reals and represents the clock constraints using linear arithmetic
expressions.
De Moura et al. present a bounded model checker that combines propositional SAT
solvers with domain-speciﬁc theorem provers over inﬁnite domains [59]. DiﬀerentlyChapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software 81
from other related work, the authors abstract the Boolean formula and then apply a
lazy approach to reﬁne it in an incremental way. This approach is applied to verify
timed automata and RTL level descriptions. Jackson et al. [98] discharge several VCs
from programs written in the Spark language to the SMT solvers CVC3 and Yices as
well as to the theorem prover Simplify. The idea of this work is to replace the Praxis
prover by CVC3, Yices and Simplify in order to generate counter-example witnesses to
VCs that are not valid. In [99], Jackson and Passmore extend [98] by implementing a
tool to automatically discharge VCs using SMT solvers. The authors observed signiﬁcant
performance improvement of the SMT solvers if compared to the Praxis prover. Jackson
and Passmore, however, focus on translating VCs into SMT from programs written in the
SPARK language (which is a subset of the Ada language) instead of ANSI-C programs.
Recently, a number of static checkers have been developed in order to trade oﬀ scalability
and precision. Calysto is an automatic static checker that is able to verify VCs related
to arithmetic overﬂow, null-pointer dereferences and assertions speciﬁed by the user [13].
The VCs are passed to the SMT solver SPEAR which supports boolean logic and bit-
vector arithmetic and is highly customized for the VCs generated by Calysto. However,
Calysto does not support ﬂoating-point operations and unsoundly approximates loops by
unrolling them only once. As a consequence, soundness is relinquished for performance.
Saturn is another automatic static checker that scales to larger systems, but with the
drawback of losing precision by supporting only the most common integer operators and
performing at most two unwindings of each loop [180]. In contrast to [13, 180], the ex-
tended static checker for Java (ESC/JAVA) is a semi-automatic veriﬁcation tool, which
requires the programmer to supply loop, function, and class invariants and thus limits
its acceptance in practice [69]. In addition, ESC/Java employs the Simplify theorem
prover [62] to verify user-supplied invariants and thus important constructs of the pro-
gramming language (e.g., bitwise operation) are often encoded imprecisely using axioms
and uninterpreted functions.
3.8 Conclusions
In this chapter, we have investigated SMT-based veriﬁcation of ANSI-C programs, with
a focus on embedded software. In this sense, we have described a new set of encodings
that allow us to reason accurately about bit operations, unions, ﬁxed-point arithmetic,
pointers and pointer arithmetic and implemented it in the ESBMC tool. As far as
we are aware, no encoding into the SMT theories existed that can reliably handle full
ANSI-C. With these encodings we have successfully achieved the ﬁrst objective stated
in Section 1.2.
Moreover, our experiments constitute, to the best of our knowledge, the ﬁrst substan-
tial evaluation of SMT-based BMC on industrial applications. The results show that82 Chapter 3 SMT-based Bounded Model Checking for Embedded ANSI-C Software
ESBMC outperforms CBMC [42] and SMT-CBMC [11] if we consider the veriﬁcation
of embedded software. ESBMC is able to model check ANSI-C programs that involve
tight interplay between non-linear arithmetic, bit operations, pointers and array manip-
ulations. In addition, it was able to ﬁnd undiscovered bugs in the NECLA, PowerStone,
Siemens, SNU-RT, VERISEC and WCET benchmarks related to arithmetic overﬂow,
buﬀer overﬂow, invalid pointers and pointer arithmetic.
SMT-CBMC still has limitations not only in the veriﬁcation time (due to the lack of
simpliﬁcation based on high-level information), but also in the encodings of important
ANSI-C constructs used in embedded software. CBMC is a SAT-based BMC tool for full
ANSI-C, but it has limitations due to the fact that the size of the propositional formulae
increases signiﬁcantly in the presence of large data-paths and high-level information is
lost when the VCs are converted into propositional logic (preventing potential optimiza-
tions to reduce the state space to be explored). Its prototype SMT-based back-end is
still unstable and fails on a large fraction of our benchmarks.Chapter 4
Verifying Multi-threaded
Software using SMT-based
Context-Bounded Model
Checking
We describe and evaluate three approaches to model check multi-threaded software with
shared variables and locks using bounded model checking based on Satisﬁability Mod-
ulo Theories (SMT) and our modelling of the synchronization primitives of the Pthread
library in order to achieve the second objective stated in Section 1.2. In the lazy ap-
proach, we generate all possible interleavings and call the SMT solver on each of them
individually, until we either ﬁnd a bug, or have systematically explored all interleavings.
In the schedule recording approach, we encode all possible interleavings into one single
formula and then exploit the high speed of the SMT solvers. In the underapproximation
and widening approach, we reduce the state space by abstracting the number of inter-
leavings from the proofs of unsatisﬁability generated by the SMT solvers. In all three
approaches, we bound the number of context switches allowed among threads in order to
reduce the number of interleavings explored. We implemented these approaches in ES-
BMC, our SMT-based bounded model checker for ANSI-C programs. Our experiments
show that ESBMC can analyze larger problems and substantially reduce the veriﬁca-
tion time compared to state-of-the-art techniques that use iterative context-bounding
algorithms or counter-example guided abstraction reﬁnement.
4.1 Introduction
Bounded model checking (BMC) has already been successfully applied to verify software
and to discover subtle errors in real systems [24]. In an attempt to cope with growing
8384 Chapter 4 Verifying Multi-threaded Software
system complexity, Boolean Satisﬁability (SAT) solvers are increasingly replaced by
Satisﬁability Modulo Theories (SMT) solvers to prove the validity of the generated
veriﬁcation conditions (VCs) [11, 53, 71]. Recently, there have also been attempts to
extend BMC to the veriﬁcation of multi-threaded software [73, 102, 103, 152]. The
main challenge here is the state space explosion problem, as the number of possible
interleavings grows exponentially with the number of threads and program statements.
However, two important observations can help us. First, most concurrency bugs in real
applications have been found to be shallow so that only a few context switches are
required to expose them [150]. We can thus use a context-bounded analysis [112, 171]
that limits the number of context switches it explores. Second, SAT and SMT solvers
produce unsatisﬁable cores that allow us to remove logic that is not relevant to a given
property [129]. Grumberg et al. [84] showed that the unsatisﬁable cores can also be
used to control the number of allowed interleavings of the given set of processes. They
proposed a SAT-based BMC method to model check a multi-process system using a
series of under-approximated models. However, their method does not combine context-
bounded analysis with symbolic algorithms, which limits its usefulness for verifying
multi-threaded software. It has also not been applied in conjunction with the SMT
solvers.
In the previous chapter, we extended the encodings from previous SMT-based BMC [11,
71] to provide more accurate support for variables of ﬁnite bit width, bit-vector opera-
tions, arrays, structures, unions and pointers. Here, we continue this work and develop
and evaluate three related approaches for model checking multi-threaded ANSI-C soft-
ware. In contrast to previous fully symbolic approaches (e.g., [73, 102, 103, 152, 84]), we
combine symbolic model checking with explicit state space exploration. In particular,
we explicitly explore the possible interleavings (up to the given context bound) while we
treat each interleaving itself symbolically. This approach is similar to the recent ESST
approach by Cimatti et al. [39], but we handle ANSI-C instead of SystemC, we use
BMC instead of predicate abstraction, and place no restrictions on the scheduler. Our
approaches all implicitly use the reachability tree (RT) derived from the system, but
diﬀer in the way they exploit it. In the lazy approach, we traverse the RT depth-ﬁrst,
and simply call the single-threaded BMC procedure on the interleaving whenever we
reach an RT leaf node. We stop the RT traversal either when we ﬁnd a bug, or have sys-
tematically explored all interleavings. In the schedule recording approach, we use the RT
to encode all the possible execution paths into one single formula, which is then fed into
the SMT solver. In a third approach, we extend the under-approximation and widening
(UW) algorithm [84] with the purpose of addressing the veriﬁcation of real-world C code
using diﬀerent background theories and SMT solvers.
This chapter makes two major novel contributions. First, we exploit SMT to improve
BMC of multi-threaded software. We describe a comprehensive SMT-based BMC proce-
dure to support the checking of multi-threaded C programs that use the synchronizationChapter 4 Verifying Multi-threaded Software 85
primitives of the POSIX Pthread Library [135]. Second, we describe and evaluate three
related approaches to SMT-based BMC. This work also marks the ﬁrst application of the
UW algorithm in combination with context-bounded model checking to verify non-trivial
multi-threaded C software. Experiments obtained with the extended ESBMC show that
our approaches can analyze larger problems and substantially reduce the veriﬁcation time
compared to state-of-the-art techniques that use iterative context-bounding algorithms
and others that implement counter-example guided abstraction reﬁnement (CEGAR)
techniques.
4.2 Preliminaries
In the widely adopted interleaving paradigm for multi-threaded programs, the notion
of concurrency is represented by that of interleaving, i.e., the non-deterministic choice
between activities of the simultaneously acting threads [40]. If only a single core is
available, the actions of the diﬀerent threads must obviously be interleaved on this core;
however this concept also applies to multiple cores, as there are many diﬀerent possible
orderings between truly concurrent events [14]. An interleaving represents a possible
execution of the program where all of the concurrent events are arranged in a linear
order. Any change of the active thread in an interleaving is called a context switch.
The interleaving paradigm relies on a scheduler, which selects the concurrently executing
threads according to a given strategy. This abstracts from the speed of the participating
threads and thus models any possible realization by a single-core machine or by several
cores with arbitrary speeds. However, in order to fully verify a multi-threaded program
against a given speciﬁcation, all possible interleavings must be considered. This results
in a large state space that must be explored by a model checker.
4.2.1 Multi-threaded Goto Programs
We consider multi-threaded ANSI-C programs in asynchronous mode and assume that
all threads in the program only communicate through shared global variables. ESBMC
handles full ANSI-C, but for presentation, we use a minimal language similar to the
internal goto-language of the CBMC model checker [42]. It is expressive enough to
model multi-threaded programs. We summarize the language in Figure 4.1.
A multi-threaded goto-program is a (numbered) list of commands. Commands include
assignments, non-deterministic assignments (V ar = ∗), blocking statements (assume)
to cut oﬀ subsequent executions paths, and assertion statements (assert) to indicate
user-speciﬁed properties. All control structures are represented by explicit (conditional)
jumps to a statement l ∈ {1,...,n}. A thread t is a sublist of commands between
begin thread and end thread. Threads are created via asynchronous procedure calls86 Chapter 4 Verifying Multi-threaded Software
Prop ::= Var | true | false | Prop ∧ Prop |...| Exp = Exp | ...
Exp ::= Var | Const | Var[Exp] | Exp + Exp | ...
Cmd ::= skip | Var = Exp | Var = ∗ | assume Prop | assert Prop
| goto l | if Prop goto l | begin atomic | end atomic
| begin thread Id | end thread
| Var = start thread Id | join thread Var
Prog ::=Cmd;...;Cmd
Figure 4.1: Multi-threaded Goto Program Language
(start thread), which return an integer that can be used as thread identiﬁer for synchro-
nization (join thread); hence, dynamic thread creation is allowed. Atomic statements
(atomic begin and atomic end) indicate that a code segment cannot be preempted by
another thread. Figure 4.2 shows an example of a multi-threaded C program and its
representation in the multi-threaded goto-language. In this running example, we have
three threads t1, t2 and main. Each thread contains one or more eﬀective statements,
i.e., statements that can inﬂuence the program state. In our minimal language, the
only eﬀective statements are assignments and assertions, since control-ﬂow tests cannot
inﬂuence the state. In the example in Figure 4.2(b), thread t1 contains two eﬀective
statements (in lines 3 and 5), thread t2 contains three (in lines 9, 10 and 12), while
thread main contains one (in line 18).
4.2.2 Formal Model of Multi-threaded Software
The multi-threaded software to be analyzed is modelled as a tuple M = (S,S0,T,V )
(cf. Deﬁnition 2.14), where:
• S is a ﬁnite set of states, with S0 ⊂ S the set of initial states;
• T = t0,t1,...,tn is the set of threads, where n represents the total number of
threads;
• V = Vglobal ∪
 
Vj where Vglobal is the set of global variables and Vj is the set of
local variables of tj.
We assume that each variable ranges over a ﬁnite domain. A state s ∈ S consists of
the values of the global and local variables, including a local program counter for each
thread. Each thread j is a tuple tj = (Rj,lj), where:
• Rj ⊆ S × S is the transition relation of thread tj;
• lj =  l
j
i  is the sequence of thread locations l
j
i at time step i.88 Chapter 4 Verifying Multi-threaded Software
L3: x = x +1
L4: if !(x > 1)
L5: x = x - 1
L2: START_THREAD 1
L6: END_THREAD
FALSE
TRUE
L9: x = x +1
L11: if !y
L12: x = x - 1
L7: START_THREAD 2
L13: END_THREAD
FALSE
TRUE
L8: _Bool y
L10: y = x>1
Figure 4.3: CFG of two threads of the goto program shown in Figure 4.2 (b).
The execution of the instructions of each thread tj is modelled by means of transition
relations and we use the notation R
j
i(s,s′) to denote that s′ is a successor of s obtained
by executing at time step i an instruction of thread tj. We deﬁne Ri(s,s′) =
 
j R
j
i(s,s′)
and R(s,s′) =
 
i Ri(s,s′). Finally, a particular program location, denoted by l
j
0 is
designated as the entry point of thread tj.
4.2.3 Context-Bounded Encoding
As described in Section 2.3, our work considers multi-threaded programs in asynchronous
mode and assumes that the threads in the program only communicate through shared
(global) variables and synchronize to avoid the simultaneous access to shared variables.
This means that at all times only one thread is running until a context switch occurs
and another thread resumes its execution. Figure 4.4 shows an example of one possible
concurrent execution of the two threads from Figure 4.3.
In our approach, we only consider eﬀective context switches, i.e., context switches to
eﬀective statements. An ECS block then deﬁnes as a sequence of program statements
that are executed with no intervening ECS. This deﬁnition is key to our context-bounded
translation, because we only allow context switches before visible statements (i.e., before
global variables and synchronization points). If the program statements are invisible,
we group them into one ECS block thus reducing the number of possible concurrent
executions.Chapter 4 Verifying Multi-threaded Software 89
start_thread 1 start_thread 2
ECS Block 0
x = 1
ECS Block 1
x = 2, y
ECS Block 2
x = 1
ECS Block 3
x = 1, y = false
ECS Block 4
end_thread ECS Block 5
end_thread
Figure 4.4: Concurrent execution of two threads.
In order to obtain a bounded multi-threaded C program, we bound the number of context
switches between the ECS blocks up to C, as described in detail in the next sections. The
technique is incomplete because there might still be a counterexample that requires more
context switches than the speciﬁed context-bound C, but it is both sound and precise
for context-bounded executions of multi-threaded programs. The technique of bounding
the number of context switches was originally proposed by Qadeer and Rehof [150], but
the authors apply this idea on Boolean programs using pushdown automata. Recently,
a number of context-bounded translations for model checking Boolean [112, 171] and C
programs [111, 152] have been proposed in the literature, but they neither use bounded
model checking to generate the VCs nor SMT solvers to check the validity of the VCs.
4.3 Context-Bounded Model Checking of Multi-threaded
Software
This section describes how to exploit SMT techniques to improve BMC of multi-threaded
software. In particular, we exploit SMT solvers to prune the property and data depen-
dent search space and to remove thread interleavings that are not relevant by analyzing
proofs of unsatisﬁability. We then propose three approaches to SMT-based BMC and
show how the lazy, schedule recording, and UW approaches are encoded into BMC
framework of multi-threaded software.
4.3.1 Exploring the Reachability Tree
In order to describe reachable states of a multi-threaded goto program, we use a reach-
ability tree (RT) that is obtained by unfolding the set of running threads.
Deﬁnition 4.1. For a multi-threaded program with n active threads, each node in the
RT is a tuple ν = (Ai,Ci,si, l
j
i,G
j
i n
j=1)i for a given time step i, where:90 Chapter 4 Verifying Multi-threaded Software
• Ai represents the currently active thread;
• Ci represents the context switch number;
• si represents the current state;
• l
j
i represents the current location of thread j;
• G
j
i represents the control ﬂow guards accumulated in thread j along the path from
l
j
0 to l
j
i.
Since threads only communicate via global variables, we only need to consider context
switches at visible instructions, i.e., synchronization points and statements containing
global variables. As in Gupta et al. [73], we do not model context switches inside individ-
ual visible statements. This is safe as long as the statements only read or write a single
global variable, but in general it is an under-approximation. However, we have not en-
countered any problems in the benchmarks we have used. Additionally, we do not model
context switches between a visible control-ﬂow test and the next visible statement, since
the test cannot inﬂuence the state. However, note that we can simulate the eﬀect of a
context switch right after a visible test by hoisting the test out of the conditional, and
assigning its result to a new auxiliary variable, as shown in thread t2 in Figure 4.2(a).
ESBMC can be conﬁgured to automatically insert such auxiliary variables. Finally,
we also assume sequential consistency, as is common in model checking multi-threaded
software [44, 102, 138, 152].
In order to expand the RT and explore all possible interleavings, we symbolically execute
each instruction of the multi-threaded goto-program. This takes as input the program
and the current RT node, and generates its children according to the set of rules described
below. We assume that we expand an RT node ν at time step i and that the guard G
Ai
i
of the thread tAi is enabled in state si (i.e., that the corresponding formula is satisﬁable),
so that the thread can potentially execute the instruction I at location l
Ai
i .
R1 (ASSIGN): If I is an assignment x = e, then we symbolically execute I, which
generates a new state si+1. We then add as child to ν a new node ν′
ν′ = (Ai,Ci,si+1, l
j
i+1,G
j
i )i+1 (4.1)
where the active thread remains unchanged. We increment the location of the active
thread only (i.e., l
Ai
i+1 = l
Ai
i +1) and leave all other locations and all guards unchanged;
however, note that the evaluation of the guards can change under the new state si+1,
and hence threads may become enabled.
We have fully expanded ν ifChapter 4 Verifying Multi-threaded Software 91
• l
Ai
i is within an atomic block; or
• I contains no global variable (since we allow context switches only at visible in-
structions); or
• we have reached the upper bound of context switches to be explored (i.e., Ci = C).
If ν is not yet fully expanded, we then also explore all context switches, up to the given
context bound C. For each thread j  = Ai where G
j
i is enabled in si+1, we thus create a
new child node
ν′
j = (j,Ci + 1,si+1, l
j
i,G
j
i )i+1 (4.2)
In ν′
j we then continue the RT exploration with thread j executing in the state produced
by the current thread Ai.
R2 (SKIP): If I is a skip-statement with target l, then we simply increment the location
of the current thread and continue with it. However, we explore no context switches,
i.e., we only add a single child node
ν′ = (Ai,Ci,si, l
j
i+1,G
j
i )i+1 (4.3)
where l
j
i+1 = l
j
i + 1 only if j = Ai and l
j
i+1 = l
j
i otherwise.
R3 (unconditional GOTO): If I is an unconditional goto-statement with target l, then we
simply set the location of the current thread and continue with it. However, we explore
no context switches, i.e., we only add a single child node
ν′ = (Ai,Ci,si, l
j
i+1,G
j
i )i+1 (4.4)
where l
j
i+1 = l only if j = Ai and l
j
i+1 = l
j
i otherwise.
R4 (conditional GOTO): If I is a conditional goto-statement with test c and target l,
then we create two child nodes ν′ and ν′′ for both possible outcomes of the test. For ν′,
we assume that c is true and proceed with the target instruction of the jump, similar to
unconditional jumps. However, we also add c to the guards of all other threads, since it
may contain global variables, and may thus enable or disable other transitions.1 Hence,
we construct
ν′ = (Ai,Ci,si, l
j
i+1,c ∧ G
j
i )i+1 (4.5)
1 Note that any thread local variables in c are of course inaccessible to the other threads.92 Chapter 4 Verifying Multi-threaded Software
where l
j
i+1 = l if j = Ai and l
j
i+1 = l
j
i otherwise. For ν′′, we add ¬c to the guards and
continue with the next instruction in the current thread, i.e.,
ν′′ = (Ai,Ci,si, l
j
i+1,¬c ∧ G
j
i )i+1 (4.6)
where l
j
i+1 = l
j
i + 1 if j = Ai and l
j
i+1 = l
j
i otherwise. We prune one of the nodes if the
condition is determined in the current state (i.e., either evaluates to true or to false).
Note that we are not exploring any possible context switches (even if I is visible), since
the condition cannot change the global state.
R5 (ASSUME): If I is an assume-statement with argument c, then we proceed similar
to the way described in R1. We continue with the unchanged state si but add c to all
guards, as described in R4. If c ∧ G
j
i evaluates to false, we prune the execution paths.
R6 (ASSERT): If I is an assert-statement with argument c, then we proceed similar
to the way described in R1. We continue with the unchanged state si but add c to
all guards, as described in R4. We also generate a veriﬁcation condition to check the
validity of c.
R7 (START THREAD): If I is a start thread instruction, we just add the indicated thread
to the set of active threads, i.e., we add a node
ν′ = (Ai,Ci,si, l
j
i+1,G
j
i+1 n+1
j=1)i+1 (4.7)
where ln+1
i+1 is the initial location of the indicated thread, and Gn+1
i+1 = G
Ai
i , i.e., the
thread starts with the guards of the currently active thread.
R8 (JOIN THREAD): If I is a join thread instruction with argument Id, then we add a
child node
ν′ = (Ai,Ci,si, l
j
i+1,G
j
i )i+1 (4.8)
where l
j
i+1 = l
Ai
i +1 only if the joining thread Id has exited. We model this by an addi-
tional variable exitj that is set to false when begin thread Id is called. When end thread
is reached, we set exitj to true to indicate that thread Id has exited.
The remaining instructions (begin atomic, end atomic, begin thread, and end thread) are
just scoping constructs and do not contribute to the expansion of the RT. As example,
we consider the C program with two threads and the corresponding goto-program, as
shown in Figure 4.2(a) and (b). This example is modiﬁed slightly from Ghafari et al. [74],
where it is used to check (by increasing the number of increments) the scalability ofChapter 4 Verifying Multi-threaded Software 93
diﬀerent context-bounded analysis algorithms. Both threads increment a global variable
x, and then, depending on the value of x, decrement it again. t2 uses a local variable
y to store the value of x and uses this in the test (cf. lines 12–13). This simulates a
possible context switch between the evaluation of the guard and the execution of the
next statement. Figure 4.3 shows the CFG representation of the two threads t1 and t2.
Note that this example contains an assertion violation in line 23, where the invariant
x = 1 does not hold under speciﬁc thread interleavings.
Figure 4.5 shows a fragment of the reachability tree for threads t1 and t2 (where t0 repre-
sents the main thread). We build this by ﬁrst executing the goto-program of Figure 4.2(b)
sequentially, i.e., in the same order that the threads are created. In this case, we ﬁrst exe-
cute the statements of t1 (i.e., lines 3-5), followed by the statements of t2 (i.e., lines 8-12).
The initial node of the RT fragment is ν0 = (t0,0,s0, (L16,true),(L2,true),(L7,true) ),
i.e., the main thread t0 is active at line 16, the program is before the ﬁrst context switch,
the state s0 has x = 0 and y undeﬁned, and both threads t1 and t2 have just been started,
i.e., are at their initial location with guards true. To expand the RT, we check which
threads are enabled from ν0.2 Since t1 and t2 are both enabled and since our approach
always expands the enabled thread with the smallest index, we expand the transitions
of t1. The transition relation R1
1(s0,s1) of t1 that represents the assignment x = x + 1
is deﬁned as follows:
R1
1(s0,s1) ⇔ l1
1 = L3 ∧ x1 = x0 + 1 ∧ ∀v ∈ V \ {x} : v1 = v0
The ﬁrst term corresponds to the unconditional edge from line 2 to 3 (see Figure 4.3). The
second term deﬁnes the new value of the shared variable x. The third term ensures that
the values of V , but not x, do not change in the transition from s0 to s1. To create node
ν1, we apply rule R1, which gives us ν1 = (t1,1,s1, (L16,true),(L3,true),(L7,true) ).
We then check again which threads are enabled and expand t1 as the enabled thread
with the smallest index. The transition relation that represents the branch at program
location L4 is deﬁned by a case-split on the value of x in state s1.
R1
2(s1,s2) ⇔ l1
2 =
 
L6 : ¬(x1 > 1),
L5 : otherwise
∧∀v ∈ V : v2 = v1
The transition does not aﬀect the global state (as the condition ¬(x1 > 1) holds), so we
only increment the program location but do not create a new node in the RT (described in
rule R4). Therefore, to expand the next node from ν1, we check again which threads are
enabled and since t1 has executed all its statements, we then expand the ﬁrst instruction
of thread t2. The transition relation R2
3(s2,s3) of t2 is similar to R1
1(s0,s1). We thus apply
2We ignore interleavings with t0 to simplify the presentation.9
4
C
h
a
p
t
e
r
4
V
e
r
i
f
y
i
n
g
M
u
l
t
i
-
t
h
r
e
a
d
e
d
S
o
f
t
w
a
r
e
n0: t0, 0, x=0, y
(L16,true), (L2,true), 
(L7,true)
n1: t1,1, x=1,y
(L16,true), (L3,true), 
(L7,true)
n2: t2, 2, x=2,y=false
(L16,true), (L6,true),
(L9,true)
n3: t2, 3, x=2,y=true
(L16,true), (L6,true),
(L10,true)
n4: t2, 4, x=1,y=true
(L16,true), (L6,true),
(L12,false)
n5: t2, 2, x=2,y=false
(L16,true), (L3,true),
(L9,true)
n6: t1, 3, x=1,y=false
(L16,true), (L5,false),
(L9,true)
n7: t2, 4, x=1,y=false
(L16,true), (L5,false),
(L10,true)
n8: t2, 3, x=2,y=true
(L16,true), (L3,true),
(L10,true)
n9: t1, 4, x=1,y=true
(L16,true), (L5,false),
(L10,true)
n10: t2, 5, x=0,y=true
(L16,true), (L5,false),
(L12,false)
n12: t2,1, x=1,y=false,
(L16,true),(L2,true),
(L9,true)
n13: t1, 2, x=2,y=false
(L16,true),(L3,true),
(L9,true)
n14: t1, 3, x=1,y=false
(L16,true), (L5,false),
(L9,true)
n15: t2,4,x=1,y=false
(L16,true), (L5,false),
(L10,true)
n16: t2, 3, x=2,y=true
(L16,true), (L3,true),
(L10,true)
n17: t1, 4, x=1,y=true
(L16,true), (L5,false),
(L10,true)
n18: t2, 5, x=0,y=true
(L16,true),(L5,false),
(L12,false)
n20:t2, 2, x=1,y=false
(L16,true), (L2,true),
(L10,true)
n21: t1, 3, x=2,y=false
(L16,true), (L3,true),
(L13,true)
n22: t1, 4, x=1,y=false
(L16,true), (L5,false),
(L13,true)
n11: t2, 4, x=1,y=true
(L16,true), (L5,true),
(L12,false)
n19: t2, 4, x=1,y=true
(L16,true), (L3,true),
(L12,false)
F
i
g
u
r
e
4
.
5
:
F
r
a
g
m
e
n
t
o
f
t
h
e
r
e
a
c
h
a
b
i
l
i
t
y
t
r
e
e
o
f
t
h
e
m
u
l
t
i
-
t
h
r
e
a
d
e
d
g
o
t
o
-
p
r
o
g
r
a
m
o
f
F
i
g
u
r
e
4
.
2
(
b
)
.
N
o
d
e
s
w
i
t
h
d
a
s
h
e
d
l
i
n
e
r
e
p
r
e
s
e
n
t
p
r
o
g
r
a
m
l
o
c
a
t
i
o
n
s
t
h
a
t
v
i
o
l
a
t
e
t
h
e
a
s
s
e
r
t
i
o
n
s
t
a
t
e
m
e
n
t
i
n
l
i
n
e
1
8
o
f
F
i
g
u
r
e
4
.
2
(
b
)
.Chapter 4 Verifying Multi-threaded Software 95
rules R1 and R2 to derive ν2 = (t2,2,s2, (L16,true),(L6,true),(L9,true) ). ν3 and ν4
are derived in the same way. After creating ν4, both t1 and t2 do not have enabled
transitions and we backtrack to explore pending transitions from previous nodes; in this
case, we have already explored ν3 and ν2 and continue the RT exploration at ν1.
4.3.2 Lazy Approach
The idea of the lazy approach to verify multi-threaded software is to traverse the RT
depth-ﬁrst, and to call the single-threaded BMC procedure on each interleaving whenever
we reach an RT leaf node. We stop the RT traversal either when we ﬁnd a bug, or have
systematically explored all interleavings. This approach seems obvious, but to the best
of our knowledge, it has not been formalized nor evaluated in the literature. Figure 4.6
details how the lazy approach works. Formally, given an RT Υ = {ν1,...,νN} that
represents the program unfolding for a context bound C and a bound k, and a property
φ, we derive a VC ψπ
k for a given interleaving (or computation path) π = {ν1,...,νk}
such that ψπ
k is satisﬁable if and only if φ has a counterexample of depth k that is
exhibited by π. As always in our work, the VC ψπ
k is a quantiﬁer-free formula in a
decidable subset of ﬁrst-order logic, which is checked for satisﬁability by an SMT solver.
The model checking problem associated with SMT-based BMC of a given π is then
formulated by constructing the logical formula [11, 71]:
ψπ
k =
constraints       
I(s0) ∧ R(s0,s1) ∧ ... ∧ R(sk−1,sk)∧
property
    
¬φk (4.9)
Here, φk represents a safety property φ in step k, I is the function for the set of initial
states of M and Ri(si,si+1) is the function representing the transition relation of M at
time steps i and i + 1, as described by the states in the nodes of π. If ψπ
k is satisﬁable,
then φ is violated and the SMT solver provides a satisfying assignment, from which
we can extract the values of the program variables to construct a counterexample. A
counterexample for a property φ is a sequence of states s0,s1,...,sk with s0 ∈ S0, and
R(si,si+1) for 0 ≤ i < k. If ψπ
k is unsatisﬁable, we can conclude that no error state is
reachable in length k along π.
On the face of it, the lazy approach seems to be naive: despite the context-bounding, the
RT and thus the number of interleavings can grow very quickly, and we need to invoke
the SMT solver several times to check the satisﬁability of formula (4.9), which might slow
down the veriﬁcation process. However, there are several observations that make this
approach worthwhile. First, if the program contains any errors at all, they will often be
exhibited in a substantial fraction of the interleavings (cf. Qadeer and Rehof [150] and
our evaluation in Section 4.6 for experience on benchmarks and applications), so that in
practice we only need to explore a small part of the search space until we ﬁnd the ﬁrst96 Chapter 4 Verifying Multi-threaded Software
Step 1: Initialize the stack with the initial node ν0 and the initial path π0 =  ν0 .
Step 2: If the stack is empty, terminate with “no error”.
Step 3: Pop the current node ν and current path π oﬀ the stack and compute the set
ν′ of successors of ν using rules R1-R8.
Step 4: If ν′ is empty, derive the VC ψπ
k for π using formula (4.9), and call the SMT
solver on it. If ψπ
k is satisﬁable, terminate with “error”; otherwise, goto step 2.
Step 5: If ν′ is not empty, then for each node ν ∈ ν′, add ν to π, and push node and
extended path on the stack. Goto step 3.
Figure 4.6: Algorithm of the lazy approach.
error. In our running example, the invariant x = 1 does not hold for the two nodes ν10 and
ν18 and if we traverse the RT depth-ﬁrst and left-to-right, the error already shows up in
the third interleaving. Second, we do not need to actually build the entire RT; instead, we
only keep in memory nodes on computation paths that are still unexplored and expand
them one path at a time. We then construct the VC for the chosen computation path
and feed it into the SMT solver to check for satisﬁability. Third, and most important,
we can leverage the optimizations from the ESBMC front-end (e.g., constant folding
and constant propagation as described in Chapter 5) to exploit which transitions are
enabled in a given state to drive the exploration of the interleavings and to reduce both
the number of interleavings to be explored and the size of the formulas sent to the SMT
solver. For example, if we continue to explore thread t1 from node ν1, the front-end
exploits the fact that x = 1 to infer that the guard in line 4 holds. t1 thus continues
in line 6, and terminates, so that the exploration continues with a context switch to
thread t2, as shown in node ν2. Note that our current implementation does not check
the satisﬁability of the accumulated guards, and simply assumes that all running threads
are enabled, unless they have explicitly been blocked or their guards evaluate to false.
Implementing this could further reduce the size of the RT to be explored.
In summary, the lazy approach guides the symbolic execution between the threads and
systematically explores all the possible interleavings in a lazy way. This approach can ﬁnd
bugs fast and the VCs are relatively small, since they correspond to a single interleaving
only, but as the front-end invokes the SMT solver, once for each possible computation
path, it can suﬀer performance degradation, in particular for correct programs where
we have to explore all possible interleavings.
4.3.3 Schedule Recording Approach
State-of-the-art SMT solvers are built on top of eﬃcient SAT solvers to speed up the
performance on large problems by exploiting the support for conﬂict clauses and non-
chronological backtracking [163]. In the schedule recording approach we leverage thisChapter 4 Verifying Multi-threaded Software 97
and avoid invoking the SMT solver repeatedly. We thus build the RT as before to
systematically explore the interleavings, but we now add schedule guards [103] to record
in which order the scheduler has executed the program. Figure 4.7 shows how schedule
guards are added to the program during the exploration of the left-hand side of the
RT in Figure 4.5. We then encode all interleavings into a single large formula, which is
ﬁnally passed to the SMT solver.
 L2: (t1, #0) 
L7: (t2, #0)
L3: (t1, #1)
ts1 == 1 -> x = x+1
L9: (t2, #2)
ts2 == 2 -> x = x+1
L10: (t2, #3)
ts3 == 2 -> y = x>1
L12: (t2, #4)
ts4 == 2 -> x = x-1
L9: (t2, #2)
ts2 = 2 -> x = x+1
L5: (t1, #3)
ts3 == 1 -> x = x-1
L10: (t2, #4)
ts4 == 2 -> y=x>1
L10: (t2, #3)
ts3 == 2 -> y = x>1
L5: (t1, #4)
ts4 == 1 -> x = x-1
L12: (t2, #5)
ts5 == 2 -> x = x-1
L12: (t2, #4)
ts4 == 2 -> x = x-1
Figure 4.7: Schedule recording applied to the left-hand side of the RT in Figure 4.5.
Since control-ﬂow tests cannot inﬂuence the state, we only need to add guards to ef-
fective statements, i.e., assignments and assertions (as described in Section 4.2.3). Each
eﬀective program statement is then preﬁxed by a schedule guard tsi = j where tsi is
the thread selection variable for the i-th ECS and j is the thread identiﬁer. Its intuitive
interpretation is that the statement can only be executed if thread j is scheduled to run
after the i-th ECS. For example, the schedule guard ts1 = 1 at L3 encodes that the
assignment x = x + 1 can only be executed if t1 runs after the ﬁrst ECS.
The schedule guards are added when program statements are executed symbolically
and become part of the produced veriﬁcation conditions. They can be derived from the
RT nodes, i.e., for node νi we construct the guard tsCi = Ai. The thread selection
variables are free variables that the SMT solver will instantiate with concrete values.
The instantiation of all thread selection variables corresponds to the choice of a speciﬁc
interleaving. In our running example, if the SMT solver chooses ts1 = 1, ts2 = 2,
ts3 = 2, and ts4 = 2, then the model checker simulates the eﬀect of executing the98 Chapter 4 Verifying Multi-threaded Software
program statements at L3,L9,L10, and L12 (in that order). Note that the ordering of
statements within a thread is of course still ensured by the program order semantics, so
that the program statement at L10 will not be executed before the program statement
at L9 (i.e., we ensure sequential consistency [44, 102, 138, 152]). We further deﬁne a
schedule SCH to determine which interleavings should be considered and encode the
guards in (4.10) as:
ψk =
constraints       
I(s0) ∧ R(s0,s1) ∧ ... ∧ R(sk−1,sk)∧
property
    
¬φk
∧
scheduler       
SCH(s0) ∧ ... ∧ SCH(sk) (4.10)
Here SCH(si) represents a constraint on the schedule guard of state si. If we do not
impose any schedule constraints, then we formulate
 k
i=0 SCH(si) = true and all possi-
ble interleavings are considered. However, if we want to apply aggressive reductions (for
example by exploiting the proofs of unsatisﬁability as described in the next subsection),
we can add constraints to SCH to force the removal of interleavings that do not con-
tribute to checking a given property. Although we can bound the number of preemptions
and exploit which transitions are enabled in a given state when we build formula (4.10),
the number of threads and context switches can still grow very large quickly, and easily
lead to formulae that overwhelm the solver.
4.3.4 UW Approach
The core idea of the under-approximation and widening (UW) approach is to check
models with an increasing set of allowed interleavings [84]. We start from an underap-
proximation describing a single interleaving and widen the model by adding more inter-
leavings incrementally based on the proof objects generated from an SMT solver [57].
We thus exploit the SMT solvers to remove possible undesired models of the program in
order to satisfy a given property. This is possible because the SMT solvers can conclude
that a given model is unsatisﬁable without even using all of its constraints (since some
of them might be redundant).
We deﬁne ψ′ as an underapproximated model of ψ, i.e., ψ′ = ψ ∧ SCH(s0) ∧ ... ∧
SCH(sk), where we introduce constraints on the schedule guards. We can see that if ψ
is unsatisﬁable, then ψ′ is also unsatisﬁable; however, it is possible that ψ is satisﬁable
while ψ′ is not, due to the constraints on the schedule. Thus, ψ′ can be thought of as
an underapproximation of ψ and each satisfying assignment of ψ′ is also a satisfying
assignment to ψ. The main steps of the UW algorithm are shown in Figure 4.8.
The additional literals clij introduce constraints on the schedule guards (e.g., clij →
tsi = j), which allow us to guide the widening process according to the variables thatChapter 4 Verifying Multi-threaded Software 99
Step 1: Add control literals clij (where i is the ECS number and j is the thread iden-
tiﬁer) to the VC ψk.
Step 2: Add negated control literals ¬clij to the schedule SCH, except those enabling
the ﬁrst interleaving.
Step 3: Check satisﬁability of ψk; if ψk is satisﬁable, then terminate with “error”.
Step 4: Check whether the proof objects generated by the SMT solver contains any
control literals; if not terminate with “no error”.
Step 5: Remove literals that are contained in the proof objects from the schedule SCH
and go to step 3.
Figure 4.8: Algorithm of the UW approach.
participate in the proof of unsatisﬁability produced by the SMT solver. This means that
the schedule is now updated based on the information extracted from the proof, which
aims to remove interleavings that are not relevant for checking a given property [84,
129]. Note that the way that we encode the underapproximation diﬀers from Grumberg
et al. [84]. Grumberg et al. encode an underapproximation using m × n control literals,
where m is the number of control points that guard each program statement and n is
the number of threads. In our encoding, we use e × n control literals, where e is the
number of ECS (with e ≤ m) and n is the number of threads. If we were to include a
control literal for each statement as in [84], then our solution might not scale in practice
to large multi-threaded software systems.
4.3.5 Pruning the RT with Partial Order Reduction
In the modelling of multi-threaded software, we consider that any of the threads j ∈ T
is able to make a transition and then we have to compute all states for which a thread j
exists. The problem is that the number of states to be explored can grow dramatically
with the number of program statements and threads. The purpose of the Partial-Order
Reduction (POR) technique [40, 75, 146] is to reduce the number of states that have to
be explored. This is done in a way that if the property holds on the reduced model, it
also holds on the original model.
In our SMT-based BMC framework, as threads communicate only through global vari-
ables, we apply partial order reduction techniques at two levels in our algorithm. At the
ﬁrst level, we apply the visible instruction analysis POR (VI-POR) [146], which removes
the interleavings of instructions that do not aﬀect the global variables (i.e., we remove
transitions which are independent from transitions made by any other thread). As we
mentioned in Section 4.3.1, an instruction is visible only if it accesses a global variable,
and it is invisible otherwise. VI-POR is “hard-wired” into our approach, due to the way
we build the ECS blocks.Chapter 4 Verifying Multi-threaded Software 101
n0 : t0,0,x=0,y=0
(L3, L6, L9)
n1 : t1,1,x=1,y=0
(L5, L6,L9)
n2 : t2,2,x=2,y=0
(L5, L8, L9)
n5 : t3,2,x=1,y=1
(L5, L6, L11)
n7: t2,1,x=1,y=0
(L3, L8,L9)
n8 : t1,2,x=2,y=0
(L5, L8, L9)
n10 : t3,2,x=1,y=1
(L3, L8, L11)
n12: t3,1,x=0,y=1
(L3, L6,L11)
n13: t1,2,x=1,y=1
(L5, L6, L11)
n15: t2,2,x=1,y=1
(L3, L8, L11)
n3: t3,3,x=2,y=1
(L5, L8, L11)
n6 : t2,3,x=2,y=1
(L5, L8, L11)
n9: t3,3,x=2,y=1
(L5, L8, L11)
n11: t2,3,x=2,y=1
(L5, L8, L11)
n14: t3,3,x=2,y=1
(L5, L8, L11)
n16: t1,3,x=2,y=1
(L5, L8, L11)
Figure 4.10: The reachability tree for threads t1, t2, and t3 of the multi-threaded
goto-program of Figure 4.9(b). Edges with dashed line represent transitions that can
be eliminated by RW-POR.
In order to implement the RW-POR technique, we compute the sets of variables written
(WRj) and read (RDj) by each of the threads. In particular, if
WRj ∩ (
 
k =j
RDk ∪ WRk) = ∅ (4.11)
and
RDj ∩
 
k =j
WRk = ∅ (4.12)
i.e., if the intersection between the set of visible variables that are written and read by
thread j and all other threads is empty, then we only explore the successors generated
by executing j while all other transitions can be safely ignored.
For instance, in Figure 4.10 we get the node ν1 from the initial node ν0 after executing
the program statement x = x + 1 of thread t1. We can see that from node ν1, we still
have statements from threads t2 and t3 to execute. However, since thread t3 does not
share any global variable with thread t1, then we can safely ignore the transition from
node ν1 to ν5 (and consequently from node ν5 to node ν6). This reduction is safe because
the diﬀerent order of execution between the statements of threads t2 to t3 (or vice-versa)
from node ν1 always results in the same state. Hence, the RW-POR technique exploits
the commutativity of concurrent transitions that result in the same state when they are
executed in diﬀerent orders. In our example, the transitions that can be safely eliminated
by applying the RW-POR technique are indicated by edges with dashed line, as shown
in Figure 4.10. Figure 4.11 shows the RT of Figure 4.10 after applying the RW-POR102 Chapter 4 Verifying Multi-threaded Software
technique.
n0 : t0, 0, x=0, y=0
(L3, L6, L9)
n1 : t1, 1, x=1, y=0
(L5, L6,L9)
n2 : t2, 2, x=2, y=0
(L5, L8, L9)
n7 : t2, 1, x=1, y=0
(L3, L8,L9)
n8 : t1, 2, x=2, y=0
(L5, L8, L9)
n3 : t3, 3, x=2,y=1
(L5, L8, L11)
n9 : t3, 3, x=2, y=1
(L5, L8, L11)
Figure 4.11: The reachability tree for threads t1, t2, and t3 after applying the RW-
POR technique.
In summary, there are six possible combinations of visible instructions of diﬀerent
threads, as shown in Table 4.1. There are three particular situations to consider when
we build the reachability tree, as follows:
1. Two read operations from the same global variable, but from diﬀerent threads will
not modify the state, so they will always generate equivalent interleavings.
2. Two program statements accessing diﬀerent variables are independent w.r.t. their
execution states, thus these two program statements always generate equivalent
interleavings with both execution orders.
3. Two program statements accessing the same global variable in such a way that at
least one of them is a write access (i.e., with read-write and write-write relations)
will generate non-equivalent interleavings.
In all three cases, the read-write relation actually causes read-write races and the write-
write relation causes the write-write races. Consequently, only two types of relations
will generate non-equivalent interleavings, while all other four types of relations gener-
ate equivalent interleavings. Those redundant interleavings are simply removed in our
approach.
PORs work best in conjunction with an alias analysis. Our algorithms are able to
remove redundant interleavings originating from pointer aliasing by dereferencing the
actual thread parameters before building the reachability tree. This means that when
a given thread is created with an argument (e.g., a pointer to a void type) and thisChapter 4 Verifying Multi-threaded Software 103
Access Relations to Read-Read Read-Write Write-Write
Same variable Equivalent Non-equivalent Non-equivalent
Diﬀerent variables Equivalent Equivalent Equivalent
Table 4.1: Read-write analysis of interleaving equivalence between visible instructions.
argument is used by the thread, we ﬁrst get the object that the pointer points to before
we apply the POR algorithms to build the reachability tree.
4.4 Verifying Race Conditions and Atomicity Violations
Concurrency bugs are tricky to reproduce and debug because they usually occur under
speciﬁc thread interleavings. When verifying multi-threaded programs, it is important
to detect data race conditions and atomicity violations, which are consistently ranked
as the most common and diﬃcult source of concurrency faults [61, 117]. This section
presents our instrumentation to check for data race and atomicity violations in the
multi-threaded goto programs.
4.4.1 Detecting Data Races
In a multi-threaded program with shared variable communication between the threads,
data race conditions occur when multiple threads perform unsynchronized accesses to
the shared variable [67, 139, 155, 158, 161]. In particular, a data race occurs when two
(or more) threads access a shared variable at the same time and at least one of them
is a write access. In a multi-threaded program, data races are often manifestations of
bugs, because they may cause the program to behave in ways that are not expected by
the developers.
There are two situations where data races may occur when two threads have access to
the same shared variable simultaneously. In read-write races, one of the operations is
read and the other one is write. Here, data race occurs because the value is changed
by the write operation at the same time when the value is read by the read operation.
In write-write races, both operations are writing (diﬀerent) values to the same variable.
Here, data race occurs because the ﬁrst value written is overwritten by the other write
operation.
We can identify both types of data races by breaking visible statements into two stages.
In the ﬁrst stage, we copy the value of the global variable into a local temporary variable
and allow context switch. In the second stage, we check if the current value of the variable
is the same as the copied value and if so we perform the assignment; otherwise we have
detected an error (i.e., the assertion in the atomic section is violated). Figure 4.12106 Chapter 4 Verifying Multi-threaded Software
4.5.1 Modelling Mutex Locking Operations
The Pthread library supports two functions to implement mutual exclusion between
threads called pthread mutex lock and pthread mutex unlock [143]. Both functions
take as argument a data structure called mutex that has two states, “locked” and “un-
locked”. The function pthread mutex lock locks the mutex if it is unlocked; otherwise
it blocks the current thread until the mutex is unlocked and can successfully be locked
again. The function pthread mutex unlock simply unlocks a locked mutex. Compu-
tation paths are blocked on a mutex when a thread tries to lock a mutex that has
already been locked by another thread. As an example, consider the threads tA and
tB, which both lock and unlock the same mutex m, as shown in Figure 4.15. The paths
A0;A1;B0;B1 and B0;B1; A0;A1 are non-blocking or wait-free while the other two paths
are blocked.
A0
A1
B0
B1
B0
A0
B0
A0 A1
B1
A1
B1
START_THREAD
A0: lock(*m)
A1: unlock(*m)
END_THREAD
START_THREAD
B0: lock(*m)
B1: unlock(*m)
END_THREAD
Figure 4.15: Computation paths blocking on a mutex.
A strategy to model mutex operations based on the notion of wait-free paths was pro-
posed by [151, 152]. Instead of blocking the computation paths starting with A0;B0 and
B0;A0, they are simply ignored by modelling the function pthread mutex lock(m) as
atomic {assume(∗m == 0);∗m = 1}
where the statement assume(∗m == 0) cuts oﬀ subsequent paths if the mutex is already
locked. pthread mutex unlock(m) is then modelled as
atomic {assert(∗m == 1);∗m = 0}
which simply checks if the mutex is already locked. If so, the lock is released; otherwise, a
thread tries to unlock a mutex that has not been locked previously, and we have detected
an error.
This is suﬃcient to ﬁnd bugs related to data races and lock acquisition ordering, but
not to detect local and global deadlocks [151, 152]. We thus model pthread mutex lock108 Chapter 4 Verifying Multi-threaded Software
and we only lock it after the ﬁrst call to pthread mutex lock. In subsequent calls, we
increase the value of the variable c lock, allow context switches, check if the mutex m
was unlocked, and then assert c lock < trds in run. If the assertion fails, a deadlock was
detected: a thread is blocked by a lock operation on a mutex and the required mutex
never gets unlocked by the thread that owns it, either because the locking thread has
exited or because it has been blocked by another operation. If the assertion holds, we
then eliminate this execution as described above.
As example, Figure 4.17 shows a code fragment extracted (and slightly modiﬁed) from
the INSPECT suite [182], which aims to capture a concurrent scenario typically used in
database systems, as described in [183]. For the sake of simplicity, we show the code for
threads t1 and t2 only, which support two distinct classes of operations A and B (see
lines 9 and 20) on a shared database. The intuitive interpretation of these operations is
that the threads can run concurrently only if they belong to the same operation class.
Here, the global variables A and B count the number of threads that are performing
operations A and B respectively. The mutex l is used for the mutual exclusion between
threads of distinct classes, while the mutex m is used for the mutual exclusion between
threads of the same class.
The code shown in Figure 4.17 tries to implement the concurrent scenario described
above, but it contains a subtle error (i.e., a local deadlock) that is exposed only un-
der speciﬁc thread interleavings. Since we have four threads and two locked (i.e.,
trds in run = 4 c lock = 2), our approach does not detect the local deadlock imme-
diately. A local deadlock is detected only when the exploration of the running threads
(that are not in deadlock) terminates and the invariant c lock < trds in run becomes
false.
One possible thread interleaving to expose this error is to execute the program statements
of threads t1 and t2 in the following order: t1,5, t1,6, t1,7, t2,16, t2,17 and t2,18 (the term
tj,i denotes that j is the thread identiﬁer and i is the program statement). The full
counterexample produced by our model checker is shown in Appendix D.
4.5.2 Modelling Conditional Waiting
We model the functions pthread cond wait, pthread cond signal, and pthread cond broadcast
from the Pthread library that implement conditional waiting [143]. All functions take
as argument a condition variable c that has also two states, “locked” and “unlocked”;
pthread cond wait also takes a mutex argument. Our modelling of the conditional wait-
ing operation again employs the notion of wait-free execution paths. pthread cond wait
is used to block the thread on a condition variable; the blocked thread is woken up only
if another thread calls signal or broadcast. If several threads are blocked on a condition
variable, then pthread cond signal non-deterministically unblocks at least one of them112 Chapter 4 Verifying Multi-threaded Software
4.6 Experimental Evaluation
We have implemented the lazy, schedule recording, and UW approaches described in
Section 4.3 in ESBMC. In our experiments, we have used ESBMC v1.15.1 together with
Z3 v2.11 [57], which was the most eﬃcient SMT solver in our previous experiments [53]
(see also Chapter 3).
The experimental evaluation of our work consists of three parts. In Section 4.6.1, we com-
pare our approaches against the Monotonic Partial Order Reduction (MPOR) [103] and
Peephole Partial Order Reduction (PPOR) [178] that are implemented in an SMT-based
bounded model checker using the Yices SMT solver [65]. In Section 4.6.2, we compare
our lazy approach against CHESS v0.1.30626.0 [136, 137], which is a concurrency test-
ing tool for C# programs. CHESS supports iterative context-bounding by exploring the
various thread schedules deterministically. In Section 4.6.3, we compare our approaches
against SATABS version 2.5 [44] connected to Cadence SMV [124], which is a state-of-
the-art C model checker and supports the veriﬁcation of multi-threaded software with
shared variables using the CEGAR technique.
All experiments were conducted on an otherwise idle Intel Pentium Dual CPU, 2GHz
and 3GHz with 4 GB of RAM running Windows and Linux OS respectively. For all
benchmarks, the time limit has been set to 3600 seconds to check all properties at
once. All times given are wall clock time in seconds as measured by the unix time
command through a single execution. In our experiments, we chose CHESS [136, 138]
and SATABS [44] as two of the most widely used veriﬁcation tools.
4.6.1 Comparison to MPOR and PPOR
We use the dining philosophers model to evaluate our approaches against MPOR and
PPOR. MPOR and PPOR combine dynamic partial order reduction [68] with symbolic
state space exploration for model checking multi-threaded software. Both MPOR and
PPOR thus explore all necessary interleavings by dynamically tracking interactions be-
tween the threads interleavings and adding constraints to allow automatic pruning of
redundant interleavings in the SMT solver. However, MPOR is based on the notion of
quasi-monotonic sequences of thread-ids, i.e., if all transitions enabled at a global state
are independent then MPOR needs to explore just one interleaving, which is chosen to
be the one in which transitions are executed in increasing (monotonic) order of their
thread-ids; while PPOR is based on the notion of guarded independent transitions, i.e.,
transitions that can be considered as independent in certain execution paths. MPOR
is optimal (i.e., remove all redundant interleavings) for multi-threaded programs with
more than two threads while PPOR is optimal for programs with two threads.
Since the benchmarks used by Kahlon et al. [103] are not available, we re-implemented114 Chapter 4 Verifying Multi-threaded Software
eaten at least once. We also show in column #FI/#I that all interleavings generated
by our lazy ESBMC are satisﬁable, i.e., that each interleaving exhibits the error. In
summary, our lazy approach outperforms both MPOR and PPOR for those benchmarks
that generate satisﬁable formulae and is still comparable to MPOR and PPOR when
the generated formulae are unsatisﬁable.
4.6.2 Comparison to CHESS
CHESS is a concurrency testing tool for C# programs. It implements iterative context-
bounding and explores the various thread schedules deterministically [136, 138]. CHESS
requires idempotent unit tests that it repeatedly executes in a loop, exploring a diﬀerent
interleaving on each iteration. In this respect, it is similar to our lazy approach; however,
CHESS is a purely dynamic, test-based tool and originally employed a stateless search
technique, although its latest version (v0.1.30626.0) performs state hashing based on
happens-before graph to avoid exploring the same state redundantly.
Table 4.3 shows the detailed results of the comparison between ESBMC and CHESS on a
2GHz machine. reorder, twostage, and wronglock are diﬀerent versions of a reader/writer
program [156]. The numbers (x,y) indicate that we have x instance(s) of thread tset and
y instance(s) of thread treader. According to [156], increasing the number of instances
of a given thread while keeping constant the number of instances of the other thread,
substantially increases the “semantic hardness” of the error discovery. Note that all
these benchmarks only check for a single, violated property. micro is a synthetic micro-
benchmark [74] (shown in Figure 1.1 of Chapter 1) which checks a single valid property.
It is used to check the scalability of multi-threaded software veriﬁcation tools. The
number in brackets indicates the total number of visible statements on each thread. In
the table, L is the size of the code (in lines), and T the total number of threads. B is
the number of BMC unrolling steps for each loop, while C is the context switch bound.
Except for reorder 6 bad, C is set to the minimum number of context switches required
to expose the error. We increase further the number of context switches for reorder 6 bad
because we want to check the scalability of both tools. Time is the time in seconds until
the error is found; timeouts are denoted by TO. For ESBMC, I is the total number of
generated interleavings, while FI is the total number of failed interleavings. The column
itr gives the number of iterations required to prove or disprove the property in the UW
approach. For CHESS, Tests reports the approximate number of tests executed, which
is not related to the number of interleavings. Both tools identify the property violation
(resp. conﬁrm that it holds) in all cases where they do not run out of time or memory.
As we can see in Table 4.3, CHESS is eﬀective for programs where there are a small
number of threads, but it does not scale that well and consistently runs out of time when
we increase the number of threads. In general, CHESS times out when the number of
threads increases beyond six. The relatively poor scalability of CHESS has already beenChapter 4 Verifying Multi-threaded Software 115
observed by [156]. In contrast, our lazy algorithm is able to ﬁnd bugs quickly even when
we increase the number of threads and the context bound, and consistently outperforms
CHESS as well as the schedule recording and UW approaches. However, note that our
lazy algorithm runs out of memory for test cases 16 and 18 when we increase the number
of context switches to 18 and 13 respectively.
4.6.3 Comparison to SATABS
SATABS is an ANSI-C model checker which supports the veriﬁcation of multi-threaded
software with shared variables using the CEGAR technique. We compare our approaches
against SATABS v2.5 [44] based on Cadence SMV using a number of multi-threaded
programs taken from standard benchmark suites. Table 4.4 shows the results achieved
on a 3Ghz machine. Programs that end on “bad” contain an error (i.e., at least one of
the properties is satisﬁable) while those that end on “ok” are correct. Here, #P gives
the number of properties to be veriﬁed for each program, which includes array bounds,
pointer safety, division by zero, deadlock and order violations checks. A context bound
of ∞ means that we did not specify a bound. A “-” result indicates that the tool failed
with an error such as internal (†) and reﬁnement (RF) failure, memory overﬂow (MO),
time-out (TO), or failed to detect errors in the program. A “+” indicates that the tool
detected the error or proved all VCs.
Programs 1-6 are concurrent implementations of stack, queue, and circular buﬀer data
structures; programs 1 and 2 are extracted from an embedded application [51]. Pro-
grams 7-14 are from the INSPECT benchmark [182] and use mutex and condition syn-
chronization primitives from the Pthread library. Programs 15-17 are from the VV-lab
benchmarks [156] and contain common concurrency bugs such as data races, atomicity
and order violations. Programs 18-20 are embedded applications that run on a dual
core processor; they are implemented in a commercial set-top box product from NXP
semiconductors [141]. Program 21 is the same synthetic micro-benchmark described in
Section 4.6.2, but here we increase further the number of context switches to check the
scalability of our approaches.
As we can see in Table 3.3, SATABS produces reﬁnement failures (RF) and fails with
internal errors (†) for most programs. These programs contain linear arithmetic oper-
ations with arrays and the predicate abstraction technique implemented in SATABS
seems to suﬀer from a lack of precision when dealing with arrays. However, the ability of
a veriﬁcation tool to check such programs is particularly important as many real-world
multi-threaded programs belong to this class. SATABS also times out for large programs
or for programs with many threads (cf. programs 7, 8, 9, 13, and 21). Additionally, SA-
TABS gives false positives on programs 14-16, which contain known bugs related to data
races, atomicity and order violations.1
1
6
C
h
a
p
t
e
r
4
V
e
r
i
f
y
i
n
g
M
u
l
t
i
-
t
h
r
e
a
d
e
d
S
o
f
t
w
a
r
e
CHESS Lazy Schedule UW
Test Program #L #T B C
T
i
m
e
T
e
s
t
s
T
i
m
e
#FI / #I
T
i
m
e
T
i
m
e
I
t
r
1 reorder 3 bad (2,1) 84 3 3 4 1 200 <1 1/29 <1 <1 4
2 reorder 4 bad (3,1) 84 4 4 5 98 13000 <1 1/82 1 4 5
3 reorder 5 bad (4,1) 84 5 5 6 TO 429000 <1 1/277 4 18 6
4 reorder 6 bad (5,1) 84 6 6 7 TO 396000 <1 1/853 36 72 7
5 reorder 6 bad (5,1) 84 6 6 8 TO 371000 <1 1/2810 225 592 7
6 reorder 6 bad (5,1) 84 6 6 9 TO 367000 <1 1/8124 MO MO 1
7 twostage 3 bad (2,1) 128 3 3 4 4 500 1 1/35 1 3 5
8 twostage 4 bad (3,1) 128 4 4 4 215 27000 2 1/42 1 4 5
9 twostage 5 bad (4,1) 128 5 5 4 TO 384000 2 1/44 1 5 5
10 twostage 6 bad (5,1) 128 6 6 4 TO 366000 2 1/45 2 5 5
11 wronglock 4 bad (1,3) 110 4 4 8 21 3000 5 2/489 10 89 9
12 wronglock 5 bad (1,4) 110 5 5 8 724 93000 10 3/2869 50 408 9
13 wronglock 6 bad (1,5) 110 6 6 8 TO 356000 18 4/12106 225 2060 9
14 wronglock 7 bad (1,6) 110 7 7 8 TO 330000 34 5/39100 MO MO 1
15 micro 2 ok (100) 247 2 1 2 316 35855 <1 0/4 <1 <1 1
16 micro 2 ok (100) 247 2 1 17 TO 400000 1095 0/131072 MO MO 1
17 micro 3 ok (100) 365 3 1 2 TO 272000 <1 0/9 <1 <1 1
18 micro 3 ok (100) 365 3 1 12 TO 290000 1021 0/121393 MO MO 1
Table 4.3: Results of the comparison between ESBMC (v1.15.1) and Microsoft CHESS (v0.1.30626.0).C
h
a
p
t
e
r
4
V
e
r
i
f
y
i
n
g
M
u
l
t
i
-
t
h
r
e
a
d
e
d
S
o
f
t
w
a
r
e
1
1
7
SATABS Lazy Schedule UW
Test Program L T P B C
T
i
m
e
R
e
s
u
l
t
T
i
m
e
R
e
s
u
l
t
#FI / #I
T
i
m
e
R
e
s
u
l
t
T
i
m
e
R
e
s
u
l
t
I
t
r
1 circular buﬀer ok [51] 111 2 9 8 ∞ † − 477 + 0/12870 MO − MO − 1
2 circular buﬀer bad [51] 109 2 8 8 5 † − <1 + 3/32 2 + 11 + 6
3 queue ok [55] 147 2 12 41 ∞ RF − 3 + 0/6 3 + 3 + 1
4 queue bad [55] 153 2 15 41 8 † − 3 + 91/256 50 + 373 + 7
5 stack ok [55] 105 2 5 11 12 † − 225 + 0/4094 1026 + 1097 + 1
6 stack bad [55] 106 2 6 11 4 RF − <1 + 4/16 2 + 6 + 4
7 fsbench ok [182] 81 26 47 26 2 † − 252 + 0/676 304 + 301 + 1
8 fsbench bad [182] 80 27 48 27 2 † − <1 + 729/729 360 + 786 + 2
9 indexer ok [182] 77 13 21 129 4 TO − 595 + 0/17160 220 + 218 + 1
10 stateful20 ok [182] 60 2 3 20 10 † − 95 + 0/1024 487 + 518 + 1
11 sync02 ok [182] 74 2 6 21 21 RF − 44 + 0/121 60 + 60 + 1
12 sync02 bad [182] 74 2 6 21 21 RF − 8 + 5/186 132 + 383 + 3
13 aget-0.4 bad [182] 1233 3 279 200 2 3346 + 137 + 1/1 127 + 125 + 1
14 bzip2smp ok [182] 6366 3 8568 1 9 TO − 1800 + 0/1294 MO − MO − 1
15 reorder 10 bad (9,1) [156] 84 10 7 10 11 1 − <1 + 1/154574 MO − MO − 1
16 twostage 100 bad (99,1) [156] 128 100 13 100 4 2 − 88 + 1/139 93 + 195 + 5
17 wronglock 8 bad (1,7) [156] 110 8 8 8 8 2 − 90 + 6/104015 MO − MO − 1
18 exStbHDMI ok [141] 1060 2 24 16 20 TO − 229 + 0/1 226 + 213 + 1
19 exStbLED ok [141] 425 2 45 10 10 RF − 73 + 0/11 73 + 787 + 1
20 exStbThumbs bad [141] 1109 2 249 2 1 317 + 95 + 3/3 14 + 12 + 1
21 micro 10 ok (100) [74] 1171 10 10 1 17 TO − 254 + 0/29260 MO − MO − 1
Table 4.4: Results of the comparison between SATABS (v2.5) and ESBMC (v1.15.1).118 Chapter 4 Verifying Multi-threaded Software
Note that SATABS uses predicate abstraction and reﬁnement, and in some sense tries to
solve a harder problem than bounded model checking. However, the results in Table 4.4
indicate that this problem may still be too hard for multi-threaded applications, as
SATABS is unable to prove the required properties.
We can also see in Table 3.3 that if the program contains errors at all, these errors indeed
generally occur in most interleavings explored; consequently, the lazy approach is very
fast for these cases. The notable exception is wronglock bad, where less than 0.1% of the
interleavings expose the error and SATABS is substantially faster than ESBMC (but fails
to ﬁnd the error); however, even here the lazy approach outperforms both the schedule
recording and UW approaches. Similarly, the lazy approach is capable of handling safe
programs in which the number of threads and context switches grows quickly, which
makes the formula harder and often “blows up” the SMT solver. The UW approach
is typically slower than schedule recording. We suspect that the proof generation of
the SMT solver (which is required to produce the unsatisﬁable cores) causes memory
overhead and corresponding slowdowns; this was also reported previously [56].
4.7 Related Work
SMT-based BMC is gaining popularity in the formal veriﬁcation community [57]. Ganai
and Gupta describe a veriﬁcation framework for BMC and apply several techniques to
simplify the BMC problem [71]. However, the authors focus on sequential software and
use only the theory of integer and real arithmetic, which does not reﬂect precisely the
ANSI-C semantics. Armando et al. also propose a BMC approach using SMT solvers
for sequential ANSI-C programs [11] by using linear arithmetic, arrays, records and
restricted bit-vectors arithmetic but they do not address important constructs of the
ANSI-C language.
Cimatti et al. [39] describe an approach to verify SystemC that similarly combines
explicit state space exploration (i.e., the explicit exploration of the diﬀerent possible
interleavings) with symbolic model checking (i.e., the symbolic representation and up-
dates of the state). However, we use BMC instead of predicate abstraction, and we
implement a realistic scheduler, i.e., our scheduler may preempt a thread at any visible
instruction in its execution, whereas [39] encodes the semantics of the non-preempting
SystemC scheduler. We also exploit the SMT techniques on large problems by encoding
all possible interleavings into a single formula.
Qadeer and Rehof present a pragmatic method to discover bugs in concurrent software in
which the program analysis is restricted to executions with a bounded number of context
switches [150]. However, the authors do not apply it to realistic and large concurrent soft-
ware benchmarks and the integration of this context-bounded model checking algorithmChapter 4 Verifying Multi-threaded Software 119
into the explicit state model checker ZING is left for future work. Rabinovitz and Grum-
berg describe an extension of the CBMC model checker to concurrent C programs [152],
which translates C threads into SSA form and adds constraints for a bounded number
of context-switches, as described in [150]. This approach, however, is limited to two
threads, and requires the user to run the model checker twice in order to detect diﬀerent
types of bugs (“regular” and concurrency bugs). It is also only evaluated on a concurrent
bubblesort, but not on a set of realistic applications.
Ganai and Gupta describe a lazy method for modelling multi-threaded concurrent sys-
tems using shared variables [73], but this method is also restricted to two threads. Gupta
et al. [103] extend [73, 102] by supporting more than two threads and by combining dy-
namic partial order reduction with symbolic state space exploration. The benchmarks
that have been reported are a parameterized version of the dining philosophers model,
which are untypical multi-threaded C programs. Grumberg et al. propose an algorithmic
method based on SAT and BMC to model check a multi-process system based on a series
of under-approximated models [84]. This approach, however, does not integrate context-
bounded analysis and it does not address the problem of model checking multi-threaded
C software.
4.8 Conclusions
We have presented three diﬀerent approaches to model check multi-threaded ANSI-C
software with shared variable communication between the threads. The lazy approach
iteratively generates all possible interleavings and calls the BMC procedure on each
interleaving. The schedule recording approach systematically encodes all possible in-
terleavings into one formula. The underapproximation and widening approach checks
models with an increased set of allowed interleavings. The main contribution of all
three approaches is in the combination of symbolic model checking with explicit state
space exploration. As far as we are aware, the lazy approach has not been described or
evaluated in the literature. Similarly, the underapproximation-widening approach has
not been used for bounded model checking of multi-threaded software. The diﬀerence
between our schedule recording and Gupta et al. [103] is that they work in a fully sym-
bolic context. With these novel approaches we have successfully achieved the second
objective stated in Section 1.2.
Additionally, we have presented our modelling of the synchronization primitives of the
Pthread library that allows us to detect not only atomicity and order violations, but also
local and global deadlock, that previous attempts are unable to ﬁnd [73, 102, 103, 152].
Surprisingly, our approach to check constraints lazily is extremely fast for programs
that contain errors and to a lesser extent even for safe programs in which the num-
ber of threads and context switches grows quickly. The experimental results also show120 Chapter 4 Verifying Multi-threaded Software
that the lazy approach generally outperforms not only the schedule recording and UW
approaches, but also CHESS [136] and SATABS [44] tools on several non-trivial bench-
marks. As far as we are aware, there is no other work that considers a comprehensive
SMT-based BMC procedure to verify multi-threaded ANSI-C software by combining
symbolic model checking with explicit state space exploration. In future work, we plan
to explore the use of Craig interpolants to prove non-interference of context switches
among the threads up to a given depth and develop an eﬃcient method on top of ES-
BMC to localize faults in multi-threaded programs.Chapter 5
Implementation of ESBMC
Chapter 5 describes the main software components of the ESBMC architecture. Addi-
tionally, in order to achieve the third objective stated in Section 1.2, we describe the
simpliﬁcations and heuristics that we used in order to reduce the unwound formula
and to determine the best representation for the program variables. It also evaluates
the simpliﬁcations and heuristics, which show a substantial performance improvement
over a large set of benchmarks. The results described in the previous chapters have
been achieved using the implementation described here, so this chapter should not be
interpreted as a continuation of the previous chapters, but a “separation of concerns”.
5.1 Introduction
ESBMC is a context-bounded model checker for embedded ANSI-C software based on
SMT solvers. It allows the veriﬁcation engineer to:
• verify single- and multi-threaded software (with shared variables and locks);
• reason about arithmetic under- and overﬂow, pointer safety, memory leaks, array
bounds, atomicity and order violations, deadlock, data race, and user-speciﬁed
assertions;
• verify programs that make use of bit-level, arrays, pointers, structs, unions, mem-
ory allocation and ﬁxed-point arithmetic.
ESBMC does not require the user to annotate the programs with pre/post-conditions,
but allows the user to state additional properties using assert-statements, that are then
checked as well. It also provides three approaches (lazy, schedule recording, and under-
approximation and widening) to model check multi-threaded software. ESBMC can be
invoked through the command-line interface or conﬁgured through the Eclipse plug-in
121122 Chapter 5 Implementation of ESBMC
(see Appendix A). ESBMC converts the veriﬁcation conditions using diﬀerent back-
ground theories and passes them directly to an SMT solver. In addition, ESBMC can
output veriﬁcation conditions using the SMT logics QF AUFBV and QF AUFLIRA.
ESBMC is built on top of the CProver framework; the next section explains ESBMC’s
overall architecture, and the framework modiﬁcations.
5.2 Tool Architecture
Figure 5.1 shows its main software components. Every step of the model checking
process in ESBMC is implemented within a separate software component. ESBMC is
written in C++ and can be executed on all major operating systems and machines (i.e.,
32-bit Windows/x86, 32-bit Linux/x86, 64-bit Windows/x64 and 64-bit Linux/x64). In
Figure 5.1, the white boxes (except for the SMT solver) represent the components that
we reused from the CProver framework without any modiﬁcation while the gray boxes
with dashed lines represent the components that we modiﬁed in order to:
1. generate automatically assertions to check for memory leaks, data races, atom-
icity and order violations and deadlocks (implemented in the component GOTO
program, see Subsections 3.4.7, 4.4 and 4.5);
2. extend the SSA form of the symbolic execution engine to avoid naming conﬂicts
when verifying multi-threaded programs (implemented in the component GOTO
symex, see Subsection 4.3.1);
3. simplify the unwound formula based on high-level information to prevent over-
burdening the solver (implemented in the component GOTO symex, see Subsec-
tions 3.3 and 5.3);
4. perform an up-front analysis in the CFG of the program to determine the best en-
coding and solver for a particular program (implemented in the component GOTO
symex, see Subsection 5.4).
The GOTO program component converts the ANSI-C program into a goto-program,
which simpliﬁes the representation (e.g., replacement of switch and while by if and
goto statements). The GOTO symex component performs a symbolic simulation of the
program, which thus handles the unrolling of the loops and the elimination of recursive
functions; and generates the veriﬁcation conditions to be encoded in the back-end.
In Figure 5.1, the gray boxes with solid lines represent new components that we imple-
mented from scratch in order to guide the symbolic execution via a thread scheduler
(see the component Scheduler) and to encode the given constraints and properties of
an ANSI-C program into a global logical context (see the components constraints and128 Chapter 5 Implementation of ESBMC
program by means of integer arithmetic produces wrong results if the bit-level operators
are treated as uninterpreted functions (UFs) because, even though UFs simplify the
proofs, they ignore the semantics of the operators and consequently make the formula
weaker. This problem occurs in several software model checkers (e.g., SMT-CBMC [11]
handles restricted bit-vectors arithmetic and BLAST [88] treats bit-level operations as
UFs, models integers as elements of Z and does not account for arithmetic overﬂows [21]),
which fail to check the assertion in line 9. In contrast, bit-vector arithmetic allows us
to encode bit-level operators in a more accurate way. However, in our benchmarks, we
noted that the majority of VCs are solved faster if we model the basic datatypes as Z
and R. Consequently, we have to trade oﬀ between speed and accuracy which are two
competing goals in formal veriﬁcation using SMT.
Based on the extent to which the SMT solvers support the domain theories and on
experimental results obtained with a large set of benchmarks, we developed a simple
but eﬀective heuristic to determine the best representation for the program variables
as well as the best SMT solver to be used in order to check the properties of a given
ANSI-C program:
1. Our default representation for encoding the constraints and properties of a given
ANSI-C program are integers and reals, respectively, and our default SMT solver
is Z3.
2. We then explore the CFG representation of the program.
3. If we ﬁnd expressions that involve bit-level operations (e.g., <<, >>, &,|, ⊕)
or typecasts from signed to unsigned datatypes and vice-versa, we encode the
corresponding variables as bit-vectors.
4. We switch the SMT solver to Boolector if no pointers are used but we keep Z3 if
pointers are used.
We adopted this strategy because we are able to implement the theory of tuples on top
of Z3 to model pointers and thus exploit the structure provided by the word-level instead
of bit-level models (i.e., instead of concatenating and extracting bit-vectors) [108].
5.5 Evaluation of Performance Improvements
We evaluate the eﬀectiveness of the simpliﬁcation techniques and the exploration of the
datatype representations described in Sections 5.3 and 5.4 resp. using 174 programs,
with a total size of 70K lines of code, taken as a representative sample from the bench-
mark suites Siemens, SNU-RT, PowerStone, NECLA and NXP. With all optimizations
enabled, ESBMC can check all 174 programs in 439 seconds, which serves as our baseline.130 Chapter 5 Implementation of ESBMC
5.6 Conclusions
We presented the main software components of the ESBMC architecture, the simpliﬁca-
tions that we applied to reduce the unwound formula and the heuristics that we used to
determine the best representation for the program variables. With the implementation
of these simpliﬁcations and heuristics we successfully achieved the third objective stated
in Section 1.2. Moreover, we have seen that every step of the model checking process in
ESBMC is implemented within a separate component. The communication between the
software components is conducted by means of well-deﬁned interfaces. Therefore, single
components of the model checking process in ESBMC could, in principle, be exchanged
independently. We have also shown that the simpliﬁcations CPstore, CPString and FS
reduce substantially the unwound formula that is passed to the SMT solvers. Addition-
ally, we also observed in our benchmarks that the majority of the veriﬁcation conditions
are solved faster if we model the basic datatypes as integer and/or real as speciﬁed in
the SMT-LIB.Chapter 6
Integrating ESBMC into Software
Engineering Practice
In this chapter, we describe an approach to integrate SMT-based bounded model check-
ing into the software engineering process by exploiting practices such as incremental
development and regression tests; this chapter is directed towards the fourth objective
stated in Section 1.2. In particular, our approach looks at the modiﬁcations suﬀered by
the software system since its last veriﬁcation, and submits them to a partly static, partly
dynamic “continuous” veriﬁcation process, guided by a set of test cases for coverage. A
case study from the telecommunications domain shows that the proposed approach can
potentially improve the error-detection capability and reduce the overall veriﬁcation
time.
6.1 Introduction
The complexity of software in embedded systems has increased signiﬁcantly over the
last years so that software veriﬁcation now plays an important role in ensuring the
overall product quality. In this context, bounded model checking has been successfully
applied to discover subtle errors, but for larger applications, it often suﬀers from the
state space explosion problem, as we pointed out in Chapter 3. We try to address this
bottleneck with a new concept called continuous veriﬁcation, which combines existing
ideas of software engineering (e.g., continuous integration [70]) and formal veriﬁcation
(e.g., equivalence checking [30]) communities.
The continuous veriﬁcation approach thus aims to automatically detect design errors
and integration problems as quickly as possible by exploiting information from the soft-
ware conﬁguration management (SCM) system, systematically focusing the veriﬁcation
eﬀort on new or modiﬁed functions. We use equivalence checking to determine whether
131132 Chapter 6 Integrating ESBMC into Software Engineering Practice
modiﬁed functions need to be re-veriﬁed formally and we use existing test cases to reduce
the search space for the model checker, thus combining dynamic and static veriﬁcation.1
The formal veriﬁcation community has extensively used equivalence checking for hard-
ware designs [30, 109], but there is little evidence that equivalence checking for large
embedded software will improve the scalability of software model checking. In partic-
ular, Godlin and Strichman describe an approach to prove the equivalence of similar
programs [78, 167] and apply it to random and industrial programs (e.g., ranging from
300 to 3000 lines of code). The authors claim that their approach takes from few seconds
to 30 minutes in order to prove equivalence on equivalent programs or it can take several
hours (or run out of memory) on non-equivalent programs. The results are inconclusive
since Godlin and Strichman do not specify how many functions they are able to prove in
their benchmarks, the time needed to check each one, how many functions are actually
equivalent and how often these functions are modiﬁed from one version to another. Mat-
sumoto et al. also describe an approach to check the equivalence of C programs using
the SMT solver CVC, but the authors restrict the C programs to be checked (e.g., no
pointer uses) due to limitations of their symbolic execution engine [121]. The paper also
does not provide suﬃcient details to compare their results to the results of our approach.
The main purpose of this chapter is thus to investigate whether the continuous veriﬁca-
tion approach can indeed substantially reduce the veriﬁcation time of large embedded
software using our SMT-based context-bounded model checker.
6.2 Continuous Veriﬁcation
The continuous veriﬁcation approach has its roots in the continuous integration (CI)
practice described by Fowler [70]. CI relies on every developer to create and execute unit,
functional and integration tests before committing their source code to a single source
repository. It also assumes the existence of an automated unit test framework. The SCM
is then used to perform the system build and test processes in a completely automatic
way. In continuous veriﬁcation, we use the same information (i.e., development history
and test cases), but in a diﬀerent way to improve the coverage and substantially reduce
the veriﬁcation time throughout the development of a product or product line. We
use SMT-based bounded model checking to verify for each system build that the entire
system still satisﬁes all properties given as assertions by the designers, as well as a
range of language-speciﬁc safety properties such as the absence of arithmetic under- and
overﬂow, out-of-bounds array indexing, NULL-pointer dereferencing, or memory leaks.
We also consider properties expressed in LTL which we can easily convert to C-monitors
via B¨ uchi automata [173] and model-check with ESBMC. Figure 6.1 shows the main
1We use the term dynamic to denote that the program is executed and its actual and expected
outputs observed and static to denote that a mathematical model of the program is analyzed.Chapter 6 Integrating ESBMC into Software Engineering Practice 133
elements and steps of the continuous veriﬁcation approach; the gray boxes indicate core
steps.
SCM
Check for modifications
Test 
Suite
Modified Functions
Static 
Verification
Dynamic 
Verification
Check property and path coverage
Buechi
Automata
Property
Assertions
Property
LTL
Figure 6.1: Continuous Veriﬁcation
For large embedded software systems, the computational eﬀort to re-verify the entire
software from scratch is high, and is even largely wasted if, as is often the case, the
changes are small [70]. For each system build, we thus consult the SCM to identify the
functions and methods that have actually been modiﬁed and focus on these. We then
use equivalence checking to determine whether they need to be re-veriﬁed formally: if
we can prove that the old and new versions of a function are functionally equivalent,
then we do not need to show for the new version any of the properties already shown for
the old version. This can potentially reduce the immediate veriﬁcation eﬀort because
proving the equivalence of two function versions can be less expensive than re-verifying
the function [78, 109, 167]. However, and more importantly, it also reduces overall system
veriﬁcation eﬀorts because it limits the propagation of changes through the system: if
we can prove the two versions of the function computationally equivalent, then we do
not need to re-verify any other function that depends it (unless that function has been
changed as well). Of course, proving the equivalence of two functions is in general
undecidable, due to unbounded memory usage [109], and the eﬀort we spend in trying
to do so might be wasted.
As an example, consider the two versions of the signalInverter function shown in Fig-
ure 6.2. They were extracted from the embedded software of two releases of a medical
device product. In order to prove the equivalence of these two ANSI-C functions, we
compare their input-output relations. We thus:Chapter 6 Integrating ESBMC into Software Engineering Practice 135
formula. As in [78, 167], we also abstract calls to other functions with uninterpreted
function symbols with the purpose of keeping the size of the SMT formula relatively
small. This is sound as long as the called functions have no side-eﬀects and have been
proved to be equivalent.
6.3 Generalizing Test Cases
After detecting new and/or modiﬁed functions, we use the existing unit test cases to
reduce the state space to be explored by the model checker. In this phase, we ﬁrst run
the unit tests, keeping track of which inputs have already been used. We then guide
the model checker to visit states that have not been visited previously (e.g., by placing
assumptions on the input). In addition, the test cases also help to reduce the state space
to be explored in another way: by using the test stubs, we can break the global model
(containing the entire program) into local models (containing only the functions under
test) and generate on-demand the reachable states to be visited by the model checker,
starting with the state described by the test case. We can so reduce the number of paths
and variables to be considered during model checking.
This approach is similar to concolic testing, which simultaneously executes a program
concretely and symbolically [119, 160]. However, here we do not generate new concrete
values for the test cases with the purpose of maximizing the code coverage. Instead, we
use existing test cases and assume-statements to block larger parts of the search space
(e.g., by combining respective concrete values of the test cases into a single interval).
As an example consider the three simple C functions shown in Figure 6.3 that were
extracted from a medical device, and one of the test cases shown in Figure 6.4. The
device, called a pulse oximeter [51], is responsible for measuring the oxygen saturation
(SpO2) and heart rate (HR) in the blood system using a non-invasive method. The
functions, that we consider here from the pulse oximeter, implement a simple circular
buﬀer using a FIFO (First In, First Out) policy. The test case checks whether messages
are correctly added to and removed from the circular buﬀer using the FIFO policy.
Other test cases check for buﬀer underﬂow and/or overﬂow and whether the elements
are lost before reading them from the buﬀer.
The pulse oximeter sources contain seven test cases, which intend to cover all possible
execution paths related to the circular buﬀer, and during dynamic veriﬁcation, we are
not able to ﬁnd any bug in the circular buﬀer implementation with these. However, the
implementation is ﬂawed: the array buﬀer is declared to be of type char[] (see line 1 in
Figure 6.3) but we assign an element b of type int (see line 14). The test cases do not
uncover this error because they happen to use only integer values that can safely be cast
to a char.138 Chapter 6 Integrating ESBMC into Software Engineering Practice
gram then monitors the design’s progress and watches out for violations of the speciﬁed
properties.
As an example, we extract two properties from the speciﬁcation of the pulse oximeter
device, and show how they can be modelled and used in the context of the continuous
veriﬁcation. In particular, we verify:
(a) the data ﬂow to compute the HR value that is provided by the pulse oximeter sensor
hardware.
(b) whether the user of the pulse oximeter is capable of adjusting the sample time of
the embedded device.
The properties (a) and (b) can be expressed using the following LTL pattern (as de-
scribed in Chapter 2):
AG(p → Fr) (6.2)
Here, A (“for all paths”), G (“always”), and F (“eventually”) are the LTL quantiﬁers,
and p and r represent the required pre- and post-states. In the example, for the property
(a), p denotes the state in which the buﬀer contains HR and SpO2 raw data, while r
denotes the state that deﬁnes the respective HR value. Consequently, (6.2) speciﬁes that
any state containing the HR and SpO2 raw data in the buﬀer is eventually followed by
a state representing the respective HR value.
A B¨ uchi automaton is a ﬁnite automaton over inﬁnite words. It diﬀers from a standard
ﬁnite automaton over ﬁnite words in the deﬁnition of accepting a word, which is based
on passing through an accepting state inﬁnitely often (rather than terminating in a
ﬁnal state) [40]. The B¨ uchi automata we consider here work over computation traces,
i.e., sequences of states of the program to be analyzed. These are abstracted by the
predicates of interest (here p and r). Hence the “words” can be represented by sequences
of propositional expressions over the variables p and r. Figure 6.6 shows the non-
deterministic B¨ uchi automaton that represents the LTL formula (6.2) and Figure 6.7
shows its corresponding ANSI-C monitor. The transition function δ is given in Table 6.2.
1 r ∨ ¬p r ¬p
init {S1,S2} S3 init init
S1 S1 S1 S3 S1
S2 S2 S2 S2 S3
S3 S3 S3 S3 S3
Table 6.2: Transition function δ for the B¨ uchi automaton shown in Figure 6.6.
From the initial state, we can transition to S3 if r ∨ ¬p holds, stay in the initial state if
either r, or ¬p holds, or non-deterministically transition to either S1 or S2 if none of the142 Chapter 6 Integrating ESBMC into Software Engineering Practice
Time Properties
Test Case L B P V C
S
o
l
v
e
r
T
o
t
a
l
P
a
s
s
e
d
V
i
o
l
a
t
e
d
F
a
i
l
1 commandLoop.TC1 545 ∞ 18 0 <1 4 18 0 0
2 commandLoop.TC2 545 500∗ 18 3 11 29 18 0 0
3 commandLoop.TC3 545 500∗ 18 3 11 29 18 0 0
4 commandLoop.TC4 545 17 18 5 8 14 18 0 0
5 commandLoop.TC5 545 ∞ 18 1 <1 4 18 0 0
6 commandLoop.TC6 545 ∞ 18 0 <1 4 18 0 0
7 commandLoop.TC7 545 1 18 15 15 19 18 0 0
8 commandLoop.TC8 545 1 18 11 28 31 18 0 0
9 checkCommandParams.TC1 238 17 17 56 <1 9 17 0 0
10 checkCommandParams.TC2 238 17 17 36 <1 5 17 0 0
11 checkCommandParams.TC3 238 17 17 37 <1 5 17 0 0
12 checkCommandParams.TC4 238 17 17 36 7 30 17 0 0
13 checkCommandParams.TC5 238 17 17 80 <1 50 17 0 0
14 checkCommandParams.TC6 238 17 17 664 15 44 17 0 0
15 checkCommandParams.TC7 238 20∗ 17 957 37 78 17 0 0
16 checkCommandParams.TC8 238 20∗ 17 1117 170 215 17 0 0
Table 6.3: Results for running the test cases for the functions commandLoop and
checkCommandParams.
properties in the functions commandLoop (see lines 2 and 3) and checkCommandParams
(see lines 15 and 16) due to unwinding violations. In any case, the test cases are still
useful since the veriﬁcation of the functions are not completely deterministic, i.e., we
still have veriﬁcation conditions to be checked by the SMT back-end, as we can see in
the column V C of the Table 6.3, which shows the total number of generated veriﬁcation
conditions. However, the generalization of the test cases does not produce signiﬁcant
results here since we only have a small number of test cases available and ESBMC
thus still runs out of memory or time out to check the properties of the functions
commandLoop and checkCommandParams.C
h
a
p
t
e
r
6
I
n
t
e
g
r
a
t
i
n
g
E
S
B
M
C
i
n
t
o
S
o
f
t
w
a
r
e
E
n
g
i
n
e
e
r
i
n
g
P
r
a
c
t
i
c
e
1
4
3
Time Properties Product Releases
Function L B P
S
o
l
v
e
r
T
o
t
a
l
P
a
s
s
e
d
V
i
o
l
a
t
e
d
F
a
i
l
P
R
1
0
P
R
1
1
P
R
1
2
P
R
1
3
1 threadRename 6 17 0 <1 3 0 0 0 X
2 ﬁleExists 19 17 0 <1 3 0 0 0 X
3 readLine 27 17 11 <1 3 1 0 0 X
4 getCommand 269 17 61 <1 3 61 0 0 X N/3 N/3
5 powerDown 9 17 0 <1 2 0 0 0 X
6 digitStart 12 17 0 <1 2 0 0 0 X Y/2
7 digitAdd 34 17 2 <1 2 2 0 0 X Y/2
8 checkEndOfPvrStream 32 17 13 <1 2 13 0 0 X Y/2
9 checkEndOfMediaStream 28 17 1 <1 2 1 0 0 X
10 commandLoop 545 17 53 Mf Mf - - - X Mf Mf
11 checkCommandParams 238 17 269 Tb Tb 0 0 269 X Tb Tb Tb
12 signal handler 13 17 0 <1 2 0 0 0 X
13 setupFBResolution 29 17 0 <1 2 0 0 0 X Y/3 Y/3 Y/2
14 setupFramebuﬀers 115 17 8 <1 3 8 0 0 X N/3 N/2 N/2
15 main Thread 68 17 4 <1 4 4 0 0 X Y/3 Y/3
16 set to raw 8 17 0 <1 3 0 0 0 X
17 set to buﬀered 8 17 0 <1 2 0 0 0 X N/2
Table 6.4: Results for checking the equivalence between the functions of the exStbDemo application.Chapter 6 Integrating ESBMC into Software Engineering Practice 145
6.5.2 Medical Device Case Study
In order to check ESBMC’s performance in verifying temporal properties, we analyzed
the embedded software of a pulse oximeter device, which is composed of device drivers
(i.e., display, keyboard, serial, sensor, and timer) that are hardware-dependent code, a
system log component that allows the developer to debug the code through data stored
on RAM memory, and an API that enables the application layer to call the services
provided by the platform. The ﬁnal version of the pulse oximeter embedded software
has approximately 3500 lines of ANSI-C code and 80 functions.
In order to meet the application’s deadline, there are 100 lines of Assembly code that
are responsible for writing text messages to the LCD hardware. ESBMC does not verify
Assembly code and as a result we execute this part of the code dynamically only by
writing diagnostic messages to a buﬀer so that we are able to examine the call stack (each
message written to the buﬀer reports the source ﬁle, line number, severity, and diagnostic
text). These diagnostic messages have been proved to be quite useful to evaluate ﬂight
software systems and aid test engineer to understand the system behaviour [82].
Table 6.5 summarizes the results in the usual format. The column Property gives the
identiﬁer of the LTL property that has been checked. The column L gives the number
of lines of code of the test program while the column T reports the total number of
threads. Note that T is always three because here we only have the main, monitor and
event threads that are running, as described in Section 6.4. The column B provides the
unwinding bound for each loop while C is the context switch bound. We use the symbol -
to denote that C has not been speciﬁed, i.e., we do not restrict the context switch bound.
The Time column provides the time in seconds while the column #FI/#I provides the
total number of failed and generated interleavings respectively. The superscript † means
that we injected a fault in the module.
We checked two types of LTL properties (i.e., AG(p → F r) and AG(p)) over diﬀerent
modules of the pulse oximeter. Here, we describe in detail each property presented in
Table 6.5:
P1: Whenever the start button is pressed, the application will eventually be initial-
ized (i.e., AG (startButton → F startApp). To check this property, we included
two additional Boolean variables into the program menu app to indicate whether
the start button has been pressed (represented by startButton) and whether the
application has been initialized (represented by startApp).
P2: It is possible to get to a state where the next position of the buﬀer is less than
its total size (i.e., AG (next < buﬀer size)). To check this property, we did not
change the program log since next and buﬀer size are already declared as global
variables.146 Chapter 6 Integrating ESBMC into Software Engineering Practice
Test program Property L T B C Time #FI / #I
1 menu app P1 847 3 2 - 16 0/3003
847 3 3 20 271 0/50456
847 3 4 20 625 0/87386
2 menu app† P1 847 3 2 - 9 663/3003
847 3 3 20 121 7584/50456
847 3 4 20 218 12548/87386
3 log P2 135 3 2 - 12 0/12
135 3 3 - 820 0/22
135 3 4 10 1149 0/8
4 log† P2 135 3 2 - 1 12/16
135 3 3 - 3 27/31
135 3 4 - 5 48/52
5 keyboard P3 49 3 2 - 7 0/120
49 3 3 - 80 0/1001
49 3 4 - 1007 0/8568
6 keyboard† P3 49 3 2 - 1 2/6
49 3 3 - 1 3/8
49 3 4 - 1 4/10
7 serial P4 165 3 2 - 16 0/1287
165 3 3 - 980 0/50388
165 3 4 10 21 0/1023
8 serial† P4 165 3 2 - 3 347/1287
165 3 3 - 147 17286/50388
165 3 4 10 3 189/1023
9 sensor P5 584 3 2 20 333 0/27768
584 3 3 20 1452 0/54900
584 3 4 10 12 0/330
10 sensor† P5 584 3 2 20 56 4420/18096
584 3 3 20 211 4655/26326
584 3 4 20 365 4655/26708
Table 6.5: Results of the LTL properties veriﬁcation of the pulse oximeter.
P3: Whenever the bit 0 of the micro-controller port is set to high, the start button
of the pulse oximeter keyboard will eventually be detected (i.e., AG (BIT0 →
F startButton)). To check this property, we included two additional Boolean
variables into the program keyboard to indicate whether the ﬁrst bit of the micro-
controller port has been set to high (represented by BIT0) and whether the start
button has been detected (represented by startButton).
P4: Whenever we set the baud rate of the micro-controller serial port to 1200 bits/sec-
ond, then its serial register will eventually be conﬁgured (i.e., AG(br1200 → F
reg1200)). To check this property, we included two additional Boolean variables
into the program serial to indicate whether the baud rate has been set to 1200
bits/second (represented by br1200) and whether the serial register has been con-Chapter 6 Integrating ESBMC into Software Engineering Practice 147
ﬁgured (represented by reg1200).
P5: Whenever we receive the synchronism bit from the pulse oximeter sensor, its con-
tent will eventually be stored into the checksum2 array (i.e., AG(sync byte → F
checksum stored)). To check this property, we included two additional Boolean
variables into the program sensor to indicate whether the synchronization byte
has been received (represented by sync byte) and whether it has been stored into
the checksum array (represented by checksum stored).
Note that we have to manually introduce additional Boolean variables into all test pro-
grams (except for the test program log) in order to indicate whether a given event has
occurred or not. Note further that we have to manually merge the resulting C-monitor
into the code as we described in Section 6.4.
As shown in Table 6.5, we also injected faults in all test programs as follows:
menu app: We do not initialize the application after the start button is pressed.
log: We change the program statements so that in a situation where the next index is
at the end of the array buﬀer, an overﬂowing index by one byte can occur, i.e., we
replace the program statement
next = (next + 1)%buﬀer size;
by
next% = buﬀer size;
next+ = 1;
keyboard: We comment out the break statement (of the following program statement
that is included into a switch-case: case START: command=startButton; break;)
so that if START was pressed, the code would fall through to the next line, and
have the wrong value assigned to startButton.
serial: Similar to the faulty keyboard program, we comment out the break statement
that selects the baud rate so that the case statement selecting the baud rate would,
in the case of 1200 baud, fall through a case and set the timer to a wrong value
(i.e., 2400).
sensor: We replaced assignments to an internal ﬂag (that detects the synchronization
bit) by non-deterministic values (i.e., ﬂag = nondet bool() ? true : false).
2The checksum of the pulse oximeter detects errors that might be introduced during the data collec-
tion.148 Chapter 6 Integrating ESBMC into Software Engineering Practice
The pulse oximeter software is a reactive system and does not terminate. In general,
ESBMC can thus only check the LTL properties up to a certain unwinding and context-
switch bounds as shown in Table 6.5. However, for smaller values of the unwinding
bound B, the number of context switches is limited, and ESBMC is able to model check
the properties without a speciﬁed upper bound on the context switches (denoted by -
in Table 6.5). Additionally, if a given LTL property does not hold in the test program,
ESBMC is able to detect the violation in few seconds and about 15% of the generated
interleavings actually fail.
6.6 Related Work
One way of tackling large veriﬁcation problems is to leverage both parallelism and search
diversity [93]. Holzmann et al. describe the Swarm tool that allows using diﬀerent search
strategies on multi-core machines [93]. It is the main interface to the SPIN model checker
to verify larger systems. This approach, however, involves large communication overhead
and does not take into account information from the software conﬁguration management
(SCM) system in order to focus the veriﬁcation eﬀort on new and/or modiﬁed functions.
In another related work, Holzmann et al. explore the availability of large chunks of
memory in order to explore more exhaustively the state space. However, the authors
do not consider that the search modes implemented in the SPIN model checker (mainly
based on depth-ﬁrst search) still remain the main performance bottleneck to verify larger
system [92].
Peled proposes a set of combinations between model checking and testing, which includes
black box checking, adaptive model checking, and unit checking [147]. However, he does
not consider the development history from the SCM system and also uses explicit model
checking based on automata theory, which does not scale well due to the number of
program variables and data type widths [64]. In addition, Peled only describes the
techniques, but does not apply it to any commercial product. Gunter and Peled [85]
extend this approach by proposing a symbolic veriﬁcation approach for a unit of code,
also called unit checking. The authors, however, apply this approach only to check
whether a complex number diverges to inﬁnity, while we focus on the veriﬁcation of
large embedded software.
Sen et al. propose an approach called concolic testing that aims to simultaneously execute
a program concretely and symbolically by combining random testing with symbolic
execution [77, 119, 160]. It thus removes partially the limitations of random testing (i.e.,
coverage) and symbolic execution (i.e., scalability). This approach, however, can fail to
compute concrete values that satisfy a given (large) path constraint (which can involve
complex expressions) due to the solver performance. However, in [119], Majumdar and
Sen proposed an approach called hybrid concolic testing that combines random andChapter 6 Integrating ESBMC into Software Engineering Practice 149
concolic testing and scale it to large software implementations (e.g., for programs with
up to 150K lines of code).
Godlin and Strichman describe an approach called regression veriﬁcation that aims to
prove the equivalence of two C programs [78, 167]. Their approach is built on top of
CBMC [42], which thus eliminates loops and recursive functions and can handle almost
all of the features of ANSI-C. In order to make their approach to scale, the authors
isolate functions from their callees and abstract them with uninterpreted functions.
Godlin and Strichman apply their approach to random and industrial programs (e.g.,
from 300 to 3000 lines of code); and they are thus able to check equivalence in minutes.
Matsumoto et al. also describe an approach to check the equivalence of two C programs
using the SMT solver CVC [121]. Before checking the equivalence of the two programs,
the authors ﬁrst identify the textual diﬀerences between them in order to get hints where
the equivalence must be checked. However, in their approach the authors restrict the
C programs to be checked (e.g., no pointer uses) due to limitations of their symbolic
execution engine.
6.7 Conclusions
For large embedded software, SMT-Based bounded model checking suﬀers from the
state space explosion problem. In this chapter we proposed an approach called continu-
ous veriﬁcation to detect design errors as quickly as possible by looking at the software
conﬁguration management system and by combining dynamic and static veriﬁcation to
reduce the state space to be explored. As a result, the continuous veriﬁcation approach
and the combination of diﬀerent encodings and solvers allowed us to explore more ex-
haustively the state space of the program. Controlled experiments using a case study
from the telecommunications domain with more than 10K of lines of C code shows that
this approach can potentially improve the error-detection capability and reduce the ver-
iﬁcation time. However, the advantage of the continuous veriﬁcation approach is not
as substantial as expected, and in this sense we achieved the fourth objective stated in
Section 1.2 only partially. If there would be more functional correctness assertions in
our case study, re-veriﬁcation would be more expensive and the continuous veriﬁcation
approach would then be more advantageous.Chapter 7
Conclusions
In this thesis, we investigated SMT-based veriﬁcation for single- and multi-threaded
ANSI-C programs, focusing in particular on embedded software. As a ﬁrst step, we de-
scribed a new set of encodings that allow us to reason accurately about bit operations,
unions, ﬁxed-point arithmetic, arrays, pointers (and pointer arithmetic) and dynamic
memory allocation and implemented it in the ESBMC tool. We integrated the SMT
solvers CVC3, Boolector, and Z3 into ESBMC and evaluated them using both standard
software model checking benchmarks and typical embedded software applications from
telecommunications, control systems, and medical devices. Our experiments constitute,
to the best of our knowledge, the ﬁrst substantial evaluation of SMT-based bounded
model checking on industrial applications. The results show that our approach out-
performs CBMC [42] and SMT-CBMC [11] if we consider the veriﬁcation of embedded
software and thus conﬁrm that we successfully met the ﬁrst objective stated in Sec-
tion 1.2. ESBMC is able to model check ANSI-C programs that involve tight interplay
between non-linear arithmetic, bit operations, pointers and array manipulations. In
addition, it was able to ﬁnd undiscovered bugs in the NECLA, PowerStone, Siemens,
SNU-RT, VERISEC and WCET benchmarks related to arithmetic overﬂow, buﬀer over-
ﬂow, invalid pointers and pointer arithmetic.
Compared to ESBMC, SMT-CBMC still has limitations not only in the veriﬁcation
time (due to the lack of simpliﬁcation based on high-level information), but also in the
encodings of important ANSI-C constructs used in embedded software. CBMC is a
SAT-based BMC tool for full ANSI-C, but it has limitations due to the fact that the
size of the propositional formulae increases signiﬁcantly in the presence of large data-
paths and high-level information is lost when the veriﬁcation conditions are converted
into propositional logic (preventing potential optimizations to reduce the state space
to be explored). Its prototype SMT-based back-end is still unstable and fails on a
large fraction of our benchmarks. We also improved considerably the performance of
SMT-based bounded model checking for embedded software by making use of high-level
information to simplify the unwound formula and by determining the best representation
151152 Chapter 7 Conclusions
(i.e., SMT logics) to model the program variables and thus successfully met the third
objective stated in Section 1.2. As a result, our approach represents a promising direction
to improve the state space coverage and to verify quickly properties in larger state spaces
using bounded model checking.
Despite the large body of (theoretical) research in the veriﬁcation of multi-threaded sys-
tems, there are only few formal veriﬁcation tools that analyze multi-threaded programs
with shared variables and locks. As a second step, we presented the lazy, schedule record-
ing, and underapproximation and widening algorithms to model check multi-threaded
ANSI-C software with shared variable, mutexes and conditions. In the lazy approach, we
generate all possible interleavings and call the BMC procedure on each of them individ-
ually, until we either ﬁnd a bug, or have systematically explored all interleavings. In the
schedule recording approach, we encode all possible interleavings into one single formula
and then exploit the high speed of the SMT solvers. In the underapproximation-widening
approach, we reduce the state space by abstracting the number of state variables and
interleavings from the proofs of unsatisﬁability generated by the SMT solvers. In all
three approaches, we bound the number of context-switches and use partial-order re-
duction techniques to reduce the number of interleavings explored. We also presented
our modelling of the synchronization primitives of the Pthread library that allowed us to
detect not only atomicity and order violations, but also local and global deadlock, that
previous attempts are unable to ﬁnd [73, 102, 103, 152]. Surprisingly, our approach to
check constraints lazily is extremely fast for programs that contain errors and to a lesser
extent even for safe programs in which the number of threads and context switches grows
quickly. Our experimental results also show that the lazy approach generally outperforms
not only the schedule recording and underapproximation and widening approaches, but
also the CHESS [136] and SATABS [44] tools on several non-trivial benchmarks as well
as state-of-the-art techniques that combine classic partial order reduction methods with
symbolic algorithms. With these approaches to verify multi-threaded software (with
shared variables) we successfully met the second objective stated in Section 1.2.
For large embedded software, SMT-based bounded model checking still suﬀers from the
state space explosion problem. Finally, as a third step, we deﬁned and evaluated the
continuous veriﬁcation approach, which combines existing ideas of software engineering
(e.g., continuous integration [70]) and formal veriﬁcation (e.g., equivalence checking [30])
communities. We applied the elements of the continuous veriﬁcation approach to the
veriﬁcation of small and large embedded software used in the medical and telecommu-
nications domains. In the medical device case study, ESBMC can only check the LTL
properties up to a certain unwinding and context-switch bounds since the software is
a reactive system and does not terminate. However, if a given LTL property does not
hold, ESBMC is able to detect the violation in few seconds and about 15% of the gener-
ated interleavings actually fail. In the telecommunication case study, we concluded that
the continuous veriﬁcation approach can potentially reduce the veriﬁcation time of largeChapter 7 Conclusions 153
embedded software systems. However, as the complete veriﬁcation time of the functions
under observation is small, the advantage of the continuous veriﬁcation approach is not
so pronounced and in this sense we achieved the fourth objective stated in Section 1.2
only partially; if there would be more (functional correctness) assertions in the code,
re-veriﬁcation would be more expensive and the continuous veriﬁcation approach would
then be more advantageous.
7.1 Main Contributions
Our work makes two major contributions. First, we describe the details of an accurate
translation from single-threaded ANSI-C programs into quantiﬁer-free formulae using
the logics QF AUFBV and QF AUFLIRA from the SMT-LIB [164]. We demonstrate
that our encoding and optimizations improve the performance of software model checking
for a wide range of software systems, with a particular emphasis on embedded software,
if compared to other approaches proposed by Kroening [105] and Armando et al. [11]. To
the best of our knowledge, no SMT-based BMC tool existed that can reliably handle full
ANSI-C. Additionally, we show that our encoding allows us to reason about arithmetic
overﬂow and to verify programs that make use of bit-level, pointers, unions and ﬁxed-
point arithmetic, where previous attempts fail [105, 11, 71, 88]. We also use three
diﬀerent SMT solvers (Boolector, CVC3, and Z3) in order to check the eﬀectiveness
of our encoding techniques. This evaluation thus allows us to quantitatively assess
the beneﬁt of using SMT solvers in software veriﬁcation; in addition, it also provides
direction for the new development of SMT solvers.
The second main contribution is in the combination of symbolic model checking with ex-
plicit state space exploration that underlies our lazy, schedule recording and underapproxi-
mation-widening approaches to handling multi-threaded software. In particular, the dif-
ference between our approach and that of Cimatti et al. [39] is that we use BMC instead
of predicate abstraction and we implement a realistic scheduler, i.e., our scheduler may
preempt a thread at any visible instruction in its execution, whereas Cimatti et al. [39]
encodes the semantics of the non-preempting SystemC scheduler. To the best of our
knowledge, the lazy approach has not been described or evaluated in the literature.
Similarly, the underapproximation-widening approach has not been used for bounded
model checking of multi-threaded software; also our approach uses a diﬀerent encod-
ing based on the notion of eﬀective context-switch blocks. The diﬀerence between our
schedule recording and Gupta et al. [103] is that they work in a fully symbolic context.154 Chapter 7 Conclusions
7.2 Future Work Directions
Conceptually, software debugging can be divided into three main steps: fault detec-
tion, fault localization and fault correction. In order to detect faults in multi-threaded
software, all possible thread interleavings must be systematically explored, which is par-
ticularly diﬃcult for traditional testing. Fault localization (and thus correction) is in
general a very time-consuming process in software development, which becomes even
worse for multi-threaded software mainly due to the non-determinism of the thread
interleavings. A number of diﬀerent approaches have been proposed in the literature
to localize faults in software systems, including, for example, slicing, mutation testing,
trace-based analysis, delta-debugging, model-based debugging and model checking (for a
recent survey we refer the reader to [122]). Apart from these approaches, the debugging
time can be substantially reduced if an automatic method is used to localize faults in
multi-threaded software. As future work, we thus intend to develop a new method for
fault localization in multi-threaded C programs using model checking. In particular,
we intend to extend the sequential fault localization method proposed by Griesmayer
et al. [81] to localise faults in multi-threaded programs and thus evaluate this approach
with industrial benchmarks using our ESBMC model checker.
We also intend to investigate the application of Craig interpolation [125] and the lazy
abstraction paradigm [127] to the veriﬁcation of multi-threaded software. However, dif-
ferently from [127], we intend to use Craig interpolation to derive thread invariants and
not just for unfolding sequential programs. The interpolation-based model checking
algorithm [125] described in Subsection 2.2.3.1 requires an unfolding of the entire pro-
gram up to some bound k. In contrast to [125], we would like to use a lazy abstraction
method (similar to [127]) so that we can apply the SMT solvers to individual program
paths in order to reduce the burden on the solver. In order to achieve this goal, we would
have to reﬁne the model using interpolants derived from refuting program paths. This
would avoid the high cost of computing the predicate image operator (as described in
Subsection 2.2.3.1), allowing us to improve substantially the performance of the model
checker.
Another direction of future work we intend to pursue is to investigate the problem of veri-
fying real-time software using SMT techniques. Most model checkers (e.g., UPPAL [113],
TSMV [120] and NuSMV [38]) that reason about timing properties in real-time systems
consider that the model is expressed as a timed automata (TA) and they use explicit
state-space exploration or BDD-based model checking techniques. To the best of our
knowledge, there is only one paper that considers the veriﬁcation of real-time systems
using SMT techniques for checking the satisﬁability of the generated formula, which is
described by Xu [181]. Xu applies his method to verify liveness and timing properties of
the form F m..nφ (where m and n represent upper and lower time bounds respectively)
in these two models, Fischer’s Protocol and the Bridge-crossing problem [181]. However,Chapter 7 Conclusions 155
Xu does not support directly real-time software and considers that in the model each
transition takes unit time for execution. This assumption is not realistic because for
embedded real-time systems, we need mechanisms to assign values to each transition so
that these values are the estimated worst-case execution times (WCET) of the respective
transition on the selected processor.
7.3 Concluding Remarks
Embedded computer systems are used in a wide range of sophisticated applications,
such as mobile phones or set-top boxes providing internet connectivity. The functional-
ity demanded in such applications has increased signiﬁcantly and an increasing number
of functions are implemented in software rather than hardware. Multi-core processors
with scalable shared memory have thus become popular in embedded systems. In turn,
the veriﬁcation of the software design and the correctness of its multi-threaded im-
plementations has become increasingly diﬃcult. This thesis, in particular, proposed a
comprehensive and implemented SMT-based bounded model checking procedure to rea-
son accurately and eﬀectively about single- and multi-threaded software in embedded
systems by exploiting SMT solvers in order to prune the property and data dependent
search space and to remove interleavings that are not relevant by analyzing the proof of
unsatisﬁability. However, the development of reliable embedded software is a complex
problem [100] and software veriﬁcation for embedded systems is still in its infancy since
it has been little explored by the research community. Tools for model checking software
are still under heavy development, as observed recently by [81]. Therefore, the develop-
ment of software model checkers based on SMT techniques is still a fertile research area
that should be further explored.Appendix A
ESBMC plug-in
This appendix describes the Eclipse plug-in for ESBMC in order to assist the veriﬁcation
engineer when using the ESBMC model checker. The main ESBMC plug-in window
consists of ﬁve tabs that allow you to set the diﬀerent run-time options of the ESBMC
model checker. This plug-in was developed with the help of Qiang Li during his summer
internship.
The ESBMC plug-in is developed in Eclipse Helios [87], release 3.6 with the Java Run-
Time Environment (JRE) 1.6 running on a Linux operating system. This appendix
describes only the main features of the ESBMC plug-in. For further information (e.g.,
how to install, how to use, and how to uninstall the ESBMC plug-in), we refer the reader
to the user manual of the ESBMC plug-in available on-line at [52].
A.1 Front-end Options
Figure A.1 shows the options available in the ESBMC front-end, which are described as
follows:
• Use current ﬁle: You can analyze the ﬁle that is open in your current editor.
If you want to analyze other ﬁles located in your ﬁle system, then uncheck the
box Use current ﬁle and click on Browse to choose the program that you want to
analyze. If your application consists of more than a single C program, then you
can specify them as a sequence (e.g., /home/esbmc/ﬁle1.c ﬁle2.c).
• Set include path: You can set the include path, which contains the .h ﬁles, by
clicking on the Browse button.
• Deﬁne preprocessor macro: You can deﬁne C preprocessor macro in this text
area by just providing the name without # as directives.
157158 Appendix A ESBMC plug-in
Figure A.1: Front-end options.
• Program, loop, claim and VCs: You can choose the options to show the
preprocessed program, all the claims (or properties) given as assertions by the
designers as well as a range of language-speciﬁc safety (such as the absence of
arithmetic under- and overﬂow, out-of-bounds array indexing, or nil-pointer deref-
erencing), show the veriﬁcations conditions that are generated during BMC, the
identiﬁcation of the loops in the program, the expressions of the program in single
static assignment (SSA) form, and the documentation (in Latex) of the generated
claims. However, note that all these options are mutually exclusive, because they
produce an output to the same (Console) view, i.e., you can visualize one of them
on each time.
• Machine word length: You can set your machine word length, the default is 32.
• Disable built-in abstract C library: The C programs usually use functions
of the ANSI-C library (e.g., strcmp, printf), which contain information that are
irrelevant from the veriﬁcation point of view. We thus provide an abstract ANSI-C
library implemented internally in the model checker, which comprises a small set
of the functions. If you do not want to use the built-in ANSI-C library, then you
should select this option.
• Read goto program instead of source code: This option allows you to
model check the goto programs (i.e., control-ﬂow graphs) generated by the goto-ccAppendix A ESBMC plug-in 159
tool [179].
A.2 BMC Options
The BMC options of the ESBMC plug-in are shown in Figure A.2. It consists of the
following options:
Figure A.2: BMC options.
• Set function name: You can set the main function name here.
• Only check speciﬁc claim: You can check for a speciﬁc claim, so please input
the number of the identiﬁcation of the claim.
• Limit search depth: You can limit search depth, so please input an integer
number.
• Unwind times: Set the unwind bound in here, the default is 2. You have to
provide an integer number.
• Unwind given loop times: This option allows you to unwind a speciﬁc loop in
your program. Here, you should provide the identiﬁcation of the loop.160 Appendix A ESBMC plug-in
• Do not generate unwinding assertions: If you do not want to check that
you have unrolled enough the loops in your program, then you should select this
option.
• Do not remove unused equations: If it is unchecked, unused equations are
removed automatically during the symbolic execution.
A.3 SMT Solver Conﬁguration
The solver conﬁguration options tab is shown in Figure A.3. It consists of the following
options:
Figure A.3: SMT Solver Conﬁguration.
• SMT solvers: If you choose the ﬁrst one, the model checker will use BOOLEC-
TOR with bit-vector arithmetic as a decision procedure to model check your pro-
gram. The second option is to use Z3 with bit-vector arithmetic, and the third
ESBMC is Z3 with integer/real arithmetic. The last option, which is the default
option, will determine the best solver and encoding to be used according to the
veriﬁcation conditions that are generated from your C program.
• Instantiation: You can choose either eager or lazy instantiation to solve the SMT
instances with Z3 (lazy is the default option).Appendix A ESBMC plug-in 161
A.4 Property Check
You can select which safety properties you want to check in your single-threaded program
as show in Figure A.4. This tab consists of the following options:
Figure A.4: Property check.
• Ignore assertions: This option ignores all assertions in your C program.
• Do not do array bounds check: This option does not allow ESBMC to generate
veriﬁcation conditions related to checking out-of-bounds array indexing.
• Do not do division by zero check: This option does not allow ESBMC to
generate veriﬁcation conditions related to checking division by zero in arithmetic
expressions.
• Do not do pointer check: This option does not allow ESBMC to generate
veriﬁcation conditions related to checking nil-pointer dereferencing.
• Enable arithmetic over- and underﬂow check: This option does not allow
ESBMC to generate veriﬁcation conditions related to checking arithmetic over-
and underﬂow.162 Appendix A ESBMC plug-in
A.5 Concurrency Check
You can select which approaches and properties you want in order to verify in your
multi-threaded program as shown in Figure A.5.
Figure A.5: Concurrency check.
• Limit the number of context switches: Limit the number of context switches al-
lowed per each thread. You have to provide an integer number here.
• Use schedule recording approach: This option allows ESBMC to encode all possible
interleavings into one single formula and then exploit the high speed of the SMT
solvers.
• Use under-approximation and winding approach: This option allows ESBMC to
check models with an increasing set of allowed interleavings.
• Limit the number of assumptions: If you choose Use under-approximation and
winding approach, then you can limit the number of assumptions in the UW
approach. You have to provide an integer number in the text area.
• Enable global and local deadlock check with mutex: This option checks whether
all threads wait for a mutex (global deadlock) or whether some of the threads form
a waiting cycle (local deadlock).Appendix A ESBMC plug-in 163
• Enable data races check: This option checks whether multiple threads perform
unsynchronized accesses to shared data.
• Do not do lock acquisition ordering check: This option checks for unintended
sequence of lock and unlock operations among the threads.
• Enable atomicity violation check at visible assignments: This option allows ES-
BMC to break visible statements to check if a region of code executes atomically.
• Enable context switch before control ﬂow tests: This option allows ESBMC to
simulate the eﬀect of a context switch right after a visible test by hoisting the test
out of the conditional, and assigning its result to a new auxiliary variable.
A.6 Counterexample, Property Violation, and Claim Views
In order to model check your C program, you should click on the Save and Check button
as shown in Figure A.1, or click on Verify Current File menu item or simply type the
shortcut CTRL+ALT+C. When the veriﬁcation fails (i.e., the property does not hold
in the program), you can see details of the property violation and counterexample (or
trace to reproduce the violation) in the corresponding views as shown in the bottom of
Figure A.6.
Figure A.6: Counterexample view.
If you double click on the variable name in the counterexample view, then you go directly
to the corresponding line in the program where the error is located. The property164 Appendix A ESBMC plug-in
violation and claim views work in the same way as in the counterexample view, i.e., you
should click in one line of the table in order to go directly to the corresponding line in
the program. If you need to obtain more information about the results of other options
of the ESBMC model checker (e.g., show program only, show loops), then you can easily
visualize them in the Eclipse console.166 Appendix B Static Analysis Benchmarks
Time Properties
Module L B P Solver Total Passed Violated Fail
1 EUREKA bf20 49 21 41 0.06 1 41 0 0
2 EUREKA BubbleSort 305 141 160 125.98 335 160 0 0
3 EUREKA Prim 79 9 41 0.15 1 41 0 0
4 EUREKA SelectionSort 309 141 156 12.63 155 156 0 0
5 EUREKA StrCmp 14 1000 6 32 35 6 0 0
6 EUREKA SumArray 12 1000 7 9 10 7 0 0
7 EUREKA MinMax 19 1000 9 2 6 9 0 0
- Total 787 - 420 181.82 543 420 0 0
Table B.1: Results of applying ESBMC to the veriﬁcation of the benchmarks from
the EUREKA suite.
Time Properties
Module L B P Solver Total Passed Violated Fail
1 POWERSTONE adpcm 473 55 545 199.35 263 545 0 0
2 POWERSTONE bcnt 83 17 157 1.14 1 157 0 0
3 POWERSTONE blit 95 1025 133 11.34 17 126 12 0
4 POWERSTONE compress 565 120 367 312.29 318 367 0 0
5 POWERSTONE cr 99 257 22 0.25 8 22 0 0
6 POWERSTONE engine 291 2 295 0.05 1 295 0 0
7 POWERSTONE ﬁr 116 34 124 0.36 3 124 0 0
8 POWERSTONE g3fax 606 2 143 47.69 48 143 0 0
9 POWERSTONE jpeg 529 5 245 155.3 157 245 0 0
- Total 2857 - 2031 728 816 2019 12 0
Table B.2: Results of applying ESBMC to the veriﬁcation of the benchmarks from
the PowerStone suite.
B.3 NECLA Suite
Table B.3 shows the results of applying ESBMC to the veriﬁcation of the correct pro-
grams from the NECLA benchmarks. Note that ESBMC ﬁnds three property violations
in two programs (ex13 and ex28) from Table B.3, which have been conﬁrmed as true
faults by the benchmark creators [97].
Table B.4 shows the results of applying ESBMC to the veriﬁcation of the bad programs
from the NECLA benchmarks. Note that ESBMC was able to verify two programs
(ex25 and ex40) from Table B.4 that did not contain any seeded errors; the benchmark
creators conﬁrmed that these two programs were misclassiﬁed and subsequently changed
the error seeding [97].Appendix B Static Analysis Benchmarks 167
Time Properties
Module L B P Solver Total Passed Violated Fail
1 NEC.ex10 72 17 10 0.02 1 10 0 0
2 NEC.ex11 24 1000 3 17.14 30 3 0 0
3 NEC.ex13 9 33 2 0 1 1 1 0
4 NEC.ex14 15 11 5 0 1 5 0 0
5 NEC.ex15 34 2 5 0 1 5 0 0
6 NEC.ex16 34 10000 4 1.47 6 4 0 0
7 NEC.ex17 44 101 14 0.01 1 14 0 0
8 NEC.ex19 28 10 2 0.02 1 2 0 0
9 NEC.ex1 22 513 10 0.27 3 10 0 0
10 NEC.ex21 25 1024 6 0.02 1 6 0 0
11 NEC.ex22 38 51 9 0.01 1 9 0 0
12 NEC.ex23 20 37 1 0.01 1 1 0 0
13 NEC.ex24 78 1000 37 0.11 0.64 37 0 0
14 NEC.ex28 12 101 11 0.01 1 9 2 0
15 NEC.ex29 47 101 32 0.01 1 32 0 0
16 NEC.ex2 39 1025 4 12.21 19 4 0 0
17 NEC.ex30 45 101 16 1.16 3 16 0 0
18 NEC.ex31 13 8 6 0.02 1 6 0 0
19 NEC.ex32 26 1001 4 0.16 1 4 0 0
20 NEC.ex33 35 100 13 0 1 13 0 0
21 NEC.ex34 24 10 7 0.01 1 7 0 0
22 NEC.ex37 26 10 5 0 1 1 0 0
23 NEC.ex38 25 201 16 0 1 16 0 0
24 NEC.ex39 26 100 4 1.1 1 4 0 0
25 NEC.ex42 32 40 12 63.05 87 12 0 0
26 NEC.ex49 15 100 2 0.12 1 2 0 0
27 NEC.ex5 17 100 6 0 1 6 0 0
28 NEC.ex6 20 100 0 1 0 0 0 0
29 NEC.ex7 27 100 3 0.16 1 3 0 0
30 NEC.ex8 19 100 5 0.37 1 5 0 0
- Total 891 - 254 98 172 212 3 0
Table B.3: Results of applying ESBMC to the veriﬁcation of the correct benchmarks
from the NECLA suite.
B.4 SNU-RT Suite
Table B.5 shows the results of applying ESBMC to the veriﬁcation of the programs from
the SNU-RT suite. Note that ESBMC ﬁnds array bounds violations and overﬂows in
arithmetic expressions in four of the SNU-RT benchmarks (crc nondet, ﬁbcall nondet,
insertsort nondet and jfdctint det); we conﬁrmed by inspection that these are indeed
faults.168 Appendix B Static Analysis Benchmarks
Time Properties
Module L B P Solver Total Passed Violated Fail
1 NEC.ex12 23 21 4 0 1 3 1 0
2 NEC.ex20 32 10 12 0.01 1 12 0 0
3 NEC.ex25 26 101 7 0.07 2 7 0 0
4 NEC.ex26 29 101 11 0.09 1 9 2 0
5 NEC.ex27 39 101 9 0.03 1 7 2 0
6 NEC.ex3 25 11 4 0 1 3 1 0
7 NEC.ex40 19 101 9 0.54 1 9 0 0
8 NEC.ex41 22 10 10 32.05 32 6 4 0
9 NEC.ex4 15 1000 6 0.01 1 4 2 0
10 NEC.ex43 112 21 40 4.709 6 27 13 0
- Total 342 - 112 37 47 87 25 0
Table B.4: Results of applying ESBMC to the veriﬁcation of the bad benchmarks
from the NECLA suite.
Time Properties
Module L B P Solver Total Passed Violated Fail
1 SNU.bs det 114 16 11 0.001 1 11 0 0
2 SNU.bs nondet 120 16 12 0.071 9 12 0 0
3 SNU.crc det 125 257 18 0.082 8 18 0 0
4 SNU crc nondet 126 257 13 0.29 7 12 1 0
5 SNU.ﬀt1 det 218 9 72 0.004 1 72 0 0
6 SNU.ﬀt1k nondet 158 0 39 0.763 50 39 0 0
7 SNU.ﬁbcall det 83 10000 2 0.005 1 2 0 0
8 SNU ﬁbcall nondet 84 10000 2 0 157 0 2 0
9 SNU.ﬁr det 314 34 25 0.361 3 25 0 0
10 SNU.ﬁr nondet 316 34 25 0.326 2 25 0 0
11 SNU.insertsort det 86 12 17 0.557 1 17 0 0
12 SNU.insertsort nondet 94 12 20 4.981 5 14 6 0
13 SNU.jfdctint det 374 65 331 0.471 2 311 20 0
14 SNU.lms det 258 202 35 4.358 297 35 0 0
15 SNU.lms nondet 256 202 35 2.488 21 35 0 0
16 SNU.ludcmp det 144 144 88 0.042 1 88 0 0
17 SNU.matmul det 81 6 31 0.055 1 31 0 0
18 SNU.qurt det 164 20 8 0.139 1 8 0 0
19 SNU.select nondet 117 1 42 0.001 1 42 0 0
20 SNU.sqrt det 88 20 2 0.002 1 2 0 0
- Total 3320 - 828 15 570 799 29 0
Table B.5: Results of applying ESBMC to the veriﬁcation of the benchmarks from
the SNU-RT suite.Appendix B Static Analysis Benchmarks 169
B.5 VERISEC Suite
Tables B.6, B.7 and B.8 shows the results of applying ESBMC to the veriﬁcation of
the correct programs from the VERISEC suite. ESBMC ﬁnds 15 property violations in
nine programs (see programs 67-73, 75 and 76), which have also been conﬁrmed by the
benchmark creators [36].
Tables B.9, B.10 and B.11 shows the results of applying ESBMC to the veriﬁcation of
the bad programs from the VERISEC suite.1
7
0
A
p
p
e
n
d
i
x
B
S
t
a
t
i
c
A
n
a
l
y
s
i
s
B
e
n
c
h
m
a
r
k
s
Time Properties
Module L B P Solver Total Passed Violated Fail
1 VERISEC.ok apache full-ok 58 5 42 0.05 1 42 0 0
2 VERISEC.ok apache full-ptr-ok 57 5 37 0.05 1 37 0 0
3 VERISEC.ok apache simp2-ok 43 5 24 0.03 1 24 0 0
4 VERISEC.ok apache simp3-ok 55 5 40 0.04 1 40 0 0
5 VERISEC.ok apache strncmp-ok 41 5 25 0.03 1 25 0 0
6 VERISEC.ok bind expands-vars-ok 89 1 29 0.01 1 29 0 0
7 VERISEC.ok gxine simp-ok 31 5 15 0 1 15 0 0
8 VERISEC.ok libgd gd-no-entities-ok 117 3 30 0.08 1 30 0 0
9 VERISEC.ok libgd gd-simp-ok 95 3 28 0.05 1 28 0 0
10 VERISEC.ok MADWiFi no-sprintf-ok 53 3 19 0 1 14 0 0
11 VERISEC.ok NetBSD-libc anyMeta-int-ok 50 10 12 0.83 2 12 0 0
12 VERISEC.ok NetBSD-libc anyMeta-ptr-ok 52 10 11 0.74 2 11 0 0
13 VERISEC.ok NetBSD-libc bounds-ok 17 0 2 0 1 2 0 0
14 VERISEC.ok NetBSD-libc glob2-int-ok 91 12 28 11.69 19 28 0 0
15 VERISEC.ok NetBSD-libc glob2-ptr-ok 92 12 27 21.19 29 27 0 0
16 VERISEC.ok NetBSD-libc loop-int-ok 39 4 6 0.01 1 6 0 0
17 VERISEC.ok NetBSD-libc loop-ok 24 4 3 0 1 3 0 0
18 VERISEC.ok NetBSD-libc loop-ptr-ok 39 4 5 0 1 5 0 0
19 VERISEC.ok NetBSD-libc noAnyMeta-int-ok 43 10 10 2.06 3 10 0 0
20 VERISEC.ok NetBSD-libc noAnyMeta-ptr-ok 45 10 9 0.63 2 9 0 0
21 VERISEC.ok OpenSER cases1-stripFullBoth-arr-inlined-ok 60 10 43 4.04 5 43 0 0
22 VERISEC.ok OpenSER cases1-stripFullBoth-arr-ok 59 10 38 3.35 4 38 0 0
23 VERISEC.ok OpenSER cases1-stripFullEnd-arr-inlined-ok 54 9 35 0.24 1 35 0 0
24 VERISEC.ok OpenSER cases1-stripFullEnd-arr-ok 53 10 30 0.16 1 30 0 0
25 VERISEC.ok OpenSER cases1-stripFullStart-arr-inlined-ok 56 10 35 1.77 3 35 0 0
26 VERISEC.ok OpenSER cases1-stripFullStart-arr-ok 55 10 30 1.7 2 30 0 0
27 VERISEC.ok OpenSER cases1-stripNone-arr-inlined-ok 50 10 27 0.08 1 27 0 0
28 VERISEC.ok OpenSER cases1-stripNone-arr-ok 49 10 22 0.05 1 22 0 0
29 VERISEC.ok OpenSER cases1-stripSpacesBoth-arr-inlined-ok 56 10 0 1 0 0 0
30 VERISEC.ok OpenSER cases1-stripSpacesBoth-arr-ok 55 10 28 1.38 2 28 0 0
Table B.6: Results of applying ESBMC to the veriﬁcation of the correct benchmarks from the VERISEC suite - Part I.A
p
p
e
n
d
i
x
B
S
t
a
t
i
c
A
n
a
l
y
s
i
s
B
e
n
c
h
m
a
r
k
s
1
7
1
Time Properties
Module L B P Solver Total Passed Violated Fail
31 VERISEC.ok OpenSER cases1-stripSpacesEnd-arr-inlined-ok 53 10 30 0.13 1 30 0 0
32 VERISEC.ok OpenSER cases1-stripSpacesEnd-arr-ok 52 10 25 0.1 1 25 0 0
33 VERISEC.ok OpenSER cases1-stripSpacesStart-arr-inlined-ok 53 10 30 1.16 2 30 0 0
34 VERISEC.ok OpenSER cases1-stripSpacesStart-arr-ok 52 10 25 0.93 2 25 0 0
35 VERISEC.ok OpenSER cases2-stripFullBoth-arr-inlined-ok 63 10 45 7.54 9 45 0 0
36 VERISEC.ok OpenSER cases2-stripFullBoth-arr-ok 62 10 40 6.05 7 40 0 0
37 VERISEC.ok OpenSER cases2-stripFullEnd-arr-inlined-ok 57 10 37 0.66 1 37 0 0
38 VERISEC.ok OpenSER cases2-stripFullEnd-arr-ok 56 10 32 0.36 1 32 0 0
39 VERISEC.ok OpenSER cases2-stripFullStart-arr-inlined-ok 59 10 37 3.25 5 37 0 0
40 VERISEC.ok OpenSER cases2-stripFullStart-arr-ok 58 10 32 2.39 3 32 0 0
41 VERISEC.ok OpenSER cases2-stripNone-arr-inlined-ok 53 10 29 0.13 1 29 0 0
42 VERISEC.ok OpenSER cases2-stripNone-arr-ok 52 10 24 0.07 1 24 0 0
43 VERISEC.ok OpenSER cases2-stripSpacesBoth-arr-inlined-ok 59 10 35 4.01 5 35 0 0
44 VERISEC.ok OpenSER cases2-stripSpacesBoth-arr-ok 58 10 30 2.7 4 30 0 0
45 VERISEC.ok OpenSER cases2-stripSpacesEnd-arr-inlined-ok 56 10 32 0.49 1 32 0 0
46 VERISEC.ok OpenSER cases2-stripSpacesEnd-arr-ok 55 10 27 0.18 1 27 0 0
47 VERISEC.ok OpenSER cases2-stripSpacesStart-arr-inlined-ok 56 10 32 2.16 3 32 0 0
48 VERISEC.ok OpenSER cases2-stripSpacesStart-arr-ok 55 10 27 1.55 2 27 0 0
49 VERISEC.ok OpenSER cases3-stripFullBoth-arr-inlined-ok 66 10 47 7.11 8 47 0 0
50 VERISEC.ok OpenSER cases3-stripFullBoth-arr-ok 65 10 42 6.52 8 42 0 0
51 VERISEC.ok OpenSER cases3-stripFullEnd-arr-inlined-ok 60 10 39 0.75 1 39 0 0
52 VERISEC.ok OpenSER cases3-stripFullEnd-arr-ok 59 10 34 0.55 1 34 0 0
53 VERISEC.ok OpenSER cases3-stripFullStart-arr-inlined-ok 62 10 39 4.08 5 39 0 0
54 VERISEC.ok OpenSER cases3-stripFullStart-arr-ok 61 10 34 3.4 4 34 0 0
55 VERISEC.ok OpenSER cases3-stripNone-arr-inlined-ok 56 10 31 0.26 1 31 0 0
56 VERISEC.ok OpenSER cases3-stripNone-arr-ok 55 10 26 0.16 1 26 0 0
57 VERISEC.ok OpenSER cases3-stripSpacesBoth-arr-inlined-ok 62 10 37 4.32 5 37 0 0
58 VERISEC.ok OpenSER cases3-stripSpacesBoth-arr-ok 61 10 32 3.15 4 32 0 0
59 VERISEC.ok OpenSER cases3-stripSpacesEnd-arr-inlined-ok 59 10 34 0.53 1 34 0 0
60 VERISEC.ok OpenSER cases3-stripSpacesEnd-arr-ok 58 10 29 0.33 1 29 0 0
Table B.7: Results of applying ESBMC to the veriﬁcation of the correct benchmarks from the VERISEC suite - Part II.1
7
2
A
p
p
e
n
d
i
x
B
S
t
a
t
i
c
A
n
a
l
y
s
i
s
B
e
n
c
h
m
a
r
k
s
Time Properties
Module L B P Solver Total Passed Violated Fail
61 VERISEC.ok OpenSER cases3-stripSpacesStart-arr-inlined-ok 59 10 34 2.15 3 34 0 0
62 VERISEC.ok OpenSER cases3-stripSpacesStart-arr-ok 58 10 29 1.79 3 29 0 0
63 VERISEC.ok samba simp-ok 22 1 2 0 1 2 0 0
64 VERISEC.ok sendmail both-ok 78 5 38 0.42 1 38 0 0
65 VERISEC.ok sendmail close-angle-ptr-no-test-ok 44 3 8 0.01 1 8 0 0
66 VERISEC.ok sendmail inner-ok 39 4 13 0 1 13 0 0
67 VERISEC.ok sendmail mime7to8-arr-one-char-heavy-test-ok 48 10 15 0.29 1 14 1 0
68 VERISEC.ok sendmail mime7to8-arr-one-char-med-test-ok 46 10 15 0.17 1 14 1 0
69 VERISEC.ok sendmail mime7to8-arr-one-char-no-test-ok 31 10 7 0.01 1 6 1 0
70 VERISEC.ok sendmail mime7to8-arr-three-chars-med-test-ok 94 10 41 1.1 1 38 3 0
71 VERISEC.ok sendmail mime7to8-ptr-one-char-heavy-test-ok 46 10 17 0.43 1 16 1 0
72 VERISEC.ok sendmail mime7to8-ptr-three-chars-med-test-ok 86 10 45 4.36 4 42 3 0
73 VERISEC.ok sendmail mime7to8-ptr-three-chars-no-test-ok 48 10 18 0.14 1 15 3 0
74 VERISEC.ok sendmail outer-ok 47 4 17 0.02 1 17 0 0
75 VERISEC.ok sendmail prescan-arr-med-test-ok 86 5 21 0.06 1 20 1 0
76 VERISEC.ok sendmail prescan-arr-min-test-ok 92 5 21 0.05 1 20 1 0
77 VERISEC.ok sendmail tTﬂag-arr-one-loop-ok 23 11 7 0.01 1 7 0 0
78 VERISEC.ok SpamAssassin loop-ok 45 7 33 1.94 3 33 0 0
79 VERISEC.wu-ftpd simple-ok 54 4 18 0 1 18 0 0
80 VERISEC.wu-ftpd strcpy-strcat-ok 64 5 32 0.01 1 32 0 0
- Total 4521 - 2114 128 211 2094 15 0
Table B.8: Results of applying ESBMC to the veriﬁcation of the correct benchmarks from the VERISEC suite - Part III.A
p
p
e
n
d
i
x
B
S
t
a
t
i
c
A
n
a
l
y
s
i
s
B
e
n
c
h
m
a
r
k
s
1
7
3
Time Properties
Module L B P Solver Total Passed Violated Fail
1 VERISEC.apache full-bad 58 5 39 0.07 1 38 1 0
2 VERISEC.apache full-ptr-bad 57 5 34 0.08 1 33 1 0
3 VERISEC.apache simp2-bad 43 5 21 0.03 1 20 1 0
4 VERISEC.apache simp3-bad 55 5 37 0.07 1 36 1 0
5 VERISEC.apache strncmp-bad 41 5 22 0.03 1 21 1 0
6 VERISEC.bind expands-vars-bad 82 1 28 0.01 1 27 1 0
7 VERISEC.cases2 stripSpacesEnd-arr-inlined-bad 53 10 29 0.16 1 26 3 0
8 VERISEC.gxine simp-bad 31 5 10 0 1 9 1 0
9 VERISEC.libgd gd-no-entities-bad 120 3 28 0.12 1 25 3 0
10 VERISEC.libgd gd-simp-bad 98 3 26 0.07 1 23 3 0
11 VERISEC.MADWiFi no-sprintf-bad 52 3 18 0 1 13 5 0
12 VERISEC.NetBSD-libc anyMeta-int-bad 50 10 12 0.84 2 7 5 0
13 VERISEC.NetBSD-libc anyMeta-ptr-bad 52 10 11 0.74 1 6 5 0
14 VERISEC.NetBSD-libc bounds-bad 17 0 2 0 1 1 1 0
15 VERISEC.NetBSD-libc glob2-int-bad 91 12 28 19.98 27 16 12 0
16 VERISEC.NetBSD-libc glob2-ptr-bad 92 12 27 16.48 24 15 12 0
17 VERISEC.NetBSD-libc loop-bad 24 4 3 0.01 1 2 1 0
18 VERISEC.NetBSD-libc loop-int-bad 39 4 6 0.01 1 4 2 0
19 VERISEC.NetBSD-libc loop-ptr-bad 39 4 5 0.01 1 3 2 0
20 VERISEC.NetBSD-libc noAnyMeta-int-bad 43 10 10 2.53 4 8 2 0
21 VERISEC.NetBSD-libc noAnyMeta-ptr-bad 45 10 9 0.63 1 5 4 0
22 VERISEC.OpenSER cases1-stripFullBoth-arr-bad 56 10 35 1.29 2 34 1 0
23 VERISEC.OpenSER cases1-stripFullBoth-arr-inlined-bad 57 10 40 1.51 3 37 3 0
24 VERISEC.OpenSER cases1-stripFullEnd-arr-bad 50 10 27 0.12 1 26 1 0
25 VERISEC.OpenSER cases1-stripFullEnd-arr-inlined-bad 51 9 32 0.16 1 29 3 0
26 VERISEC.OpenSER cases1-stripFullStart-arr-bad 52 10 27 1.02 2 26 1 0
27 VERISEC.OpenSER cases1-stripFullStart-arr-inlined-bad 53 10 32 1.25 2 29 3 0
28 VERISEC.OpenSER cases1-stripNone-arr-bad 46 10 19 0.05 1 18 1 0
29 VERISEC.OpenSER cases1-stripNone-arr-inlined-bad 47 10 24 0.08 1 21 3 0
30 VERISEC.OpenSER cases1-stripSpacesBoth-arr-bad 52 10 25 0.87 2 24 1 0
Table B.9: Results of applying ESBMC to the veriﬁcation of the bad benchmarks from the VERISEC suite - Part I.1
7
4
A
p
p
e
n
d
i
x
B
S
t
a
t
i
c
A
n
a
l
y
s
i
s
B
e
n
c
h
m
a
r
k
s
Time Properties
Module L B P Solver Total Passed Violated Fail
31 VERISEC.OpenSER cases1-stripSpacesBoth-arr-inlined-bad 53 10 30 1.13 2 27 3 0
32 VERISEC.OpenSER cases1-stripSpacesEnd-arr-bad 49 10 22 0.08 1 21 1 0
33 VERISEC.OpenSER cases1-stripSpacesEnd-arr-inlined-bad 50 10 27 0.12 1 24 3 0
34 VERISEC.OpenSER cases1-stripSpacesStart-arr-bad 49 10 22 0.76 2 21 1 0
35 VERISEC.OpenSER cases1-stripSpacesStart-arr-inlined-bad 50 10 27 0.91 2 24 3 0
36 VERISEC.OpenSER cases2-stripFullBoth-arr-bad 59 10 37 1.72 3 36 1 0
37 VERISEC.OpenSER cases2-stripFullBoth-arr-inlined-bad 60 10 42 1.55 2 39 3 0
38 VERISEC.OpenSER cases2-stripFullEnd-arr-bad 53 10 29 0.15 1 28 1 0
39 VERISEC.OpenSER cases2-stripFullEnd-arr-inlined-bad 54 10 34 0.22 1 31 3 0
40 VERISEC.OpenSER cases2-stripFullStart-arr-bad 55 10 29 1.42 2 28 1 0
41 VERISEC.OpenSER cases2-stripFullStart-arr-inlined-bad 56 10 34 1.43 2 31 3 0
42 VERISEC.OpenSER cases2-stripNone-arr-bad 49 10 21 0.05 1 20 1 0
43 VERISEC.OpenSER cases2-stripNone-arr-inlined-bad 50 10 26 0.09 1 23 3 0
44 VERISEC.OpenSER cases2-stripSpacesBoth-arr-bad 55 10 27 0.99 2 26 1 0
45 VERISEC.OpenSER cases2-stripSpacesBoth-arr-inlined-bad 56 10 32 1.11 2 29 3 0
46 VERISEC.OpenSER cases2-stripSpacesEnd-arr-bad 52 10 24 0.11 1 23 1 0
47 VERISEC.OpenSER cases2-stripSpacesEnd-arr-inlined-bad 54 10 29 0.16 1 26 3 0
48 VERISEC.OpenSER cases2-stripSpacesStart-arr-bad 52 10 24 0.79 1 23 1 0
49 VERISEC.OpenSER cases2-stripSpacesStart-arr-inlined-bad 53 10 29 1.02 2 26 3 0
50 VERISEC.OpenSER cases3-stripFullBoth-arr-bad 62 10 39 1.9 3 38 1 0
51 VERISEC.OpenSER cases3-stripFullBoth-arr-inlined-bad 63 10 44 2.51 4 41 3 0
52 VERISEC.OpenSER cases3-stripFullEnd-arr-bad 56 10 31 0.28 1 30 1 0
53 VERISEC.OpenSER cases3-stripFullEnd-arr-inlined-bad 57 10 36 0.41 1 33 3 0
54 VERISEC.OpenSER cases3-stripFullStart-arr-bad 58 10 31 1.84 3 30 1 0
55 VERISEC.OpenSER cases3-stripFullStart-arr-inlined-bad 59 10 36 1.74 2 33 3 0
56 VERISEC.OpenSER cases3-stripNone-arr-bad 52 10 23 0.14 1 22 1 0
57 VERISEC.OpenSER cases3-stripNone-arr-inlined-bad 53 10 28 0.21 1 25 3 0
58 VERISEC.OpenSER cases3-stripSpacesBoth-arr-bad 58 10 29 1.3 3 28 1 0
59 VERISEC.OpenSER cases3-stripSpacesBoth-arr-inlined-bad 59 10 34 1.5 3 31 3 0
60 VERISEC.OpenSER cases3-stripSpacesEnd-arr-bad 55 10 26 0.24 1 25 1 0
Table B.10: Results of applying ESBMC to the veriﬁcation of the bad benchmarks from the VERISEC suite - Part II.A
p
p
e
n
d
i
x
B
S
t
a
t
i
c
A
n
a
l
y
s
i
s
B
e
n
c
h
m
a
r
k
s
1
7
5
Time Properties
Module L B P Solver Total Passed Violated Fail
61 VERISEC.OpenSER cases3-stripSpacesEnd-arr-inlined-bad 56 10 31 0.3 1 28 3 0
62 VERISEC.OpenSER cases3-stripSpacesStart-arr-bad 55 10 26 1.11 1 25 1 0
63 VERISEC.OpenSER cases3-stripSpacesStart-arr-inlined-bad 56 10 31 1.29 3 28 3 0
64 VERISEC.samba simp-bad 22 1 4 0 1 4 1 0
65 VERISEC.sendmail both-bad 44 5 15 0.03 1 13 2 0
66 VERISEC.sendmail close-angle-ptr-no-test-bad 45 3 8 0.01 1 7 1 0
67 VERISEC.sendmail inner-bad 37 4 9 0 1 8 1 0
68 VERISEC.sendmail mime7to8-arr-one-char-heavy-test-bad 48 10 15 0.22 1 12 3 0
69 VERISEC.sendmail mime7to8-arr-one-char-med-test-bad 46 10 15 0.18 1 10 5 0
70 VERISEC.sendmail mime7to8-arr-one-char-no-test-bad 29 10 7 0.02 1 4 3 0
71 VERISEC.sendmail mime7to8-arr-three-chars-med-test-bad 94 10 41 2.97 3 32 9 0
72 VERISEC.sendmail mime7to8-ptr-one-char-heavy-test-bad 46 10 14 0.45 1 11 3 0
73 VERISEC.sendmail mime7to8-ptr-three-chars-med-test-bad 86 10 36 4.14 5 25 11 0
74 VERISEC.sendmail mime7to8-ptr-three-chars-no-test-bad 42 10 15 0.05 1 8 7 0
75 VERISEC.sendmail outer-bad 43 4 15 0.02 1 15 1 0
76 VERISEC.sendmail prescan-arr-med-test-bad 86 5 21 0.06 1 20 1 0
77 VERISEC.sendmail prescan-arr-min-test-bad 83 5 21 0.05 1 20 1 0
78 VERISEC.sendmail tTﬂag-arr-one-loop-bad 23 11 9 0.22 1 6 3 0
79 VERISEC.sendmail util-bad 136 30 33 39.67 52 26 7 0
80 VERISEC.SpamAssassin loop-bad 45 7 33 2.58 3 32 1 0
81 VERISEC.wu-ftpd simple-bad 53 4 14 0 1 13 1 0
82 VERISEC.wu-ftpd small-invalid 44 4 14 0 1 10 4 0
83 VERISEC.wu-ftpd strcpy-strcat-bad 63 5 29 0.02 1 27 2 0
- Total 4569 - 2024 127 226 1808 216 0
Table B.11: Results of applying ESBMC to the veriﬁcation of the bad benchmarks from the VERISEC suite - Part III.176 Appendix B Static Analysis Benchmarks
B.6 WCET Suite
Table B.12 shows the results of applying ESBMC to the veriﬁcation of the programs
from the WCET suite. Note that ESBMC ﬁnds four property violations in two programs
(duﬀ nondet and st), which we inspected manually.
Time Properties
Module L B P Solver Total Passed Violated Fail
1 WCET.cnt 133 11 27 0.144 1 27 0 0
2 WCET.cover 238 121 196 0.067 1 196 0 0
3 WCET.duﬀ det 86 101 39 0.026 1 39 0 0
4 WCET.duﬀ nondet 86 101 38 0.052 1 37 1 0
5 WCET.expint 157 101 33 0.016 1 33 0 0
6 WCET.fdct 238 9 314 0.091 1 314 0 0
7 WCET.ns det 531 6 22 0.012 2 22 0 0
8 WCET.ns nondet 531 6 22 5.296 14 22 0 0
9 WCET.statemate 1273 3 6 0.25 1 6 0 0
10 WCET.st 157 1001 29 1.332 50 26 3 0
- Total 3430 - 726 7 73 722 4 0
Table B.12: Results of applying ESBMC to the veriﬁcation of the benchmarks from
the WCET suite.Appendix C
Functions of the Pthread Library
ESBMC is able to model check multi-threaded programs that use some functions of the
POSIX Pthread Library [135]. This appendix thus describes the main functions of the
Pthread library that we support.
• pthread create(): This function creates a new thread.
• pthread exit(): This function terminates the calling thread.
• pthread mutex init(): This function initializes the mutex that is used for per-
forming synchronization among the threads.
• pthread mutex lock(): This function locks the mutex if it is unlocked; otherwise
it blocks the current thread until the mutex is released and can then be locked
successfully again.
• pthread mutex unlock(): This function unlocks the mutex that was locked
previously by the same thread.
• pthread rwlock init(): This function initializes the read-write lock object, which
allows concurrent read access to an object but requires exclusive access for write
operations.
• pthread rwlock trywrlock(), pthread rwlock wrlock(): These functions locks
a read-write lock object for writing.
• pthread rwlock unlock(): This function unlocks a read-write lock object.
• pthread cond init(): This function initializes the condition variable.
• pthread cond wait(): This function is used to block the thread on a condition
variable and the blocked thread is awakened only if another thread calls signal or
broadcast.
177178 Appendix C Functions of the Pthread Library
• pthread cond signal(): If there are several threads that are blocked on a con-
dition variable, then this function unblocks at least one of them (but there is no
guarantee of which one will be woken up due to the scheduling policy).
• pthread cond broadcast(): This function unblocks all threads currently blocked
on the speciﬁed condition variable.
• pthread cond destroy(): This function destroys the given condition variable,
i.e., the object becomes uninitialized.References
[1] The economic impact of inadequate infrastructure for software testing. Technical
Planning Report 02-3, National Institute of Standards and Technology, 2002.
[2] MiBench Version 1.0. http://www.eecs.umich.edu/mibench/, 2009.
[3] Common vulnerabilities and exposures. In http://cve.mitre.org/, 2010.
[4] Advanced Linux Sound Architecture. http://www.alsa-project.org/, 2011.
[5] DirectFB. http://directfb.org/, 2011.
[6] Television with Linux. http://www.linuxtv.org/, 2011.
[7] Accellera. Property Speciﬁcation Language (Reference Manual). Available at
http://www.eda.org/vfv/docs/PSL-v1.1.pdf, 2004.
[8] Torben Amtoft and Anindya Banerjee. Veriﬁcation condition generation for con-
ditional information ﬂow. In FMSE, pages 2–11, 2007.
[9] Andrew W. Appel. Modern Compiler Implementation in C: Basic Techniques.
Cambridge University Press, New York, NY, USA, 1997.
[10] Alessandro Armando, Jacopo Mantovani, and Lorenzo Platania. Bounded model
checking of software using SMT solvers instead of SAT solvers. In SPIN, LNCS
3925, pages 146–162, 2006.
[11] Alessandro Armando, Jacopo Mantovani, and Lorenzo Platania. Bounded model
checking of software using SMT solvers instead of SAT solvers. Int. J. Softw. Tools
Technol. Transf., 11(1):69–83, 2009.
[12] Domagoj Babi´ c. Exploiting Structure for Scalable Software Veriﬁcation. PhD
thesis, University of British Columbia, Vancouver, Canada, 2008.
[13] Domagoj Babi´ c and Alan J. Hu. Calysto: Scalable and Precise Extended Static
Checking. In ICSE, pages 211–220, 2008.
[14] Christel Baier and Joost-Pieter Katoen. Principles of Model Checking. The MIT
Press, 2008.
185186 REFERENCES
[15] Subhashini Balakrishnan and Soﬁene Tahar. On the formal veriﬁcation of embed-
ded software using multiway decision graphs. Technical Report TR-402, Concordia
University, Montreal, Canada, 1997.
[16] Thomas Ball and Sriram K. Rajamani. SLIC: A Speciﬁcation Language for In-
terface Checking (of C). Technical Report MSR-TR-2001-21, Microsoft Research,
2001.
[17] Michael Barnett, Robert DeLine, Manuel F¨ ahndrich, Bart Jacobs 0002, K. Rus-
tan M. Leino, Wolfram Schulte, and Herman Venter. The Spec# Programming
System: Challenges and Directions. In VSTTE, LNCS 4171, pages 144–152, 2005.
[18] Michael Barnett and K. Rustan M. Leino. To goto where no statement has gone
before. In VSTTE, LNCS 6217, pages 157–168, 2010.
[19] Clark Barrett, Leonardo de Moura, and Aaron Stump SMT-COMP: Satisﬁability
Modulo Theories Competition. In CAV, LNCS 3576, pages 20–23, 2005.
[20] Clark Barrett and Cesare Tinelli. CVC3. In CAV, LNCS 4590, pages 298–302,
2007.
[21] Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala, and Rupak Majumdar. The
software model checker BLAST. STTT, 9(5-6):505–525, 2007.
[22] Armin Biere, Keijo Heljanko, Tommi Junttila, Timo Latvala, and Viktor Schup-
pan. Linear encodings of bounded LTL model checking. CoRR, abs/cs/0611029,
2006.
[23] Armin Biere. PicoSAT essentials. JSAT, 4(2-4):75–97, 2008.
[24] Armin Biere. Bounded model checking. In Handbook of Satisﬁability, pages 457–
481. 2009.
[25] Armin Biere, Alessandro Cimatti, Edmund M. Clarke, and Yunshan Zhu. Symbolic
model checking without BDDs. In TACAS, LNCS 1579, pages 193–207, 1999.
[26] Nikolaj Bjørner and Leonardo de Moura. Z310: Applications, enablers, challenges
and directions. In Sixth International Workshop on Constraints in Formal Veriﬁ-
cation, 2009.
[27] Marco Bozzano, Roberto Bruttomesso, Alessandro Cimatti, Anders Franz´ en,
Ziyad Hanna, Zurab Khasidashvili, Amit Palti, and Roberto Sebastiani. Encoding
RTL constructs for MathSAT: a preliminary report. Electr. Notes Theor. Comput.
Sci., 144(2):3–14, 2006.
[28] Marco Bozzano, Roberto Bruttomesso, Alessandro Cimatti, Tommi A. Junttila,
Peter van Rossum, Stephan Schulz, and Roberto Sebastiani. An incremental and
layered procedure for the satisﬁability of linear arithmetic logic. In TACAS, LNCS
3440, pages 317–333, 2005.REFERENCES 187
[29] Aaron R. Bradley and Zohar Manna. The Calculus of Computation: Decision
Procedures with Applications to Veriﬁcation. Springer-Verlag New York, Inc., Se-
caucus, NJ, USA, 2007.
[30] Daniel Brand. Veriﬁcation of large synthesized designs. In ICCAD, pages 534–537,
1993.
[31] Robert Brummayer and Armin Biere. Boolector: An eﬃcient SMT solver for
bit-vectors and arrays. In TACAS, LNCS 5505, pages 174–177, 2009.
[32] Roberto Bruttomesso, Alessandro Cimatti, Anders Franz´ en, Alberto Griggio, and
Roberto Sebastiani. The MathSAT 4 SMT solver. In CAV, LNCS 5123, pages
299–303, 2008.
[33] Jerry R. Burch, Edmund M. Clarke, Kenneth L. McMillan, David L. Dill, and
L. J. Hwang. Symbolic model checking: 1020 states and beyond. In LICS, pages
428–439, 1990.
[34] William R. Bush, Jonathan D. Pincus, and David J. Sielaﬀ. A static analyzer
for ﬁnding dynamic programming errors. Softw. Pract. Exper., 30:775–802, June
2000.
[35] Sagar Chaki, Edmund M. Clarke, Alex Groce, and Ofer Strichman. Predicate
abstraction with minimum predicates. In CHARME, LNCS 2860, pages 19–34,
2003.
[36] Marsha Chechik. Personal communication. 2011.
[37] Alonzo Church. A note on the entscheidungsproblem. Journal of Symbolic Logic,
1:40-41, 1936.
[38] Alessandro Cimatti, Edmund M. Clarke, Fausto Giunchiglia, and Marco Roveri.
NuSMV: A new symbolic model veriﬁer. In CAV, LNCS 1633, pages 495–499,
1999.
[39] Alessandro Cimatti, Andrea Micheli, Iman Narasamdya, and Marco Roveri. Veri-
fying SystemC: a software model checking approach. In FMCAD, 2010.
[40] Edmund M. Clarke, Orna Grumberg, and Doron Peled. Model checking. MIT
Publishers, 2000.
[41] Edmund M. Clarke and Daniel Kroening. Hardware veriﬁcation using ANSI-C
programs as a reference. In ASP-DAC, pages 308–311, 2003.
[42] Edmund M. Clarke, Daniel Kroening, and Flavio Lerda. A tool for checking ANSI-
C programs. In TACAS, LNCS 2988, pages 168–176, 2004.188 REFERENCES
[43] Edmund M. Clarke, Daniel Kroening, Natasha Sharygina, and Karen Yorav. Pred-
icate abstraction of ANSI–C programs using SAT. Formal Methods in System
Design, 25:105–127, 2004.
[44] Edmund M. Clarke, Daniel Kroening, Natasha Sharygina, and Karen Yorav. SA-
TABS: SAT-based predicate abstraction for ANSI-C. In TACAS 2005, LNCS 3440,
pages 570–574, 2005.
[45] Edmund M. Clarke, Daniel Kroening, Ofer Strichman, and Joel Ouaknine. Com-
pleteness and complexity of bounded model checking. In VMCAI, LNCS 2937,
pages 85–96, 2004.
[46] Edmund M. Clarke. SAT-based counterexample guided abstraction reﬁnement in
model checking. In CADE, LNCS 2741, page 1, 2003.
[47] Edmund M. Clarke, Anubhav Gupta, Himanshu Jain, and Helmut Veith. Model
checking: Back and forth between hardware and software. In VSTTE, pages 251–
255, 2005.
[48] James A. Clause and Alessandro Orso. Leakpoint: pinpointing the causes of
memory leaks. In ICSE (1), pages 515–524, 2010.
[49] Byron Cook, Daniel Kroening, and Natasha Sharygina. Cogent: Accurate theorem
proving for program veriﬁcation. In CAV, LNCS 3576, pages 296–300, 2005.
[50] Lucas Cordeiro and Bernd Fischer. Bounded model checking of multi-threaded
software using smt solvers. In Presentation-only paper at 8th International Work-
shop on Satisﬁability Modulo Theories (SMT) at FLoC, Edinburgh, Scotland, 2010.
[51] Lucas Cordeiro, Raimundo Barreto, Rafael Barcelos, Meuse Oliveira, Vicente Lu-
cena Jr., and Paulo Maciel. Txm: An agile hw/sw development methodology
for building medical devices. In ACM SIGSOFT Software Engineering Notes.,
32(6):32, 2007.
[52] Lucas Cordeiro, Bernd Fischer, and Joao Marques-Silva. Eﬃcient SMT-based
Bounded Model Checker (ESBMC). users.ecs.soton.ac.uk/lcc08r/esbmc, 2009.
[53] Lucas Cordeiro, Bernd Fischer, and Joao Marques-Silva. SMT-based bounded
model checking for embedded ANSI-C software. In ASE, pages 137–148, 2009.
[54] Lucas Cordeiro, Bernd Fischer, and Jo˜ ao Marques-Silva. Continuous veriﬁcation
of large embedded software using SMT-based bounded model checking. In ECBS,
pages 160–169, 2010.
[55] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Cliﬀord Stein.
Introduction to Algorithms, Second Edition. The MIT Press, 2001.REFERENCES 189
[56] Leonardo de Moura and Nikolaj Bjørner. Proofs and refutations, and Z3. In LPAR
Workshops, 2008.
[57] Leonardo de Moura and Nikolaj Bjørner. Z3: An eﬃcient SMT solver. In TACAS,
LNCS 4963, pages 337–340, 2008.
[58] Leonardo de Moura and Nikolaj Bjørner. Satisﬁability modulo theories: An appe-
tizer. In SBMF, LNCS 5902, pages 23–36, 2009.
[59] Leonardo de Moura, Harald Rueß, and Maria Sorea. Lazy theorem proving for
bounded model checking over inﬁnite domains. In CADE, LNCS 2392, pages 438–
455, 2002.
[60] Eva Dejnozkova and Petr Dokladal. Asynchronous multi-core architecture for
level set methods. In International Conference on Acoustics, Speech, and Signal
Processing (ICASSP), pages 1–4, 2004.
[61] Jayant DeSouza, Bob Kuhn, Bronis R. de Supinski, Victor Samofalov, Sergey
Zheltov, and Stanislav Bratanov. Automated, scalable debugging of MPI programs
with intel R  message checker. In SE-HPCS, pages 78–82, 2005.
[62] David Detlefs, Greg Nelson, and James B. Saxe. Simplify: a theorem prover for
program checking. J. ACM, 52(3):365–473, 2005.
[63] Alastair Donaldson, Daniel Kroening, and Philipp R¨ ummer. Automatic analysis
of scratch-pad memory code for heterogeneous multicore processors. In TACAS,
LNCS 6015, pages 280–295, 2010.
[64] Vijay D’Silva, Daniel Kroening, and Georg Weissenbacher. A survey of automated
techniques for formal software veriﬁcation. IEEE Trans. on CAD of Integrated
Circuits and Systems, 27(7):1165–1178, 2008.
[65] Bruno Dutertre and Leonardo de Moura. The Yices SMT solver. Tool paper,
http://yices.csl.sri.com/documentation.shtml, 2009.
[66] Niklas E´ en and Niklas S¨ orensson. Temporal induction by incremental SAT solving.
Electr. Notes Theor. Comput. Sci., 89(4), 2003.
[67] Cormac Flanagan and Stephen N. Freund. Type-based race detection for Java. In
PLDI, pages 219–232, 2000.
[68] Cormac Flanagan and Patrice Godefroid. Dynamic partial-order reduction for
model checking software. In POPL, pages 110–121, 2005.
[69] Cormac Flanagan, K. Rustan M. Leino, Mark Lillibridge, Greg Nelson, James B.
Saxe, and Raymie Stata. Extended static checking for java. In PLDI, pages 234–
245, 2002.190 REFERENCES
[70] Martin Fowler. Continuous Integration. ThoughtWorks. http://martinfowler.com,
2006.
[71] Malay K. Ganai and Aarti Gupta. Accelerating high-level bounded model checking.
In ICCAD, pages 794–801, 2006.
[72] Malay K. Ganai and Aarti Gupta. Completeness in SMT-based BMC for software
programs. In DATE, pages 831–836, 2008.
[73] Malay K. Ganai and Aarti Gupta. Eﬃcient modeling of concurrent systems in
BMC. In SPIN, LNCS 5156, pages 114–133, 2008.
[74] Naghmeh Ghafari, Alan Hu, and Zvonimir Rakamaric. Context-bounded trans-
lations for concurrent software: An empirical evaluation. In SPIN, LNCS 6349,
pages 227–244, 2010.
[75] Patrice Godefroid. Partial-order Methods for the Veriﬁcation of Concurrent Sys-
tems: An Approach to the State-explosion Problem. University of Liege, PhD
thesis, 1995.
[76] Patrice Godefroid, Jonathan de Halleux, Aditya V. Nori, Sriram K. Rajamani,
Wolfram Schulte, Nikolai Tillmann, and Michael Y. Levin. Automating software
testing using program analysis. IEEE Software, 25(5):30–37, 2008.
[77] Patrice Godefroid, Nils Klarlund, and Koushik Sen. DART: directed automated
random testing. In PLDI, pages 213–223, 2005.
[78] Benny Godlin and Ofer Strichman. Regression veriﬁcation. In DAC, pages 466–
471, 2009.
[79] H. Goldstein. Checking the play in plug-and-play. Spectrum, IEEE, 39(6):50–55,
2002.
[80] David Gries and Gary Levin. Assignment and procedure call proof rules. ACM
Trans. Program. Lang. Syst., 2(4):564–579, 1980.
[81] Andreas Griesmayer, Stefan Staber, and Roderick Bloem. Fault localization using
a model checker. Softw. Test., Verif. Reliab., 20(2):149–173, 2010.
[82] Alex Groce, Klaus Havelund, and Margaret H. Smith. From scripts to speciﬁca-
tions: the evolution of a ﬂight software testing eﬀort. In ICSE (2), pages 129–138,
2010.
[83] Formal Methods Group. SymC. http://www-ti.informatik.uni-
tuebingen.de/ fmg/symc/, 2008.
[84] Orna Grumberg, Flavio Lerda, Ofer Strichman, and Michael Theobald. Proof-
guided underapproximation-widening for multi-process systems. In POPL, pages
122–131, 2005.REFERENCES 191
[85] Elsa L. Gunter and Doron Peled. Model checking, testing and veriﬁcation working
together. Formal Asp. Comput., 17(2):201–221, 2005.
[86] Sumit Gupta. High Level Synthesis Benchmarks Suite.
http://mesl.ucsd.edu/spark/benchmarks.shtml, 2009.
[87] Eclipse Helios. Eclipse IDE for C/C++ developers, 2010.
[88] Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Dirk Beyer. BLAST:
Berkeley Lazy Abstraction Software Veriﬁcation Tool. http://mtc.epﬂ.ch/software-
tools/blast/, 2009.
[89] Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Shaz Qadeer. Thread-
modular abstraction reﬁnement. In CAV, LNCS 2725, pages 262–274, 2003.
[90] Gerard J. Holzmann. The model checker Spin. IEEE Trans. Software Eng.,
23(5):279–295, 1997.
[91] Gerard J. Holzmann. The Spin Model Checker - Primer and Reference Manual.
Addison-Wesley, 2003.
[92] Gerard J. Holzmann and Dragan Bosnacki. The design of a multicore extension
of the Spin model checker. IEEE Trans. Software Eng., 33(10):659–674, 2007.
[93] Gerard J. Holzmann, Rajeev Joshi, and Alex Groce. Tackling large veriﬁcation
problems with the Swarm tool. In SPIN, LNCS 5156, pages 134–143, 2008.
[94] Michael Huth and Mark Ryan Logic in Computer Science: modelling and reasoning
about systems. Cambridge University Press, 2004.
[95] ISO. ISO/IEC 9899:1999: Programming languages C. International Organization
for Standardization, 1999.
[96] Franjo Ivancic, Ilya Shlyakhter, Aarti Gupta, and Malay K. Ganai Model checking
C programs using F-SOFT. ICCD, pages 297–308, 2005.
[97] Franjo Ivancic. Personal communication. 2011.
[98] Paul B. Jackson, Bill J. Ellis, and Kathleen Sharp. Using SMT solvers to verify
high-integrity programs. In AFM, pages 60–68, 2007.
[99] Paul B. Jackson and Grant Olney Passmore. Proving SPARK Veriﬁcation Condi-
tions with SMT solvers. Technical Report, University of Edinburgh, 2009.
[100] Ranjit Jhala and Rupak Majumdar. Software model checking. ACM Comput.
Surv., 41(4), 2009.
[101] Bengt Jonsson and Yih-Kuen Tsay. Assumption/guarantee speciﬁcations in linear-
time temporal logic. Theor. Comput. Sci., 167(1&2):47–72, 1996.192 REFERENCES
[102] Vineet Kahlon, Sriram Sankaranarayanan, and Aarti Gupta. Semantic reduction
of thread interleavings in concurrent programs. In TACAS, LNCS 5505, pages
124–138, 2009.
[103] Vineet Kahlon, Chao Wang, and Aarti Gupta. Monotonic partial order reduction:
An optimal symbolic partial order reduction technique. In CAV, LNCS 5643, pages
398–413, 2009.
[104] Hermann Kopetz. Real-Time Systems: Design Principles for Distributed Embedded
Applications. Kluwer Academic Publishers, 2002.
[105] Daniel Kroening. Personal communication. 2009.
[106] Daniel Kroening, Edmund Clarke, and Karen Yorav. Behavioral consistency of
C and Verilog programs using bounded model checking. In DAC 2003, pages
368–371, 2003.
[107] Daniel Kroening, Edmund Clarke, and Karen Yorav. Behavioral consistency of C
and Verilog programs using bounded model checking. In Technical Report, CMU-
CS-03-126, 2003.
[108] Daniel Kroening and Sanjit A. Seshia. Formal veriﬁcation at higher levels of
abstraction. In ICCAD, pages 572–578, 2007.
[109] Daniel Kroening and Ofer Strichman. Decision Procedures: An Algorithmic Point
of View. Springer Publishing Company, Incorporated, 2008.
[110] Kelvin Ku, Thomas E. Hart, Marsha Chechik, and David Lie. A buﬀer overﬂow
benchmark for software model checkers. In ASE, pages 389–392, 2007.
[111] Shuvendu K. Lahiri, Shaz Qadeer, and Zvonimir Rakamaric. Static and precise
detection of concurrency errors in systems code using SMT solvers. In CAV, LNCS
5643, pages 509–524, 2009.
[112] Akash Lal and Thomas W. Reps. Reducing concurrent analysis under a context
bound to sequential analysis. Formal Methods in System Design, 35(1):73–97,
2009.
[113] Kim Guldstrand Larsen, Paul Pettersson, and Wang Yi. Uppaal in a nutshell.
STTT, 1(1-2):134–152, 1997.
[114] D. Lettnin, P. K. Nalla, J. Ruf, R. Weiss, A. Braun, J. Gerlach, T. Kropf, and
W. Rosenstiel. Semiformal veriﬁcation of temporal properties in embedded soft-
ware. GI/ITG/GMM Workshop, Methoden und Beschreibungssprachen zur Mod-
ellierung und Veriﬁkation von Schaltungen und Systemen, Erlangen, Germany,
2007.REFERENCES 193
[115] Djones Lettnin, Pradeep Kumar Nalla, J¨ org Behrend, J¨ urgen Ruf, Joachim Ger-
lach, Thomas Kropf, Wolfgang Rosenstiel, Volker Sch¨ onknecht, and Stephan Re-
itemeyer. Semiformal veriﬁcation of temporal properties in automotive hardware
dependent software. In DATE, pages 1214–1217, 2009.
[116] Sung-Soo Lim. SNU Real-Time Benchmarks Suite.
http://archi.snu.ac.kr/realtime/benchmark/, 2009.
[117] Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. Learning from mistakes:
a comprehensive study on real world concurrency bug characteristics. SIGARCH
Comput. Archit. News, 36(1):329–339, 2008.
[118] James R. Lyle and David W. Binkley. Program slicing in the presence of pointers.
In Third Annual Software Engineering Research Forum, pages 11–12, 1993.
[119] Rupak Majumdar and Koushik Sen. Hybrid concolic testing. In ICSE, pages
416–426, 2007.
[120] Nicolas Markey and Ph. Schnoebelen. Symbolic model checking for simply-timed
systems. In FORMATS/FTRTFT, pages 102–117, 2004.
[121] Takeshi Matsumoto, Hiroshi Saito, and Masahiro Fujita. Equivalence checking of
C programs by locally performing symbolic simulation on dependence graphs. In
ISQED, pages 370–375, 2006.
[122] Wolfgang Mayer and Markus Stumptner. Evaluating models for model-based de-
bugging. In ASE, pages 128–137, 2008.
[123] John McCarthy. Towards a mathematical science of computation. In In IFIP,
pages 21–28, 1962.
[124] Kenneth L. McMillan. The Cadence SMV Model Checker.
http://www.kenmcmil.com/smv.html, 2010.
[125] Kenneth L. McMillan. Interpolation and sat-based model checking. In CAV, LNCS
2725, pages 1–13, 2003.
[126] Kenneth L. McMillan. Applications of craig interpolants in model checking. In
TACAS, LNCS 3440, pages 1–12, 2005.
[127] Kenneth L. McMillan. Lazy abstraction with interpolants. In CAV, LNCS 4144,
pages 123–136, 2006.
[128] Kenneth L. McMillan. Interpolants and symbolic model checking. In VMCAI,
LNCS 4349, pages 89–90, 2007.
[129] Kenneth L. McMillan and Nina Amla. Automatic abstraction without counterex-
amples. In TACAS, LNCS 2619, pages 2–17, 2003.194 REFERENCES
[130] Elliott Mendelson. Introduction to Mathematical Logic. Chapman & Hall/CRC,
2009.
[131] Jos´ e Vander Meulen and Charles Pecheur. Combining partial order reduction with
bounded model checking. In Communicating Process Architectures (CPA), pages
29–48, 2009.
[132] Jeremy Morse. Kerberos Git. https://www.studentrobotics.org/trac/wiki/Kerberos/Git,
2011.
[133] Mohammad Reza Mousavi and Michel Reniers. A congruence rule format with
universal quantiﬁcation. Electron. Notes Theor. Comput. Sci., 192(1):109–124,
2007.
[134] Steven S. Muchnick. Advanced compiler design and implementation. Morgan
Kaufmann Publishers Inc., 1997.
[135] Frank Mueller. A library implementation of posix threads under unix. In USENIX,
pages 29–41, 1993.
[136] Madanlal Musuvathi and Shaz Qadeer. Iterative context bounding for systematic
testing of multithreaded programs. In PLDI, pages 446–455, 2007.
[137] Madanlal Musuvathi and Shaz Qadeer. Fair stateless model checking. In PLDI,
pages 362–371, 2008.
[138] Madanlal Musuvathi, Shaz Qadeer, Thomas Ball, G´ erard Basler, Pira-
manayagam Arumuga Nainar, and Iulian Neamtiu. Finding and reproducing
heisenbugs in concurrent programs. In OSDI, pages 267–280, 2008.
[139] Mayur Naik and Alex Aiken. Conditional must not aliasing for static race detec-
tion. In POPL, pages 327–338, 2007.
[140] George C. Necula, Scott McPeak, Shree Prakash Rahul, and Westley Weimer. Cil:
Intermediate language and tools for analysis and transformation of C programs.
In CC, LNCS 2304, pages 213–228, 2002.
[141] NXP. High deﬁnition IP and hybrid DTV set-top box STB225.
http://www.nxp.com/, 2009.
[142] Tom Ostrand. Siemens Corporate Research. http://sir.unl.edu/portal/, 2010.
[143] Peter Pacheco. An Introduction to Parallel Programming. Morgan Kaufmann
Publishers, 2011.
[144] Sangmin Park, Richard W. Vuduc, and Mary Jean Harrold. Falcon: fault local-
ization in concurrent programs. In ICSE (1), pages 245–254, 2010.REFERENCES 195
[145] Jacques Patarin and Louis Goubin. Trapdoor one-way permutations and multi-
variate polynominals. In ICICS, LNCS 1334, pages 356–368. Springer, 1997.
[146] Doron Peled. All from one, one for all: on model checking using representatives.
In CAV, LNCS 697, pages 409–423, 1993.
[147] Doron Peled. Model checking and testing combined. In ICALP, LNCS 2719, pages
47–63, 2003.
[148] Lorenzo Platania. Eureka Benchmark Suite. http://www.ai-
lab.it/eureka/bmc.html, 2009.
[149] Mukul R. Prasad, Armin Biere, and Aarti Gupta. A survey of recent advances in
SAT-based formal veriﬁcation. STTT, 7(2):156–173, 2005.
[150] Shaz Qadeer and Jakob Rehof. Context-bounded model checking of concurrent
software. In TACAS, LNCS 3440, pages 93–107, 2005.
[151] Shaz Qadeer and Dinghao Wu. Kiss: keep it simple and sequential. In PLDI,
pages 14–24, 2004.
[152] Ishai Rabinovitz and Orna Grumberg. Bounded model checking of concurrent
programs. In CAV, LNCS 3576, pages 82–97, 2005.
[153] Muralikrishna Ramanathan. ﬂex. http://sir.unl.edu/portal/, 2010.
[154] John A. Robinson. A machine-oriented logic based on the resolution principle. J.
ACM, 12:23–41, January 1965.
[155] Michiel Ronsse and Koen De Bosschere. Recplay: a fully integrated practical
record/replay system. ACM Trans. Comput. Syst., 17(2):133–152, 1999.
[156] Neha Rungta and Eric G. Mercer. Clash of the titans: tools and techniques for
hunting bugs in concurrent programs. In PADTAD, pages 1–10, 2009.
[157] Sriram Sankaranarayanan. NECLA Static Analysis Benchmarks. http://www.nec-
labs.com/research/system/, 2009.
[158] Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas
Anderson. Eraser: a dynamic data race detector for multithreaded programs.
ACM Trans. Comput. Syst., 15(4):391–411, 1997.
[159] Jeﬀ Scott, Lea Hwang Lee, Ann Chin, John Arends, and Bill Moyer. Designing
the low-power m*core architecture. In ICCD, pages 94–101, 1999.
[160] Koushik Sen. Concolic testing. In ASE, pages 571–572. ACM, 2007.
[161] Koushik Sen. Race directed random testing of concurrent programs. SIGPLAN
Not., 43(6):11–21, 2008.196 REFERENCES
[162] Mary Sheeran, Satnam Singh, and Gunnar St˚ almarck. Checking safety properties
using induction and a SAT-solver. In FMCAD, LNCS 1954, pages 108–125, 2000.
[163] Joao P. Marques Silva and Karem A. Sakallah. GRASP - a new search algorithm
for satisﬁability. In ICCAD, pages 220–227, 1996.
[164] SMT-LIB. The Satisﬁability Modulo Theories Library.
http://combination.cs.uiowa.edu/smtlib, 2009.
[165] Fabio Somenzi and Roderick Bloem. Eﬃcient buechi automata from LTL formulae.
In CAV, LNCS 1855, page 247263, 2000.
[166] Ian Sommerville. Software Engineering. Pearson Education Limited, 2007.
[167] Ofer Strichman. Regression veriﬁcation: Proving the equivalence of similar pro-
grams. In CAV, LNCS 5643, page 63, 2009.
[168] Aaron Stump and Morgan Deters. Satisﬁability Modulo Theories Competition.
http://www.smtcomp.org/, 2010.
[169] Andrew S. Tanenbaum. Computer networks: 4th edition. Prentice-Hall, Inc.,
Upper Saddle River, NJ, USA, 2002.
[170] Olivier Thiry and Luc J. Claesen A formal veriﬁcation technique for embedded
software. ICCD, pages 352–357, 1996.
[171] Salvatore La Torre, P. Madhusudan, and Gennaro Parlato. Reducing context-
bounded concurrent reachability to sequential reachability. In CAV, LNCS 5643,
pages 477–492, 2009.
[172] G. S. Tseitin. On the complexity of derivation in propositional calculus. In In
J. Siekmann and G. Wrightson, editors, Automation of Reasoning 2: Classical
Papers on Computational Logic 1967-1970.
[173] Moshe Y. Vardi. An automata-theoretic approach to linear temporal logic. In
Logics for Concurrency: Structure versus Automata, LNCS 1043, pages 238–266,
1996.
[174] Alberto L. Sangiovanni-Vincentelli, Luca P. Carloni, Fernando De Bernardinis,
and Marco Sgroi. Beneﬁts and challenges for platform-based design. DAC, pages
409–414, 2004.
[175] Nguyen Le Vinh. The Flasher Manager Application.
http://users.polytech.unice.fr/ rueher/Benchs/FM/, 2010.
[176] Willem Visser, Klaus Havelund, Guillaume P. Brat, Seungjoon Park, and Flavio
Lerda. Model checking programs. Autom. Softw. Eng., 10(2):203–232, 2003.REFERENCES 197
[177] Chao Wang, Rhishikesh Limaye, Malay K. Ganai, and Aarti Gupta. Trace-based
symbolic analysis for atomicity violations. In TACAS, LNCS 6015, pages 328–342,
2010.
[178] Chao Wang, Zijiang Yang, Vineet Kahlon, and Aarti Gupta. Peephole partial
order reduction. In TACAS, LNCS 4963, pages 382–396, 2008.
[179] Christoph M. Wintersteiger. Compiling GOTO-Programs.
http://www.cprover.org/goto-cc/, 2009.
[180] Yichen Xie and Alex Aiken. Scalable error detection using Boolean satisﬁability.
SIGPLAN Not., pages 351–363, 2005.
[181] Liang Xu. SMT-based bounded model checking for real-time systems. In QSIC,
pages 120–125, 2008.
[182] Yu Yang. Inspect: A Framework for Dynamic Veriﬁcation of Multithreaded C
Programs. http://www.cs.utah.edu/ yuyang/inspect/, 2010.
[183] Yu Yang, Xiaofang Chen, Ganesh Gopalakrishnan, and Robert Kirby. Runtime
model checking of multithreaded C/C++ programs. In Technical Report, UUCS-
07-008, 2007.
[184] Aleks Zaks, Ilya Shlyakhter, Franjo Ivancic, Srihari Cadambi, Zijiang Yang, Malay
Ganai, Aarti Gupta, and Pranav Ashar. Using range analysis for software veri-
ﬁcation. In 4th International Workshop on Software Veriﬁcation and Validation,
2006.
[185] C. Pˇ asˇ areanu, P. Mehlitz, D. Bushnell, G. Burlet. Combining unit-level symbolic
execution and system-level concrete execution for testing nasa software In ISSTA,
pages 15–26, 2008.