Quality and Quantity in Robustness-Checking Using Formal Techniques by Frehse, Stefan
Quality and Quantity in
Robustness Checking
Using Formal Techniques
Stefan Frehse

Quality and Quantity in
Robustness Checking
Using Formal Techniques
Stefan Frehse
Arbeitsgruppe Rechnerarchitektur
Fachbereich 3 - Mathematik und Informatik
Universität Bremen
DISSERTATION
Zur Erlangung des Grades eines Doktors
der Ingenieurwissenschaften
- Dr.-Ing. -
Kolloquium am 21. August 2013
Erstgutachter: Prof. Dr. Rolf Drechsler
Zweitgutachter: Prof. Dr. Alberto Garcia-Ortiz

Für Imke
und unsere echt krasse Herde

Danksagung
Mit der Dissertation beende ich meine Promotion im Fach Informatik. Für
die Unterstützung während der Entstehung möchte ich mich bei einigen
Menschen bedanken.
Für die intensive Betreuung meiner Promotion und die Begutachtung
meiner Dissertation bedanke ich mich beim Leiter der Arbeitsgruppe Rech-
nerarchitektur: Rolf Drechsler. Er hat mich bereits im dritten Semester
meines Informatikstudiums für aktuelle Forschungsfragen begeistert und
eng in die Arbeitsgruppe einbezogen. Für die intensive Unterstützung
während der Enstehung wissenschaftlicher Veröﬀentlichungen und das Her-
ausarbeiten meines Dissertationsthemas möchte ich Görschwin Fey meinen
Dank aussprechen.
Die gesamte Arbeitsgruppe Rechnerarchitektur bietet eine herausragend
positive Arbeitsatmosphäre, dafür möchte ich mich sehr bedanken. Die
spannenden und vielfältigen Diskussionen mit Finn Haedicke, Mathias
Soeken und André Sülﬂow haben mich sehr inspiriert.
Ein herzliches Dankeschön geht an Jean Christoph Jung und Alexander
Finder für die Durchsicht der Arbeit.
Ein großer Dank geht an meine Eltern, die mich in vielerlei Hinsicht
intensiv unterstützen. Zu guter Letzt: Ich danke Imke Niemann für Ihre
herausragende Motivation: May the Force be with you!
Bremen, Mai 2013, Stefan Frehse
iii

Contents
1 Introduction 1
2 Preliminaries 11
2.1 Directed Acyclic Graph . . . . . . . . . . . . . . . . . . . . 11
2.2 Boolean Logic . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Boolean Functions . . . . . . . . . . . . . . . . . . . 11
2.2.2 Binary Decision Diagrams . . . . . . . . . . . . . . . 12
2.2.3 Conjunctive Normal Form . . . . . . . . . . . . . . . 12
2.2.4 Boolean Satisﬁablity – The SAT Problem . . . . . . 13
2.3 Circuits and Automata . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Digital Circuits . . . . . . . . . . . . . . . . . . . . . 21
2.3.2 And-Inverter-Graphs – AIGs . . . . . . . . . . . . . 23
2.3.3 Finite State Machine . . . . . . . . . . . . . . . . . . 23
2.4 Functional Veriﬁcation . . . . . . . . . . . . . . . . . . . . . 25
2.4.1 Bounded Model Checking (BMC) . . . . . . . . . . . 26
2.4.2 Interpolation-Based Model Checking . . . . . . . . . 27
2.4.3 Boolean Reasoning of Digital Circuits . . . . . . . . 31
2.5 Automatic Test Pattern Generation . . . . . . . . . . . . . . 31
2.6 Fault Tolerance Circuits . . . . . . . . . . . . . . . . . . . . 32
2.6.1 Checker Circuitry . . . . . . . . . . . . . . . . . . . 32
2.6.2 Triple Modular Redundancy - TMR . . . . . . . . . 33
3 Fault Model 35
3.1 Transient Faults . . . . . . . . . . . . . . . . . . . . . . . . 35
3.1.1 Transient Faults . . . . . . . . . . . . . . . . . . . . 36
3.1.2 Component Model and Multiple Transient Faults . . 38
3.2 Problem Formulation of the Thesis . . . . . . . . . . . . . . 40
3.3 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.1 Best Case Complexity . . . . . . . . . . . . . . . . . 45
3.4 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Observation Window . . . . . . . . . . . . . . . . . . . . . . 48
v
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4 Robustness Measures 51
4.1 Worst Case Robustness Measure . . . . . . . . . . . . . . . 51
4.2 Probabilistic Analysis . . . . . . . . . . . . . . . . . . . . . 53
4.2.1 Excitation and Propagation Probabilities . . . . . . 54
4.2.2 Relation of WC−RM and P−RM . . . . . . . . . 57
4.3 Class Models . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5 Computational Model 61
5.1 Modelling CTFs in Circuits . . . . . . . . . . . . . . . . . . 61
5.2 Models for Classiﬁcations . . . . . . . . . . . . . . . . . . . 62
5.2.1 Model for Classifying k-non-robust Components . . 63
5.2.2 Model for Classifying k-dangerous Components . . . 65
5.3 Basic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 Handling Reachability Information . . . . . . . . . . . . . . 67
5.4.1 Inﬂuences of Approximations . . . . . . . . . . . . . 68
5.5 Classiﬁcation by Means of Model Checking . . . . . . . . . 70
5.5.1 Problem Formulation . . . . . . . . . . . . . . . . . 71
5.5.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . 72
5.6 Low-Level Optimization Techniques . . . . . . . . . . . . . 72
5.6.1 Shortest Path Analysis . . . . . . . . . . . . . . . . . 72
6 RobuCheck - An Integrated Robustness Checker 75
6.1 BMC-classiﬁer . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.1.1 Problem Formulation . . . . . . . . . . . . . . . . . 79
6.1.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 84
6.1.3 Completeness . . . . . . . . . . . . . . . . . . . . . . 87
6.1.4 Embedding Reachability Information . . . . . . . . . 87
6.1.5 Incremental Satisﬁability . . . . . . . . . . . . . . . 89
6.2 ATPG-classiﬁer . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.2.1 Problem Formulation . . . . . . . . . . . . . . . . . 90
6.2.2 Using ATPG to compute EPP . . . . . . . . . . . . 91
6.2.3 Comparison to Blackbox Model Checker . . . . . . . 91
6.3 ITP-classiﬁer . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.3.1 Adaption of Interpolation-based Model Checking . . 93
6.3.2 Adequate Over-Approximation . . . . . . . . . . . . 98
6.3.3 Model Checking with Adequate Approximations . . 102
6.3.4 Classiﬁcation with Adequate Approximations . . . . 104
6.3.5 Proving Unbounded Dangerous Components . . . . 104
6.3.6 Complete Algorithm of the ITP-classiﬁer . . . . . . 107
6.4 COMP-classiﬁer . . . . . . . . . . . . . . . . . . . . . . . . 110
6.4.1 General Approach . . . . . . . . . . . . . . . . . . . 110
6.4.2 Local Classiﬁcation . . . . . . . . . . . . . . . . . . . 111
6.4.3 Composite Classiﬁcation . . . . . . . . . . . . . . . . 112
6.4.4 Flow of Validation . . . . . . . . . . . . . . . . . . . 114
6.4.5 Realization of the Validation . . . . . . . . . . . . . 114
6.4.6 Inﬂuence of Choosing Subcircuits . . . . . . . . . . . 115
6.4.7 Comparison of Accuracy . . . . . . . . . . . . . . . . 116
6.5 SIM-classiﬁer . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.5.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 118
6.5.2 Integration in the Classiﬁers . . . . . . . . . . . . . . 120
6.6 Comparison of the Classiﬁers . . . . . . . . . . . . . . . . . 120
6.6.1 BMC, ATPG, and ITP-classiﬁer . . . . . . . . . . . 121
6.7 RobuCheck . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.7.1 System Overview . . . . . . . . . . . . . . . . . . . . 123
6.7.2 Technical Details . . . . . . . . . . . . . . . . . . . . 124
6.7.3 Simulation Heuristics . . . . . . . . . . . . . . . . . 131
6.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7 Experiments 135
7.1 Interpolation: Model-based vs. Proof-based . . . . . . . . . 136
7.1.1 Future Work . . . . . . . . . . . . . . . . . . . . . . 138
7.2 Simple Model Checker - SimpMC . . . . . . . . . . . . . . . 138
7.3 Robustness Checking . . . . . . . . . . . . . . . . . . . . . . 140
7.3.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . 140
7.3.2 Formal classiﬁers . . . . . . . . . . . . . . . . . . . . 142
7.3.3 SIM-classiﬁer . . . . . . . . . . . . . . . . . . . . . . 147
7.3.4 Concurrent Classiﬁcation . . . . . . . . . . . . . . . 151
7.3.5 Probabilistic Analysis . . . . . . . . . . . . . . . . . 152
7.3.6 COMP-classiﬁer . . . . . . . . . . . . . . . . . . . . 156
7.3.7 Robustness Checking by Means of Model Checking . 159
7.3.8 IBM Benchmarks . . . . . . . . . . . . . . . . . . . . 160
8 Conclusion and Future Work 163

Chapter 1
Introduction
In our daily life we are directly and indirectly using numerous computer
systems. This number tends to grow in the future. Safety-critical computer
systems assembled by several digital systems are integrated for example
in cars, airplanes or are used in server systems. For example, during the
last years drive-by-wire in cars came up to control the driver’s command
electronically rather than by a mechanical control systems. One advantage
of such systems is that the input of the driver can be checked whether
it keeps the car on track and correct the input if necessary. However,
the dedicated computers behind those control systems are special purpose
processors and are very complex while the application demands even for a
very fast and correct processing.
Those system’s complexity signiﬁcantly increased over the last years
since more and more features are implemented. A modern computer system
contains several communicating digital circuits – or chips. These chips
usually consist of millions of transistors. For example the IBM Power7
Central Processing Unit (CPU) introduced in 2011 consists of 1.2 billion
transistors [SKS+11] while one of the ﬁrst transistorized CPUs, the Intel
4004 released 1971, consists of only 2,300 transistors. This number, also
referred to as transistor count doubles every 18 month according to Moore’s
Law. This leads to an exponential growth.
The continuously increasing integration density comes inherently with
shrinking feature sizes since the size of the chips is kept approximately
constant. Thus, the size of a single transistor is getting smaller and smaller.
Besides the integration of more transistors the frequency is scaled up while
the voltage is scaled down leading to a lower power consumption. That
means, the chips run faster while consuming fewer energy.
However, due to these strong improvements one of the drawbacks are the
circuits are less reliable [Bau05, Bor07, BBL+12, SKK+02]. More precisely,
1
Chapter 1 INTRODUCTION
the circuits are more sensitive to radiation. A transient fault occurs when
enough energy aﬀects the transistor’s internal state. Thus, a transient fault
may manipulate a logical value of an internal signal - the circuit is aﬀected
by a soft error. The logical value is ﬂipped, i.e., inverted by 0 to 1 and
vice versa which is also called as a bit ﬂip. As one consequence the circuit
may not operate as speciﬁed. Unlike physical manufacturing defects after
fabrication a soft error is caused by external events that do not damage the
chip permanently [Bau05].
The Soft Error Rate (SER) is measured in Failure-In-Time (FIT) deﬁned
by one failure in 109 hours caused by transient faults. A vast set of literature
shows that the SER increases with continuously increasing integration
density, e.g., [KK07]. That means, the higher the SER the higher the
probability that a circuit causes a failure within that period of time.
In the past transient faults caused signiﬁcant breakdowns [Bau02] since
a misbehavior is life-threatening even in safety-critical systems. For example
the Therac-25 machine used for radiation therapy caused a critical overdose
of six patient [LT93].
Handling transient faults appropriately has become one of the major
challenges for future technology scaling [Bor07, BBL+12]. An important
factor of improving the circuit’s reliability is to tolerate transient faults by
detecting and correcting the misbehavior. Reliability of a digital circuit is
composed by several factors while fault tolerance is one of the most crucial
parts. The reliability is usually measured in Mean Time To Failure (MTTF)
and similar measures. MTTF speciﬁes the average time that a system may
fail. [KK07]
A digital circuit can be divided into combinational logic and sequential
logic. In earlier technology generations, sequential logic was more sensitive
for transient faults and was predominant analyzed in terms of transient
faults rather than combinational logic. However, the probability that
transient faults aﬀect also combinational logic that may ﬁnally cause a
failure increases with future technology generations. One reason is that
scaling down the frequency decreases the time window that a transient
fault can be stored in a memory element. This increases the probability
that a faulty value is ﬁnally stored in the state elements since the state
elements store more often in the same period of time. Consequently, a
broader consideration of dealing with soft errors in combinational logic as
well as sequential logic needs to be done which is addressed in this thesis.
In safety-critical systems the correct function must be ensured in every
case. For example, an airbag must be ﬁred in case of an accident and must
not be ﬁred during normal operation even under inﬂuences of external eﬀects.
Various standards have been published to formalize the requirements of
2
1.0
modern systems. An automotive related standard, ISO 26262 requires that
a certain level of reliability needs to be assured to get ﬁnally certiﬁed.
In order to overcome these eﬀects various hardening techniques are
available to catch and handle these faults on hardware level. During the
design process, the engineer usually adds redundancy to the hardware. Two
important categories of those techniques taken at design-level are listed:
• Error Correction Codes (ECC): The classical Hamming code [Ham50]
detects two errors and is able to correct one of them. A further ECC
is the widely used Reed-Solomon (RS) [Ber68] code which is able to
detect and correct errors as well. ECC is often implemented on top
of storage elements. For example NVIDIA’s recently manufactured
Fermi GPU implements ECC on storage elements [NVI12]. Parts of
the Power7 CPU are also ECC protected [SKS+11].
• Redundancy: Additional hardware is necessary in every case to tol-
erate faults [AK84]. A prominent example is the Triple Modular
Redundancy (TMR) implementation [vN56]. The basic idea is to
triplicate the functional unit and to add a majority voter to handle
transient faults during operation.
TMR is widely used in space applications since the SER increases
with increasing altitude [GOSM08, Nic11]. The work of [Yeh96]
describes a TMR-based implementation of the Boeing 777 primary
ﬂight computer.
For various generally applicable techniques, tool support is already
available. That means, those tools automatically implement these tech-
niques. For example, TMRTool implements fully automatic TMR on an
FPGA [Xil13].
Based on the widely applied techniques transient faults are caught to
ensure that the circuit works as speciﬁed even in the presence of transient
faults. However, the implementation itself might be faulty. Hence, the
implementation needs to be veriﬁed. This process is called robustness
checking and states the focus of this thesis. Robustness checking analyzes
a circuit whether the implemented techniques tolerates transient faults
appropriately.
In contrast, Model Checking (MC) usually veriﬁes a circuit with respect
to a speciﬁcation without external eﬀects [CGP01]. While MC veriﬁes the
unaﬀected circuit, robustness checking veriﬁes the circuit even in the case
of external eﬀects, i.e., in the presence of transient faults.
Figure 1.1 shows the enhanced design ﬂow embedding robustness check-
ing. The design is passed to a model checker verifying whether the circuit
3
Chapter 1 INTRODUCTION
Design
Model Checking
Robustness Checking
vuln. Components
. . .
. . .
D
es
ig
ne
r
Figure 1.1: Robustness checking embedded into the design ﬂow
implements the speciﬁcation. Once the circuit is successfully veriﬁed, ro-
bustness checking is performed. The outcome of robustness checking is a set
of components that are vulnerable against transient faults. A component is
vulnerable if a transient fault at this component modiﬁes the function of
the circuit. Moreover, a measure of the quality of the circuit in the presence
of transient faults is provided in term of a robustness measure.
If a desired level of robustness is assured the design is passed to further
steps in the design ﬂow. However, if a higher level of robustness is required
the data about vulnerable components is provided to the designer who may
implement additional techniques to improve the robustness. Afterwards the
design is again passed to the model checker, i.e., the loop restarts.
This thesis focuses on analyzing a digital circuit with respect to transient
faults, i.e., technique to perform robustness checking are proposed. The
analysis is performed on logical level of a circuit while the circuit is given in
a Hardware Description Language (HDL). Basically, two sets of components
are delivered by robustness checking. The ﬁrst set contains components
that are not vulnerable against transient faults and the other set contains
components that are vulnerable against transient faults.
4
1.0
In order to compute these sets the theoretical groundwork needs to be
properly deﬁned: 1) a fault model that describes the behavior and impact
of transient faults, and 2) a computational model that calculates the impact
of transient faults for each component are precisely introduced. Faults
are logically injected, i.e., for each component that needs to be analyzed
a dedicated fault injection logic is introduced into the circuit. Various
algorithms are introduced which inject and analyze which behavior the
respective faults cause.
The algorithms proposed in the thesis are divided into formal-based
methods and simulation-based approaches. Formal methods completely
analyze the entire search space and are therefore able to formally prove the
absence of faulty behavior. In contrast, simulation covers only a limited
portion of the search space and is usually not able to prove those properties
since corner cases are missed. That means, simulation roughly approximates
the solution but can handle larger circuits.
Bounded Model Checking (BMC) is a technique of functional veriﬁca-
tion that is able to disprove or prove a temporal property [BCCZ99] with
respect to a circuit. A bounded time interval is analyzed by formulating a
decision problem which is translated into a series of satisﬁability problems
solved by a Boolean satisﬁability solver. BMC is proven to be PSPACE-
complete [SC85] for the general class of temporal properties. Practically,
BMC is very eﬀective in ﬁnding bugs of a circuit by returning a trace that
shows the particular misbehavior. Proving that a property holds on the cir-
cuit requires typically too much computational power using BMC. Further
improvements have been developed by introducing, e.g., Interpolation-based
Model Checking (IMC) [McM03] that makes BMC practically complete.
IMC abstracts irrelevant facts while proving a property which was shown
to be very eﬀective even on industrial benchmarks.
Trivially, one can say that robustness checking can be easily translated
into a model checking problem that can be solved by state-of-the-art model
checkers. Of course, the modeling eﬀort would be very low but this approach
performs very poorly. During experiments it turned out that this translation
into a model checking problem is outperformed by all formal approaches
proposed in this thesis by a huge factor even for the smallest considered
circuit. That means, exploiting problem domain knowledge leads to a much
better performance and ﬁnally better quality of the result.
In contrast to BMC, Automatic Test Pattern Generation (ATPG) com-
putes a set of test pattern that are simulated after post-production to ﬁlter
out defective chips [DEFT09]. The generation of a test pattern is usually
performed according to the Stuck-At Fault Model (SAFM). Basically, for
each fault a test pattern is separately generated. In [Lar92] the ATPG
5
Chapter 1 INTRODUCTION
problem has been translated into a Boolean satisﬁability problem which
has been further improved in [DEFT09] called SAT-based ATPG. In in-
dustrial test ﬂows SAT-based ATPG is very eﬀective. ATPG is proven to
be NP-complete [IS75] and theoretically nontrivial to solve. But typically
the problem instances are solved very eﬃciently. ATPG is usually reduced
to combinational circuits since scan chains are integrated in the circuits
to arbitrarily justify values on the state elements. As a side-eﬀect the
complexity is reduced.
The algorithms of this thesis follow the methods of BMC, ATPG, and
IMC to perform robustness checking. These original approaches are adapted
for robustness checking. Moreover, in order to handle larger circuits a
divided-and-conquer approach is introduced similar as in Compositional
Model Checking [CLM89]. All these approaches are formal approaches
providing exact results. However, in a powerful ﬂow of robustness checking a
random simulation is necessary to run fast pre-processing which is introduced
in this thesis as well.
Overall, RobuCheck an integrated robustness checker is introduced
that implements a highly-optimized ﬂow. All approaches of this thesis are
integrated and are freely conﬁgurable. These approaches can be called
concurrently exploiting multi-core processor architectures or consecutively
on single core processors. The huge search space of the underlying problem
making robustness checking hard. Low-level optimization, pre-processing,
and post-processing techniques are introduced to improve the overall per-
formance.
In order to eﬀectively assess the circuit’s robustness three complexity
issues are adequately addressed in this thesis: 1) all input scenarios, and 2)
all transient faults need to be completely analyzed whether 3) all output
sequences adhere the speciﬁcation. These issues come inherently to compute
a suitable set of reachable states to provide high quality results. This set
of states directly inﬂuence the accuracy of the analysis which is deeply
investigated and handled in RobuCheck.
All approaches introduced in this thesis are empirically evaluated on
well-known academic circuits. Moreover, benchmarks coming from IBM
show the eﬀectiveness of RobuCheck on industrial circuits.
Related Work
To analyze the behavior of a circuit in the presence of transient faults
various works have been published. Simulation and emulation [CMR+02,
KPMH07, PCZ+08] based methods can handle large circuits but do not
generally provide exact results.
6
1.0
A major diﬀerence of the approaches in the literature and the thesis’s
focused fault model is the level of modeling faults. The works [MZM06,
MZM10, ZBD07] model radiation-induced transient faults on electrical level
and all relevant masking eﬀects. The size of those models is very large.
Consequently, the approaches are only useful for small circuits. Moreover,
multiple transient faults are considered in [MZM10] making the model even
more complex.
[BBC+09] characterizes the state space after injecting faults based on
certain properties. The works [BCG+10, BCT07] require more human inter-
action by providing properties manually. However, both approaches provide
only a "yes"/"no" information of the respective system while [BCG+10]
provides moreover a gradation of the considered properties as well.
The most similar works compared to the thesis are [HH08, HHC+09,
KPJ+06] which analyze all components of a circuit where the work [HPB07,
SLM07] focusing on state elements. All approaches analyze the impact of
the sequential behavior of transient faults. Either the approaches perform
symbolically a ﬁxed-point characterization which is restricted to relatively
small circuits or the approaches itself are restricted to a certain class of
circuits [HH08, HHC+09].
The approaches in this thesis analyze any kind of digital circuit at
logic level by considering the complete space of transient faults for each
component. Transient faults are adequately modeled for a single but
arbitrary time frame where the impact of transient faults are sequentially
analyzed. The approaches in this thesis provide a detailed analysis for
each component for any sequential circuit. The thesis focuses on formal
techniques to analyze transient faults.
Research Work
The thesis is based on the prior work of [FD08, FSD09] that formulate
the basis for formal robustness checking. However, several extensions and
reﬁnements in terms of performance, accuracy, and completeness have
been published by the author of this thesis. The entire work would not
be possible without the strong support of the coauthors of the respective
papers. Thanks to all my coauthors.
In the following an overview of my research is brieﬂy described.
This thesis focuses on the author’s research work about formal robust-
ness checking. Several scientiﬁc works were published at national and inter-
national conferences [SFFD09, FFD10, FFSD10, FF10, FHD+11, FRF12,
FFA+12, RFF12] and journals [FSFD11, FSSFa10] based on peer-review.
7
Chapter 1 INTRODUCTION
The work of [FFD10] entitled A better-than-worst-case robustness mea-
sure received a Best Paper Award in the category testing. Moreover, the
authors were invited to give a talk about their work at the International
Test Conference (ITC).
Parts of the thesis were also presented at the PhD-Forum of the Asia
and South Paciﬁc Design Automation Conference (ASP-DAC) conference.
The work [FFA+12] was a joint work with the IBM Haifa Research Lab
where the author of this thesis had an internship.
A preliminary version of the veriﬁcation tool RobuCheck has been pub-
lished and demonstrated at the University Booth of the Design, Automation,
and Test in Europe (DATE) conference.
To provide an overview which paper is included in the respective chapters
the following list shows the relations and the further structure of the thesis:
• Chapter 2: In this chapter the fundamentals are presented to keep
the thesis self-contained.
• Chapter 3: The next chapter introduces the basic fault model. Tran-
sient faults are modeled and the impact of these faults on the circuit’s
behavior is introduced. This chapter contains parts of the works
[FFD10, FSFD11, FFA+12].
• Chapter 4: After introducing the diﬀerent kinds of behavior two
robustness measures are introduced originally published in [FFD10,
FSFD11].
• Chapter 5: Having the basic groundwork introduced a basic algorithm
to compute the circuit’s robustness is proposed. Moreover, the inﬂu-
ence of approximate reachability information is analyzed. Parts of
this chapter were published in [FSFD11, FHD+11].
• Chapter 6: Concrete engines are introduced in this chapter that follow
the idea of the basic algorithm from the previous chapter. Mainly,
formal-methods based approaches are introduced but also a simulation-
based approach. Moreover, a simple model checker covering a new
idea of approximating reachable states is proposed as well. At the
end RobuCheck is presented that integrates all engines to determine
the circuit’s robustness. Parts of this chapter were published in
[FFSD10, FSFD11, FHD+11, FFA+12].
• Chapter 7: All introduced algorithms are evaluated on several bench-
marks. Beside mainly the evaluation of robustness checking results of
approaches to eﬃciently compute interpolants and a comparison of the
8
1.0
introduced model checker against a state-of-the-art model checking
are presented.
• Chapter 8: Finally, the thesis ends with a conclusion and a direction
of the future work.
Moreover, numerous works are published in related areas that are listed
below:
• The works of [SFWD12, WGF+11, ZFWD11, JFWD10, FWD10,
WGF+09] are about reversible logic including automated debugging
and testing. Moreover, an integrated development environment called
RevKit for developing algorithms around reversible logic has been
published as an open-source framework.
• The work of [HFF+11] integrates various satisﬁability solvers into
a domain speciﬁc language via C++ called metaSMT. Numerous
front-ends, middle-ends, and back-ends are available to conﬁgure an
optimal solver for a dedicated problem. metaSMT is published
as an open-source framework that is already integrated in various
tools, e.g, [HLGD12, RF12].
9

Chapter 2
Preliminaries
In this chapter the theoretical background is introduced to keep the thesis
self-contained. For further details on the respective topic a reference is
provided.
2.1 Directed Acyclic Graph
A common data structure in computer science is a graph consisting of nodes
and edges formally introducted as follows:
Deﬁnition 2.1. A Directed Acyclic Graph (DAG) is a tuple G = (V,E)
with a ﬁnite set of nodes V and a ﬁnite set of directed edges E ⊆ V × V .
If e = (v, v′) ∈ E, v is called source node and v′ is called target node.
The function in(v) = {ei1 , . . . , eio} denotes the set of incoming edges, i.e.,
where v is the target node. The function out(v) = {ei1 , . . . , eip} denotes the
set of outgoing edges, i.e., where v is the source node.
2.2 Boolean Logic
2.2.1 Boolean Functions
The set of Boolean values is given by B = {0, 1}, where 0 is also denoted by
FALSE or ⊥, and 1 which is also denoted by TRUE or  [Weg87].
Deﬁnition 2.2. An assignment φ is a mapping from a set of variables X
to Boolean values, i.e., φ(x) ∈ B for all x ∈ X.
Deﬁnition 2.3. A Boolean function f is a mapping of the form f : Bn → B.
Further, f is often deﬁned over a set of Boolean variables X = {x1, . . . , xn}
and is also written as f(x1, . . . , xn).
11
Chapter 2 PRELIMINARIES
Boolean functions can be diﬀerently represented by, e.g., truth tables,
Boolean expressions, Binary Decision Diagrams, and, Conjunctive Nor-
mal Form. A simple representation form is based on Boolean operations
(connectivities) that forms a Boolean expression. Such operations are for
example negation (¬, ·), conjunction (∧, ·), disjunction (∨,+), exclusive-or
(⊕, )ˆ, and further. A more detailed explanation is given in [Weg87].
Example 2.1. Given a set of Boolean variables X = {x1, x2, x3} with
xi ∈ B, a Boolean function f is given by: f(x1, x2, x3) = x1 · x2 ∨ ¬x1 · x3.
The function f evaluates to TRUE under the assignment x1 = 1, x2 = 1,
x3 = 0.
2.2.2 Binary Decision Diagrams
Binary Decision Diagrams (BDDs) are graph-based data structures to
represent Boolean functions [Bry86]. For several years they exclusively
represented the back-end for many tasks during the design process like
veriﬁcation or test generation since manipulation and comparison is very
eﬃcient for those systems once the BDD is built-up. However, they are
still eﬀectively used in many applications as also in this thesis. Recently,
[XWMB12] shows that BDDs are still used in industrial-strength veriﬁcation
ﬂows that successfully veriﬁes systems while more modern approaches fail.
A BDD consists of edges and nodes forming a Directed Acyclic Graph
(DAG). Each internal node of the BDD is commonly computed by the Shan-
non Decomposition. A Reduced Ordered Binary Decision Diagram (ROBDD)
is a canonical representation of a Boolean function. Construction and ma-
nipulation of BDDs can be done very eﬃciently for a wide range of Boolean
functions. The book of [DB98] provides an overview of the techniques and
applications of BDDs.
2.2.3 Conjunctive Normal Form
Deﬁnition 2.4. Let X = {x1, · · · , xn} be a set of Boolean variables then
Lit(X) = {x, x | x ∈ X} is called the set of literals of X.
Deﬁnition 2.5. A clause is a disjunction of literals, i.e., F = l1∨l2∨· · ·∨ln
is a clause with li ∈ Lit(X) where X is a set of Boolean variables.
A clause is also modeled as a set of literals since the structure of the
formula is implicitly deﬁned, i.e., the clause F = l1 ∨ l2 ∨ . . . ∨ ln is also
written as F = {l1, l2, . . . , ln}. The size of a clause is given by the number
of containing literals and is written as |F |. A clause that contains only
one literal is called unit clause. The empty clause contains no literals, is
logically equivalent to FALSE, and is denoted by .
12
Boolean Logic 2.2
Deﬁnition 2.6. A Conjunctive Normal Form (CNF) is a conjunction of
clauses. That means the CNF M = F1 ∧F2 ∧ · · · ∧Fm is a Boolean formula
consisting of clauses Fi with 1 ≤ i ≤ m.
A CNF is also modeled as a set of clauses, i.e., a CNF is also written
as M = {F1, F2, . . . , Fm}. The size of a CNF is given by the number of
containing clauses written as |M |. The function Var(M) returns all variables
occurring in the formula M .
Deﬁnition 2.7. Given a CNF M = {F1, . . . , Fm} deﬁned over the variables
Var(M) = {x1, . . . , xn} and an assignment φ to the variables of M with
φ(xi) ∈ B. The assignment φ satisﬁes Fi, written φ |= Fi, if at least one
literal evaluates to TRUE. The CNF M is satisiﬁed by φ, written φ |= M ,
if every clause Fi is satisﬁed by φ, i.e., ∀Fi ∈ M.φ |= Fi. If there exists
an assigmments satisfying M , then M is called satisﬁable, otherwise M is
called unsatisﬁable.
Deﬁnition 2.8. Let F = {l1, . . . , ln} be a clause and M be a Boolean
function, then F|M is a stripped clause by the variables of M , i.e, F|M =
F ∩ Lit(Var(M)).
Example 2.2. Given a clause F = {x1, x¯2, x3} and a Boolean function
M(x1, x2) = x1·x¯2, then the stripped clause by M is given by F|M = {x1, x¯2}.
That means, the stripped clauses does not contain the variable x3, because
M is not deﬁned over x3.
Deﬁnition 2.9. Given an unsatisﬁable CNF M = {F1, . . . , Fn}, then M ′ is
called unsatisﬁable subformula with M ′ ⊆ M , if M ′ is unsatisﬁable. Further,
M ′ is called minimal unsatisﬁable subformula, if for all F ∈ M , the formula
M ′ \ {F} is satisﬁable. The term (minimal) unsatisﬁable subformula is also
known as (minimal) unsat-core of a CNF formula.
Deﬁnition 2.10. A minterm is a conjunction of literals, i.e., H = l1 ∧ l2 ∧
. . .∧ ln is a minterm with li ∈ Lit(X) where X is a set of Boolean variables.
Deﬁnition 2.11. A Disjunctive Normal Form (DNF) is a disjunction of
minterms. That means, G = H1 ∨ H2 . . . ∨ Hn is a DNF.
2.2.4 Boolean Satisﬁablity – The SAT Problem
Deﬁnition 2.12. The Boolean Satisﬁablity Problem (SAT Problem) is
a decision problem that takes as input a Boolean expression f : Bn → B
and computes whether f is satisﬁable computed by the function SAT?(f) ∈
{TRUE, FALSE}. SAT?(f) returns TRUE if there is an assignment φ such
that φ |= f , i.e., the function f evaluates to TRUE under the assignment α,
otherwise FALSE is returned.
13
Chapter 2 PRELIMINARIES
Boolean Satisﬁability (also known as Satisﬁability Problem or SAT
problem) states the question whether a Boolean function is satisﬁable. This
problem is of great interest for a various of theoretical and practical reasons.
The SAT problem was the ﬁrst known NP-complete problem proved by
Stephen A. Cook in [Coo71]. Therefore, from a theoretical point of view we
do not expect a general solution to compute satisﬁability eﬃciently under
assumption that P = NP [AB09]. However, not every probem instance
requires exponential run time and due to the great advances in the area of
SAT-solving, hard problem instances became solved very eﬃciently.
Various real-world problems from planning in Artiﬁcial Intelligence (AI)
[KS92, Rin12] over Automatic Test Pattern Generation (ATPG) [DEFT09]
and Model Checking (MC) [CGP01] of Boolean circuits to automated
debugging of software [SVAV05] are translated into a SAT-problem and are
eﬀectively solved. Due to the great success, these SAT solvers are applied
in various ﬁelds in computer science to solve diﬃcult problems.
Before solving a suitable problem using a SAT solver, a decision problem
needs to be formulated and then translated into a CNF. That means, the
problem instance from the problem domain needs to be translated into
a Boolean function and is then checked for satisﬁability by a SAT solver.
After deciding satisﬁability, the result is translated back into the problem
domain and in case that the formula is satisﬁable the satisfying assignments
are translated back if necessary.
Despite the high complexity of solving the SAT problem various algo-
rithms came up to solve the problem eﬃciently. These algorithms forms the
basis for today’s modern SAT solvers whose insights are brieﬂy introduced
in the following.
SAT solver
A SAT solver usually gets a CNF formula as input and returns whether
the CNF is satisﬁable or unsatisﬁable, i.e., a SAT solver implements the
function SAT?. Additionally, almost all SAT solvers compute satisfying
assignments if one exists. Due to the great achievements to boost the
performance of the SAT solver in recent years, CNFs with several hundreds of
thousand clauses and variables are routinely solved. Empirical studies about
the craft behind SAT solvers are presented, e.g., in [KSMS11]. The ﬁrst
algorithm was the Davis-Putnam (DP) algorithm [DP60] which performs
resolution, a simple inference rule to deduce the empty clause in case of an
unsatisﬁable CNF. The two years later proposed Davis-Putnam-Logemann-
Loveland (DPLL) algorithm improves the DP algorithm in terms of memory
14
Boolean Logic 2.2
consumption [DLL62] by replacing resolution with search and backtracking.
The DPLL algorithm forms the basis for the most modern SAT solvers.
At the beginning of the 21st century, SAT-solving became very attractive,
because more elaborated techniques like conﬂict analysis, 2-watched literal
scheme, and, non-chronological backtracking have been proposed to solve
practical problems, e.g., proposed with the SAT solver GRASP [MSS99],
MiniSAT [ES03], and Chaﬀ [MMZ+01b].
Today’s annual SAT competitions1 since 2002 show the increasing eﬃ-
ciency of new techniques. Further, it has been shown that ﬁne-tuning the
internal heuristics and engineering internal data structures of those solvers
has a strong impact on the performance.
Due to the increasing number of multi-core systems, recently SAT-solving
on diﬀerent CPU cores became an active research area. A portfolio solver
encapsulates diﬀerent solvers and starts the solving process by starting all
integrated solvers in separate threads or processes. The result of the fastest
solver is returned and all solvers are terminated. Various SAT solvers behave
diﬀerently on diverse CNFs due to their distinguished internal heuristics and
parameters. Adjusting one parameter can help to solve one instance much
faster while it slows down solving other instances. An advanced selection
whether a speciﬁc solver is suitable to solve a CNF faster than another
solver is very diﬃcult and often comes inherently to solve the SAT problem
itself. However, a balanced combination of various solvers may overcome
this issues which is also recently addressed in these SAT competitions by
introducing new tracks of parallel solving.
Resolution Proofs
Today’s modern SAT solvers are Conﬂict-Driven Clause-Learning (CDCL)
solvers. These solvers are inspired by modern DPLL-based algorithms
including backtracking, and the generation of conﬂict clauses. However,
they can be understood as proof systems performing resolution [PD11]
introduced in this section.
Given a CNF formula M , the procedure resolution tries to produce a
proof that M is a contraction, i.e., unsatisﬁable. The proof is produced
by the application of the following RESolution inference rule (RES) with
Fi ∪ {l} and Fj ∪ {¬l} are clauses with Fi ∪ {l}, Fj ∪ {¬l} ∈ M :
Fi ∪ {l} Fj ∪ {¬l}
Fi ∪ Fj RES (2.1)
1Organized on http://www.satcompetition.org
15
Chapter 2 PRELIMINARIES
where l is called pivot variable, Fi ∪{l} and Fj ∪{¬l} are called antecedents,
and Fi ∪ Fj is called resolvent. Let Res(Fi, Fj , l) be the function that
computes the resolvent. Basically, theRES rule states that Fi∪{l}∧Fj∪{¬l}
implies Fi ∪ Fj . If a sequence of resolution rules has been applied which
ﬁnally derives the empty clause, the CNF is unsatisﬁable since a conﬂict
occurs. Recall the empty clause is false.
Remark. In [Rob65], it has been shown that the resolution proof system
with the single rule RES is sound and complete.
Example 2.3. Given an unsatisﬁable CNF M with
M = {{x1, x2}, {x1,¬x2}, {¬x1, x2}, {¬x1,¬x2}}.
A sequence of applying RES to derive the empty clause is given by the
following ﬁgure:
{x1, x2} {x1,¬x2}
{x1}
{x1, x2} {¬x1, x2}
{x2} {¬x1,¬x2}
{¬x1}

The sequence of applying RES to derive the empty clause is called
resolution proof. In order to handle resolution proofs a data structure is
introduced.
Deﬁnition 2.13 (Resolution Proof). A resolution proof is a Directed
Acyclic Graph (DAG) R = (V,E, piv, cl, c), where V is a ﬁnite set of nodes,
E is a ﬁnite set of edges, piv speciﬁes the pivot variable which is labeled at
the edges, cl is the clause function computing a resolvent and c ∈ V is called
sink representing the empty clause and has no outgoing edges. Nodes with
no incoming edges are original nodes from the input formula, all remaining
nodes are derived nodes. For all derived nodes it holds: let α, β, γ ∈ V
with (α, γ) ∈ E and (β, γ) ∈ E and cl(γ) = Res(α, β, piv(γ)). The sink node
represents the empty clause, i.e., cl(c) = .
Moreover, given a resolution proof an unsatisﬁable core can be easily
extracted. The original clauses reached backwards from the sink node form
an unsatisﬁable core since they are necessary to derive a conﬂict.
Example 2.4. Reconsider Example 2.3, the following ﬁgure represents the
resolution proof.
16
Boolean Logic 2.2

{x1} {¬x1}
{x1, x2} {x1,¬x2} {¬x1,¬x2}
{x2}
{x1, x2} {¬x1, x2}
x1 x1
x2
x2
x2
x2
x1
x1
Arrows mark edges between nodes (clauses) and variables on edges denote
the pivot variables. 
In the following the number of nodes in the resolution proof is considered
as the size of the proof necessary to derive the empty clause. Unfortunately,
the size of a resolution proof can be very large with respect to the size
of the input formula. For example, the proof size of Pigeonhole Prob-
lems in inherent, i.e., it requires always an exponentially sized resolution
proof [AB09].
For other instances, the size signiﬁcantly depends on the choice of the
antecedents and pivot variables. Thus, the heuristics of the SAT solver in-
ﬂuences at least structurally the proof and often also semantically. However,
there are various techniques to shrink resolution proofs, e.g., [BIFH+11].
A modern SAT solver can be instrumented to produce a resolution proof.
For instance, the SAT solvers PicoSAT [Bie08] and MiniSAT [ES03] in its
proof tracing versions produce resolution proofs. However, tracing the proof
slows down the solve process by a considerable factor. Thus, proof tracing
is only activated if necessary.
Moreover, there are several tools beside the SAT solvers to process
resolution proofs. For example, tracecheck2 veriﬁes whether a resolution
proof correctly deduces the empty clause based on the resolution inference
rule. Those tools are useful when it has to be veriﬁed that a SAT solver
correctly concludes that a CNF is unsatisﬁable [ZM03].
Craig Interpolants
Craig interpolation has been introduced in [Cra57] and was later named by
the author William Craig. An interpolant describes intuitively the relation
2Available at http://fmv.jku.at/booleforce/
17
Chapter 2 PRELIMINARIES
between logical formulas and is formally deﬁned for Boolean formulas as
follows:
Deﬁnition 2.14. Given a pair of Boolean formulas (A,B) such that A ∧ B
is unsatisﬁable. A Craig interpolant Iˆ is a Boolean formula with the
following properties: 1) A =⇒ Iˆ, 2) Iˆ ∧ B is unsatisﬁable, and 3)
Var(Iˆ) ⊆ Var(A) ∩ Var(B).
Intuitively, Iˆ abstracts formula A, but contains suﬃcient constraints to
contradict formula B.
Theorem 2.1. Given a propositional formula pair (A,B) where A ∧ B is
unsatisﬁable, then there exists always an interpolant [Cra57].
An interpolation system is a procedure to obtain interpolants. This
system is called proof-based when it is based on a resolution proof and it is
called model-based when it is based on model enumeration.
Proof-based System Various proof-based interpolation systems have
been proposed based on resolution proofs. But basically, they can be divided
into two groups: a symmetric system independently developed in [Hua95,
Kra97, Pud97], and McMillan’s interpolation system (MIS) [McM03]. Fur-
ther, [DKPW10, Wei12] compare the strengths of both systems and provide
a parametric interpolation systems to generate diﬀerent interpolants from
the same proof. An essay about the inﬂuences of the performance of various
interpolation systems is until now publicly not available to the best of the
author’s knowledge.
Interpolants are practically exploited in SAT-based Model Checking
[McM03] which signiﬁcantly improved the ﬁeld of formal hardware veriﬁ-
cation in terms of handling industrial-sized circuits. In McMillan’s work
on interpolation-based model checking he introduced MIS which is brieﬂy
revisited in the following. MIS can be understand as a method which
annotates each node of a resolution proof based on incoming edges that
ﬁnally produces an interpolant.
Deﬁnition 2.15 (McMillan’s Interpolation System – MIS). Given a reso-
lution proof R = (V,E, piv, cl, c) for the formula pair (A,B) with A ∧ B is
unsatisﬁable. Each node v ∈ V is annotated with the annotate-function as
follows:
• if in(v) = 0:
– if cl(v) ∈ A then annotate(v) = cl(v)|B
18
Boolean Logic 2.2
– if cl(v) ∈ B, then annotate(v) = 
• else: let (v1, v) ∈ E and (v2, v) ∈ E
– if piv(v) ∈ Var(B), then annotate(v) = annotate(v1)∧annotate(v2)
– if piv(v) ∈ Var(B), then annotate(v) = annotate(v1)∨annotate(v2)
Finally, annotate(c) represents the interpolant where c is the sink node.
The computation of an interpolant can be done in linear time and space
with respect to the size of the resolution proof. However, the proof itself can
be exponentially larger than the size of the input formula A ∧ B according
to the resolution proof’s size.
Model-based System The work of [CIM12] propose an approach to
compute interpolants based on model enumeration. Given the formula pair
(A,B) the approach basically 1) enumerates all assignments satisﬁying A,
2) minimizes these assignments that still contradict B, and, 3) creates a
DNF using these assignments. Finally, the DNF represents the interpolant.
Algorithm 1:
Model-based interpolation systems
• Input: The input of the approach is the formula pair (A,B) given
in CNF.
• Output: The output is either the interpolant Iˆ of (A,B) in DNF
or a satisfying assignment in case that A ∧ B is satisﬁable.
• Description: The code is listed in Pseudocode 1. Basically, the
approach enumerates models of A, i.e., φ |= A as long as no more
assignments exists. At the beginning interpolant Iˆ is FALSE (Line 2)
and the approach starts all possible models, i.e., A′ = A (Line 3).
The while-loop iterates as long as no more assignments exist, i.e., A′
is unsatisﬁable (Line 5) - the determined interpolant Iˆ is returned -
or there is an assignment that satisﬁes B, i.e., A ∧ B is satisﬁablity -
TRUE is returned in Line 11.
Suppose there is an assignment φ |= A′ (Line 7) represented as
minterm. The ﬁrst steps is to remove all variables of φ that occur
only in A, i.e., φ′ is deﬁned over common variables of A and B. In
19
Chapter 2 PRELIMINARIES
begin1
Iˆ = FALSE;2
A′ = A;3
while TRUE do4
if !SAT?(A′) then5
return Iˆ;6
φ = model(A′);7
φ′ = projection(φ,Var(A) ∩ Var(B);8
φ′′ = cubeEnlargement(φ′, A′);9
if SAT?(B ∧ φ′′) then10
return SAT + φ′′;11
φ′′′ = cubeEnlargement(φ′′, B);12
Iˆ = Iˆ ∨ φ′′′;13
A′ = A′ ∧ ¬φ′′′;14
end15
end16
Pseudocode 1: Model-based interpolation systems.
Line 9 the minterm φ is minimized by the function
cubeEnlargement. The implementation of that function is
transparent to the algorithm. Basically, the function tries to
minimize an assignment that still holds for an interpolant. Concrete
realizations of this function are provided in Section 6.7.2.
Finally, a new minimized minterm φ′′ is returned. If this minterm
satisﬁes B then A ∧ B is satisﬁable.
The minterm p′′′ is added to the interpolant Iˆ where the negation of
the minterm is added to A′ to avoid to recomputed the same
assignment.
Brief Comparison The proof-based systems are based on resolution
proofs whereas the model-based systems are based on model enumeration.
However, since the resolution proof might be very large the memory con-
sumption may require gigabytes of memory. In contrast, the model-based
system does not require the resolution proof but enumerates a huge number
of assignments. In Figure 2.1 the main diﬀerence is illustrated. While the
20
Circuits and Automata 2.3
A Iˆ B
(a) Proof-based
A
Iˆ1
Iˆ2
Iˆ3 Iˆ4
Iˆ5
B
(b) Model-based
Figure 2.1: Types of Interpolation Systems
proof-based approach computes the interpolant at once along the resolu-
tion proof, the model-based approach iteratively construct interpolants by
computing minterms.
Both systems are implemented and run concurrently to get the best of
both. Details about the implementation are presented later in this thesis in
Section 6.7.2.
In the following, the function itp(A,B) denotes the interpolant of the
formula pair (A,B) independent on the respective realizations.
2.3 Circuits and Automata
Digital circuits can be represented in diﬀerent ways. In this thesis cir-
cuits are assumed to be given as a graph-based representation introduced
more formally in the next section. In formal veriﬁcation an automata-
based presentation is commonly used which is introduced in the following.
Each graph-based representation of a circuit can be easily translated into
a automata-based representation. Thus, if necessary a automata-based
presentation is used.
2.3.1 Digital Circuits
Deﬁnition 2.16 (Circuit). A sequential circuit C = (V,E) is a Directed
Acyclic Graph (DAG) consisting of a ﬁnite set of nodes V = {v1, . . . , vn},
21
Chapter 2 PRELIMINARIES
AND OR NOT XOR
Figure 2.2: Typical gates
and a ﬁnite set of edges E = {e1, . . . , em}. Nodes correspond to components
and edges correspond to signal connections between the compnents. The
terms node and component are used interchangeably. The size of a circuit
is given by the number of components, i.e., |V | also written as |C|.
A component can be:
• a primary input PI, a primary output PO, or a state element FF,
• a primitive commonly used gate exemplary shown in Figure 2.2,
• a more complex module for example a multiplier MULT, or an Arith-
metic Logic Unit (ALU). A complex module might also relate to a
statement of a Hardware Description Language such as VHDL [Ash01]
or Verilog [TM96].
All available types of components are summarized in the set LIB.
Additional information is associated to nodes and edges expressed by
mappings of the following form:
• type : V → LIB maps each node to a type,
• ord : E → N maps each edge to a natural number
• bv : V ∪ E → N maps a bit-width in terms of a natural number to
an edge or node, also written as |g| for g ∈ V . Usually, the outgoing
node of an primitive gate has a bit-width of 1. More complex modules
may have a higher bit-width.
The set X ⊆ V denotes the set of primary inputs PI, i.e., compo-
nents with no incoming edges. The set Y ⊆ V the set of primary out-
puts PO, i.e., components with no outgoing edges. The set S ⊆ V contains
state elements FF. A sequential circuit consists of state elements where the
state elements may have diﬀerent values in diﬀerent times controlled by
clock denoted by time frame.
In order to address the values at a certain time frame the following
notation is used; An additional argument of the sets denotes a certain time
frame, .e.g, X(t) adresses the primary inputs at time frame t. Ananlogously
this notation is used for primary outputs, and state elements.
22
Circuits and Automata 2.3
A bit-level circuit exclusively consists of primitive gates with single-bit
outputs.
Deﬁnition 2.17. A combinational circuit is a digital circuit without state
element i.e., S = ∅.
2.3.2 And-Inverter-Graphs – AIGs
And-Inverter-Graphs (AIGs) [Hel63, DJBT81] are a subset of sequential
circuits which consider a certain set of gates. An AIG consists only of
AND-gates, state elements (FF), and, special annotations to signals whether
the value is inverted or not. AIGs are very simple to handle and can be
compactly represented using dedicated data structures. Each circuit can
be eﬃciently translated into an AIG, i.e., converting a circuit can be done
in linear time and space with respect to the number of components of the
original circuit [Tse68].
AIGs are commonly used in formal hardware veriﬁcation tools. Various
optimization techniques are available to reduce the size of AIGs, e.g.,
[ZKKSV06, EMS07]. Even the translation of an AIG into a CNF can be
done very eﬃciently which is typically required to reason about a circuit.
Before translation, the AIG is unrolled for a desired number of time frames
and afterwards converted into CNF.
Various software packages are available to represent and manipulate
AIGs: Aiger [Bie12] and the ABC veriﬁcation tool [Gro12].
2.3.3 Finite State Machine
Besides the DAG-based representation of sequential circuits the functional
behavior can be represented by a Finite State Machine (FSM) as well. This
kind of representation is often used in formal veriﬁcation since problems
related to reachability analysis are performed which are commonly per-
formed on automatas. Those FSMs can be derived from sequential circuits
considering fan-in and fan-out cone of state elements that correspond to
transition functions. Intuitively, an FSM reﬂects the behavior over the time
frames in terms of state-to-state relations based on the stimuli of the inputs
and produces output values for each transition.
Essentially, reachability information can be computed on this data
structure in order to conclude that a circuit may reach unwanted behavior
in terms of a bad state. However, in the following an FSM is formally
introduced.
Deﬁnition 2.18. A Finite State Machine (FSM) of a sequential circuit
is a tuple M = (I, S, T ). The set I ⊆ S describes the set of initial states.
23
Chapter 2 PRELIMINARIES
S
Sˆ
S∗
Sˇ
Figure 2.3: Relation of sets of states
The state space S is given by S = Bn where n is the number of state bits of
the circuit C. The transition relation T (s, s′) is TRUE if there is transition
from state s to state s′. [CGP01]
Operators to compute reachability information are deﬁned as well.
Deﬁnition 2.19. The set img(Q) = {s′ ∈ S | ∃s ∈ Q ∧ T (s, s′)} computes
all successor states reachable in one step from states of set Q ⊆ S based
on the transition relation T . A path is a sequence of states s0 → ... → sn
such that for 0 ≤ i < n the transition function T (si, si+1) is TRUE. The
length of a path is the number of transitions from the start state to the
end state. Let img0(Q) = Q and imgi+1(Q) = img(imgi(Q)). All states
reachable from the initial state I in any number of steps are given by
S∗ = ⋃i≥0 imgi(I). States in the set S∗ are called reachable states and
img is named as exact image operator. Let îmg be an over-approximate
operator with the following properties: img(Q) ⊆ îmg(Q) and therefore it
holds S∗ ⊆ Sˆ with Sˆ = ⋃i≥0 îmgi(I).
The set Sˆ contains at least all reachable states, i.e., ∀s ∈ S∗ =⇒ s ∈
Sˆ and might contain also states that are not reachable from the initial
state, i.e., ∃s ∈ Sˆ =⇒ s ∈ S∗. The sets of states are also often described
by a Boolean formula whose models corresponds to states. Those formulas
can be often represented very compactly. In the following sets of states and
Boolean formulas of states are used interchangeably. A detailed relation of
the sets of states is visually characterized in Figure 2.3.
An approximation can be described manually by a designer or can
be generated by various available tools, e.g.,based on BDDs3. A trivial
over-approximation is the complete state space, i.e., S, and a trivial under-
approximation is given by Sˇ(l) = imgl(I), i.e., all states that are reachable
in l steps from the initial state.
3Available under http://vlsi.colorado.edu/~fabio/CUDD/
24
Functional Veriﬁcation 2.4
Deﬁnition 2.20. The diameter of an FSM M denoted by dia(M) is the
length of the longest path in the set of shortest paths between pairs of states
in S∗. The reachability diameter of M denoted by rd(M) is the length of
the longest path that starts from the initial state.
Deﬁnition 2.21. A scenario is a tuple τ = ((X0, . . . , Xn), S0) where Xi
(0 ≤ i ≤ n) is an assignment to the primary inputs at time frames i and S0
is an initial assignment to the state elements for the ﬁrst time frame.
2.4 Functional Veriﬁcation
Functional veriﬁcation is an important step during the design process that
checks whether the Circuit Under Veriﬁcation (CUV) fulﬁlls the desired
speciﬁcation.
Model Checking (MC) [CGP01] is a major part of functional veriﬁcation
that formally veriﬁes whether a circuit completely fulﬁlls the speciﬁcation.
The speciﬁcation is given as a set of temporal properties such as Computation
Tree Logic (CTL), or Linear Temporal Logic (LTL) [Eme95, Pnu77]. The
circuits are usually provided as a FSM. Various algorithms came up to per-
form model checking on FSMs, e.g., Symbolic Model Checking [CMCHG96]
that performs a ﬁxed-point computation that checks whether the circuit’s
states intersect states not uncovered by the speciﬁcation.
In contrast to MC, Equivalence Check is a further technique to ensure
correctness during the design process. In EC, a golden model (speciﬁcation)
is checked against an implementation which is performed for each level of
the design process.
In order to verify that a CUV completely fulﬁlls a desired speciﬁcation all
possible computations scenarios need to be checked against the speciﬁcation.
Formally verifying a design comes inherently with considering all possible
scenarios to provide a complete analysis.
Techniques behind the formal veriﬁcation mainly rely on powerful rea-
soning engines as for example BDDs, or SAT solvers. Since those engines
fully analyze the entire search space a complete answer is provided that
empowers the design process. If a CUV violates the speciﬁcation, i.e., the
CUV does not behave as the desired, techniques based on formal veriﬁcation
will uncover this misbehavior.
The ﬁeld of formal hardware veriﬁcation is an intensive research area.
Various veriﬁcation techniques are successfully applied in complex hardware
veriﬁcation in industry [Kai11]. The work of [KGN+09] describes how a
complex processor design consisting of several million components has been
formally veriﬁed.
25
Chapter 2 PRELIMINARIES
In contrast to formal veriﬁcation, simulation-based functional veriﬁcation
only partially covers all possible scenarios and therefore misses corner cases.
Simulation or constrained-random simulation is still commonly applied
in functional veriﬁcation to close the manufacturing and veriﬁcation gap.
Since several years the trend is that the ability to manufacture a design is
signiﬁcantly higher than the ability to formally verify a design.
2.4.1 Bounded Model Checking (BMC)
Bounded Model Checking (BMC) [BCCZ99] is a model checking technique
based on Boolean satisﬁability to show or to refute that a design fulﬁlls
certain properties. Due to its great success large processor designs have
been formally veriﬁed [VLP+05]. Therefore formal veriﬁcation became more
and more accepted within the design groups despite its high computational
complexity [CKOS04].
BMC is able to prove or disprove a temporal property provided in LTL
with respect to a design. In particular, if a bug has been found, a trace
or stimuli called a counterexample is provided in order to understand the
misbehavior of the design by simulating the trace. Basically, the technique
behind BMC is to iteratively check whether a property holds during a
certain number of time frames. By increasing this number of time frames
eventually 1) a counterexample is found which violates the property, or
2) suﬃciently many time frames are considered and therefore the design
fulﬁlls the property. Consequently, if there is behavior of the circuit that
does not adhere the property BMC will ﬁnd this bug. Otherwise, i.e., the
circuit does completely fulﬁlls the property BMC provides a proof. Since
BMC relies on formal reasoning engines such as SAT solvers a complete
analysis is guaranteed.
A more detailed explanation of BMC is revisited in the following since
parts of the thesis are related to BMC. More insights are presented in the
original work [BCCZ99].
Problem Formulation
BMC comes inherently with translating a certain number of time frames and
a negated property into a CNF. The CNF is then checked for satisﬁablity.
This step is described below:
Given an FSM M = (I, T, S) of a sequential circuit and a predicate P
describing the property (speciﬁcation). The following formula forms the
basis for all BMC instances that are checked by a SAT solver:
26
Functional Veriﬁcation 2.4
BMC(l) = I(s0) ∧
∧
0≤i<l
T (si, si+1) ∧ P (sl) (2.2)
If BMC(l) is satisﬁable for any l then there exists a path s0 → s1 → . . . → sl
with s0 ∈ I(s0), (si, si+1) ∈ T for 0 ≤ i < l and sl ∈ P (sl). That means,
there is a path in the system that matches undesired behavior. Consequently,
the design does not fulﬁll the property. In contrast, if the formula is
unsatisﬁable the design fulﬁlls the property for a path of length l.
The BMC instances are iteratively checked starting from l = 0 up to
the completeness threshold (CT) denoted by lcmpl. Once a BMC instance
becomes satisﬁable for any l ∈ [0, lcmpl] the property is violated and a
counterexample is extracted from the satisfying assignments. Otherwise,
the BMC instance is unsatisﬁable for all those values of l the property holds
on the circuit.
Large upper bounds of the completeness thresholds can be determined
which is itself a nontrivial task [CKOS04]. Computing exact values in
reasonable run times for general LTL properties constitutes still an open
problem. Linear bounds have been proven for a subset of LTL proper-
ties [BCCZ99]. Various methods have been proposed in order to provide a
complete veriﬁcation without computing the exact completeness threshold.
Due to these strong improvements [McM03, GS05, CLM89] LTL model
checking is very eﬀective despite its high complexity since it is shown to be
PSPACE-complete [SC85].
2.4.2 Interpolation-Based Model Checking
BMC is sound and complete [BCCZ99] once the completeness threshold is
known. In industrial veriﬁcation ﬂows it turned out that BMC is mostly
used as a bug ﬁnder by computing a counterexample. Proving that a circuit
fulﬁlls a property BMC needs to reach the completeness threshold which is in
general unknown making BMC impractically as discussed above. Moreover,
the completeness threshold cannot be practically reached for moderate-sized
circuits since the complexity increases exponentially which results in huge
search spaces.
Craig interpolants [Cra57] are exploited in the work of McMillan [McM03]
to provide a completeness argument without knowing the completeness
threshold. Interpolants are used in order to over-approximate the image of
the transition relation. Those interpolants abstract facts that are irrelevant
to prove the property making BMC with interpolation very eﬀective. Once
a ﬁxed-point of the over-approximate image has been found the circuit is
27
Chapter 2 PRELIMINARIES
proven to be correct with respect to the property since all necessary states
have been discovered. This approach is called Interpolation-based Model
Checking (IMC) in the following.
Problem Formulation
Reconsider the BMC Formula 2.2 partitioned into two parts (A,B):
A := I(s0) ∧ T (s0, s1) and B :=
∧
1≤i<l
T (si, si+1) ∧ P (sl)
Formula A contains the initial predicate and the ﬁrst transition relation
while the formula B contains the remaining transitions and the predicate
of the property. The common variables of A and B are the state variables
corresponding to s1.
Suppose the formula A ∧ B is unsatisﬁable for a certain l. That means,
there is no path of length l that violates the property. An interpolant Iˆ =
itp(A,B) can be computed. The interpolant Iˆ over-approximates the image
of the ﬁrst transition relation since A =⇒ Iˆ and Var(Iˆ) ⊆ Var(A)∩Var(B).
Since B ∧ Iˆ is unsatisﬁable based on the construction of Iˆ that means no
path of length l−1 from an over-approximate state of Iˆ can reach a state of
the property. This construction is used to compute an over-approximation
of the exact set of reachable states by iteratively computing interpolants
until a ﬁxed-point is reached.
The interpolant Iˆ is added to the set A by replacing the variable s1
through s0 with:
A := (I(s0) ∨ Iˆ(s0)) ∧ T (s0, s1)
The formula A ∧ B is again checked for satisﬁablity and two cases may
occur:
• If the formula is unsatisﬁable a new interpolant is computed and
added to A. Eventually, the disjunction of all previously computed
interpolants implies the new interpolant - a ﬁxed-point is reached and
it is proven that the circuit fulﬁlls the property since no state of the
over-approximate state space reaches a state of the property.
• In contrast, if the formula is satisﬁable a probably spurious path
has been found since the ﬁrst state may start from an non-reachable
state over-approximated by the interpolant. However, in that case l
is increased, all interpolants are discarded and the entire procedure
restarts. The consequence of increasing l is that the new interpolants
28
Functional Veriﬁcation 2.4
begin1
if I(s0) ∧ P (s0) then return TRUE ;2
l = 1;3
while TRUE do4
φ = I(s0);5
while TRUE do6
A := φ ∧ T (s0, s1);7
B := ∧1≤i<l T (si, si+1) ∧ P (sl);8
if SAT?(A ∧ B) then9
if φ = I(s0) then10
return TRUE11
l = l + 1;12
break;13
end14
Iˆ = itp(A,B);15
if Iˆ =⇒ φ then return FALSE ;16
φ = φ ∨ Iˆ(s0);17
end18
end19
end20
Pseudocode 2: Interpolation-based model checking.
may abstract fewer facts leading to a non-spurious counterexample or
a safe path since formula B is stronger.
The entire procedure is shown in the following algorithm.
Algorithm 2:
Interpolation-based model checking
• Input: An FSM M = (I, T, S) and a property to verify P are given
as input.
• Output: The algorithm returns TRUE if a counterexample has
been found and FALSE if the circuit fulﬁlls the property.
• Description: The algorithm, shown in Pseudocode 2 runs
iteratively until either a ﬁxed-point is reached or a counterexample is
29
Chapter 2 PRELIMINARIES
found. In Line 2 it is checked whether the initial state violates the
property. If the initial state already violates the property, TRUE is
returned. Otherwise, the interpolation loop starts with checking
paths of length 1 (Line 3). The outer while-loop from Line 4 to
Line 19 runs as long as no counterexample and no ﬁxed-point is
found. In each iteration the algorithm checks whether a path that
starts from the initial state can reach a property state in l steps
where the initial states are speciﬁed in Line 5. This check
corresponds always to the base case for a real, non-spurious
counterexample. The inner while-loop from Line 6 to Line 18 is the
interpolation loop. At ﬁrst the partitions A and B are created. If the
formula is satisﬁable it is further checked whether the base case is
considered. In this case a real counterexample has been found and
TRUE is returned. Otherwise, a probably spurious counterexample
has been found. Thus, l is increased and the outer loop completely
restarts (Line 12 and 13). In contrast, if A ∧ B is unsatisﬁable an
interpolant Iˆ (Line 15) is computed. In Line 16 it is checked whether
a ﬁxed-point is found, i.e., it is checked whether the disjunction of all
previously computed interpolants (φ) is implied by the new
interpolant. In the case, the circuit fulﬁlls the property and FALSE
is returned. Otherwise, if no ﬁxed-point is found the new interpolant
is added to φ and the inner loop restarts from Line 6. Eventually, a
ﬁxed-point is found, assuming that the circuit fulﬁlls the property,
the disjunction φ of all computed interpolants computes an
over-approximation of the reachable states.
IMC is proven to be sound and complete in [McM03] and is further
improved in, e.g., [DPK08, VG09]. Furthermore, the algorithm always
terminates since either a counterexample is found or the completeness
threshold is reached. Practically, It has been shown to be very eﬀective on
industrial benchmarks since it abstracts irrelevant facts for the proof of a
certain property making the problem instances manageable.
2.4.3 Boolean Reasoning of Digital Circuits
In order to formally reason about digital circuits using a SAT solver a
CNF has to be generated which is basically required for, e.g. BMC and
IMC. Converting a digital circuit into a CNF can be done by, e.g., Tseitin
encoding [Tse68] or Plaisted-Greenbaum encoding [PG86]. A digital circuit
is translated into a CNF in linear time and space with respect to the size
30
Automatic Test Pattern Generation 2.5
of the circuit for both techniques. The proposed algorithms in this thesis
exclusively use the common Tseitin encoding. A more detailed discussion
and comparison of the mentioned techniques are presented in the elaborated
work [JBH12].
2.5 Automatic Test Pattern Generation
Automatic Test Pattern Generation (ATPG) generates a set of input stimuli
of a circuit according to a fault model to test the manufactured circuit also
named as post-production test. This test ensures that the chip is correctly
fabricated with respect to the underlying fault model. A defective chip will
not be delivered to a customer. That means, each fabricated chip will be
tested by the generated test patterns. Therefore, ATPG comes with special
requirements. At one hand as much as necessary test patterns should be
generated to keep the time of testing low and on the other hand the coverage
of the test patterns should be as high as possible to detect most of the
buggy chips.
ATPG has been proven to be NP-complete [IS75] and founds application
in veriﬁcation as well, e.g., [BRTF99, AV02].
Practically, ATPG is often performed on combinational circuits instead
on sequential circuit to keep the complexity low. A sequential circuit is
converted into a combinational circuit before by replacing the state elements
by pseudo primary inputs and pseudo primary outputs. To test a sequential
circuit based on combinational test pattern, scan chains are inserted into
the circuit to justify arbitrary value at the state elements [WA73].
The test pattern are generated based on the commonly used fault model:
Stuck-At Fault Model (SAFM) [BF76]. In SAFM a signal a constantly set
to a Boolean value. SAFM covers various physical eﬀects and it turned out
that this fault model is relatively simple but practically very eﬀective. There
are two kinds of stuck-at faults denoted by g1@sa0 and g1@sa1, i.e., signal
g1 is constantly set to value 0 and 1, respectively. For each fault, ATPG
generates a test pattern that is applied on the fabricated circuit to detect
possibly inserted faults during the manufacturing process.
To test all signals of a circuit against the SAFM, a fault list is created.
This list contains for each signal of the circuit two stuck-at faults, i.e, F =
{g1@sa0, g1@sa1, . . . , gn@sa0, gn@sa1} where n is the number of signals of
the circuit.
For each item of the fault list, a test pattern is generated using ATPG.
The complexity of ATPG is high since for each entry an NP-complete
problem needs to be solved.
31
Chapter 2 PRELIMINARIES
Logic Cloud CheckerCircuitry F
Figure 2.4: A schematic view of a circuit with a checker circuitry
.
Optimization techniques were proposed to reduce the size of the fault
list and therefore the number of ATPG calls. Logical implications and
equivalences are applied. Moreover, fault simulation tests whether further
faults can be detected after obtaining a test pattern [JG03] which reduces
the overall run time of ATPG.
Several ATPG algorithms have been proposed. A powerful ATPG engine
based on Boolean satisﬁability (SAT-ATPG) has been published in [Lar92]
and was strongly further improved in [DEFT09, CPL+09, ED11]. Basically,
a SAT problem is formulated for each item of the fault list. That means, a
CNF of a decision problem whether a test pattern exists, is created, solved
by a SAT-solver, and the result is interpreted into the ATPG domain.
2.6 Fault Tolerance Circuits
The thesis deals with analzing fault tolerant circuits. The technique that
is implemented as hardening technique is transparent to the approaches.
But examples of those techniques are provided in this section. Two kinds of
hardening techniques are revisited in the following section. These techniques
are also used in the experiments of this thesis.
2.6.1 Checker Circuitry
A checker circuitry is a part of a circuit that checks whether the computation
is correct. This circuitry might be a parity checker, a hamming code, or a
similar technique. In Figure 2.4 a schematic view of a circuit containing a
checker circuitry is shown.
As a result the checker circuitry reports a detected fault via a fault
signal denoted by F. The system that integrates the circuit can take some
action once a fault is reported. For example, a reset sequence can be applied
32
Fault Tolerance Circuits 2.6
Module 1
Module 2
Module 3
Majority
Voter F
Figure 2.5: A schematic view of a system-level TMR implementation
to get the circuit into a consistent state. But those mechanisms are not
further investigated in this thesis.
2.6.2 Triple Modular Redundancy - TMR
A Triple Module Redundany implementation is a commonly applied tech-
nique. The idea is to triplicate the circuit and to add a majority voter. A
schematic overview of a system-level implementation is shown in Figure 2.5.
The voter logic ensures that only correct values are propagated to the
outputs assuming single faults.
Once the implementation is correctly implemented TMR ensures that
single transient faults are completely corrected. That means, almost 100%
of single transient faults are caught. Optionally, a TMR implementation
reports also a corrected fault by the fault signal F.
In contrast to a system-level implementation, a FF-based implementa-
tion adds the triplication to each ﬂip ﬂop separately by triplicating the
combinational logic of the input of the ﬂip ﬂop. This step is applied for
each ﬂip ﬂip.
33

Chapter 3
Fault Model
A fabricated chip is delivered to the customer once the post-production test
did not detect any misbehavior caused during the manufacturing process.
However, after producing the chip various eﬀects may inﬂuence the correct
computation during operation. Recently, coping with soft errors became
a major challenge for future technology scaling [Bor07, BBL+12]. Due
to the increasing integration density the vulnerability of today’s circuits
against transient faults is signiﬁcantly increased compared to older circuit
generations. A transient fault temporarily modiﬁes the functional behavior
of a circuit [ALRL04]. Established techniques are available to catch and
handle those faults in order to keep the circuit working properly. However,
the correct implementation of those techniques needs to be veriﬁed. In
order to verify certain behavior of a circuit in the presence of transient
faults these eﬀects need to be properly deﬁned.
This chapter discusses the eﬀects of transient faults modeled at logic
level of a circuit and proposes a categorization of the impact of those faults
for the circuit’s component. A fault model at logic level is introduced
to abstract the electrical eﬀects caused by transient faults. Furthermore,
eﬀects of transient faults at Boolean level are lifted to word-level. That
means, besides single transient faults, certain multiple transient faults are
covered as well. Those multiple transient faults will occur more often in
advanced technology generations as well [MZM10, MR08, Nic11].
3.1 Transient Faults
A transient fault may temporarily modiﬁes a transistor’s state in a digital
circuit but does not damage the hardware physically – a soft error occurs.
These faults are typically caused by the environment. High energy neutrons
35
Chapter 3 FAULT MODEL
and α-particles may cause a pulse for a very short duration. This may
cause a bit ﬂip in a circuit at internal signals, i.e., a logical value is ﬂipped
from 0 to 1 or from 1 to 0, respectively. Due to the aﬀected logical value
the circuit’s function is temporarily changed. Over the time the fault eﬀect
might be observable at the output of the circuit. Consequently, the circuit
does not operate as speciﬁed which may have dramatic consequences in
safety-critical systems.
Transient faults are divided into two groups: Single Event Upsets (SEU)
and Single Event Transients (SET) [Nic11]:
• SEU causes bit ﬂips directly in the state elements, i.e., the stored
value might be ﬂipped,
• SET causes bit ﬂips in combinational logic but might be propagated
to the state elements when the fault arrives the state elements while
it captures the values: Three masking eﬀects may occur that the
transient faults are not propagated to state elements:
1. Timing masking: the fault is propagated too fast or too slow to
the inputs of the state elements to capture the fault,
2. Electrical masking: the amplitude is too small to be captured by
the state elements,
3. Logical masking: the path is not appropriately sensitized such
that the fault does not arrive the state elements.
For a long time hardening techniques were only applied to state elements
since SEU were more likely to occur than SET due to various factors such
as feature sizes, frequency, and voltage. However, the probability that SET
occurs increases, e.g., due to increasing operation frequency that increases
the probability that a faulty value arrives the state elements.
Both categories SEU and SET are considered in this work under the
term transient faults. However, if a separate analysis of SET and SEU
is desired, this can be easily conﬁgured by restricting the analysis to the
respective components. Analyzing faults in state elements or combinational
logic is transparent to the techniques and models introduced in this work.
Further, this thesis focuses on logical masking. Timing masking and elec-
trical masking are considered in the work of, e.g., [MZM10] whose model
complexity is signiﬁcantly increased.
3.1.1 Transient Faults
The eﬀect of a transient fault in a component’s logic is modeled as non-
deterministic value at a component’s output.
36
Transient Faults 3.1
OR
1
0
1
... 1 → 0

(a) Transient fault: 1 → 0
MULT
2 n
3 n 2n 6 → 1

(b) Transient fault: 6 → 1
Figure 3.1: Component Model: Transient faults at diﬀerent levels of abstraction
Deﬁnition 3.1. A Complex Transient Fault (CTF) is written as follows:
b → b′ with b = b′ and b, b′ ∈ N. That means, the value b is modiﬁed to
value b′.
Deﬁnition 3.2. A Simple Transient Fault (STF) is written as follows:
b → b′ with b = b′ and b, b′ ∈ B. That means, a bit ﬂip is described by
0 → 1, and 1 → 0. An STF is a special case of an CTF.
Usually, the eﬀects of a STF are single bit ﬂips which are also commonly
used in related works. In this work, a more general consideration is used
covering a broad range of transient faults by allowing more bits to be ﬂipped
according to Deﬁnition 3.1.
Example 3.1. In Figure 3.1(a) an OR-gate g with single-bit inputs and
a single-bit output is shown. The STF 1 → 0 at component g aﬀects the
output, i.e., the output is inverted from 1 to 0.
In contrast, Figure 3.1(b) shows multiplier MULT with two n-bit inputs
and one 2n-bit output. The CTF 6 → 1 at the multiplier aﬀects the
output, i.e., the correct multiplication of 2 · 3 = 6 is modiﬁed to value 1.
The space of all CTFs at a certain component is deﬁned as follows:
Deﬁnition 3.3. Given a component g ∈ V of a circuit. All possible CTFs
of component g are given by:
F(g) =
⋃
b,b′∈{0,...,2bv(g)−1}
b=b′
(b → b′, g)
where bv speciﬁes the bit-width of component g.
Example 3.2. All CTFs of the OR-gate from Example 3.1 are given by
F(OR) = {(0 → 1, OR), (1 → 0, OR)} and all CTFs of the multiplier MULT are
given by F(MULT) = ⋃b,b′∈{0,...,2n}
b=b′
(b → b′, MULT).
37
Chapter 3 FAULT MODEL
MULT
2 n
3 n 2n 6 → 1

(a) Transient fault: 6 → 1


. . ....
...
2 n
3 n 2n 6 → 1
(b) Multiple STFs: 6 → 1
Figure 3.2: CTFs and multiple STFs
In the related work, a diﬀerent probabilities that a transient fault occur
are taken into account, e.g., [HPB07, CRP+96, KMH05]. However, in this
thesis all transient faults are equally distributed over all components yielding
to a conservative fault model.
In the following, it is supposed that all CTFs of a component may non-
deterministically occur. That means, a non-deterministic value is assumed
according to the space of CTFs of the respective component. That means
the assumed value is independent on the component’s function.
Deﬁnition 3.4. Given is a circuit C = (V,E). The set of all CTFs of the
circuit is given by: F(C) = ⋃g∈V (g,F(g)).
The number of CTFs is denoted by |F(C)|.
3.1.2 Component Model and Multiple Transient Faults
The space of transient faults depends on the abstraction level of the CUV.
As introduced in Section 2.3.1 a circuit is composed of components
where a component is a more complex module or a primitive gate. For each
primitive gate, there are two possible STFs, 0 → 1, and 1 → 0 as introduced
in Deﬁnition 3.2.
Consider Figure 3.2 that illustates the MULT-module as a high level
component (Figure 3.2(a)) and the corresponding gate level components
38
Transient Faults 3.1
0
000
1
001
2
010
3
011
4
100
5
101
6
110
7
111
Figure 3.3: Modulo-3 counter with impact of a STF
(Figure 3.2(b)). Multiple STFs at gate level are modeled by a single CTF
at the higher level. This modeleding allows to cover local multiple transient
faults by a single CTFs. However, assembling a certain set of gates into a
component arbitrary multiple STFs analysis can be performed. As a side
eﬀect the complexity is reduced.
There are related works that focuses on analyzing multiple transient
faults, e.g., [MZM10, FFSD09]. While the work of [MZM10] is about
analyzing (multiple) transient faults on electrical level which is typically
more complex than on logical level, the work [FFSD09] analyze transient
faults on logical level. The work of [FFSD09] is based on the author’s
Diploma thesis that analyzes multiple STFs of a digital circuit. This work
considers multiple STFs over all components which results in a huge search
space. For example, a circuit that consists of only 275 primitive gates the
number of multiple STFs is approximately 6.09 × 1010 = 60 900 000 000
while considering up to 4 simultaneously occurring transient faults. Despite
this huge number up to 46% of the faults were completely classiﬁed within
almost one hour. However, with increasing number of gates the problem
becomes unmanageable and therefore simplifying the search space needs to
be done. Due to the component model of this thesis local multiple STFs
are covered by a single CTF.
Example 3.3. In Figure 3.3 an FSM of a modulo-3 counter is shown. A
state is marked by a rounded rectangle where in the upper part the state value
is shown represented by a high level implementation and in the lower part
the bit-encoding of the state value represented by a gate level implementation.
Three single-bit state elements are required to stored the state information.
39
Chapter 3 FAULT MODEL
C@t

C@t − 1 C@t + 1. . . . . .
time
Figure 3.4: Transient fault at an arbitrary time frame
Arrows between the rectangle mark valid transitions. The states 0, 1, 2,
and 3 are reachable states. The remaining states are non-reachable states.
Suppose the current state of counter is state 3 and a STF occurs. The
dashed arrows mark possible impacts of the fault. Only a single bit is ﬂipped
by an STF. i.e., possible STFs are 011 → 010, 011 → 111, 011 → 001 since
exactly one state bit is aﬀected. Thus, the STF may cause a transition to
state 1, 2, or 7 which are all invalid computations.
In contrast, suppose an CTF occurs. Since all bits can be aﬀected by
an CTF all states can be reached by that fault. In context of STF, this can
only be reached by more than one STF.
In the fault model of this thesis a CTF are modeled to occur only for a
single but arbitrary time frame. Once a CTF occurs in time frame t, the
CTF disappears immediately for all subsequent time frames > t, i.e., the
aﬀected component behaves as speciﬁed even though the CTF still aﬀects
the internal state of the circuit.
3.2 Problem Formulation of the Thesis
If a transient fault occurs the ﬂipped bits have diﬀerent eﬀects of the circuit’s
behavior. These diﬀerent kinds of behavior are distinguished by diﬀerent
terms named as classes and the computation of the impact is named as
classiﬁcation.
A schematic view of a circuit over diﬀerent time frames is shown in
Figure 3.4 in order to illustrate the situation to analyze. The notation
C@t denotes the circuit in time frame t in an arbitrary state. Suppose the
circuit is aﬀected by a CTF at an arbitrary component g denoted by . An
essential question about the impact of the CTF comes up:
Does any CTF at component g aﬀects the primary outputs in
any subsequent time frames?
40
Classes 3.3
This intuitively stated question is formally and more elaborated pre-
sented later in this section. Answering that question is one of the main
tasks of this thesis and means:
1. If the question is answered with "yes" for a certain component, the
designer knows that this component is vulnerable against transient
faults and can implement techniques for protection or can correct a
maybe buggy implementation,
2. In contrast, if the answer is "no", the designer can safely conclude
that the circuit correctly computes the desired function even in the
presence of transient faults at the considered component.
Overall, providing an answer for all components forms a measure for
the quality of the circuit in the presence of transient faults in terms of a
ratio of vulnerable components and all components. However, assessing
the quality in the presence of transient faults of a circuit is essential in the
design process particularly for safety-critical applications as a requirement
in certiﬁcation processes (e.g., ISO 26262).
Consequently, the aim of the thesis is summarized as follows:
Quantify the fault tolerance as high as possible with the highest
possible quality.
3.3 Classes
The basic idea of robustness checking is to classify each component of a
circuit into diﬀerent classes based on the impact of transient faults. This
corresponds to give an answer to the question stated in the section before.
Each class represents a certain behavior. The result is a partitioned
circuit that highlights the diﬀerent behavior. Therefore, the designer clearly
identiﬁes parts that need to be additionally protected or determines weak-
nesses of the implemented technique.
The diﬀerent classes are formally introduced in the following: Given a
circuit C = (V,E) and a component g ∈ V that has to be classiﬁed. Further,
the circuit might be equipped with a checker circuitry. The checker reports
detected faults by a fault signal F. If there is no such fault signal, F is
implicitly assumed to be always equal to zero, i.e., F reports no fault.
Class 1 (Robust). The component g is classiﬁed as robust if for all
scenarios and all CTFs one of the following conditions hold:
41
Chapter 3 FAULT MODEL
C@tC@t − 1 C@t + 1. . . . . . C@t + k 

Ft−1 = 0 Ft+1 = 1Ft = 0 Ft+k = 0
Figure 3.5: Robust classiﬁcation of a component (Condition 1)
C@tC@t − 1 C@t + 1. . . . . . C@t + k

Ft−1 = 0 Ft+1 = 0Ft = 0 Ft+k = 0
Figure 3.6: Robust classiﬁcation for a component (Condition 2)
• Condition 1: the fault signal F reports the fault before it may become
observable at the primary outputs,
• Condition 2: the fault is corrected before it becomes observable at
the outputs and the states matches the fault-free computation.
All robust components are contained in the set T.
Example 3.4. Figure 3.5 illustrates a possible situation for Condition 1.
A CTF occurs at time frame t and a component’s outputs is aﬀected. In
this ﬁgure aﬀected time frames are marked by dashed borders. However, the
fault is detected through the internal checker circuitry and is reported by
the fault signal, i.e., Ft+1 = 1. Various recovery techniques are available
triggered by the fault signal to get the circuit into a consistent state, e.g., by
a reset sequence. That means, this behavior is covered by the implemented
circuitry and the respective scenario and CTF is not critical. Condition 2
is illustrated in Figure 3.6, the fault is simply corrected through the internal
logic and the state of the circuit matches ﬁnally the fault-free computa-
tion, i.e., the circuit is getting back into a consistent state in time frame
t + k – marked by solid borders. If one of both conditions holds for all
scenarios and all CTFs the component is classiﬁed as robust because no
faulty computation is propagated to the primary outputs.
42
Classes 3.3
C@tC@t − 1 C@t + 1. . . . . . C@t + k 

Ft−1 = 0 Ft+1 = 0Ft = 0 Ft+k = 0
Figure 3.7: Non-robust classiﬁcation of a component
Class 2 (Non-robust). The component g is classiﬁed as k-non-robust if
there is at least one scenario and at least one CTF f ∈ FN(g) at any time
frame t that becomes observable on at least one primary output within k time
frames before the fault signal reports a fault, i.e., Ft+0 = 0, . . . ,Ft+k = 0.
Thus, the fault becomes observable before it can be detected or corrected. All
components that are k-non-robust are contained in the set Sk.
Remark. A component can be contained in the set Sk and Sk′ with k = k′.
That means, the sets are not necessarily disjoint. There is no requirement
that k needs to be minimal. However, once a component is classiﬁed as
k-non-robust, a further analysis with a higher value of k is typically not
required.
Example 3.5. A possible situation of Class 2 is illustrated in Figure 3.7.
The circuit C operates normally until time frame t as a CTF occurs and the
output of component g is aﬀected (denoted by ). Corrupted time frames are
marked by dashed borders. A faulty value is propagated over k time frames
as the fault becomes observable at time frame t + k (denoted by ). In all
time frames between t and t + k the fault signal does not report any fault.
Consequently, the circuit violates the speciﬁcation and the component g is
classiﬁed as k-non-robust.
The term k-non-robust is also denoted by non-robust if k is not necessary
for the context and is related to the circuit in general.
If a component is not k-non-robust, i.e., no scenario and no CTF lead
to a faulty behavior in k time frames but the states diﬀer, Silent Data
Corruption (SDC) occurs. That means, the state of the circuit is corrupted
and diﬀers from the fault-free computation. More formally deﬁned in the
following class:
Class 3 (Dangerous). The component g is classiﬁed as k-dangerous, if all
scenarios and all CTFs at time frame t the fault is either correct / detected
or the states in time frame t + k are aﬀected, i.e., the states diﬀer from the
43
Chapter 3 FAULT MODEL
C@t

C@t − 1 C@t + 1. . . . . . C@t + k
Ft−1 = 0 Ft+1 = 0Ft = 0 Ft+k = 0
Figure 3.8: Dangerous classiﬁcation of a component
fault-free computation while the fault signal does not report any fault. That
means the circuit’s output behavior is not modiﬁed within t+ k time frames.
All components that are k-dangerous are contained in the set Dk.
Example 3.6. The situation of Class 3 is illustrated in Figure 3.8. A CTF
is injected in time frame t. The state is aﬀected in k further subsequent
time frames where the fault signals do not report any fault.
The classiﬁcation of k-dangerous components is a temporary classiﬁca-
tion. Since a component might be k-dangerous but in further analysis the
component may become (k + m)-non-robust or robust for any m > 0. The
following scenarios may occur: 1) a fault becomes observable at the primary
outputs in m additional time frame, i.e., k + m and the component is
classiﬁed as (k +m)-non-robust, or 2) the scenarios and CTFs are corrected
or detected such that the component is robust. If a component is classiﬁed
as k-dangerous further analysis is required. However, once a component is
classiﬁed as k-non-robust or robust the classiﬁcation is complete for this
component. Therefore, the number of components classiﬁed as k-dangerous
decreases with increasing k, i.e., more and more components are classiﬁed
as non-robust or robust.
Lemma 3.1. It holds Dk ⊇ Dk+1 ⊇ . . . ⊇ Dk+m for any m and k.
Proof. By contradiction: Suppose it holds Dk ⊂ Dk+1 for any k. That
means, there is a component g that is not k-dangerous but (k+1)-dangerous
with g ∈ Dk and g ∈ Dk+1. A component that is not k-dangerous is either
k-non-robust or robust. That means, either a CTF becomes observable
after k time frames or all faults are corrected or detected which means that
the classiﬁcation is complete. Consequently, a component g that is not
k-dangerous cannot be (k + 1)-dangerous.
Knowledge about k-dangerous components is valuable for the designer
since the longer SDC holds in a system the more likely a second transient
44
Classes 3.3
Table 3.1: Best case complexity of the classiﬁcation.
Classiﬁcation Time frame Scenarios CTFs
k-non-robust one one one
k-dangerous one one one
robust all all all
fault occurs which may result in accumulated fault eﬀects leading to a
non-robust behavior. Moreover, a buggy implemented checker circuitry can
be therefore detected using this kind of classiﬁcation.
In order to distinguish the components of a circuit C = (V,E) based on
their classiﬁcations the sets Sk for k-non-robust components, T for robust
components, and Dk for k-dangerous components have been introduced.
Components that are not yet classiﬁed are named as non-classiﬁed, i.e., be-
fore classiﬁcation or due to the limitation of computational resources. Those
components are contained in the set U. The sets of classiﬁcation are
pairwise disjoint for all k, i.e., (⋃i∈[0,k] Sk) ∩ Dk = ∅, Dk ∩ T = ∅, and
(⋃i∈[0,k] Sk) ∩ T = ∅. Overall, it holds V = ⋃i∈[0,k] Sk ∪ T ∪ Dk ∪ U.
3.3.1 Best Case Complexity
The classiﬁcation of k-non-robust and k-dangerous components is reduced
to an existential proposition, i.e., if there is at least one scenario and at
least one CTF such that the output or the states are tampered, respectively,
the components are classiﬁed as non-robust or dangerous and no further
scenarios need to be checked. In contrast, the classiﬁcation of robust
components demands for a universal proposition, that states that under all
scenarios and all CTFs the speciﬁcation is kept for all time frames or the
fault is reported. That means, it has to be proven that no scenario and no
CTF violates the output behavior.
In the following the best case complexity for classifying the components
in the respective classes is emphasized. The best case complexity of classify-
ing k-non-robust, and k-dangerous components and the best case complexity
of classifying robust components is signiﬁcant diﬀerent. The requirements
to classify the components into the respective classes are summarized in
Table 3.1.
The ﬁrst column shows the classiﬁcation. In the remaining columns:
1) one means that one time frame, one scenario, or one CTF needs to be
45
Chapter 3 FAULT MODEL
considered, 2) all, means that all time frames, all scenarios, or all CTFs
need to be considering.
Consider the best case for k-robust components: Exactly a one scenario
and one CTF need to be found for one time frame to classify a component
to be non-robust. Analogously, this holds also for a k-dangerous component.
In contrast, classifying a robust component requires that for all scenarios
and all CTFs the fault is not observable at the primary outputs for all time
frames. However, the number of scenarios exponentially grows with the
number of primary inputs and the number of state elements. Moreover,
as it will be later shown k is typically very large to provide a complete
classiﬁcation. Consequently, the eﬀort of classifying robust components is
signiﬁcantly higher than for k-non-robust and k-dangerous components.
However, robust components can be easily derived from the classiﬁcation
of k-non-robust and k-dangerous components that signiﬁcantly reduces the
complexity in the best case. Since for a robust components it needs to
be excluded that this component is neither k-non-robust nor k-dangerous
which is in the best case easier than proving that all CTFs do not cause
misbehavior over all time frames.
Lemma 3.2. Given a component g ∈ V such that g is neither k-non-robust
nor k-dangerous, i.e., g ∈ Sk and g ∈ Dk then g is robust, i.e., g ∈ T for
any k.
Proof. If there exists no scenario and no CTF at component g that lead to
faulty behavior in k time frames or to corrupted states in time frame k, then
the circuit behaves as expected by detecting or correcting all CTFs, i.e., g
is robust.
Corollary 3.1. Given the set of k-non-robust components Sk and k-dangerous
components Dk, then it holds: T ⊆ V \ (Sk ∪ Dk).
Proof. Follows directly from the proof of Lemma 3.2 by applying the proof
for each robust component.
3.4 Completeness
So far the classiﬁcation of k-non-robust and k-dangerous components is
based on an arbitrary value k. However, in order to compute all non-robust
components all necessary sets S0, . . . ,Skcmpl have to be determined where
kcmpl is a completeness threshold and is explained in the following.
It is assumed that the fault occurs at any arbitrary time frame t, i.e., all
time frames where a fault can occur are covered. The completeness thresh-
old (CT) kcmpl describes the maximal value to cover all possible propagation
46
Completeness 3.4
C@tC@t − 1 C@t + 1. . . . . . C@t + kcmpl

F = 0 F = 0F = 0 F = 0
Figure 3.9: Unbounded dangerous components
paths that allows to fully determine all non-robust components . That
means, for all values of k ∈ [0, kcmpl] the set Sk has to be determined
and the entire set of non-robust components S(kcmpl) is determined by:
S(kcmpl) = S0 ∪ S1 ∪ . . . ∪ Skcmpl .
Checking all values of k from 0 up to the completeness threshold ensures
that all possible scenarios to propagate a CTF are covered. An upper
bound of the completeness threshold for an arbitrary circuit is given by the
following lemma.
Lemma 3.3. Given a circuit C = (V,E) with n single-bit state elements1
then the completeness threshold is bounded by: 0 ≤ kcmpl ≤ 22n.
Certainly, in practice the completeness threshold might be much smaller
than the upper bound. But a computation of the value itself is a nontrivial
task [CKOS04]. However, manually speciﬁed completeness threshold can
be given by the designer who has special knowledge about the design. A
technique to automatically obtain completeness is presented later in this
thesis.
Once all necessary values of k has been considered such that all non-
robust are determined, the vulnerable components of the circuit against
CTFs are obtained. Even when all necessary values for k have been checked,
SDC may occur which may consistute vulnerability of the circuit as well.
That means, under all possible scenarios and all CTFs at a component
for all k ∈ [0, kcmpl] a fault is neither observable at the primary outputs
nor reported by the fault signal but corrupting the states. This particular
behavior is more formally deﬁned as the following class:
Class 4. A component g is classiﬁed as unbounded dangerous, if and
only if g ∈ Dkcmpl.
Example 3.7. In Figure 3.9 a situation of Class 4. A CTF occurs in time
frame t and aﬀects all time frames up to kcmpl time frames. The fault signal
does not report any fault within the interval.
1Note, a circuit with multi-bit state elements can be easily translated into a circuit
exclusively consisting of single-bit state elements.
47
Chapter 3 FAULT MODEL
Channel
data
reset Encoder Decoder
data
F
Figure 3.10: Transmission system
The existence of unbounded dangerous components in a design might be
critical since the fault is manifested in the circuit state until a reset sequence
is performed. The probability that a second transient fault may occur
increases over the time, i.e, the longer the circuit is in the corrupted state
the higher the probability that a second transient fault causes accumulated
fault eﬀects.
3.5 Observation Window
A circuit that contains checker circuitry occasionally comes with special
constraints. For example, the fault signal should immediately report a fault
after a bounded number of clock cycles when a transient fault occur. Thus,
a bounded interval [0, k¯] may suﬃce to get accurate classiﬁcations with k¯ ≤
kcmpl. This interval is called observation window and is manually speciﬁed
by the designer. Often in practice the length of an adequate observation
window is much smaller than the general completeness threshold, i.e., k¯ 
kcmpl. Consequently, the manually speciﬁed observation window leads to a
completeness threshold. However, if not other mentioned the observation
window is set to kcmpl by default. The inﬂuence of the choice of the
observation window is illustrated by the following example.
Example 3.8 (from [FSFD11]). A (7, 4)-Hamming-Code recognizes and
repairs single faults [Ham50]. Figure 3.10 shows a transmission using an
encoder for 4-bit data, a bit-wise serial channel, and a decoder. A failure
in the transmitted code word is ﬂagged by setting the fault signal F. The
circuit computing this transmission consists of 368 components. The timing
is summarized like this:
• Encoding and transmission to the channel: 1 time step
• Transmission: 4 time steps (registers in the channel)
• Decoding, writing to the output, setting F: 1 time step
The classiﬁcation of the components depends on the value k¯:
48
Summary 3.6
• k¯ < 6: The data from k = 0 did not arrive at the primary outputs, yet.
Faults in the decoding logic are detected within 1 time step by setting
the fault signal F. The corresponding components are classiﬁed.
Faults in the channel change the state, but not all data has been
decoded, yet. Faults in unprocessed registers are undetected, yet, and
the components cannot be classiﬁed. While incrementing k¯, more and
more components are classiﬁed.
• k¯ = 6: The input data reaches the primary outputs. Faults that can
be detected are ﬂagged and undetected faults in the encoder propagate
to the primary outputs. All components are classiﬁed.
• k¯ > 6: Faults injected at t = 0 do not inﬂuence the state of the model
after more than 6 time steps.
3.6 Summary
The eﬀects of transient faults are abstracted to logical level. A fault model
is introduced that models transient faults as a non-deterministic output
behavior of component. Based on the impact of transient faults at the com-
ponents diﬀerent behavior is observable at the circuit’s outputs. Therefore,
diﬀerent classes that catches the respective behavior are introduced.
In order to overcome complexity issues the classiﬁcation of robust
components is translated into a easier problem according to the best case
complexity. Furthermore, a completeness threshold has been introduced that
needs to be reached to provide a general complete classiﬁcation. However,
this value might be very large. Practically a signiﬁcantly smaller observation
window has been introduced that is manually speciﬁed by the designer to
overcome complexity issues.
49

Chapter 4
Robustness Measures
The classiﬁcation presented in the previous chapter provides a partition of
the circuit into the respective classes. However, in order to rate the quality
of the circuit in the presence of transient faults two robustness measures
in terms of a single value are introduced. This measures objectively and
uniquely documents the quality of the circuit as it can be used, e.g., in certiﬁ-
cation. Moreover, diﬀerent hardening techniques for the same circuit can be
evaluated and compared based on these measures and the implementation
with the best properties can be selected.
The two measures are brieﬂy described:
• A worst case robustness measure is introduced that rate each com-
ponent of a circuit based on the worst case that a scenario and CTF
may occur.
• A probabilistic robustness measure is introduced that provides a diﬀer-
entiation between the non-robust components. This measure considers
a better case of scenario and CTF than the worst case.
4.1 Worst Case Robustness Measure
At ﬁrst the worst case robustness measure is introduced and deﬁned in the
following.
Deﬁnition 4.1. Given a set of robust T, non-robust S(k¯), k¯-dangerous
Dk¯ classiﬁed components. Furthermore, given a set of non-classiﬁed U
with V = T ∪ S(k¯) ∪ Dk¯ ∪ U for any k¯ ∈ [0, kcmpl]. The quality of the
circuit in the presence of CTFs is given by the Worst Case Robustness
Measure (WC−RM) Rk¯WC−RM with Rk¯lb ≤ Rk¯WC−RM ≤ Rk¯ub where the
bounds are deﬁned as follows:
51
Chapter 4 ROBUSTNESS MEASURES
Rk¯lb =
|T|
|V | = 1 −
|S(k¯) ∪ Dk¯ ∪ U|
|V | (lower bound)
Rk¯ub =
|T ∪ Dk¯ ∪ U|
|V | = 1 −
|S(k¯)|
|V | (upper bound)
That means, after classiﬁcation - computing the respective sets - up to
the observation window k¯, Rk¯WC−RM is caluclated by computing the bounds.
The WC−RM covers the worst case that any transient fault occurs since
exactly a single scenario suﬃces to classify a component to be non-robust
independent how likely the scenarios is. A more diﬀerentiated measure will
be introduced later in this chapter. However, properties of the measure are
additionally presented.
Lemma 4.1. It holds Rk¯ub − Rk¯lb ≥ Rk¯+qub − Rk¯+qlb for any k¯ ∈ [0, kcmpl] and
q ≥ 0.
Proof. Suppose all components are classiﬁed U = ∅. It holds:
|T ∪ Dk¯|
|V | −
|T|
|V | ≥
|T′ ∪ Dk¯+q|
|V | −
|T′|
|V |
|Dk¯| ≥ |Dk¯+q|
and the last inequality holds due to Lemma 3.1 for any q ≥ 0.
That means, while increasing the size of the observation window the gap
of the robustness bounds decreases, i.e., the more accurate is the analysis.
However, the accuracy – the gap of the bounds – of the classiﬁcation depends
on the choice of the observation window and varies from circuit to circuit.
The higher the value the more fault tolerant is the CUV. That means, high
values are expected for fault tolerant designs and low values for relatively
unprotected circuits or buggy implementations.
Example 4.1. Reconsider the Example 3.8 from page 48 of the previous
chapter. The determined bounds of the robustness are shown in Table 4.1.
Before the classiﬁcation starts, all components are non-classiﬁed, i.e., |U| =
368. By considering more and more time frames from 0 up to 6, the
components get classiﬁed. The robustness bounds meet each other after
analyzing 6 time frames and the classiﬁcation is therefore complete. That
means, k¯ = 6 is suﬃcient.
52
Probabilistic Analysis 4.2
Table 4.1: Hamming model
k ∈ [0, 6] |T |S| Dk |U| Rklb % Rkub %
before class. 0 0 0 368 0.0 100.0
0 11 2 355 0 3.0 99.5
1 54 36 278 0 14.7 90.2
2 93 49 226 0 25.3 86.7
3 132 61 175 0 35.9 83.4
4 171 73 124 0 46.5 80.2
5 210 87 71 0 57.1 76.4
6 267 101 0 0 72.6 72.6
When analyzing a combinational circuits there are no dangerous com-
ponents since a combinational circuit does not contain any state element.
Consequently, some issues of the robustness measure are simpliﬁed. The
size of observation window is always 1 since only one time frame suﬃces to
classify all components. The robustness measure for combinational circuits
RWC−RM with Rlb ≤ RWC−RM ≤ Rub is simpliﬁed to:
Rlb =
|T|
|V | = 1 −
|S(0) ∪ U|
|V | (lower bound)
Rub =
|T ∪ U|
|V | = 1 −
|S(0)|
|V | (upper bound)
Once all components are classiﬁed, i.e., U = ∅, the bounds are equal.
4.2 Probabilistic Analysis
The introduced robustness measure WC−RM from the previous Section 4.1
results in a worst case analysis, since a component is classiﬁed as non-robust
if there is a single scenario and a single fault that violates the speciﬁcation.
That means, a single but suitable combination of both suﬃces to classify the
component unless how likely the scenario and the fault are during normal
operation. Therefore, non-robust components cannot be diﬀerentiated how
vulnerable they are in practice: A component that has more scenarios than
another component is more vulnerable against transient faults. In order to
diﬀerentiate non-robust components a grading based on their number of
scenarios that leads to an Excitation and Propagation Probability (EPP) of
53
Chapter 4 ROBUSTNESS MEASURES
faulty behavior which is introduced in the following. Having that grading,
hot-spots can easily be highlighted and the designer is pin-pointed to those
components with high probabilities, e.g., by visualizing the hot-spots. As a
consequence, in order to keep cost constraints during the design process,
only those components, containing the hot-spot are protected, because faults
at those components are more likely to manipulate the output behavior. A
trade-oﬀ between degree of fault tolerance and costs can be found using
these gradings.
However, in order to obtain the grading more than a single scenario
need to be computed. Diverse works have been published to tackle this
problem [MZM10, HPB07]. For example, the work of [HPB07] considers all
scenarios by building up a BDD that handles all scenarios. Therefore, the
most precise possible grading is reached since the work considers all scenarios
but at very high computational costs due to the high memory consumption
of BDDs. Hence, the BDD-based approach is technically limited to very
small circuits since the number of scenarios grows exponentially with the
number of inputs of the circuit and the size of the observation window.
Those techniques, considering all scenarios are referred to as probabilistic
analysis.
However, in order to eﬃciently diﬀerentiate non-robust components a
new notion that considers a bounded number of scenarios is introduced in
the next section.
4.2.1 Excitation and Propagation Probabilities
A second robustness measure that constitutes a trade-oﬀ between the worst
case analysis and the probabilistic analysis is presented in the following. A
technique that considers more than a single scenario but potentially much
fewer than all scenarios is introduced. This leads to a measure that is more
accurate than the worst case measure and less accurate than the probabilis-
tic measure while the complexity is signiﬁcantly reduced. If the desired
diﬀerentiation of non-robust components is reached and the corresponding
hot-spots reﬂect enough information then no more scenarios need to be
computed. Moreover, the new measure states the most general measure in
this thesis since it embeds both extremes that are easily justiﬁed, i.e, this
new measure can capture both, the worst case, and the probabilistic case.
This measure has been published in [FFD10].
At ﬁrst, EPP is formally introduced:
Deﬁnition 4.2. Let g ∈ S(k¯) be a non-robust component and k¯ the obser-
vation window. The function ψ(g, k¯) denotes the number of scenarios that
lead to a non-robust classiﬁcation over the observation window k¯.
54
Probabilistic Analysis 4.2
Deﬁnition 4.3. Let g ∈ S(k¯) be a non-robust component and k¯ the obser-
vation window. The Excitation and Propagation Probability (EPP) is given
by:
epp(g, k¯) = 1 − ψ(g, k¯)
Ψ(k¯)
with Ψ(k¯) = 2|in(C)|·k¯ for an arbitrary observation window k¯.
The epp function computes a ratio of scenarios that deﬁnitely yield
faulty output and the number of all scenarios possible within a certain
observation window k¯. The following example demonstrates the meaning of
epp:
Example 4.2. Consider a combinational circuit C = (V,E) with four
primary inputs, i.e., |in(C)| = 4. Furthermore, let a, b ∈ S be two non-
robust and c, d, e ∈ T be robust components. The worst case analysis yields
RWC−RM = 3/5 = 60%. Further, assume that there are only two scenarios
that excite and propagate a fault in component a,i.e., ψ(a, 0) = 2. Given
the total number of 2|in(C)| = 24 = 16 scenarios, the probability to excite and
propagate the fault in a is only epp(g, 0) = 2/16 = 12.5%. Moreover, let any
input trace be a scenario for a fault at component b, i.e., ψ(b) = 16 and the
excitation probability at b is epp(b, 0) = 100%.
The worst case analysis does not diﬀerentiate the two components because
only a single scenario is computed. Both are simply classiﬁed as non-robust,
even though b can be considered as a hot-spot while a is relatively save.
However, the number of scenarios grows exponentially with the number
of inputs and the size of the observation window. Therefore, a maximum
pre-deﬁned portion of scenarios is introduced. The number λ is called
scenario ratio and limits the number of considered scenarios with respect
to all possible scenarios to reduce the complexity.
Deﬁnition 4.4. Given an arbitrary scenario ratio λ with 0 < λ ≤ 1. The
EPP limited by λ is given by:
epp(g, k¯, λ) = 1 −
min
{
ψ(g, k¯), λ · Ψ(k¯)
}
λ · Ψ(k¯)
Technically, once ψ exceeds the maximum number of scenarios the
computation can be terminated rather than enumerating all scenarios to
reduce computational costs. Based on this value a second robustness
measure covering the notion of multiple scenarios is deﬁned as follows
in terms of a upper bound. Since there is no diﬀerentation of robust
components the lower bound from Section 4.1 is used. However, it follows
the upper bound using epp:
55
Chapter 4 ROBUSTNESS MEASURES
Deﬁnition 4.5. The parameterized robustness measure denoted as P−RM
with respect to a scenario ratio λ with 0 < λ ≤ 1 is given by:
Rk¯,λub =
1
|V |
∑
g∈V
epp(g, k¯, λ)
The higher the value of λ the more scenarios are considered and more and
more components might be diﬀerentiated. However, increasing λ increases
the computational eﬀort as well. The more diﬀerentiations can be computed,
the more the accuracy for the designer to ﬁnd critical hot-spots. However,
the inﬂuence of λ is emphasized as follows:
The parameter λ intuitively justiﬁes the analysis between the worst case
(λ close to zero) and the best case (λ = 1). Consequently, the higher λ the
better the case. Thus, with increasing λ the upper bound of the P−RM
increases since scenarios become more or less likely.
Lemma 4.2. Given two scenario ratios λ and λ′ with λ ≤ λ′ the relation
Rk¯,λub ≤ Rk¯,λ
′
ub holds.
Proof. Let λ ≤ λ′. It has to be shown Rk¯,λub ≤ Rk¯,λ
′
ub : It suﬃces to show:∑
g∈V
epp(g, k¯, λ) ≤
∑
g∈V
epp(g, k¯, λ′)
|V | −
∑
g∈V
min
{
ψ(g, k¯), λ · Ψ(k¯)
}
λ · Ψ(k¯) ≤ |V | −
∑
g∈V
min
{
ψ(g, k¯), λ′ · Ψ(k¯)
}
λ′ · Ψ(k¯)
∑
g∈V
1︷ ︸︸ ︷
min
{
ψ(g, k¯), λ · Ψ(k¯)
}
λ · Ψ(k¯) ≥
∑
g∈V
2︷ ︸︸ ︷
min
{
ψ(g, k¯), λ′ · Ψ(k¯)
}
λ′ · Ψ(k¯)
1
λ · Ψ(k¯) ·
∑
g∈V
1 ≥ 1λ′ · Ψ(k¯) ·
∑
g∈V
2
1
λΨ(k¯) ≥
1
λ′Ψ(k¯)
The last equation holds since λ ≤ λ′.
However, conﬁguring λ appropriately the same diﬀerentiation as the
probabilistic analysis can be reached but with potentially fewer computa-
tional eﬀort. Hence, adjusting the parameter λ a trade-oﬀ between accuracy
and complexity can be found. Practically, λ can be increased until suﬃcient
diﬀerentiation is provided ﬁnally decided by the designer.
56
Probabilistic Analysis 4.3
λ close to zero λ = 1
Differentation
Worst Case Analysis Probablistic Analysis
Figure 4.1: Inﬂuence of the scenario ratio λ
4.2.2 Relation of WC−RM and P−RM
The relation of both introduced measure needs to be investigated. P−RM
is the most general robustness measure deﬁned and used in this thesis. The
robustness measure WC−RM is a special case of P−RM as emphasized
as follows.
Lemma 4.3. Given an observation window k¯ ∈ [0, kcmpl] and 0 < λ ≤ 1
with λ · Ψ(k¯) = 1 then it holds Rk¯ub = Rk¯,λub .
Proof. Follows directly from the proof of Lemma 4.2.
Embedding the worst case analysis WC−RM in P−RM measure is
easily done by simply adjusting λ close to zero such that λ · Ψ(k¯) =
1, i.e., exactly a single scenario is considered as presented previously in
Section 3.3. In contrast, embedding the probability analysis is easily done by
adjusting λ = 1, i.e., all possible scenarios are considered as it is performed
in, e.g., [HPB07]. That means, the new measure embeds both kinds of
analysis and constitutes a trade-oﬀ between accuracy and costs adjustable
by λ.
Figure 4.1 shows a graphical interpretation of P−RM. The worst
case analysis is performed when setting λ close to zero as introduced
with Lemma 4.3. In contrast, probabilistic analysis is performed when
setting λ = 1. The diﬀerentiation of non-robust components which are
more vulnerable than other non-robust components increases with higher
λ. However, the computational costs increases as well, when increasing λ
since computing a scenario is not trivial.
Therefore, ﬁnding hot-spots, that means regions of the circuit that
contains particularly non-robust components is possible using the new
measure. The designer incrementally increases λ step by step until a
suﬃcient accurate diﬀerentiation is obtained.
57
Chapter 4 ROBUSTNESS MEASURES
4.3 Class Models
In the previous chapter and in this chapter the fault model and the ro-
bustness measures have been presented, respectively. The fault model
encompassed the eﬀects of transient faults into classes: k-non-robust, k-
dangerous, robust and unbounded dangerous components. In this chapter,
two robustness measure have been presented.
In this section, meaningful combinations of fault model, measure, and
circuits are introduced in the following.
In the chapter of this thesis various approaches are presented to classify
the components of a sequential and combinational circuits. But there
are theoretical and practical diﬀerences. For example, all approaches are
able to handle sequential circuits except of one approach. Furthermore,
theoretically all approaches are able to obtain EPP for each component.
But from the eﬃciency point of view only one approach is able to compute
EPP eﬃciently.
However, the ﬁrst basic algorithm presented in the next section is able
to handle all issues, from the classes of the fault model to the robustness
measures. This algorithm state only a theoretical model and the further
concrete approaches are derived from the model with certain properties.
To provide a clear diﬀerentiation between the approaches unique terms are
introdcued.
The most general class model of this thesis is provided by the EPP-based
classiﬁcation deﬁned as follows:
Deﬁnition 4.6. The class model EPPModel is deﬁned over three models:
1. the fault model including k-non-robust, k-dangerous, robust, and un-
bounded dangerous components deﬁned in Section 3.3,
2. the robustness measure P−RM including diﬀerentiation between non-
robust components, and
3. sequential circuits and combinational circuits as well.
The EPPModel provides the most general analysis in this thesis which
requires also the most computational eﬀort. A more specialized class model
is presented as next:
Deﬁnition 4.7. The class WCModel is deﬁned over three models:
1. the fault model including k-non-robust, k-dangerous, robust, and un-
bounded dangerous components deﬁned in Section 3.3,
2. the robustness measure WC−RM a special case of P−RM, and
58
Summary 4.4
3. sequential circuits and combinational circuits as well.
The thesis focuses on providing approaches that handles WCModel.
However, the approaches are theoretical able to handle EPPModel but due
to eﬃciency issues the most approaches are reduced to the WCModel. A
dedicated ATPG-based approach is able to compute EPPModel.
So far, both models consider sequential circuits that requires to analyze
the circuit over a certain number of time frames. However, additionally
combinational circuits are handled forming the following class model:
Deﬁnition 4.8. The class model CombModel is deﬁned over three models:
• non-robust and robust component deﬁned in Section 3.3,
• the combinational case of the robustness measure WC−RM,
• combinational circuits.
4.4 Summary
Knowing the quality of the circuit under the eﬀects of transient faults is
crucial for reliable circuits. In this section two measures have been proposed:
1) WC−RM that considers the worst case of all possible scenarios and
CTFs, and 2) P−RM a more diﬀerentiated measure that provides a freely
conﬁgurable accuracy that converges to a exact probabilistic analysis. In
contrast to available probabilistic analysis a trade-oﬀ between accuracy and
computational eﬀort can be found.
Moreover, three ﬁxed conﬁguration combining the classes from the
previous chapter and the introduced measures above are introduced. Based
on this conﬁguration approaches are proposed that classify the components
of a circuit into the respective classes.
59

Chapter 5
Computational Model
While the previous chapters deﬁne what kind of behavior may occur in
the presence of transient faults, this chapter introduces the algorithms to
analyze the circuit and to classify the components into the introduced classes.
A computational model that creates the fundamentals for the classiﬁcation
is introduced. Later in this thesis, various engines (classiﬁers) are presented
covering a wide range of reasoning techniques. However, all engines assess
the robustness along a basic classiﬁcation technique. Before introducing the
engines a theoretical algorithm forming the basic classiﬁcation technique is
introduced in this chapter. This algorithm provides a simpliﬁed view of how
robustness checking works. Properties and requirements of the algorithm
are discussed. In particular, approximation of reachability information
is presented since an exact computation is often practically infeasible.
Importantly, the inﬂuence of the approximation on the quality of the
analysis is theoretically investigated. Later show, embedding approximation
in robustness checking is very useful since a great trade-oﬀ between cost
and quality can be reached.
5.1 Modelling CTFs in Circuits
In order to analyze the circuit’s behavior under CTFs arbitrary values have
to be injected at certain signals in the circuit since transient faults are
modeled as non-deterministic values. The thesis focuses on the logic level
and considers logical masking of transient faults. Therefore, faults in a
circuit are modeled at logic level with common fault injection techniques as
presented in the following.
All possible CTFs of a component are not directly modeled, e.g., as it
is done in ATPG, since too many faults need to be modeled. Instead of
61
Chapter 5 COMPUTATIONAL MODEL
gg =⇒...
go ...
g′o go
pg
gnew
Figure 5.1: Component gnew encapsulate component g and logic to inject CTF
modeling all faults directly, a symbolic representation is chosen. That means,
the presented model represents all possible faults at a component at once.
This slightly increases the search space of a single problem instance but
reduces the overall number of problem instances that need to be considered.
Given a circuit C = (V,E) and a set of components U ⊆ V to be
classiﬁed. A fault modelling circuit CU = (V ′, E′) that models CTFs at
components in U is constructed as follows: Each component g ∈ U is
replaced by the construction illustrated in Figure 5.1. A new component
gnew encapsulates the component g from the original circuit with a new
free input pg called fault predicate1. The component gnew behaves exactly
as the component g when the fault predicate is set to zero (not activated).
Otherwise, when the fault predicate is set to one (activated) an arbitrary
value independent on g’s function can be injected. That means, all possible
CTFs according to F(g) are symbolically modeled. Overall, a component
is virtually disconnected from its fan-out cone when the fault predicate is
activated. The set of all fault predicates is denoted by PU and contains one
fault predicate per component, i.e, |PU| = |U|.
However, since transient faults are considered only a single but arbitrary
time frame is manipulated temporarily through fault injection. In contrast,
a stuck-at-fault from SAFM manipulates a component permanently.
5.2 Models for Classiﬁcations
The basic classiﬁcation techniques to classify components with respect
to EPPModel is presented in the following. Since, the EPPModel is the
1Note, the fault predicate is not contained in the set of primary inputs of a circuit
rather than in a dedicated set.
62
Models for Classiﬁcations 5.2
C@0
CU@0
X0
S0
=
Y0
Y ′0
F′0
F0
Figure 5.2: Model for classifying 0-non-robust components
most general model considered in this thesis the classiﬁcation techniques is
suﬃciently general.
However, two models that allow to classify the components according
to the proposed classes, k-non-robust and k-dangerous from Section 3.3 are
presented: Given is a circuit C = (V,E), a non-empty set of components to
be classiﬁed U ⊆ V , and an observation window k¯ ∈ [0, kcmpl].
5.2.1 Model for Classifying k-non-robust Components
Figure 5.2 illustrates the model for classifying 0-non-robust components.
The circuit CU is constructed to inject CTFs according to the components
of U. The circuits C and CU are modeled as a sequential equivalence
check for one time frame. More precisely, the instances C@0 and CU@0
are stimulated by the same input stimuli denoted by X0. Further, both
circuits start at the same set of states S0 (fault-free) which are called
injection states. Here, all reachable states are allowed, i.e., S0 = S∗. Since,
non-robust components are classiﬁed, the primary outputs are checked for
diﬀerences, i.e., Y0 = Y ′0 . Furthermore, the fault signals F0 and F′0 are
highlighted. The fault signal F0 from the fault-free computation is assumed
to be always zero, because if there is no internal fault, F0 must not report any
fault. The set PU = {pg1 , . . . , pg|U|} contains all fault predicates with respect
to U. Exactly one fault predicate pgi is set to one, i.e., pg1 + . . . + pgn = 1
because single CTFs are considered.
Suppose the fault predicate pgi ∈ PU is activated such that an arbitrary
value according to FN(gi) is injected at the output of component gi. All
remaining components behave as speciﬁed, since those fault predicates are
63
Chapter 5 COMPUTATIONAL MODEL
set to zero. If there exists a scenario τ = ((X0), S0), i.e., input stimuli X0
for the primary inputs and assignments to the injection state S0, such that
Y0 = Y ′0 is true and the fault signal does not report a fault, i.e., F′0 = 0, the
component gi is classiﬁed as 0-non-robust and is stored in the set S0.
This model is denoted as N (C, S∗,U, 0), because it classiﬁes non-robust
components of U of circuit C while considering all reachable states S∗ while
no additional time frame is unrolled for propagation.
In general not all non-robust components can be classiﬁed by considering
an observation window of length one. A model arguing over an observation
window of length two is shown in Figure 5.3. Two new instances of C are
appended to the existing instances, respectively. Note, than circuit CU is
used only in the ﬁrst time frame since transient faults are modeled. Next
states S1 and S′1 are connected to the new instances where both copies
are stimulated by the same input stimuli, denoted by X1. If there exists a
scenario τ = ((X0, X1), S0) with an activated fault predicate pgi that leads
to diﬀering outputs Y1 and Y ′1 before the fault signal reports a fault, then
the component gi is classiﬁed as 1-non-robust. This model is denoted as
N (C, S∗,U, 1) because one additional time frame is unrolled for propagation.
Modeling an observation window of length k is analogously constructed and
denoted by N (C, S∗,U, k) that classiﬁes k-non-robust components.
This model is used to classify non-robust components and to compute
the sets S0, . . . ,Skcmpl . Additionally, this model is also used to compute
multiple scenarios as required to compute the grading based robustness
measure P−RM. To compute the grading of non-robust components the
function ψ(g, k) needs to be realized that returns the number of those
scenarios up the scenario ratio λ described in Section 4.2. The function is
simply realized by counting the scenarios that lead to the faulty behavior.
Once the scenario ratio is reached, i.e., the number of scenarios exceeds the
pre-deﬁned ration, the computation stops.
Using this model non-robust components are classiﬁed by incrementally
extracting all k-non-robust components and increasing k by one until
the maximal observation window k¯ has been explored. However, after
determining k-non-robust components stored in the set Sk, k-dangerous
components are determined using the following model. Components that
are already classiﬁed as k-non-robust components are not considered here
anymore. That means, the remaining components need to be analyzed:
U
′ = U \ Sk.
64
Models for Classiﬁcations 5.3
C@0
CU@0
C@1
C@1
X0
S0
S1
S′1
X1
=
Y1
Y ′1
F′1
F1
Figure 5.3: Model for classifying 1-non-robust components
5.2.2 Model for Classifying k-dangerous Components
Figure 5.4 illustrates the model for classifying 0-dangerous components.
Similar as in the model for k-non-robust components the fault modeling
circuit CU′ is constructed according to U′.
The model is almost identical to the model of non-robust components
with the exception that the states S1 and S′1 are checked for diﬀerence
rather than the primary outputs.
Given the set of fault predicates PU′ = {pg1 , . . . , pg|U′|} and allowing only
a single fault injection. Suppose the fault predicate pgi ∈ PU
′ is activated.
If there exists an scenario τ = ((X0), S0), such that S1 = S′1 holds, that
means that the state is aﬀected by the injected CTF and the fault signal
F′0 does not report any fault the component gi is classiﬁed as 0-dangerous.
The model is denoted by D(C, S∗,U′, 0) analogously to N (C, S∗,U, 0), and
D(C, S∗,U′, k) analogously to N (C, S∗,U, k) where an observation window
of size k is considered.
Both introduced models classify components against the class model
EPPModel enabling to compute WC−RM and P−RM. However, EPPModel
is the most general model, the models WCModel and CombModel are easily
derived by adapting the model. The WCModel is derived while considering
only a single scenario. Further, the CombModel is derived by simply consid-
ering only one time frame since there are no state elements in combinational
circuits.
65
Chapter 5 COMPUTATIONAL MODEL
C@0
CU@0
X0
S0
=
S1
S′1
F1
F′1
Figure 5.4: Model for classifying 0-dangerous components
5.3 Basic Algorithm
Having both models deﬁned a basic algorithm for classifying components is
presented in the following. Let sol(N (C, S∗,U, k)) and sol(D(C, S∗,U, k)) be
two functions exactly classifying k-non-robust and k-dangerous components,
respectively according to the observation window of size k. However, the
concrete realization of both functions is not necessarily speciﬁed at this
point since both functions are later implemented by the engines. For now,
both functions are theoretical models in the following algorithm in order to
provide a basic algorithm to compute the circuit’s robustness unless how
they are practically implemented.
Algorithm 3:
The basic algorithm to assess the circuit’s robustness.
• Input: The circuit C = (V,E) and a set of components to be
classiﬁed U ⊆ V and an maximum observation window k¯ ∈ [0, kcmpl]
are given as input.
• Output: The classiﬁcation of the components U into robust,
non-robust and k-dangerous components, i.e., a tuple (T,S,Dk¯) is
returned.
• Description: The algorithm, shown in Pseudocode 3 works
iteratively by incrementing the size of the observation window until k¯
66
Handling Reachability Information 5.4
begin1
k = 0;2
S = T = ∅;3
while true do4
Sk = sol(N (C, S∗,U, k));5
S = S ∪ Sk;6
Dk = sol(D(C, S∗,U \ Sk, k));7
T = T ∪ U \ (Sk ∪ Dk);8
U = U \ (Sk ∪ Dk);9
if k = k¯ or U = ∅ then10
return (T,S,Dk)11
end12
k++;13
end14
end15
Pseudocode 3: Basic classiﬁcation algorithm.
and classiﬁes for each window k-non-robust and k-dangerous
components. The algorithm starts with settings k = 0 and the
respective sets of non-robust and dangerous components are
initialized to empty sets.
In Line 3 k-non-robust components are classiﬁed and are added to
the entire set of non-robust components in Line 6. The k-dangerous
components are classiﬁed in Line 7 where the robust components are
easily derived based on Lemma 3.1 and added to the entire set of
robust components in Line 8. Already classiﬁed components are in
Line 9 excluded from further analysis.
The iteration stops if either the maximum observation window k¯ is
reached or there are no more components to be classiﬁed (U = ∅).
Finally, the components partionioned into the classes are returned.
5.4 Handling Reachability Information
So far, complete reachability information is assumed on S0 for fault injection.
But computing all reachable states is a hard problem itself. A high accuracy
of the reachable states comes inherently with high computation costs.
67
Chapter 5 COMPUTATIONAL MODEL
Approximations are frequently used when an exact result is compu-
tationally too expensive. The classiﬁcation of the components requires
to cover all possible scenarios. In particular, covering the exact set of
reachable states is a nontrivial requirement. This requirement is completely
abstracted in Algorithm 5.3. However, the explicit computation of this set
becomes very hard. Several methods have been proposed during the last
years, e.g., [SVD08, CCK03, CMB06]. For example, BDD-based ﬁxed-point
computation is often used [CGP01]. But when considering larger systems
this technique reaches its limit due to the state explosion problem [CGP01].
However, approximation may practically often suﬃce while providing exact
results. There are two ways to approximate the exact set of reachable states:
1) over-approximation and 2) under-approximation. In the experiments it
turned out that exploiting both approximations of reachable states is very
eﬀective in order to provide a high quality analysis within a reasonable run
time.
5.4.1 Inﬂuences of Approximations
Approximations of the state space are frequently used within formal ver-
iﬁcation. But the result of the each veriﬁcation step need to be properly
interpreted as well as in robustness checking. Exploiting approximations in
robustness checking has a direct inﬂuence on the classiﬁcations and therefore
to the quality of robustness checking which needs to be investigated. These
theoretical analysis is presented in the following.
Over-approximation of Reachable States
Classifying the components based on a over-approximation of the reachable
states means that CTFs might be injected into states that are non-reachable.
Consequently, scenarios are covered that are not possible during operation.
This may lead to spurious classiﬁcations and may yield a lower value for
the robustness. This is more formally emphasized as follows:
Theorem 5.1. Let Sˆ be an over-approximation of the reachable states,
i.e., S∗ ⊆ Sˆ, and k ∈ [0, kcmpl]. Then it holds: Sk ⊆ Sˆk with Sˆk =
sol(N (C, Sˆ,U, k)) and Sk = sol(N (C, S∗,U, k)).
Proof. Sketch: Since S∗ ⊆ Sˆ, i.e., more states are considered for injecting
and propagating of a CTF there might be a scenario that leads to a
classiﬁcation of a component to be k-non-robust that would not be found
by considering S∗.
68
Handling Reachability Information 5.4
That means, the classiﬁcation based on an over-approximation of reach-
able states over-approximates k-non-robust components stored in Sk ⊆ Sˆk.
The components of Sˆk \ Sk are called spurious k-non-robust components
since the components might be misleadingly classiﬁed.
Example 5.1. Consider reachable states in a Triple Modular Redundancy
(TMR) circuit consisting of three equal modules and a majority voter.
Each of the three modules are in the same reachable state assuming that
the circuit operates fault-free, i.e., S0 = S∗. Therefore, all CTFs within
the modules are masked by the majority voter and these components are
classiﬁed as robust.
A simple over-approximation and implied consequences of the classiﬁ-
cations are explained: Suppose all three modules can be in diﬀerent states
may computing therefore diﬀerent outputs. A CTF within a module cannot
be properly masked since the correct majority cannot be computed. Con-
sequently, all components of the modules might be classiﬁed as spurious
non-robust.
The analogous observation is presented for k-dangerous components.
Theorem 5.2. Let Sˆ be an over-approximation of the reachable states,
i.e., S∗ ⊆ Sˆ and, k ∈ [0, kcmpl]. Then it holds: Dk ⊆ Dˆk with Dk =
sol(D(C, S∗,U, k)) and Dˆk = sol(D(C, Sˆ,U, k)).
Proof. Analog to proof of Theorem 5.1.
These components are called spurious k-dangerous components and are
stored in the set Dˆk.
Having both determined, an over-approximated set of k-non-robust
components and an over-approximated set of k-dangerous components
implies an under-approximation of the robust components. That means,
using an over-approximatin of the reachable state space, classiﬁcation of
k-non-robust and k-dangerous components yields an under-approximation
of robust components, i.e., T ⊆ U \ (Sˆk ∪ Dˆk). This observation has a
consequence on the robustness measure. Suppose the spurious k-non-robust
and spurious k-dangerous components are determined. Since a subset of
robust components is computed an under-approximation of the lower bound
for the robustness is given by:
Rˇk¯lb =
|Tˆ|
|U| ≤ R
k¯
lb (5.1)
69
Chapter 5 COMPUTATIONAL MODEL
For practical purposes of robustness checking using an over-approximation
of reachable states yields a safe lower bound for the robustness. That means,
some components may be misleadingly classiﬁed as non-robust, which would
be robust under exact assumptions (S∗). However, the classiﬁed robust
components are deﬁnitely robust components. Reﬁnement methods are
available (e.g, [CGJ+00]) to narrow down down this problem that can also
be applied for robustness checking.
Under-approximation of Reachable States
An under-approximation of the reachable states contains fewer states than
the exact set of reachable states. Consequently, states might be missed for
injection and propagation. For example, a function of an ALU cannot be acti-
vated since the respective state is not contained in the under-approximation.
Hence, non-robust components might be classiﬁed as robust yielding an
over-approximation of the upper bound for the robustness. More formally:
Theorem 5.3. Let Sˇ be an under-approximation of the reachable states
with Sˇ ⊆ S∗ and k ∈ [0, kcmpl], then it holds: Sˇk ⊆ Sk with Sˇk =
sol(N (C, Sˇ,U, k)) and Sk = sol(N (C, S∗,U, k)).
Proof. Sketch: Since Sˇ ⊆ S∗, i.e., states are missed for injection and
propagating of CTFs and therefore scenarios might be missed to classify a
component to be non-robust.
The under-approximation of k-non-robust components yields a safe
upper bound for the robustness. Given Sˇk for the classiﬁed k-non-robust
components yielding an over-approximation of the upper bound of the
robustness and is given by:
Rˆk¯ub = 1 −
|Sˇk¯|
|V | ≥ R
k¯
ub (5.2)
However, the quality of the computed bounds directly depends on the
strength of the approximation of the reachable states. The experiments will
show that relevant classes of circuits can be exactly classiﬁed even when
the approximation is very coarse.
5.5 Classiﬁcation by Means of Model Checking
The classiﬁcation of the components can be naïvely translated into a model
checking problem. Industrial-strength model checkers are highly-optimized
and can handle complex circuits [Par10] such that robustness checking can
70
Classiﬁcation by Means of Model Checking 5.5
Spec
Implinjection
PI
property
Figure 5.5: Classiﬁcation based on model checking
beneﬁt from this strength. A naïve approach to classify the components
of a circuit is to translate the problem into a model checking problem by
generating a circuit model with fault injection and a temporal property for
each component. Consequently, |U| properties need to be generated and
veriﬁed. For each property the model checker determines the respective
solution in terms of property holds or fails which corresponds to classiﬁed
into the respective class.
The implementation eﬀort is kept very low since only the properties
need to be generated instead of developing dedicated algorithms. In the
following section the classiﬁcation of non-robust components is translated
into a model checking problem. Therefore, a model that needs to veriﬁed
and a CTL property are introduced to determine non-robust components.
5.5.1 Problem Formulation
The problem formulation is very similar to that model presented in Sec-
tion 5.2. Figure 5.5 depicts the problem formulation that is generated for
each component. Given is a circuit C = (V,E) and a component g ∈ V
that has to be classiﬁed. The term Spec represents the original circuit
C and the term Impl represents the circuit C with fault injection logic at
component g as introduced in Section 5.1. Due to considering CTF that
occurs at an arbitrary single time frame, additional logic is inserted into
Impl that ensures that single CTFs are injected. The input injection
71
Chapter 5 COMPUTATIONAL MODEL
controls whenever a fault is injected, i.e., the time frame chosen by the
model checking. If this input is set to one, a CTF at component g is
injected.
Both circuits are stimulated with the same input illustrated by PI. The
output property reports when at least one output pair of Spec and Impl
becomes unequal by setting the signal to 1.
Having this construction the following CTL property is used to classify
non-robust components:
AG(injection =⇒ AG¬property)
The property states that globally for all paths the injection of a CTF implies
that globally for all paths the outputs are equal. If this property holds
for component g the component is classiﬁed as robust. Otherwise, if the
property fails the component is classiﬁed as non-robust.
5.5.2 Discussion
This approach comes with several drawbacks that is shown in long run
time which is also be demonstrated by the experiments later in this thesis.
However, the run time depends on the property. There may exist similar
constructions that may yield better run times. But in general, domain
speciﬁc knowledge about the underlying problem cannot be easily exploited.
Industrial model checker are usually closed-source application and therefore
adding certain features is diﬃcult if not impossible. Exemplary, coarse
approximations of the reachable state often suﬃce to get accurate results.
Embedding those approximations is impossible if dedicated parameters of
the veriﬁcation tool are hidden or not avaliable.
5.6 Low-Level Optimization Techniques
Classifying the components of a circuit corresponds to solving several
complex problems since the number of scenarios that need to be searched
grows exponentially with the number of inputs and considered time frames.
To reduce the overall computational eﬀort optimizations are presented.
Even on this level of algorithm the classiﬁcation can be improved.
Exploiting structural knowledge is exploited used in formal veriﬁcation
to reduce the overall’s veriﬁcation run time, e.g., [XWMB12, BIMM12,
BKA02, FSF11].
In the following a low-level technique is described that has been published
in [FHD+11].
72
Low-Level Optimization Techniques 5.6
FF1
FF2
e
0
0/1
0
1 o0a
0
b 0
c 0
d 0
Figure 5.6: Example circuit in DAG-based representation with weighted edges
5.6.1 Shortest Path Analysis
In order to check whether a fault is observable at the primary outputs, the
aﬀect values need to be propagated over the state elements for several time
frames. A light-weighted pre-process has been published in [FHD+11] and
is presented below.
In Figure 5.6 a sequential circuit C = (V,E) is shown based on a DAG-
based representation with weighted edges. Each edge e = (v, v′) ∈ E is
weighted as follows: if v′ is a state element then the edge is weighted with
one, otherwise the edge is weighted with zero. Assume a CTF occurs at
component e. To propagate the fault to the primary output o at least two
time frames need to be considered since the shortest path to a primary
output includes two state elements, FF1 and FF1. This observation is more
formally emphasized in the following.
Deﬁnition 5.1. Given a circuit C = (V,E) and a component g ∈ V . The
Minimal Propagation Path MPP of component g is the shortest path from
g to a primary output with respect to all primary outputs.
MPP can be computed for each component based on, e.g., Dijkstra’s
shortest path algorithms [Dij59] which can be done very fast since the
complexity is O(n logn + m) where m is the number of edges and n the
number of nodes of the circuit graph.
Lemma 5.1. Given a circuit circuit C = (V,E) and a component g ∈ V .
If the component g is k-non-robust then it holds: k ≥ MPP(g).
Proof. Suppose the component g is k-non-robust. To propagate a CTF at g
to at least one primary output at least k time frames need to be considered
otherwise the fault cannot be propagated to the outputs.
This idea is exploited during the classiﬁcation to reduce the number
of problems to be solved. The most engines rely on SAT solvers thus the
number of SAT calls is signiﬁcantly reduced.
73
Chapter 5 COMPUTATIONAL MODEL
Recall the function sol(N (C, S∗,U, k)) from Section 5.3 that computes
k-non-robust components. To reduce the search space of this function
components that do no match the necessary condition of Lemma 5.1 can
be excluded from the classiﬁcation for the current value of k. That means,
sol(N (C, S∗, ﬁlterMPP(U, k), k)) with ﬁlterMPP(U, k) = {g ∈ U |MPP(g) ≤
k} is called.
74
Chapter 6
RobuCheck - An Integrated
Robustness Checker
The fundamentals for assessing the robustness of a circuit have been in-
troduced in the previous chapters. However, the implemented engines
classifying the components of a circuit have not been presented so far. In
this chapter, various engines or classiﬁers showing diﬀerent strengths are
introduced. Integrating diﬀerent engines establish a powerful veriﬁcation
tool. One engine may classify one circuit very eﬀectively while another
engine may take long run times and vice versa. All these engines are tightly
integrated in RobuCheck [FFSD10]: a powerful push-button tool that
implements an highly-optimized ﬂow [FHD+11] that automatically assesses
the robustness of a circuit. Comparable to modern model checker tools
various engines are orchestrated within an uniﬁed powerful veriﬁcation tool.
Before introducing precisely the insights of RobuCheck the integrated
engines are presented in separate sections.
Mainly three corners of robustness checking need to be addressed to
eﬀectively classify the circuit. Improving these corners will improve the
overall veriﬁcation run time. The more eﬀective the approaches the better
robustness checking scales well for larger circuits while providing high
quality results.
• Sine computing the exact reachable states is hard, approximations are
used. However, bad approximations may not yield high quality results.
Consequently, a focus is on determining suitable approximations that
allows to provide accurate results.
• Robustness checking is performed during the design process once a
designer has implemented hardening techniques and the robustness
75
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
needs to be assessed. The performance of the classiﬁcation directly
inﬂuences the run time of the robustness checker and inﬂences therefore
the outcome of the designer.
• A sparse result in terms of broad robustness bounds that was com-
puted very fast is not useful for the designer. Those useless results are
often caused by incomplete computations coming from bad approxi-
mations. That means, completeness and good approximations need
to be strongly considered. Proving the absence of faulty behavior
typically needs to reach the completeness thresholds. But reaching
this value is often infeasible since computing this value itself is usually
diﬃcult. However, completeness can alternatively guaranteed that
ﬁnally implies completeness and moreover provides exact results.
This thesis focuses mainly on engines that are based on formal methods
falling back to formal reasoning engines as a SAT solver. While these formal
engines provide a complete analysis, since they cover all possible scenarios,
a simulation-based engine provides a fast but incomplete analysis and can
handle very large circuits. However, a simulation-based engine is an essential
engine for an eﬀective ﬂow of robustness checking since pre-processing may
signiﬁcantly reduce the search space which would increase the performance
of the classiﬁcation. Therefore, beside mainly formal-method based engines,
additionally a simulation engine is presented in this chapter as well.
All engines are brieﬂy introduced in the following and afterwards detailed
introduced in the respective sections:
BMC-classiﬁer: The ﬁrst approach is based on Bounded Model Checking
and is named as BMC-classiﬁer. This classiﬁer basically checks a
series of safety properties that are dedicated to classify k-non-robust
and k-dangerous components. This classiﬁer was initially proposed
in [FSFD11]. For that, properties are introduced that allow to ef-
fectively classify all components within one problem instance. Theo-
retically, the approach is complete once the respective completeness
threshold is known.
ATPG-classiﬁer: The second approach is introduced along SAT-based
sequential ATPG and is named as ATPG-classiﬁer. Sequential SAT-
based ATPG is essentially adapted for robustness checking, i.e., in
particular to handle CTFs. The ATPG-classiﬁer was brieﬂy proposed
in [FSFD11]. The main diﬀerence of the ATPG-classiﬁer to the BMC-
classiﬁer is that the components are classiﬁed separately - stepwise.
That means, for each component a dedicated problem instance is
76
6.0
created and solved. This approach is very close to the naïve model
checking approach presented in Section 5.5. But ATPG a priori
shrinks the problem instance since certain parts of the circuit can be
safely ignored which comes in the model checking only with signiﬁcant
additional costs. Consequently, this may signiﬁcantly reduce the size
of the problem instance and therefore the size of the search space as
well.
Both engines consider the circuit theoretically over an arbitrary number
of time frames and provide therefore a complete classiﬁcation. However,
the engines reach their limits when large circuits are considered that consist
of several thousand components. The problem instances get too large and
intractable to solve them eﬃciently by a formal reasoning engine like a SAT
solver.
Abstraction and Decomposition techniques have been proposed for model
checking in general - for BDD-based model checking as well as for SAT-
based checking [GS05, CGP01, CGJ+00, CLM89, EMA10, XWMB12] – to
enhance the ability to verify larger circuits eﬀectively, i.e., to overcome
complexity issues. Both techniques have been proposed in [FFA+12, FF10]
are further improved in this thesis. Both techniques are implemented in
two additional classiﬁers which are brieﬂy described below.
ITP-classiﬁer: The BMC-classiﬁer and ATPG-classiﬁer rely basically on
unrolling the transition relation up to the completeness threshold, i.e.,
up to the reachability diameter of the underlying automaton (intro-
duced in Deﬁnition 2.20 on page 24). The completeness thresholds can
be very large even for smaller circuits as shown in Section 3.4. Thus,
the transition relation must be unrolled several thousand times. This
is practically not manageable due to limited computational resources
such as run time and space. In order to tackle this problem, Craig
interpolation [Cra57, McM03] is exploited in order to provide still a
complete and sound classiﬁcation while computing an approximate
state space by interpolants. Craig interpolants were ﬁrstly exploited
in McMillan’s work of SAT-based model checking with interpola-
tion [McM03]. This approach is eﬀectively able to prove properties on
industrial-sized circuits in particular without unrolling the transition
relation up to the completeness threshold. Interpolants are used to
automatically derive an approximation of the state space while ab-
stracting irrelevant facts to prove the respective property. However,
this kind of model checking is adapted for robustness checking. Fur-
thermore, a step beyond McMillan’s approach, a ﬁxed-computation
77
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
using interpolation on the property is introduced allowing for a com-
plete classiﬁcation for unbounded observation windows as well as a
new kind of computing approximations using interpolations. This
allows to classify unbounded dangerous components.
COMP-classiﬁer: Applying robustness checking on, e.g., complex proces-
sor designs, composed of several high level modules, suﬀers from very
long run times. Since the complexity signiﬁcantly increases for complex
designs other methods are required. A compositional approach [FF10]
named as COMP-classiﬁer is introduced in this thesis that reduces the
problem into smaller sub-problems. This classiﬁer considers suitable
subcircuits of a circuit in a separate problem instance. All components
of a subcircuit are locally classiﬁed and eventually composite with the
entire circuit if necessary. Since classifying the components locally
on subcircuits results in smaller problem instances that consequently
speeds up the classiﬁcation signiﬁcantly because the surrounding logic
is ignored. The approach is very powerful for certain kinds of circuits
but strongly depends on the choice of the subcircuit. Guidelines of
choosing suitable subcircuits are provided.
SIM-classiﬁer: Furthermore, a simulation-based approach named as SIM-
classiﬁer is introduced as well. The SIM-classiﬁer performs random
simulation and randomly injects CTFs. This classiﬁer is not only
used as a pre-processing step but is also tightly integrated in the
classiﬁcation process of the formal engines. A common simulation-
based engine is adapted to robustness checking in terms of the fault
model and component model.
After a precise introduction of all classiﬁers the diﬀerences are discussed.
Due to the diversity in terms of diﬀerent manner of classiﬁcation a wide
range of circuits can be eﬀectively classiﬁed. Further, the classiﬁers are
diﬀerent in terms of time and space complexity presented in a separate
section. In the last section of this chapter, the highly-optimized ﬂow of
RobuCheck is presented.
6.1 BMC-classiﬁer
A theoretical algorithm for classifying the components of a circuit has
been presented in the previous chapter in Section 5.3 but serves only as
theoretical model. In this section, a formal methods based classiﬁer leant
on Bounded Model Checking (BMC) named as BMC-classiﬁer is introduced.
78
BMC-classiﬁer 6.1
The basic idea of this classiﬁer is to check two kinds of safety properties
covering the computational model to classify k-non-robust and k-dangerous
components. BMC is then used to determine solutions of the BMC formula
using a SAT solver. More precisely, it is checked whether the properties hold
on the circuit that corresponds to whether components are k-non-robust
or k-dangerous. Finally, as already introduced in the basic algorithm the
robust components are easily derived from the classiﬁcation of k-non-robust
and k-dangerous components.
At ﬁrst, the problem formulation, i.e., the two properties are introduced
and translated in a series of SAT problems from an instantiated BMC
formula. After introducing the properties, a universal algorithm is presented
to classify k-non-robust and k-dangerous components, respectively.
In general, the BMC-classiﬁer is complete, i.e., all components are
completely classiﬁed. In the experiments it turned out that the BMC-
classiﬁer is very eﬀective for relevant classes of circuits. However, in practice
the BMC-classiﬁer cannot cover all reachable states since the computation
of those states is computationally very expensive. Due to the limited
computational resources certain scenarios might be missed and the circuit
may only partially classiﬁed. However, in order to overcome those complexity
issues approximation techniques for the reachable state space are introduced.
Moreover, at the end of this section, characteristics of the approach are
discussed.
6.1.1 Problem Formulation
BMC is adapted for robustness checking in the following. Two kinds
of properties (formulas) are introduced: 1) a property for k-non-robust
components and 2) a property for k-dangerous components. In the context
of BMC the formulas can be seen as safety properties which are later
explained more elaborated. Reconsider the general BMC formula introduced
in Section 2.4.1:
BMC(l) = I(s0) ∧
∧
0≤i<l
T (si, si+1) ∧ P (sl)
where I is a predicate describing the initial state, T is the transition relation,
and P is the negated property. If the formula is satisﬁable for any l then
there is a path of length l from an initial state s0 to state sl that satisﬁes
the negated property P (violates the desired speciﬁcation) and therefore
the circuit is buggy.
To perform robustness checking using BMC a formula for P is required:
a property P has to be deﬁned which injects CTFs, propagates the aﬀected
79
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
states, and forces the diﬀerence of the primary outputs and states, respec-
tively. Remember, faults on k-non-robust components are observable at
the primary outputs and faults on k-dangerous components corrupt the
states after k time frames. The two properties classifying k-non-robust
and k-dangerous components are introduced in the following that cover the
notion of the computational model presented in Section 5.2. Finally, an
algorithm is introduced that classiﬁes the components with respect to the
introduced properties. This algorithm can be embedded into the theoretical
algorithm presented in Section 5.3 resulting in a practically useful classiﬁer.
Property for k-non-robust Components
The property for classifying k-non-robust components is divided into three
parts: 1) injection, 2) propagation, and 3) diﬀerence of primary outputs.
The property is denoted by PN (U, l, k) where U ⊆ V is the set of components
that needs to be classiﬁed, l describes an index of states for fault injection
(number of steps from the initial state), and k is the number of steps from
injection into state sl to the time frame l+k where the primary outputs are
checked. The entire number of steps computed by the property corresponds
to the size of the observation window, i.e., k exactly describes the size of
the observation window and is also used in this way.
Part 1 In the ﬁrst part fault injection is performed: As described before
in Section 5.1 the circuit is copied and logic to inject CTFs is introduced at
the components of U, i.e., the fault modeling circuit CU with new inputs
PU is obtained. The fault modeling basically follows the construction from
SAT-based debugging [SVAV05] where values are injected to determine
possibly faulty gates in a buggy circuit. However, in this modeling arbitrary
values are particularly injected as faults. The set PU contains all fault
predicates enabling the fault injections. Since single faults are considered,
exactly one arbitrary predicate is activated. This constraint is given by
φone where
φone ⇔
∑
p∈PU
p = 1.
After modeling both circuits C and CU the FSMs with transition relations
T and TU are derived, respectively. The formula to inject CTFs is then
given by:
inj(U, l) ⇔ T (sl, sl+1) ∧ TU(sl, s′l+1) ∧ φone
Both transition relations are stimulated by the same input omitted here to
keep the formulas simple. The fault injection is done into the same state sl
80
BMC-classiﬁer 6.1
representing the fault injection state. The successor states sl+1 and s′l+1
might be unequal since a fault is injected. Consequently, the formula inj
corresponds to the model from Section 5.2.1 unrolled for one time frame.
Part 2 The second part propagates the correct and aﬀected states over
k − 1 time frames after fault injection:
prop(l, k) ⇔
l+k−1∧
i=l+1
T (si, si+1) ∧ T (s′i, s′i+1)
Both equal transitions relations are stimulated by the same inputs as well
and perform a transition from a current state to a successor state. Note,
the original transition T relation is taken for propagation instead of TU
since a fault is only injected into one time frame.
Part 3 The third part deﬁnes the constraint particularly for determining
k-non-robust components. The following formula forces the primary outputs
to be diﬀerent in the last time frame l + k:
N(l, k) ⇔ Yl+k = Y ′l+k
The conjunction of these three parts leads to:
PN (U, l, k) = inj(U, l) ∧ prop(l, k) ∧ N(l, k) (6.1)
The state sl for injecting a fault is not constrained in the formula and
can be arbitrarily chosen. But k-non-robust components are only classiﬁed
based on reachable states. Thus, states on sl must be reachable states
only. This is realized while a BMC problem with the introduced property
PN (U, l, k) is created. The instantiated BMC problem denoted by NBMC
is given by:
NBMC(U, l, k) =
preﬁx︷ ︸︸ ︷
I(s0) ∧
∧
0≤i<l
T (si, si+1)∧
suﬃx︷ ︸︸ ︷
PN (U, l, k) (6.2)
In each circuit instance for fault-free and faulty computation the fault signal
F is forced to be zero since scenarios are computed without ﬂagging the
fault signal, i.e., when faults are not reported. In order to keep the formulas
simple this constraint is omitted.
81
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
S
I(s0) s1 sl
sl+1
s′l+1
inj(U, l)BMC prop(l, k)
N(l, k)
. . .
. . .
. . .
PN (U, l, k)
Figure 6.1: Illustration of formula NBMCl(U, k)
The preﬁx of the formula computes states reachable from the initial
state I(s0) to the state sl in l steps which corresponds to l time frames.
Therefore, the parameter l adjusts how deep in the state space the fault is
injected and plays an important role when considering completeness aspects
later in this chapter.
The intention of the formula NBMC is illustrated in Figure 6.1. The big
circle denotes the entire state space S. The dashed objects represent the
preﬁx of the NBMC formula and the introduced parts before:
1. BMC marks the path from the initial state to a reachable state in l
steps,
2. inj(U, l) performs the fault injection into the state sl. After fault
injection, the states may diﬀer illustrated by divergent arrows,
3. prop(l, k) propagates the possibly diﬀerent states over k − 1 time
frames, and ﬁnally,
4. N(l, k) forces the primary outputs computed after k steps to be
diﬀerent
A SAT solver is used to determine solutions for the formula that directly
corresponds to k-non-robust components of the instantiated NBMC for any
l ∈ [0, lcmpl] and k ∈ [0, kcmpl]. As introduced in Section 2.4.1 the parameter
lcmpl is the completeness threshold that ensures that all reachable states are
covered. The NBMC(U, l, k) formula is translated into a CNF ΦN (U, l, k)
and is then checked for satisﬁability.
82
BMC-classiﬁer 6.1
Lemma 6.1. The CNF ΦN (U, l, k) is satisﬁable if and only if there is a
component g ∈ U that is k-non-robust for some l and k.
If ΦN is satisﬁable the satisfying assignment is extracted in particular
the assignments of the fault predicates PU. Suppose the fault predicate
pg ∈ PU has been activated by the SAT solver, i.e., a CTF at component
g has been injected. Thus, the respective component g is classiﬁed as k-
non-robust. By determining all solutions of the CNF according to the fault
predicates PU all k-non-robust components are determined with respect to
l and k and are stored in the set Slk.
Enumerating all k-non-robust components is easily realized by itera-
tively calling the SAT solver for determining a solution. If a solution is
found, i.e., the CNF is satisﬁable, the activated fault predicate pg ∈ PU is
extracted, a unit clause {¬pg} is added to the CNF, i.e., ΦN = ΦN ∪{¬pg},
and ΦN is again checked by the SAT solver. This prevents the SAT solver
to deliver the same solution again. Eventually, the CNF formula becomes
unsatisﬁable. Thus, all k-non-robust components have been determined
with respect to the BMC instance adjusted by l and k - the set Slk is entirely
computed. If not all components of U are classiﬁed the value of l is increased
by one and components are further determined until lcmpl is reached. These
steps are performed for each k ∈ [0, kcmpl].
Property for Dangerous Components
This section introduces the property for determining k-dangerous compo-
nents denoted by PD analogously to PN . The formula consists of three
parts as well as the property for k-non-robust components. However, the
diﬀerence between classifying k-dangerous components and k-non-robust
components is that in the last time frame the states are forced to be diﬀerent
instead of the primary outputs. That means, it is checked whether a CTF
corrupts the states after k time frames. Thus, only the third part is diﬀerent
and is given by:
D(l, k) ⇔ sl+k = s′l+k
which forces the state to be diﬀerent rather then the primary outputs.
The conjunction of all formulas inj(U, l), prop(l, k) and D(l, k) yields the
property for k-dangerous components:
PD(U, l, k) = inj(U, l) ∧ prop(l, k) ∧ D(l, k)
and the BMC problem denoted by DBMC is given by:
83
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
DBMC(U, l, k) =
preﬁx︷ ︸︸ ︷
I(s0) ∧
∧
0≤i<l
T (xi, si, si+1, yi)∧
suﬃx︷ ︸︸ ︷
PD(U, l, k) (6.3)
This formula is translated into a CNF ΦD(U, l, k) and is checked for
satisﬁability by a SAT solver where solutions accordingly correspond to
k-dangerous components.
Lemma 6.2. The CNF Φ′(U, l, k) is satisﬁable if and only if there is a
component U that is k-dangerous for some l and k.
Determining all k-dangerous components is performed as well as for the
k-non-robust components. The result is stored in set Dlk.
In both properties, the parameter k exactly speciﬁes the size of the
observation window as introduced in Section 5.2 about classiﬁcation. The
parameter l comes from the BMC formula that speciﬁes the number of
unrolled time frames after the initial state. Intuitively, l describes how deep
in the states space a CTF is injected. To provide a complete classiﬁcation
with respect to a ﬁxed value of k, all reachable states have to be checked
for the state sl - the injection state. This can be achieved by checking
all l ∈ [0, lcmpl] where lcmpl is the completeness threshold introduced in
Section 2.4.1 implemented by the algorithm in the next section.
6.1.2 Algorithm
In Section 5.3 a basic algorithm has been presented that uses two functions
virtually to classify the components: 1) sol(N (C, S∗,U, k)) to classify k-non-
robust, and 2) sol(D(C, S∗,U, k)) to classify k-dangerous components with
respect to the exact state space S∗ with U ⊆ V components to be classiﬁed
and C = (V,E) the circuit under veriﬁcation. The algorithm introduced in
this section provides an implementation for both functions to completely
classify the components. That means, both abstract functions can be
replaced by the new classiﬁer in order to provide a real implementation.
The classiﬁcation is performed with respect to a ﬁxed size of observation
window, i.e., k ∈ [0, kcmpl]. Basically, the algorithm determines k-non-robust
and k-dangerous components, respectively, as long as not all reachable states
are considered, i.e., l ≤ lcmpl, and at least one component remains to be
classiﬁed.
Algorithm 4:
84
BMC-classiﬁer 6.1
begin1
l = 0;2
while U = ∅ ∧ l ≤ lcmpl do3
Φ = toCNF(class(U, l, k));4
Alk = ∅;5
while SAT?(Φ) do6
m = model(Φ);7
forall pg ∈ PU do8
if m[pg] then9
Alk = Alk ∪ {g};10
Φ = Φ ∪ {¬pg};11
end12
end13
U = U \ Alk;14
l = l + 1;15
end16
return A0k ∪ A1k ∪ . . . ∪ Alk17
end18
Pseudocode 4: BMC-classiﬁer.
85
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
BMC-classiﬁer.
• Input: A circuit C = (V,E) under analysis, a set of components to
be classiﬁed U ⊆ V , and the size of the observation window
k ∈ [0, kcmpl] are given as input. Further, the parameter class
specifying the formula used for
classiﬁcation, i.e., class ∈ {NBMC,DBMC} to classify k-non-robust or
k-dangerous components, respectively.
• Output: The set of k-non-robust Sk or k-dangerous Dk
components is returned, respectively.
• Description: The code shown in Pseudocode 4 and is explained in
the following. The algorithm starts with setting l = 0. The outer
while-loop from Line 3 to Line 16 iterates as long as there is at least
one component to be classiﬁed and the completeness threshold is not
yet reached, i.e., not all reachable states have been discovered. In
each iteration the CNF of the respective formula class based on the
current value of l and the remaining components to classify U is
constructed. Note, k is an input parameter and is ﬁx during the
entire run of this algorithm. The set Alk, that stores classiﬁed
components according to l and k, is initialized with the empty set in
Line 5. In the while-loop from Lines 6 to 13 all solutions for the
CNF and therefore classiﬁed components are determined. That
means, if the CNF Φ is satisﬁable, the model is extracted in Line 7.
Next, the for-loop searches for the activated fault predicate
pg (Line 9). The respective component g is added to the set of
classiﬁed components Akl and a unit clause {¬pg} is added to the
CNF Φ to block this solution. Eventually, Φ becomes
unsatisﬁable, i.e., all components with respect to l and k are
classiﬁed and the inner loop ends. All determined components are
excluded from further classiﬁcations (Line 14). Finally, the value of l
is incremented by one and the outer iteration restarts. At the end,
the classiﬁcation terminates if either all components are classiﬁed or
the completeness threshold has been reached. All classiﬁed
components are returned (Line 17).
Given a ﬁxed k, the algorithm (BMC-classiﬁer) is ﬁrst called using the
formula for determining k-non-robust components (NBMC) and afterwards
86
BMC-classiﬁer 6.1
with the remaining components as presented in the basic classiﬁcation
algorithm to classify k-dangerous components (DBMC) from Section 5.3.
Details regarding the implementation and eﬃciency issues are presented at
the end of this section.
6.1.3 Completeness
Once the completeness threshold lcmpl is reached for a given property, it is
safely concluded that all reachable states and therefore all possible scenarios
have been explored as introduced in Section 2.4.1.
However, in general the completeness threshold might be very large
and the BMC-classiﬁer may iterate several thousand times and therefore
calls the SAT solver several thousand times which is not eﬃcient for large
circuits. But, completeness is theoretically guaranteed.
Approximation techniques overcome this problem for a wide range
of relevant circuit instances by providing still exact computation. The
embedding of those approximations into the BMC-classiﬁer is presented in
the next section.
An alternative to a common SAT-based model checker might be a
BDD-based approach. A BDD-based approach could be used in order to
compute the exact reachable states as it is commonly done in Symbolic Model
Checking (SMC) [CMCHG96]. SMC iteratively computes symbolically the
image of the transition relation until a ﬁxed-point is reached that ﬁnally
ensures that all reachable states are covered. However, even when the ﬁnal
BDD is compact, during the iteration the BDD might be very large consisting
of hundreds of thousands nodes which requires a huge amount of memory
making BDDs for large circuits unmanageable. The work of [FSFD11] makes
use of BDD-based ﬁxed-point computation in robustness checking and shows
that exploiting approximations results in much better performance.
However, there are several approximation techniques that can be embed-
ded into the BMC-classiﬁer. The basis for embedding those approximations
are presented in the next section. The concrete realization of the approxima-
tion is transparent to the approach. Data from a testbench or constrained
random-simulation can be used here as well.
6.1.4 Embedding Reachability Information
Once the completeness threshold lcmpl is known for a circuit a complete
classiﬁcation with respect to a observation window k is provided. However,
computing this value is as hard as model checking itself [CKOS04].
Various techniques on top of BMC have been proposed in order to
overcome this problem, see, e.g., [SVD08, CCK03, CMB06]. However, all
87
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
exact techniques have high run times while several approximation tech-
niques provide a trade-oﬀ between accuracy and run time. Even when an
approximation is used exact results can be obtained as shown later in the
experimental evaluation. Handling approximate reachability information
within the BMC-classiﬁer is presented in the following.
While in Section 5.4 the theoretical inﬂuence of the approximations
used in robustness checking has been investigated this section embeds
approximations into the BMC-based approach.
Over-approximation
Assume ωˆ(s0) is a predicate over the state variable s0 describing a set of
states with ∀s ∈ S∗ =⇒ ωˆ(s), i.e., ωˆ computes an over-approximation of
the set of the reachable states S∗. This predicate is embedded into the
formulas for determining k-non-robust and k-dangerous components by
simply replacing the preﬁx from the BMC instance by the predicate:
ˆNBMC(U, k, ωˆ) = ωˆ(s0) ∧ PN (U, 0, k) (6.4)
ˆDBMC(U, k, ωˆ) = ωˆ(s0) ∧ PD(U, 0, k) (6.5)
Both formulas depend on the state variable s0. Therefore, l is constantly set
to 0. Based on the inﬂuence of approximations described in Section 5.4 both
formulas determine an over-approximation of k-non-robust components
stored in the set Sˆk and k-dangerous components stored in the set Dˆk,
respectively since ωˆ is an over-approximation of reachable states and may
contain states that are not reachable from the initial state in any number
of steps.
Exemplary, a very simple over-approximation is to construct a predicate
as ω(s−1) = 1, i.e., all states are allowed as reachable states.
Under-approximation
In contrast to over-approximation, assume ωˇ(s0) is a predicate over the
state variable s0 and it holds: ∀s ∈ S.ωˇ(s) =⇒ s ∈ S∗, that means ωˇ
is an under-approximation of the reachable states S∗. The formula for
determining k-non-robust is analogously to the over-approximation given
by:
ˇNBMC(U, k, ωˇ) = ωˇ(s0) ∧ PN (U, 0, k)
States might be missed for fault injection where a fault cannot be properly
propagated to the primary outputs certain parts of the circuit are not
88
BMC-classiﬁer 6.1
properly sensitized. Consequently, the solutions of the formula under-
approximates k-non-robust components.
Those under-approximations can be computed very diﬀerently. Consid-
ering simply a certain number of transitions from the initial state computes
an under-approximation as introduced in [FSD09].
Deﬁnition 6.1. Given a Reachability Window Parameter (RWP) l¯ ∈
[0, lcmpl] the predicate SPB(l¯) describes any state along any path reachable
from the initial state up to a length of l¯ with:
SPB(l¯) = I(s−l¯) ∧
∧
−l¯≤i<0
I(si) ∨ T (si, si+1)
The approximation is called States along any Bounded Path (SBP).
The parameter l¯ inﬂuences certainly the space of reachable states consid-
ered as states for fault injection and therefore the accuracy of the classiﬁca-
tion. In SPB the higher the value of l¯ potentially more states are discovered
and more components are properly classiﬁed.
Once the classiﬁcation is completed for a certain l¯  lcmpl, i.e., all
components are classiﬁed an under-approximation is suﬃcient to fully
classify the circuit without considering the entire state space. As it will
be shown later in the experiments, checking a small number of various
reachability windows is suﬃcient to fully classify relevant circuit classes.
6.1.5 Incremental Satisﬁability
The BMC-classiﬁer relies on numerous satisﬁability checks and therefore
depends inherently on the SAT solver’s performance to solve a CNF. In a
naïve way for each single classiﬁcation the problem formulation needs to be
translated into CNF and is then checked by a SAT solver. However, this
can be signiﬁcantly improved by the technique proposed by [Sht01] where
incremental satisﬁability for BMC is introduced. This technique is adapted
for the BMC-classiﬁer as well.
During the solve process today’s SAT solvers generate conﬂict clauses
pruning the search space which increases the performance considerably.
However, learnt information in terms of conﬂict clauses can be partially
transfered to similar SAT instances by keeping the current instance and add
only new necessary facts. In particular, when the BMC-classiﬁer increases
the parameter l by one only the clauses of the new unrolled time frame
are required to be added to the solver instead of rebuilding the entire
formula. Learnt information about the previous run are kept within the
SAT solver’s internal databases. Besides saving the run time of rebuilding
89
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
the entire formula mainly exploiting the learnt information improves the
performance as also shown in [Sht01]. This kind of computation is known as
incremental satisﬁability. Overall, instead of resolving the entire CNF from
one classiﬁcation to a further classiﬁcation, learnt information is exploited
by keeping the SAT solver’s database during the entire classiﬁcation.
6.2 ATPG-classiﬁer
The BMC-classiﬁer presented in the previous section handles all compo-
nents in one monolithic problem instance. In contrast, the ATPG-classiﬁer
considers one component within one problem instance as it is similar done
on sequential SAT-based ATPG.
SAT-based sequential ATPG computes test patterns for stuck-at faults
at the circuit’s signals. This approach is adapted for robustness checking.
While classical ATPG engines consider stuck-at faults the ATPG-classiﬁer
for robustness checking needs to handle CTFs. Further, ATPG is often
reduced to the combinational case resulting in over testing but reduces the
problem instances signiﬁcantly. Therefore, ATPG for robustness checking is
inherently harder to solve since reachability information need to be properly
taken into account. Moreover, problem instances are usually harder to
solve since the underlying circuits contain many redundant logic to tolerate
transient faults. This kind of instances that are unsatisﬁable are mostly
hard for the SAT solvers. In contrast, the problem instances of ATPG
are usually satisﬁable instances since most of the faults are testable which
corresponds to satisﬁable instances.
6.2.1 Problem Formulation
Technically, the ATPG-classiﬁer is very close to the BMC-classiﬁer with the
exception that only a single component is considered in a single problem
instance. To classify all components of a circuit, for each component a
separate problem instance is created and solved.
Since only one component is considered in a single problem instance
parts of the logic can be safely ignored as it is done in classical ATPG as
well. A Cone-Of-Inﬂuence (COI) reduction is applied that determines logic
that inﬂuences the analysis.
Figure 6.2 illustrates the general idea which logic is relevant when
classifying a component. The fanout cone and the transitive fanin cone of
the aﬀected component need to be modeled for a single time frame. The
remaining logic can be safely ignored. This reduces the size of the problem
instances signiﬁcantly. Consequently, classifying a single component may
90
ATPG-classiﬁer 6.2

In
pu
ts
O
ut
pu
ts
fanout cone
transitive fanin cone
Figure 6.2: Fanout and transitive fanin cone of an aﬀected component
perform much faster than the monolithic model of the BMC-classiﬁer since
the search space is accordingly smaller.
6.2.2 Using ATPG to compute EPP
Computing a diﬀerentiation of non-robust components requires to compute
more than a single scenario according to the techniques introduced in
Section 4.2. Recall the parameter λ limits the number of scenarios to
computed in order to overcome complexity issues.
ATPG is suitable to compute more than a single scenario. While COI
is applied certain inputs are not relevant and are considered as don’t care
inputs which signiﬁcantly reduces the run time.
Further improvements are achieved by performing Minimal Assign-
ment Analysis (MAA) as for example introduced in [RS04, ED07]. Those
techniques can be used to reduce the overall number of SAT calls which
consequently reduces the overall run time.
6.2.3 Comparison to Blackbox Model Checker
This classiﬁer is very close to the naïve model checking approach introduced
in Section 5.5. This approach converts the model and the CTL property
into a problem instance. That means, similar as in the ATPG approach
each component is analyzed separately.
However, ATPG-classiﬁer can exploit more problem domain knowledge
as for example the COI reduction is applied. The model checker has no
knowledge about the component to be classiﬁed and perform reductions
of the problem instance without this speciﬁc knowledge. Consequently,
reductions are usually less eﬀective in the same time as required by COI
in ATPG. Of course, COI can also be performed before translation the
91
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
classiﬁcation into the a model checking problem. But in the worst case the
entire circuit needs to be modeled such that COI is obsolete.
6.3 ITP-classiﬁer
Completeness of robustness checking can be guaranteed by both proposed
classiﬁers - the BMC-classiﬁer and the ATPG-classiﬁer - from the previous
sections by checking all necessary combinations of l ∈ [0, lcmpl] and k ∈
[0, kcmpl]. Based on Lemma 3.3 about the completeness threshold the
number of those combinations may increase exponentially with the size
of the state elements of a circuit. Consequently, checking the number
of combinations, where each check implies a model checking problem, is
infeasible even for smaller circuits. Up to small values for lcmpl and kcmpl
both engines are very eﬀective, in particular for those circuits where a
bounded observation window is suﬃcient and approximate reachability
information leads to high accuracy. But providing classiﬁers that eﬀectively
conclude completeness while avoiding to check all combinations even when
the completeness thresholds lcmpl and kcmpl are large, is necessary for the
remaining hard to classify circuits.
In practice, often it suﬃces to partially check all combinations since
some states or scenarios might not be relevant to classify a component into
the respective class. The classiﬁer named as ITP-classiﬁer introduced in
this section and published in [FFA+12] automatically determines which
combinations are necessary such that ﬁnally an exact classiﬁcation is per-
formed. Consequently, an alternative termination criterion is provided by
exploiting interpolations.
The approach of McMillan’s approach of interpolation-based model
checking has been presented in Section 2.4.2. Interpolation adapted for
robustness checking has been ﬁrstly presented in [FFA+12] which is detailed
presented in this section. Moreover, a new model checker is presented in
thesis based on a new approximation technique.
Three new techniques based on interpolation are introduced to conclude
completeness before checking all combinations in this thesis. Combining
all these techniques within the ITP-classiﬁer, provides an fully automatic
approach to derive suitable over-approximations by interpolation. The basic
ideas of the three techniques are brieﬂy listed below and detailed explained
in separate sections.
1. The classical SAT-based interpolation procedure as proposed by
McMillan [McM03] is adapted for robustness checking. That means,
92
ITP-classiﬁer 6.3
for each property determining k-non-robust and k-dangerous compo-
nents with respect to a certain observation window k¯, the interpolation
procedure is started that guarantees that all necessary states for injec-
tion have been discovered. Recall the parameter l speciﬁes the number
of time frames after the initial state ﬁrstly introduced in Section 6.1.
The integrated approach of the ITP-classiﬁer avoids to check all values
for l ∈ [0, lcmpl] by over-approximating the reachable state space. This
provides an eﬀective and complete classiﬁcation with respect to a
bounded observation window as motivated in Section 3.5 for certain
circuit classes.
2. The approach of McMillan computes an over-approximation of the
state space by joining a set of interpolants. The computation of those
interpolants follows a certain partitioning of the underlying BMC
formula into part A and B. A newly introduced partitioning allows
to compute a single interpolant that over-approximates the reachable
states instead of few interpolants.
Once McMillan’s interpolants lead to a too strong over-approximation
showed by spurious counterexamples the interpolants need to be com-
pletely discarded and the computation restarts. The new interpolants
proposed in this thesis can be reused even when the approximation
is too coarse. Joining all interpolants using the new technique lead
to more and more accurate approximation. Eventually, the new
interpolants converge to the exact reachable states.
Beside the application in robustness checking in this thesis the new
approximation contributes to general model checking as well. Ad-
ditionally, a simple model checker integrates this new techniques is
introduced in this section as well.
3. Finally, providing completeness with respect to an unbounded ob-
servation window requires to consider all necessary values of k ∈
[0, kcmpl], i.e., all propagation paths need to be discovered.
Due to the high computational cost of naïvely checking all values, an
over-approximation of the propagation path is introduced. A derived
ﬁxed-point computation on the property may potentially guarantee
completeness without checking all values up to kcmpl. Technically,
interpolation is performed directly on the property, i.e., fault-free
and faulty computations are over-approximated. This certain kind of
interpolation requires that at least all reachable states are considered
for fault injection adequately provided by both previously mentioned
techniques (1.& 2.).
93
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
In this section the ITP-classiﬁer is introduced consisting of three separate
parts. Each part is introduced in a separate section. Finally, the ITP-
classiﬁer is explained including all those three parts in terms of a algorithm.
6.3.1 Adaption of Interpolation-based Model Checking
The ﬁrst classiﬁer exploiting interpolation is introduced that simply adapts
interpolation-based model checking from [McM03] for robustness check-
ing. For a bounded observation window the new interpolation classiﬁer
introduced in this section eﬀectively ensures that all reachable states are
discovered for fault injection before potentially reaching the completeness
threshold lcmpl, i.e., leading to a complete classiﬁcation with respect to a
bounded observation window.
Interpolants compute an over-approximation by abstracting facts in
terms of parts of the state space that are irrelevant for the current reasoning.
This may lead to a fast convergence to a ﬁxed-point than explicitly unrolling
the transition relation up to the completeness threshold.
A safety property holds on a circuit if it is proven that the property
holds for all reachable states of the circuit’s automaton. BMC has been
introduced to verify the property on states by iteratively unrolling the
transition relation. All reachable states are checked when the transition
relation is unrolled lcmpl times. This may become very expensive for larger
circuits. Interpolation-based model checking [McM03] often terminates
before reaching this completeness threshold which signiﬁcantly reduces the
veriﬁcation time. Interpolants abstract some facts which are irrelevant to
prove the property and therefore speeds up the convergence to a ﬁxed-point.
Therefore, in order to prove a property only relevant states are considered.
In robustness checking a series of safety properties are checked introduced
in the BMC-classiﬁer in Section 6.1. More precisely, for each k ∈ [0, k¯], a
new property needs to be checked to classify k-non-robust and k-dangerous
components. Mapping robustness checking to interpolation-based model
checking avoids considering all values of l ∈ [0, lcmpl] for each k ∈ [0, k¯]
where k¯ is the maximum size of the observation window setted by default to
kcmpl (Section 3.5). That means, for each k an earlier termination might be
reached instead of completely unrolling the transition relation up to lcmpl
times which signiﬁcantly improves the run times.
In the following, interpolation-based model checking [McM03] is adapted
for robustness checking based on the BMC-based engine.
94
ITP-classiﬁer 6.3
Problem Formulation
Recall the BMC-classiﬁer and the underlying formula NBMC(U, l, k) as
introduced in Section 6.1:
NBMC(U, l, k) = I(s0) ∧
∧
0≤i<l
T (si, si+1) ∧ PN (U, l, k)
The property PN (U, l, k) is forced to inject faults into state sl on the
components of U ⊆ V , propagates the faults over k time frames and ﬁnally
forces the primary outputs to be diﬀerent. Consider the following partition
(A,B) of the formula NBMC(U, l, k):
A := I(s0) ∧ T (s0, s1)
B :=
l∧
i=2
T (si−1, si) ∧ PN (U, l, k)
Suppose all k-non-robust components have been determined with respect
to the values of l and k and only the remaining components g ∈ U not
shown to be non-robust (at least one, i.e.,|U| ≥ 1) are considered to classify
further. In this case, A ∧ B is unsatisﬁable, since no fault injection is
observable within k time frames. One of the two following reasons leads to
the unsatisﬁable formula:
1. The formula B is unsatisﬁable itself. That means, each fault injection
does not entirely depend on the set of states for injection, i.e., in
any case the fault is either unobservable at the primary outputs or is
reported by the fault signal. Consequently, B is unsatisﬁable, since B
exactly constrains the part of fault injection, propagation, and forces
unequal primary outputs. In this case, determining the interpolant is
skipped and all components are classiﬁed with respect k, since more
states are not necessary to classify the remaining components.
2. Otherwise A and B are satisﬁable, respectively, an interpolant σ
with σ = itp(A,B) is computed. The interpolant σ is deﬁned over
the state variables expressed by s1, i.e, the common variables of
A and B with Var(σ) ⊆ Var(A) ∩ Var(B). The interpolant is an
over-approximation of the image of the transition relation, i.e., the
states of the ﬁrst transition are over-approximated (A =⇒ σ) but
still contains suﬃcient facts to contradict the property (B ∧ σ is
unsatisﬁable). That means, faults injected into the states from the
95
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
I
S∗
PNσ
Figure 6.3: Over-approximation of the interpolants leading to spurious coun-
terexamples
over- approximated image represented by the interpolant are either
unobservable or reported by the fault signal.
However, the interpolant is added to A such that A = (I(s0)∨σ(s0))∧
T (s0, s1) where the variable s1 of σ is replaced by s0 and the procedure
restarts. This virtually unrolls the transition relation.
For each interpolant it is checked whether a ﬁxed-point is reached. A
ﬁxed-point is reached when the disjunction of all previously computed
interpolants implies the new interpolant, i.e., σ1 ∨ . . . ∨ σn =⇒ σ
where σ1, . . . , σn are previously computed interpolants and σ is the new
interpolant. Once the disjunction becomes true at least all reachable
states has been covered since the disjunction of the interpolants and
the initial state yield an over-approximation of the reachable state
space, i.e., ∀s ∈ S∗ . (I ∨ σ ∨ . . . ∨ σn)(s) is true. The classiﬁcation is
complete with respect to k.
Once the instance A∧B becomes satisﬁable where A is extended by the
interpolant a probably spuriously classiﬁed non-robust component has
been determined due to the over-approximation by the interpolant.
States for injection are beside reachable states also non-reachable
states and therefore the classiﬁcation might be spurious. This case
is shown in Figure 6.3 where the interpolant σ intersects states with
the property PN .
In this case the procedure restarts by increasing the value of l, which
either allows for classifying non-robust components based on reach-
able states only or makes the interpolants weaker by strengthening
B, i.e., abstracting fewer facts.
Finally, a ﬁxed-point will be eventually found and the classiﬁcation
is complete with respect to the current value of k [McM03]. After
reaching a ﬁxed-point, k is increased by one up to k¯ and the procedure
restarts while discarding all interpolants.
96
ITP-classiﬁer 6.3
begin1
Sk = ∅;2
l = 0;3
while U \ Sk = ∅ ∧ l ≤ lcmpl do4
Sk = Sk ∪ sol(NBMC(U, l, k));5
if l == 0 then l = l + 1, continue;6
R = I;7
B :=
l∧
i=2
T (si−1, si) ∧ PN (U \ Sk, l, k);8
while !SAT?(A ∧ B) do9
A := R(s0) ∧ T (s0, s1);10
σ = itp(A,B);11
if σ =⇒ R then12
return Sk13
end14
R = R ∨ σ;15
end16
l = l + 1;17
end18
end19
Pseudocode 5: Interpolation-model checking adapted for robustness checking.
Algorithm
In this section an algorithm adapting the previously described procedure is
introduced. This algorithm states the ﬁrst part of the ITP-classiﬁer denoted
by ITP-classiﬁer-1.
97
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
Algorithm 5:
ITP-classiﬁer-1.
• Input: A circuit C = (V,E), set of components to be classiﬁed
U ⊆ V , and a ﬁxed k ∈ [0, k¯] are given as input.
• Output: The set of k-non-robust components are returned.
• Description: The code shown in Pseudocode 5 and is described in
the following. The algorithm starts with initializing the set of
k-non-robust classiﬁed components to the empty set and starts with
l = 0. The while-loop from Line 4 to Line 18 iterates as long as at
least one component needs to be classiﬁed and the completeness
threshold lcmpl is not reached. At ﬁrst k-non-robust components with
respect to the current value of l are classiﬁed and added to the set Sk.
The interpolation procedure starts when at least l is set to 1. In case
of l = 0 the output loop restarts with an incremented l (Line 6).
When l ≤ 1 the interpolation procedure starts by construction the set
R = I to be the ﬁrst set states and B a ﬁxed part of the partition.
The while-loop from Line 9 to Line 16 computes interpolants and
checks whether a ﬁxed-point is reached. The loop terminates once a
spurious classiﬁcation is performed.
In Line 10 the partition A is created and an interpolant of (A,B) is
computed. If a ﬁxed-point is found, i.e., if the condition in Line 12
evaluates to true, then the set of all k-non-robust components are
determined and returned.
Otherwise, the interpolant is added to the previously computed
interpolants in Line 15.
If the loop terminates at Line 17 a spurious classiﬁcation is performed
and in this case l is incremented and the outer loop restarts.
6.3.2 Adequate Over-Approximation
ITP-classiﬁer-1 computes an over-approximation of the reachable states
based on interpolation by partitioning the NBMC into the formulas A and B
accordingly. The new approach computes an over-approximation based on
98
ITP-classiﬁer 6.3
interpolation by an opposite partitioning. This over-approximation is then
used in robustness checking and a separate model checker - both introduced
later in this section.
The eﬀectiveness of robustness checking strongly depends on the states
considered while fault injection. An automatically determined and suitable
over-approximation leads to an eﬀective classiﬁcation.
Several properties of new approach are presented later in this section cov-
ering the strength and weaknesses in comparison to McMillan’s interpolation
approach.
Computing an Over-approximation
Initially, a new term to distinguish diﬀerent kinds of approximations of
reachable states is introduced.
Deﬁnition 6.2. Given a transition system M = (I, S, T ) and a predicate
δ deﬁned over the state variables of M . Then δ is called adequate approxi-
mation if δ contains only non-reachable states, i.e., ∀s ∈ S∗ . δ(s) = 0.
Once an adequate approximation δ is computed an over-approximation of
the reachable states has been obtained, i.e., δ¯ computes an over-approximation
since it holds: ∀s ∈ S∗ . δ¯(s) = 1.
In the following an interpolation-based approach is presented to compute
adequate approximations matching the previous deﬁnition. Reconsider the
BMC formula from page 26 of Section 2.4.1
BMC(l) = I(s0) ∧
∧
0≤i<l
T (si, si+1) ∧ P (sl)
with a safety property P that has to be veriﬁed with respect to a circuit.
Further, reconsider the state approximation SPB(l¯) from Section 6.1.4 that
computes any state along any path reachable from the initial state in l¯
number of steps. This formula is used to check the property as follows:
BMCreach(l¯) = SPB(l¯) ∧ P (s0)
The diﬀerence to the classical BMC formula is that intuitively all states
up to l¯ steps are constrained for fault injection rather than only states
reachable in l¯ steps.
Suppose the formula is unsatisﬁable for any l¯ ∈ [0, lcmpl], i.e., no state
reachable from the initial state along any path of length l¯ intersects the
property states. An interpolant from the following partition (A,B):
A = P (s0) and B = SBP(l¯) (6.6)
99
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
is computed. Let δ = itp(A,B) be an interpolant and reconsider the
Theorem 2.14 of Craig interpolation. The interpolant δ computes states
that are not reachable from the initial states in ≤ l steps since B ∧ δ is
unsatisﬁable. But the interpolant contains states that fulﬁll the property.
Those states might be non-reachable states or reachable states in more
than l steps since A =⇒ δ. However, since δ may not contain exclusively
non-reachable states δ might not be an adequate approximation since the
property may fail in general. In theory at least one adequate approximation
exists but in practice often more than one adequate approximation can be
obtained.
Theorem 6.1. Given a transition system M = (I, S, T ) and a property
that holds on the circuit. There exists an l¯ ∈ [0, lcmpl] with BMCreach(l¯) is
unsatiﬁable then δ = itp(A,B) is an adequate approximation.
Proof. Setting l¯ = lcmpl then SBP(l¯) models all reachable states and
BMCreach(l¯) is unsatisﬁable since the property holds. An interpolant
δ = itp(A,B) is computed. Formula B models all reachable states and
δ ∧ B is unsatisﬁable therefore δ contains only non-reachable states since
A =⇒ δ.
Thereby, it is proven that an adequate approximation is computed when
l¯ is set to lcmpl. However, in practice an adequate approximation is often
computed when l¯ is much lower than lcmpl. In order to verify that an
interpolant is an adequate approximation, model checking is performed by
simply checking whether δ is an invariant. As the experiments show, this
check can be done within very low run times.
Once, an adequate approximation has been obtained this approximation
can be used in robustness checking as well as in model checking.
Algorithm
A separate algorithm presented in Algorithm 6.3.2 determines adequate
approximation denoted as ITP-classiﬁer-2.
Algorithm 6:
Compute adequate approximation ADQ.
• Input: The algorithm gets a transition system M = (I, S, T ), a
property P that has to be veriﬁed, and ﬁxed value l ∈ [0, lcmpl] as
inputs.
100
ITP-classiﬁer 6.3
begin1
A = P (s0);2
B = BMCreach(l¯);3
if SAT?(A ∧ B) then4
return ∅;5
δ = itp(A,B);6
if δ is invariant on M then7
return δ8
return ∅;9
end10
Pseudocode 6: Computing adequate approximations: ADQ.
• Output: As a result either an empty set or an adequate
approximation is returned.
• Description: The code is shown in Pseudocode 6. At ﬁrst, the
algorithm creates the partition (A,B) accordingly and initially check
whether A ∧ B is satisﬁable (Line 4). In case of satisﬁability the
empty set is returned. Otherwise, an interpolant is computed and
checked whether the negation of the interpolant is invariant using a
black-box model checker. If an adequate approximation has been
obtained the interpolant is returned otherwise the empty set is
returned.
The algorithm determines adequate approximations that can be used
in model checking to verify a property. As already introduced, BMC is
used to verify a circuit with respect to a property. However, completeness
is guaranteed when reaching the completeness threshold ensures that all
reachable states has been covered. However, if a property holds on at
least all reachable states and some non-reachable states then the circuit
has been successfully veriﬁed with respect to the property. Moreover, if
the property holds on all reachable states and some non-reachable states
the property holds also on the circuit. The general idea is to exploit the
adequate approximation in model checking. Once the property holds on all
states of the adequate approximation the property does hold on the circuit
since at least all reachable states has been discovered.
101
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
Consider the following adapted BMC formula where Δ = {δ1, . . . , δn}
is a set of adequate approximations.
ˆBMC(Δ) =
∧
δ∈Δ
δ¯(s0) ∧ P (s0) (6.7)
If the formula is unsatisﬁable the property holds on the circuit since the prop-
erty holds on at least all reachable states. In contrast, if the formula is sat-
isﬁable the property might be falsiﬁed on reachable states or non-reachable
states. However, further reﬁnement checks are required, i.e., computing
more adequate approximations. However, this is not further investigated
in this work. Several well established techniques can be applied in context
of Counterexample-Guided Abstraction Reﬁnement (CEGAR) [CGJ+00] to
overcome this issue.
6.3.3 Model Checking with Adequate Approximations
BMC proves a property by unrolling the transition relation up to the
completeness threshold. Reaching this value ensures that all reachable
states has been discovered. This is usually computational expensive. In
contrast, to ﬁnd a bug, i.e., disproving a property, it often suﬃces to reach
a small values that is pratically easier to solve.
The following model checking works similar as a bounded model checker.
While checking whether the current states intersects the property states a
new steps that checks whether the property holds on an over-approximation
is added.
A simple model checker named as SimpMC that exploits adequate
approximation is presented in this section.
Algorithm 7:
SimpMC - A Model Checker that exploits adequate approximations.
• Input: The algorithm gets a transition system M = (I, S, T ), a
property P that has to be veriﬁed.
• Output: The algorithm either returns TRUE meaning there there is
a counterexamples or returns FALSE meaning that the property holds
on the circuit.
• Description: The code of the algorithm is shown in Pseudocode 7.
The algorithm starts with initializing the set of adequate
approximations Δ to the empty set.
102
ITP-classiﬁer 6.3
begin1
Δ = ∅;2
foreach l ∈ [0, lcmpl] do3
if SAT?(BMC(l)) then4
return TRUE;5
end6
δ = ADQ(M,P, l);7
if δ = ∅ then8
Δ = Δ ∪ {δ};9
if !SAT?( ˆBMC(Δ)) then10
return FALSE;11
end12
return FALSE;13
end14
Pseudocode 7: SimpMC exploiting adequate approximations.
The outer foreach-loop from Line 3 to Line 12 iterates over all
values l ∈ [0, lcmpl]. In Line 4 it is checked whether there is a
counterexample when unrolling l time frames. If the formula is
satisﬁable, then there is a counterexample and consequently TRUE is
returned.
Otherwise, if the formula is unsatisﬁable there is no path of length l
from the initial state to the property state. Thus, the algorithm of
computing adequate approximations ADQ from Algorithm 6.3.2 is
called. If the algorithm obtained an adequate
approximation, i.e., δ = ∅, the adequate approximation is added to
the set Δ.
In Line 10 it is checked whether the property holds on the
over-approximation of the reachable states represented by the
adequate approximations. If the property holds, FALSE is returned.
Otherwise, the loop restarts.
Eventually, the loop terminates which means that there is no
counterexample of length l ∈ [0, lcmpl]. Consequently, the property
holds on the circuit and TRUE is returned.
In the worst case the algorithm iterates until lcmpl is reached. However,
in many cases SimpMC terminates before, while proving the property based
103
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
on adequate approximation. Consequently, SimpMC might be an additional
piece of an formal veriﬁcation framework.
SimpMC is completely new in this thesis and has not yet been published.
The model checker is evaluated later in this thesis against a state-of-the-art
model checker.
6.3.4 Classiﬁcation with Adequate Approximations
In context of robustness checking adequate approximation can be exploited
as well. In Section 6.1 the BMC-classiﬁer has been introduced. Moreover,
embedding approximations into robustness checking are also introduced in
Section 6.1.4. Reconsider the Formulas 6.4 from Section 6.1.4:
ˆNBMC(U, k, ωˆ) = ωˆ(s0) ∧ PN (U, 0, k)
ˆDBMC(U, k, ωˆ) = ωˆ(s0) ∧ PD(U, 0, k)
where ωˆ represents an arbitrary over-approximation. In order to embed
a set of adequate approximation with Δ = {δ1, . . . , δn} the formula ωˆ is
constrained as follows:
ωˆ(s0) =
∧
δ∈Δ
δ¯(s0)
The adequate approximation can be used to over-approximate k-non-
robust and k-dangerous components and ﬁnally to derive a subset of robust
components as introduced in Section 5.4.
The better the approximations the more exact is the computation of
k-non-robust and k-dangerous components and ﬁnally the subset of robust
components. Eventually, the computation of adequate approximation yields
the exact set of reachable states and consequently the classiﬁcation is always
complete.
6.3.5 Proving Unbounded Dangerous Components
There are certain conditions that a fault becomes not observable at the pri-
mary outputs for any subsequent time frame but corrupts the states, i.e., the
fault persists in circuit’s states. This behavior was introduced under the
term unbounded dangerous in Section 3.3. Components that lead to this
behavior are hard to classify since all possible propagation paths need to
be discovered to exclude that no fault becomes observable. That means,
technically that an observation window up to the completeness threshold
104
ITP-classiﬁer 6.3
kcmpl needs to be analyzed. But this is practically impossible due to the ex-
ponentially increasing search space for each additional time frame. However,
alternative solutions are required to overcome this problem.
A proof procedure classifying components that are either robust or
unbounded dangerous based on interpolation is introduced in the following.
This proof procedure potentially avoids to consider an observation window
up to kcmpl which signiﬁcantly reduces the run times. As already exploited
in the previous approach over-approximation is used in this approach as
well. To avoid to unroll the transition relation up to kcmpl the behavior
of fault-free and faulty computation are over-approximated. Once this
over-approximated behavior leads to a ﬁxed-point it is proven that the
considered components are either robust or unbounded dangerous.
Problem Formulation
The basic idea is to perform a ﬁxed-point computation on the property to
classify non-robust components based on interpolation. The interpolants
compute an over-approximation of the fault-free and faulty computation.
The computation may terminate earlier than considering the completeness
threshold kcmpl once a ﬁxed-point is reached.
Reconsider the formula to classify k-non-robust components ˆNBMC(U, k, ωˆ)
for not yet classiﬁed components U ⊆ V , U = ∅, any k ∈ [1, kcmpl] and any
over-approximation ωˆ.
A partition (A,B) is deﬁned as follows:
A :=
∧
δ∈Δ
δ¯(s0) ∧ T (s0, s1) ∧ TU(s0, s′1) ∧ φone ∧ prop(0, 1)
B := prop(1, k) ∧ N(0, k)
The formula A performs fault injection in components of U into states
constrained by the over-approximations of Δ and propagates the fault-free
and faulty computation for one additional time frame. The formula B
propagates the remaining k − 1 time frames and forces the primary outputs
to be diﬀerent in the last time frame.
Suppose all solutions of the formula has been determined and at least
one component needs to be classiﬁed, i.e., U = ∅. Consequently, A ∧ B
is unsatisﬁable and an interpolant is computed with Iˆ = itp(A,B). The
interpolant Iˆ is deﬁned over the common variables s1 and s′1, i.e., over
states of the fault-free and faulty computation and it holds A =⇒ Iˆ
and B ∧ Iˆ is unsatisﬁable based on Deﬁnition 2.14 of Craig interpolants.
However, the interpolant computes an over-approximation of the fault-free
and faulty computation. Conjoining this interpolant Iˆ to the formula A
105
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
begin1
A = ωˆ ∧ T (s0, s1) ∧ TU(s0, s′1) ∧ φone ∧ prop(0, 1);2
A′ = prop(0, 1);3
B = prop(2, k) ∧ N(0, k);4
while !SAT?(A ∧ A′ ∧ B) do5
δ = itp(A ∧ A′, B);6
if δ =⇒ A then7
return U8
end9
A = A ∨ δ;10
end11
return ∅;12
end13
Pseudocode 8: ITP-classiﬁer 3.
yields A′ = A ∨ Iˆ(s1, s′1). Once an interpolant is computed a ﬁxed-point
test is performed by checking whether the disjunction of the previously
computed interpolants implies the new interpolant. In this case a ﬁxed-point
is reached and the components of U are considered as unbounded dangerous
Dkcmpl ⊆ U. If the formula becomes satisﬁable the procedure restarts while
incrementing the value of k and discarding all interpolants. The reason why
the instance becomes satisﬁable varies: 1) either a real fault on at least
one component becomes observable, 2) the approximation of ωˆ might be to
coarse, or 2) the computed interpolants Iˆ are to coarse. In every case the
procedure is restarted while k is incremented. A detailed algorithm follows
in the next section.
Algorithm
An algorithm that performs ﬁxed-point computation on the property is
introduced in the following:
Algorithm 8:
Fixed-point computation on the property: FixedProp.
• Input: The algorithm gets a transition system M = (I, S, T ), and a
set of components that have to be proven as unbounded dangerous or
106
ITP-classiﬁer 6.3
robust. Moreover, an arbitrary over-approximation ωˆ is given as
input.
• Output: The algorithm returns either the set of components that
are unbounded dangerous or robust, or the algorithm returns the
empty set which means that the approximations are too weak or at
least one component is non-robust.
• Description: The code of the algorithm is shown in Pseudocode 8
and described in the following.
The algorithm starts with initializing the dedicated partitions as
introduced in the previous section. The while-loop iterates as long
as the approximation on the property does not get do
coarse, i.e., until the formula becomes satisﬁable. In the loop, an
interpolant is computed. In Line 7, it is checked whether a
ﬁxed-point is reached. If a ﬁxed-point is found, the components of
the set of U are classiﬁed as unbounded dangerous or robust
components which is returned in Line 8. Otherwise, if the no
ﬁxed-point is found, the approximation is joined to the set A that
virtually unrolled the transition relation. Once the approximation
gets to coarse, i.e., the loop terminates and an empty set is returned.
6.3.6 Complete Algorithm of the ITP-classiﬁer
All three parts that forms the ITP-classiﬁer has been presented in the
previous sections. In this section an algorithm that integrates all three
parts that provides a complete and eﬀective classiﬁer is presented.
Algorithm 9:
ITP-classiﬁer
• Input: The algorithm gets a transition system M = (I, S, T )
derived from circuit C = (V,E), and a set of components to be
classiﬁed U ⊆ V .
• Output: The algorithm returns non-robust and robust components,
respectively.
107
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
begin1
k = l = 0;2
S = D = T = Δ = ∅;3
while U = ∅ do4
S = sol(NBMC(U, l, k));5
U = U \ S;6
if U = ∅ ∨ (l = lcmpl ∧ k = kcmpl) then7
return (T,S)8
δ = ADQ(M,PN (U, l);9
if δ = ∅ then10
Δ = Δ ∪ {δ};11
end12
if necessary states are covered with respect to k then13
l = 0;14
else15
l = l + 1;16
continue;17
end18
Sˆ
l
k = sol( ˆNBMC(U, l, k));19
Dˆ
l
k = sol( ˆDBMC(U \ Sˆlk, l, k));20
Tˇ
l
k = U \ (Sˆlk ∪ Dˆkk);21
U = U \ Tˇlk;22
T = T ∪ Tˇlk;23
if U = ∅ then return (T,S);24
if k = 0 then25
k = 1;26
continue;27
end28
W = FixedProp(M, Dˆlk,
∧
δ∈Δ δ¯(s0)) ;29
T = T ∪ W ;30
k = k + 1;31
end32
end33
Pseudocode 9: ITP-classiﬁer.
108
ITP-classiﬁer 6.3
• Description: The code of the algorithm is shown in Pseudocode 9
and described in the following.
The algorithm starts with initializing required indices l and k to be
zero, and set of classiﬁcation and adeqaute approximations to be
empty.
The while-loop iterates as long as not all components are classiﬁed.
At ﬁrst step of the loop, non-robust components are classiﬁed with
respect to l and k. This classiﬁed components are excluded from
further classiﬁcation in Line 6. Furthremore, in Line 7 it is checked
whether components are not yet classiﬁed or the respective
completeness thresholds are reached. In this case, all complete
analysis has been performed and the respective sets of classiﬁcation
are returned.
Otherwise, an adequate approximation is determined with respect to
k. If an adequate approximation has been determined, this
approximation is added to the entire set of adequate approximations
Δ.
Line 13 checks whether all necessary are covered with respect to the
current value of k. This is realized by the ITP-classiﬁer-1 but the
detailed pseudocode is ommitted here. If all necessary states are
covered the algorithm proceeds with Line 19. Otherwise, l need to be
incremented by one to cover more states.
Line 19 to Line 23 determine an over-approximation of k-non-robust
and k-dangerous components that ﬁnally implies a subset of robust
components. The subset of robust components are added to the
entire set of robust components. If no more components need to be
classiﬁed robust and non-robust are returned in Line 24.
In the remaining code, the ﬁxed-point computation on the property
for the potentially unbounded dangerous components Dˆlk is
performed. This is technically possible if k is greater than 0 which is
checked in Line 25.
If k is greater than 0, the algorithm FixedPoint is called with the
transition relation M , the potential dangerous components Dˆlk, and
an over-approximation determined by the adequate approximations.
This algorithm either returns a set of unbounded dangerous or robust
components or an empty set. However, this set is added to the entire
set of robust components.
109
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
At the end the computation for the current value of k is complete
since all necessary states are discovered. Consequently, k is
increamented by one and the while-loop iterates.
6.4 COMP-classiﬁer
When a problem instance gets too large, partitioning is done in various
ﬁelds in computer science. In formal hardware veriﬁcation the search space
increases often exponentially with respect to the size of the input. For
example the state explosion problem in symbolic model checking prevents
the veriﬁcation of larger systems to being eﬀective. Divide-and-conquer-
based methods are available to verify larger system. Examplarily [BCM+90,
CGJ+00, CLM89] successfully applied this techinques.
The engines introduced so far consider the entire circuit within one
monolithic problem instance. In this section a compositional approach is
introduced. Basically, a set of subcircuits of a circuit is given and the
classiﬁcation is locally performed on the subcircuits and if necessary the
classiﬁcation is composite with the entire circuit. A new notion of non-robust
components is introduced since the classiﬁcation locally on the subcircuit
may not hold on the entire circuit. Consequently, it is required to validate
the classiﬁcation with respect to the entire circuit, performed by additional
checks - justiﬁcation check and propagation check. The validation can be
performed diﬀerently, e.g., simulation yields approximate results and formal-
based techniques yield exact results, respectively, but are computational
expensive. Based on this approach, the COMP-classiﬁer results which has
been published in the author’s work [FF10].
The compositional approach considers combinational circuits. The
classiﬁcation locally on the subcircuit can be performed by any engine that
at least classiﬁes the circuit with respect to the CombModel. In fact, the
BMC-classiﬁer, ATPG-classiﬁer, and ITP-classiﬁer can be eﬀectively used
to perform the local classiﬁcation.
However, the performance of the COMP-classiﬁer depends strongly on
the choice of the subcircuits. An idea of how choosing good subcircuits are
presented later in this section.
6.4.1 General Approach
Given a combinational circuit C divided into a set of not necessarily disjoint
subcircuits P = (C1, . . . , Cn) with ⋃Ci∈P Ci = C. The components of the
110
COMP-classiﬁer 6.4
In
pu
ts
O
ut
pu
ts
C
Ci
fanin cone fanout cone
Figure 6.4: General idea of locally classifying subcircuits
circuit C are classiﬁed by two steps for each subcircuit Ci ∈ P : 1) locally
perform the classiﬁcation of the components of Ci by any classiﬁer that
classiﬁes components with respect to CombModel, and 2) ﬁnally composite
the result of the classiﬁcation while considering the entire circuit. In
Figure 6.4 the basic idea is illustrated. Consider the subcircuit Ci which is
locally classiﬁed by any classiﬁer while omitting the fanin cone and fanout
cone. Therefore, the problem instance of classifying Ci is reduced according
the size of the subcircuit. Consequently, the local classiﬁcation may perform
much faster but may require additional checks.
6.4.2 Local Classiﬁcation
The compositional approach considers combinational circuits. Therefore,
the class model CombModel is applied. Hence, there are only non-robust,
or robust components to be classiﬁed. Due to the local classiﬁcation on the
subcircuit, a new notion of non-robust components is necessarily introduced
since surrounding logic of the subcircuit is ignored.
Deﬁnition 6.3. Let Ci = (Vi, Ei) with Ci ∈ P be the current subcircuit
to be analyzed. A component g ∈ Vi is called locally non-robust if g is
classiﬁed as non-robust on the subcircuit Ci.
This new kind of locally classiﬁed components is analogously introduced
for robust components as follows:
Deﬁnition 6.4. Let Ci = (Vi, Ei) with Ci ∈ P be the current subcircuit to
analyzed. A component g ∈ Vi is called locally robust if g is classiﬁed as
robust on the subcircuit Ci.
Locally non-robust components g ∈ Vi are stored in the set SCi where
locally robust components g ∈ Vi are stored in the set TCi and it holds
SCi ∪ TCi = Vi.
111
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
The robustness of a subcircuit Ci is given by:
RCi =
|TCi |
|Vi|
which corresponds to the robustness measure for combinational circuits
from Section 4.1.
However, if the circuit is equipped with a fault signal F a certain
condition for this signal must hold for the compositional approach in order
to exploit the following lemma.
Locally reported faults by fault signal Fi ∈ Vi in subcircuit Ci must be
reported by the fault signal F ∈ V at the entire circuit C as well. This
assumption is named as fault signal implication and can be veriﬁed by a
model checker. Based on this, the following lemma provides a powerful
property that can be easily exploited that reduces the run times signiﬁcantly.
Lemma 6.3. Given a component g ∈ Vi of subcircuit Ci = (Vi, Ei). If g is
locally robust then g is also robust under the fault signal implication. Thus,
g ∈ Ti =⇒ g ∈ T.
Proof. If there is no scenario and no CTF that leads to faulty behavior at the
primary outputs and fault signal does not repot any fault on the subcircuit,
then there is also no scenario and no CTF at the entire circuit.
Components that are locally robust must not be further analyzed which
lowering the run time since additional checks are not necessary. Once
all components are locally classiﬁed a lower bound for the robustness is
provided by:
Rcomplb =
∑
Ci∈P
|Ti|
|V | ≤
|T|
|V |
where T is the set of robust components determined by analyzing the entire
circuit.
In contrast, the classiﬁcation of locally non-robust components needs
to be further validated on the circuit performed by composing partially
relevant logic parts of the circuit as introduced in the following section.
6.4.3 Composite Classiﬁcation
Since the classiﬁcation locally on the subcircuits ignores the surrounding
logic, each classiﬁcation of non-robust components needs to be composite
112
COMP-classiﬁer 6.4
with the entire circuit. Scenarios that show faulty behavior at the subcircuit
level may not be possible at the entire circuit level. However, the respective
scenarios need to be validated against the entire circuit in terms of whether
the scenarios and the faulty output are justiﬁable and observable at the entire
circuit’s outputs, respectively. This validation is divided into two checks:
1) check for justiﬁcation of the scenarios, and 2) check for propagation of
the faulty output. Once the checks are performed it is accordingly decided
whether the component is non-robust or robust based on the entire circuit.
Both checks are introduced in the following.
Justiﬁcation
Each classiﬁcation of a locally non-robust component g ∈ Ci comes with a
set of pairs consisting of a scenario and a faulty output:
Tg = {(X1, Y1), . . . , (Xm, Ym)},
where Xi is an assignment to the subcircuit’s inputs and Yi is an assign-
ment of the subcircuit’s outputs. The checker for justiﬁcation denoted as
j−checker returns a subset of pairs where the inputs of the subcircuit is
justiﬁable within the entire circuit:
j−checker(Tg, C) := {(Xi, Yi) ∈ Tg | Xi is justiﬁable in C} ⊆ Tg
However, if the checker determines that none of the scenarios can be justiﬁed
the respective component g is classiﬁed as robust. Otherwise, that means if
there is at least one scenario that is justiﬁable the propagation of the faulty
output needs to be checked.
Propagation
Given a set of faulty outputs according to the justﬁable inputs with T ′g =
j−checker(Tg, C). The propagation checker denoted by p−checker(T ′g, C) ∈
{False, True} returns whether a faulty output is observable at the circuit’s
primary outputs as follows:
p−checker(T ′g, C) =
{
FALSE if Yi are not observable ∀(Xi, Yi) ∈ T ′g
TRUE otherwise
Overall, if ﬁnally the p−checker returns FALSE for the component g,
then g is classiﬁed as robust. Otherwise, the p−checker returns TRUE, i.e., faulty
behavior is observable, the component is classiﬁed as non-robust.
113
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
6.4.4 Flow of Validation
There are several opportunities of how the j−checker and the p−checker can
be realized and how they interact. Two types of interactions are presented:
One possible interaction is that the j−checker and the p−checker interact
incrementally. That means, once a component has been classiﬁed as locally
non-robust, a single scenario and one faulty output has been obtained
that needs to be checked for justiﬁcation. Instead of enumerating all
those scenarios and faulty outputs, the single scenario is ﬁrst checked
for justiﬁcation using the j−checker. If the scenario is justiﬁable, the
corresponding single faulty output is checked for propagation using the
p−checker. If the faulty output is observable at the primary output on the
circuit the respective component is classiﬁed as non-robust. Otherwise, of
the faulty output is not observable the check for justiﬁcation repeats with
the next scenario and fault output.
However, in the best case a single component can be classiﬁed as non-
robust without enumerating and checking all scenarios and faulty outputs.
The more general interaction is to ﬁrst compute a set of a certain size
of scenarios during the classiﬁcation and perform the validation on this set.
The eﬀectiveness of the approach depends on the circuit and the respective
partitioning.
6.4.5 Realization of the Validation
The validation in terms of j−checker and p−checker can be realized dif-
ferently. They can be divided into formal and non-formal methods. Con-
sequently, the results might be exact or approximate which has a direct
inﬂuence on the validation and ﬁnally the classiﬁcation. Therefore, various
realizations and the consequences for the classiﬁcation are presented.
Basically, the j−checker considers the fanin cone of the inputs of the
respective subcircuit and checks whether there the scenarios can be justiﬁed.
In contrast, the p−checker considers the fanout cone of the subcircuit’ s
output and checks whether the faulty output is observable at the circuit’s
primary outputs.
Justiﬁcation by Simulation
The check for justiﬁcation by simulation may perform very fast. The result
might not be exact since simulation checks the scenarios non-exhaustively.
The simulation applies a certain number of stimuli to check whether the
scenario is justiﬁable. If under this number of stimuli the scenario cannot
be justiﬁed the j−checker returns that the scenario is not justiﬁable which
114
COMP-classiﬁer 6.4
is certainly an approximate result. Consequently, the j−checker based on
simulation returns a subset of the scenarios that are justiﬁable.
Justiﬁcation by SAT techniques
The check for justiﬁcation based on SAT techniques is to translate the
problem into a SAT-problem yielding an exact method for determining
justiﬁable scenarios. Necessary logic of the circuit is translated into a CNF
and the respective value of the scenarios are appropriately constrained. The
resulting CNF is satisﬁable iﬀ the scenario is justiﬁable. Otherwise, the
CNF is unsatiﬁable and the scenario not justiﬁable. Consequently, the
j−checker based on SAT techniques determines an exact set of justiﬁable
scenarios.
Approximate Propagation by SAT techniques and Simulation
Propagation can also be realized by simulation and SAT techniques. The
check of propagation based on simulation similar realized as the j−checker
yields an over-approximation of robust components due to the non-exhaustive
search. A SAT-based check leads to an equivalence check that is signiﬁ-
cantly harder to solver but deliver exact results. If this formal-based kind of
check is performed, i.e., that includes all relevant circuit logic the resulting
problem becomes very complex and may require long run times.
A more light-weighted check is used to reduce the complexity. An
approximate check for propagation by SAT techniques is presented by
considering the fanout cone of the subcircuit’s outputs while omitting the
oﬀ-path inputs of the cone. A equivalence check is then performed. The
result is an over-approximation of whether faulty outputs are justiﬁable.
6.4.6 Inﬂuence of Choosing Subcircuits
The choice of the subcircuits signiﬁcantly inﬂuences the eﬀectiveness of the
approach. There are various opportunities to compute a set of subcircuits.
Available partitioning algorithms, e.g., [KL70], determine a disjoint set of
subcircuits. However, this disjoint partition might not be appropriate for
the compositional approach since including checker circuitry within each
subcircuit may signiﬁcantly decrease the run time. Both cases, one including
the checker and one excluding the checker are described:
• The local classiﬁcation of a subcircuit that includes the checker cir-
cuitry may output the components as locally robust that are robust
on the entire circuit as well because the checker circuitry detects
115
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
Table 6.1: Accuracy of combining approximate and exact validation
Justiﬁcation Propagation Robustness
simulation simulation upper bound
exact (SAT) simulation upper bound
simulation exact (SAT) upper bound
exact (SAT) exact (SAT) exact
exact (SAT) approximation (SAT) lower bound
all internal faults. Therefore, the classiﬁcation is already completed
on the subcircuit and does not require any validation of scenarios
and faulty outputs on the entire circuit. Thus, the compositional
approach is very eﬀective since no additional validation is required
and the search space for each classiﬁcation based on the subcircuit is
considerably reduced.
• In contrast, the classiﬁcation of a subcircuit that excludes the checker
circuitry might be computationally very expensive. Suppose the
components of a subcircuits are classiﬁed as locally non-robust. Con-
sequently, the j−checker and the p−checker validate whether the
component is non-robust or robust on the entire circuit, respectively.
Suppose the j−checker determines that all scenarios are justiﬁable.
This number of scenarios might be exponential with respect to the
number of the inputs of the subcircuit. Consequently, the j−checker
has to check a very large number of scenarios where each check is
computationally complex. Afterwards, the p−checker determines that
all faulty outputs are not observable at the primary outputs of the
entire circuit since the fault signal reports all faults and it is concluded
that the component is robust on the entire circuit.
Overall, both cases demonstrate how the choice of the subcircuits in-
ﬂuences the eﬀectiveness of the compositional approach. In one situation
the classiﬁcation has been completed on the subcircuit itself and in the
other situation the classiﬁcation has been completed after a huge series of
additional checks.
6.4.7 Comparison of Accuracy
In Table 6.1 the combination in terms of accuracy of diﬀerent techniques
used for justiﬁcation and propagation is listed.
116
SIM-classiﬁer 6.5
Exact computation of the robustness is guaranteed when realizing the
check for justiﬁcation and propagation by using exact SAT techniques. A
lower bound of the robustness is provided when considering an abstracted
propagation that ignores certain parts of the circuit based on SAT techniques.
In the remaining cases an upper bound of the robustness is provided once
simulation is used since a non-exhaustive exploration of the search space is
performed that may miss some corner cases.
In this thesis, an exact check for justiﬁcation using SAT techniques and
an approximate propagation check using SAT is used that provides a lower
bound of robustness.
6.5 SIM-classiﬁer
Simulation is extensively used in the ﬁeld of functional veriﬁcation for very
complex circuits. Random simulation does not exhaustively analyzes the
entire search space an only roughly computed approximation is provided
in almost short run times. In a tightly integrated veriﬁcation ﬂow random
simulation is an integral part while verifying complex industrial circuits.
In the context of robustness checking, random simulation is used in two
ways. First, random simulation is used as a pre-processing step in order
to approximate k-non-robust and k-dangerous components before starting
the formal engines. Additionally, random simulation is tightly integrated
within the introduced SAT-based approaches that guides the simulation
into a particular search space.
In this section random simulation for robustness checking is introduced.
Similar as in the ATPG-classiﬁer the classiﬁcation of the components is
performed step by step. Once a component is classiﬁed a next component is
analyzed. Usually, several thousand of inputs are generated and simulated
per component. Randomly chosen input values are used to stimulate the
circuit over a certain number of time frames. However, beside generating
input stimuli, fault injection has to be additionally performed for each
component in order to model CTFs. Due to the component model introduced
in Section 2.3.1, fault injection becomes more complex. Since not only
Boolean values are allowed. Depending on the bit size of the respective
component, numerous faults are possible and need to be appropriately
discovered during simulation.
Due to the nature of simulation, all those computations generating input
stimuli and generating fault injections cannot be performed exhaustively
because of the limited computational resources. Consequently, random sim-
ulation approximates the classiﬁcation and is almost a non-formal approach.
However, the entire ﬂow may beneﬁt from random simulation, because some
117
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
components are easily classiﬁed, for example due to their structural and
functional properties. But when considering circuits that contain almost
only robust components, random simulation states an overhead, since robust
components cannot be proven because of the incomplete analysis.
However, random simulation is used to classify components into k-non-
robust and k-dangerous components, respectively. If a single scenario and a
single CTF lead to misbehavior the component is classiﬁed. Based on this
the SIM-classiﬁer results and is more detailed presented in the following
section.
6.5.1 Algorithm
Random simulation is used to compute an approximation of k-non-robust
and k-dangerous components which may boost the whole performance of
robustness checking. The more components are classiﬁed by simulation
the fewer components need to be checked by the formal engines which
potentially yields better run times.
Algorithm 6.5.1 the pseudo-code of the SIM-classiﬁer is presented.
Algorithm 10:
SIM-classiﬁer.
• Input: A circuit C = (V,E) to be analyzed, a set of components to
be classiﬁed U ⊆ V , l¯ the number of time frames from the initial
state, and k¯ the size of the observation window are given as inputs.
• Output: The approximate set of classiﬁed components according
to l¯ and k¯.
• Description: In Figure 6.5 a schematic view of random simulation
for robustness checking is shown. At ﬁrst the injection state is
computed by applying l¯ randomly chosen stimuli started from the
initial state. This computed state is used for fault injection. Before
performing fault injection for each component the fault-free state and
primary outputs need to be computed that constitutes the reference
values. After determining the reference values the components of U
are analyzed. For each component g ∈ U: A randomly chosen value
according to F(g) is set to g’s output and the circuit is simulated by
the same stimuli as used as computing the reference values. Once the
primary outputs diﬀer from the reference values the component is
118
SIM-classiﬁer 6.5
Random simulation
injection state
correct-
compu-
tation
CTF
simu-
lation
CTF
simu-
lation
. . .
. . .
checking outputs and states
≤ l¯
k ∈ [0, k¯]
Repeat for μ scenarios
Figure 6.5: General idea of the SIM-classiﬁer
classiﬁed as k-non-robust. After k¯ simulation steps the primary
output is equal to the reference value the states are checked for
inconsistency. If the states diﬀer the component is classiﬁed as
k-dangerous and is further checked when considering new randomly
chosen stimuli.
In each step, by checking the reference value it is checked whether
the fault signal is set to one, what means that under current stimuli
the fault is detected and therefore not classiﬁed as k-non-robust or
k-dangerous.
Finally, if there are components remaining non-classiﬁed the entire
simulation is repeated until μ stimuli has been applied.
Under-approximation
The random simulation computes an under-approximation of reachable
states while simulating simuli due to its incomplete analysis. Consequently,
an under-approximation of k-non-robust and k-dangerous components is
returned according to Theorem 5.3. However, the parameter l¯ and k¯ inﬂu-
ence the number of classiﬁcations as well as in the formal-based approaches.
But the simulation engine cannot classify robust components due to the
non-exhaustive search. Therefore, the SIM-classiﬁer provides an upper
bound of robustness as presented in Section 5.4.1.
119
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
Over-approximation
In contrast to usual random simulation that considers a partial set of
reachable states, considering an over-approximation of reachable states
is very useful in robustness checking as illustrated as follows. Formal
classiﬁers exploit an over-approximation of reachable states in order to
provide a proof procedure for robust components. Here, spurious k-non-
robust and spurious k-dangerous components are classiﬁed and a subset
of robust components are ﬁnally driven. In order to improve the overall
performance for classifying spurious k-non-robust and spurious k-dangerous
components in the formal classiﬁers, random simulation is used with an over-
approximation of reachable states for the injection state. Algorithm 6.5.1
is easily adapted in order to realize this classiﬁcations. Computing the
injecting state is skipped rather than a randomly chosen state is used.
Therefore, the parameter l¯ is not required anymore.
As a result the simulation engine based on an over-approximation pro-
vides only a meaningful classiﬁcation when it is used as a pre-processing
step before a formal classiﬁer since the classiﬁcation is spurious but excludes
spurious components for the formal classiﬁer. That means, neither a lower
bound nor an upper bound of WC−RM is provided because the classiﬁ-
cations are spurious and robust components cannot be derived. However,
the entire ﬂow beneﬁts from this kind of classiﬁcation as it will be detailed
evaluated in the experimental section.
6.5.2 Integration in the Classiﬁers
A large portion of the run times of the classiﬁers is caused by classifying
k-dangerous components. For diﬀerent values of k this steps is repeated
several times while often a large amount of all components are k-dangerous
even when k is small.
Random simulation is used to classify k-dangerous components before the
formal techniques are called. All components that are classiﬁed by simulation
can be skipped during the formal analysis which may signiﬁcantly reduce
the search space. That means, the random simulation roughly approximate
k-dangerous components. Two heuristics related to the SIM-classiﬁer and
formal classiﬁers are newly introduced in this thesis in Section 6.7.3.
6.6 Comparison of the Classiﬁers
All classiﬁers has been presented in the previous sections. In this section
the diﬀerences of these classiﬁers are more elaborated discussed.
120
Comparison of the Classiﬁers 6.6
Table 6.2: Complexity of the formal-method based classiﬁers.
ATPG-classiﬁer BMC-classiﬁer ITP-classiﬁer
Size O(2|V | · k¯ + |V | · l¯) Ω(2|V | · k¯ + |V | · l¯) Ω(2|V | · k¯ + |V | · l¯)
Instances ≥ |U| ≥ 1 ≥ 1
Space 2|X|·(l¯+k¯) 2|X|·(l¯+k¯)+|U| 2|X|·(l¯+k¯)+|U|
The ﬁve presented classiﬁers are mainly divided into four formal classi-
ﬁers and one simulation approach. The SIM-classiﬁer states the non-formal
approach while the remaining classiﬁers are based on formal methods.
All classiﬁers analyze any sequential circuit except of the COMP-
classiﬁers. The COMP-classiﬁer analyze combinational circuits and is
able to handle large circuits eﬀectively.
6.6.1 BMC, ATPG, and ITP-classiﬁer
The BMC-classiﬁers, ATPG-classiﬁer, and ITP-classiﬁers are compared. As
already mentioned the BMC-classiﬁer and ATPG-classiﬁer are theoretically
able to completely analyze a circuit up to the respective thresholds. However,
practically only bounded observation windows can be eﬀectively handled.
For many practical relevant cases a bounded analysis suﬃces since general
conditions provided by the designer lead both classiﬁers to being eﬀective
veriﬁcation tools.
However, the problem instances of the BMC-classiﬁer and the ATPG-
classiﬁer vary: The BMC-classiﬁer includes all components at once in
single problem instance rather than the ATPG-classiﬁer that considers
only one component in single problem instance. Consequently, the ATPG-
classiﬁer needs to create and solve the problem instances for each component,
separately, but the size those instances might be signiﬁcantly smaller than
the BMC-classiﬁer’s instances since only relevant logic need to be considered.
Moreover, the ATPG-classiﬁer is also eﬀectively able to compute EPP
for each component.
The most powerful classiﬁer is the ITP-classiﬁer which generally re-
quires no additional parameter as for example the size of the observation
window as required in the BMC-classiﬁer and ATPG-classiﬁer. The ITP-
classiﬁer automatically determines completeness potentially before reaching
the completeness thresholds.
Table 6.2 lists the complexity of the ATPG-classiﬁer, the BMC-classiﬁer,
and the ITP-classiﬁer. The ﬁrst row speciﬁes the size of the SAT instance
121
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
while considering the full observation window k¯ and the full reachability
window l¯. The diﬀerence of the ATPG-classiﬁer compared to the BMC-
classiﬁer and ITP-classiﬁer is that the ATPG-classiﬁer models in the worst
case the entire circuit. Usually, the cone-of-inﬂuence reduction yields smaller
instances. In contrast, the SAT instances of the BMC-classiﬁer and the
ITP-classiﬁer is always since they model always the entire circuit.
The next row speciﬁes the number SAT instances to be solved in the best
case. The ATPG-classiﬁer classiﬁes the components in separate problem
instances. Therefore, at least |U| instances need to be solved. In contrast,
the BMC-classiﬁer and the ITP-classiﬁer model all components at once in
a single instance. Consequently, in the best case only a single instance need
to be created that classiﬁes all components to be non-robust.
Moreover, in the last row the size of the search space is roughly speciﬁed.
To classify a single component with the ATPG-classiﬁer, values for the
primary inputs over k¯ + l¯ time frames need to be searched. Suppose a
circuit with only 5 inputs and an observation window and an reachability
window of 10 yields already a huge search space: 25·20 ≈ 1.26 × 1030. In
contrast, the BMC-classiﬁer and ITP-classiﬁer additionally need to decide
in which component a fault is injected. The size of the search space is
signiﬁcantly increased when considering only 50 components to be classiﬁed:
25·20+50 ≈ 1.42 × 1045.
Although the search space of the BMC-classiﬁer and the ITP-classiﬁer is
signiﬁcantly higher learnt information are transfered across the classiﬁcations
since the instances are solved incrementally. Exploiting learnt information
in the ATPG-classiﬁer is possible for a single component over various time
frames.
6.7 RobuCheck
RobuCheck is a uniﬁed push-button tool that integrates all proposed
classiﬁers presented in this thesis into a highly-optimized ﬂow for robustness
checking. RobuCheck has been ﬁrstly published in [FFSD10] as a static
veriﬁcation tool and has been strongly improved in [FHD+11]. Moreover,
concurrent classiﬁcation to exploit multi-core processor architecture is ﬁrstly
introduced in this thesis. The back-end of RobuCheck relies on the veriﬁ-
cation framework WoLFram [SKF+09] which provides basic functionality
for analyzing Boolean circuits.
122
RobuCheck 6.7
data storage syncronization
classifiersscheduler
RTLvisionProTM
backends
circuit
configuration
classifications
Figure 6.6: System overview of RobuCheck
6.7.1 System Overview
RobuCheck automatically computes the robustness by analyzing any part
of a digital circuit. The input of RobuCheck is a digital circuit in VHDL
or Verilog format and a conﬁguration ﬁle that speciﬁes certain parameters
of the analysis.
In Figure 6.6 RobuCheck’s system overview is shown. The input
in terms of a circuit and a conﬁguration ﬁle is given to the scheduler.
The scheduler reads the conﬁguration ﬁle that speciﬁes a Classiﬁcation
Process (CP) ρ which is later described.
The Scheduler calls the classiﬁers in the conﬁgured order. The clas-
siﬁes communicates with the Backends, i.e., SAT solvers and simulation
engines. Each classiﬁer has access to shared memory that stored the clas-
siﬁed components. This shared memory can be concurrently accessed
protected by synchronization mechanism. Once a classiﬁer determines a
classiﬁcation, the data storage is ﬁlled with that information such that the
remaining classiﬁers may beneﬁt from this information, i.e., the classiﬁers
can skip this already classiﬁed component which reduces the overall run
time.
The result of the classiﬁcation can be visualized with RTLVison-
ProTM [Con09]. RTLVisionPro is a strong visualization engine of hi-
erarchical schematic view, source code browsing, and cross-probing between
schematic view, and source code. This engines is integrated in RobuCheck
via Tool Command Language (TCL). The Graphical User Interface (GUI)
is shown in Figure 6.7. In particular, the visualization engine is used to
123
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
Figure 6.7: Graphical User Interface of RTLVisonProTM
diﬀerentiate between the classiﬁcation. For example, k-non-robust compo-
nents are highlighted with red-coloured components, robust components are
highlighted with green-coloured components, and, k-dangerous components
are highlighted with yellow-coloured components. The more ﬁne-grained
analysis that consider Excitation and Propagation Probabilities (EPP) as
introduced in Section 4.2 yields a diﬀerentiation between non-robust com-
ponents. A ﬁne-grained gradation of the red-coloured components provides
a visual diﬀerentiation as well, i.e., the more vulnerable the component the
darker the red.
6.7.2 Technical Details
RobuCheck is implemented in C++ and uses several thirdparty libraries
as for example external SAT solvers. Technical internals are described in
the following.
Classiﬁcation Process
A Classiﬁcation Process (CP) consists of several parameters:
• classiﬁer (cls): The name of the classiﬁer with
cls ∈ {BMC,ATPG, ITP,COMP,SIM,SUB-CP}
The SUB-CP speciﬁes that a list of CPs are started in parallel. The
remaining names corresponds to the proposed engines before.
124
RobuCheck 6.7
• observertion window k¯: The parameter k¯ ∈ [0, kcmpl] speciﬁes the
maximum considered observertion window.
• reachability window l¯: The paramter l¯ ∈ [0, lcmpl] speciﬁes the
maximum number of time frames from the initial state to inject a
fault.
• optimization lookup table olt: A lookup table speciﬁed which
optmizations are enabled.
• list of CPs ρ1, . . . , ρn: A list of sub-CPs that are exectued in parallel.
• pre-process CP ρpre: The pre-process CP speciﬁes a separate CP
that is exectued before.
• reserved ﬂags REV: A list of ﬂags dedicated to the classiﬁers are
stored in Φﬂags.
The CPs are stored in the conﬁguration ﬁle using the Extensible Markup
Language (XML). A CP conﬁgures which classiﬁers are used with the
required parameters and in which order they are executed. Since a CP
internally supports the list of CPs that are exectued in parallel (SUB-CP),
very powerful conﬁgurations can be setted up.
CP for BMC-classiﬁer and ATPG-classiﬁer The BMC-classiﬁer and
ATPG-classiﬁers gets beside the circuit two additional parameters: size of
the observation window k¯ and the size of the reachability window l¯.
CP for ITP-classiﬁer The ITP-classiﬁer does not depend on certain
parameters since completeness is automatically determined. Therefore, the
parameter l¯ and k¯ are set to the maximal possible value. However, in certain
cases a limitation can be conﬁgured.
CP for COMP-classiﬁer The COMP-classiﬁer analysis combinational
circuits. Therefore, unrolling the transition relation is not required. Hence,
the parameters l¯ and k¯ are constantly setted to zero. The general idea of
the COMP-classiﬁer is to classify the circuit based on a set of subcircuits as
detailed described in Section 6.4. The subcircuits are stored in a separate
ﬁle speciﬁed in the ﬂags Φﬂags.
CP for SIM-classiﬁer The SIM-classiﬁer depends additionally on the
parameter how many traces have to be simulated. This number of traces is
speciﬁed in Φﬂags[traces] as a positive number.
125
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
SAT solver
WoLFram encapsulates several solvers for Boolean satisﬁability that can
be arbitrarily selected. RobuCheck focuses on three diﬀerent solvers:
MiniSAT [ES03], PicoSAT [Bie08], and Lingeling [Bie10].
The most frequently used SAT solver is MiniSAT in version 2.2 which
is integrated in RobuCheck over MiniSAT’s C++ API. That means,
performance related features like incremental satisﬁability can be adequately
exploited via its API. PicoSAT and Lingeling are used in the context of
interpolation as described in the following.
Interpolation
The ITP-classiﬁer requires to compute Craig interpolants. Typically, the
performance of interpolation-based algorithms are based on the strength
of the back-end interpolation procedures. In order to further improve the
performance of the ITP-classiﬁer as originally published in [FFA+12], a
powerful ﬂow has been newly introduced in this thesis to obtained inter-
polants.
As presented in Section 2.2.4 interpolants can be computed in two
ways: proof-based along the resolution proof of a SAT solver, or model-based
using any SAT solver that provides satisfying assignments. Both techniques
have been implemented in RobuCheck and are integrated as a ﬁrst-come
ﬁrst-served approach explained later. Recall, an interpolant Iˆ is logical
formula describing the relation of a unsatisﬁable formula pair (A,B). Each
technique used in this work starts with the formulas A and B given as CNF.
To the best of the author’s knowledge the integration of both inter-
polation approach has not been proposed so far in the literature. The
integration is ﬁrstly done in this thesis. Due to the signiﬁcant diﬀerence of
both approaches the integration in terms of concurrent computation is very
useful.
Proof-based McMillan’s Interpolation System (MIS) computes inter-
polants along a resolution proof of an unsatisﬁable formula. Figure 6.8
depicts the implemented ﬂow to compute interpolants based on resolution
proofs. RobuCheck starts the computation by generating a trace of the
proof of the unsatisﬁable formula A∧B. The SAT solver PicoSAT has been
chosen to generate such traces. However, the trace generated by PicoSAT
does contain all information to generate interpolants such as pivot variables.
The tool tracecheck checks the trace for correctness, i.e., whether the
resolution inference rule is correctly applied to derive the empty clause.
While checking the trace this tool generates a resolution proof containing
126
RobuCheck 6.7
RobuCheck PicoSAT tracecheck libitp
CNF trace proof
Iˆ (interpolant / AIG)
(A,B)
Figure 6.8: Proof-based interpolants computed with PicoSAT
suﬃcient information to generate an interpolant. The library libitp gets
the formula pair (A,B) and the resolution proof as input and generates
an interpolant based on McMillan’s interpolation system (MIS) (see Sec-
tion 2.2.4). Finally, libitp returns the determined interpolant represented
as AIG to RobuCheck.
Interpolants are often very large consisting of several hundreds of thou-
sand of nodes but containing often a huge amount of redundant logic. Since
interpolants are represented in AIG several tools supporting AIG ﬁles and
minimizing the graph can be easily applied, i.e., ABC [Gro12].
Model-based In contrast to the proof-based approach to generate inter-
polants the model-based approach does not require a resolution proof rather
than enumerating satisfying assignments of A and B. The basic idea of the
approach is to compute satisfying assignments of formula A and create a
DNF with that assignments that ﬁnally builds the interpolant. Minimizing
the assignments is crucial for the performance of this approach. But while
minimizing the assignments the deﬁnition of Craig interpolant need to be
taken into account, i.e., the assignment abstracts A but contradicts B.
The computed interpolant is a DNF, i.e., Iˆ = p1 ∨ p2 ∨ pn where pi is
a satisfying assignment of A with Var(p) = Var(A) ∩ Var(B). Due to the
exponential grow with the number of variables of A the naïve approach
of enumerating all assignments might be very slow. Generalizing the
assignments speeds up the computation by a considerable factor. This is
performed in an extra step and is also known as cube enlargement. Three
techniques has been newly implemented to minimize the assignments. The
original paper [CIM12] implements other minimization algorithms.
Generalize p with respect to A: MIN-1 The Pseudocode 10 is
a greedy algorithm enumerate assignments and generates an interpolant
in terms of a DNF. The general algorithm is revisited in Section 2.2.4.
Let p be an assignment of A′ where A′ is formula A excluding all already
127
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
begin1
for l ∈ Var(p) do2
if SAT?(A ∧ p \ {l} ∪ {l¯}) then3
return MIN-1(p \ {l});4
end5
return p;6
end7
Pseudocode 10: Generalize assignment p: MIN-1.
begin1
U = Var(UCORE(p ∧ B));2
for l ∈ Var(p) do3
if l ∈ U then4
p = p \ {l};5
end6
return p;7
end8
.
Pseudocode 11: Generalize assignment p: MIN-2.
computed assignments. Pseudocode 10 checks whether a variable l ∈ Var(p)
is necessary for the minterm such that p |= A′ by checking whether the
assignment p is still satisﬁability by inverting the Boolean value of l. That
means, if p \ {l} ∪ {l¯} |= A′ holds then the variable l can be safely removed
from the minterm, i.e., p′ = p \ {l}. The algorithm MIN-1 is recursively
called with the new obtained minterm p′. Consequently, at most |p| SAT
calls are performed. Finally, the algorithm returns the generalized minterm.
Generalize p′ with respect to B (UNSAT-core): MIN-2 Ac-
cording to Craig’s interpolation theorem an assignment p |= A contradicts
B, i.e., p |= B. That means p can be minimized as long as it is unsatisﬁable
with B.
The Pseudocode 11 exploits the variable of the unsat core of p ∧ A. For
each variable occuring in p it is checked whether the variable is contained in
the unsat core. If the variable l ∈ Var(p) is not contained in the unsat core
(l ∈ U) l is not used to derive a ﬁnal conﬂict of the SAT solver. Therefore
the variable l is not necessary in p and can be safely removed from p. After
checking all variables the minimized p is returned.
128
RobuCheck 6.7
begin1
for l ∈ p do2
if !SAT?(p \ {l} ∧ B) then3
return MIN-3(p \ {l});4
end5
end6
Pseudocode 12: Generalize assignment p: MIN-3.
Generalize p′ with respect to B: MIN-3 Similar as the previous
approach MIN-2 the last approach MIN-3, shown in Pseudocode 12, checks
whether each variable is necessary to derive a conﬂict by additional SAT
calls.
129
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
p
|B
|=
A
′ ,
M
IN
-1
(p
|B
,A
′ )
M
IN
-2
(p
′ |B
,B
)
Iˆ
=
Iˆ
∨
p
′′ |B
p
′ |B
=
p
′′ |B
M
IN
-3
(p
′ |B
,B
)
A
′ =
A
′ ∧
p¯
′′ |B
SA
T?
(A
′ )
p
|B
p
′ |B
p
′′ |B
ye
s,
p
′ |B
no
,p
′′ |B
p
′′ |B
p
′′ |B
no
,Iˆ
A
′ =
A
,B
,I
=
F
A
L
S
E
ye
s
F
ig
ur
e
6.
9:
M
od
el
-b
as
ed
in
te
rp
ol
an
t
co
m
pu
ta
tio
n
130
RobuCheck 6.7
Overall ﬂow The overall ﬂow of the model-based technique is shown
in Figure 6.9. The ﬂow starts on the incoming edge with initially A′ = A,
B, and an empty interpolant Iˆ = FALSE. The ﬁrst step is to compute an
assignment p|B with p|B |= A′ as minterm which is deﬁned over the common
variables of A and B. In the following the assignment is minimized by
the presented techniques: MIN-1, MIN-2, and MIN-3. After minimizing
MIN-1 a new minterm p′|B is obtained that may contain fewer variables.
Furthermore, the next steps tries again to apply minimization according
to the formula B by MIN-2 that exploits the unsatisﬁable core. This step
is a very light-weighted step since only a fast lookup whether a variable
is contained in the unsatisﬁable core is performed. However, depending
on the SAT solver’s internal heuristics this steps is more or less eﬀective.
Therefore, in the next step it is checked whether the minimizing MIN-2
deleted at least one variable. If not, the minimization MIN-3 is additionally
called to try reduce the minterm more aggressive. The reduced minterm p′′|B
is added to the interpolant Iˆ and to avoid the recomputation of the same
assignment, p′′|B is blocked in A′. Finally, if at least one further assignment
exists a new iteration starts. Eventually, no more assignment exists and a
new interpolant has been computed.
Technically, the DNF of the interpolant is compactly represented by
a BDD. However, after computing the complete interpolant all prime
implicants are added to an AIG which is returned as interpolant. As
SAT solver Lingeling [Bie10] is used that fully supports incremental
satisﬁability and returned failed literals that are used in MIN-2.
Concurrent Computation Both approaches to compute interpolants
are integrated in RobuCheck. The approaches are concurrently started,
each in a separate process in order to exploit multi-core systems. Once
one process successfully returned the other process is terminated and the
interpolant of the fastest determined results is returned. This kind of
computation is very useful when an advance selection of the fastest approach
is diﬃcult. Overall, for each computation of an Craig interpolant the fastest
approach is performed.
6.7.3 Simulation Heuristics
The SIM-classiﬁer is used as a pre-process to roughly approximate non-
robust components. However, the SIM-classiﬁer is also used within the
formal engines. The classiﬁers proves robust components by determining
over-approximations of k-non-robust and k-dangerous components. Typ-
ically a large number of k-dangerous components are classiﬁed multiple
131
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
SIM-classiﬁer formal-classiﬁer
k-dangerous
after 100 classiﬁcations, time = time × 0.86
k-dangeroustime = 60s
Figure 6.10: Integrated ﬂow of the SIM-classiﬁer into formal-methods based
classiﬁers
times. In order to reduce this costly computation the SIM-classiﬁer is
tightly integrated into the formal classiﬁers to approximate k-dangerous
components.
Heuristic #1: Repeated Simulation
Before formally classifying k-dangerous components, the SIM-classiﬁer is
called ﬁrst by exactly analyzing the observation window of size k. As a
results the SIM-classiﬁer determines a subset of k-dangerous components.
To provide an eﬀective and complete classiﬁcation the run time SIM-
classiﬁer is limited by a dynamic parameter. The determined k-dangerous
components are not further analyzed by the formal classiﬁers which prunes
the search space of the underlying SAT instance and reduces costly SAT calls.
Moreover, after a certain number of formal classiﬁcations the SIM-classiﬁer
is called with more restrictive computational resources.
Figure 6.10 illustrates the integration of the SIM-classiﬁer and the formal
methods based classiﬁer denoted by formal-classiﬁer.
At the beginning for each k-dangerous classiﬁcation, the maximal run
time of the SIM-classiﬁer is limited to 60s. After termination the results
are passed to the formal-classiﬁer. After 100 classiﬁcations by the formal-
classiﬁer, the SIM-classiﬁer is called again. But the maximal run time of the
SIM-classiﬁed is reduced by a factor of 0.86. Eventually, the formal-classiﬁer
classiﬁes all remaining k-dangerous components since the SIM-classiﬁer
naturally does not exhaustively explore the search space. Thereby, maximal
run time converges to 0.
The concrete values for the heuristics has been obtained during pre-
liminary experiments. This particular values yields the best results. But,
however, otherwise values are conﬁgurable and may results in better run
times for another training set.
132
Summary 6.8
SIM-classifier
0-dangerous
1-dangerous
. . .
k¯-dangerous
Formal Classifiers
get put
Figure 6.11: Collecting k-dangerous components.
This ﬂow is integrated within the formal methods based classiﬁes an-
alyzing sequential circuits, i.e., the BMC-classiﬁer, ATPG-classiﬁer, and
ITP-classiﬁer.
Heuristic #2: Collecting k-dangerous Components
This heuristic is dedicated to multi-core architectures since it runs in parallel
to the formal classiﬁers. An extra classiﬁer is created on top of the SIM-
classiﬁer to collect only k-dangerous components for a certain interval
k ∈ [0, k¯].
In Figure 6.11 the idea of this heuristic is illustrated. The SIM-classiﬁer
approximates k-dangerous components and stores this information via put
in a storage. Components that a classiﬁed as k-dangerous are in particular
also further considered for diﬀerent values of k even to reduce multiple
classiﬁcations by the formal classiﬁers. This storage can be accessed by the
formal classiﬁer via get. The storage is able to handle concurrent access.
Overall, while the SIM-classiﬁer computes k-dangerous components the
formal classiﬁer can concurrently access this classiﬁcation to prune the
search space.
6.8 Summary
This chapter introduced the classiﬁers that implement the theoretical fun-
damentals of performing robustness checking. The BMC-classiﬁer, ATPG-
classiﬁer, and ITP-classiﬁer formally analyze sequential circuits. The COMP-
classiﬁer handles larger combinational circuits by decomposing the problem
formulation into smaller but potentially easier to solve instances. Moreover,
the SIM-classiﬁer a simulation-based approach has been introduced that
handles sequential circuits and provides a roughly determined approximation
of k-non-robust and k-dangerous components.
Furthermore, RobuCheck that integrates all these classiﬁers has been
presented. In particular for the ITP-classiﬁer a new back-end for determining
interpolant has been proposed. This back-end concurrently computes
133
Chapter 6 ROBUCHECK - AN INTEGRATED ROBUSTNESS CHECKER
interpolants based on two diﬀerent approaches while the fastest approach
delivers the interpolant.
Moreover in this chapter, a new model checker SimpMC exploiting the
inverse interpolants has been proposed.
The SIM-classiﬁer is particularly tightly integrated into the formal clas-
siﬁer to determine k-dangerous components concurrently and consecutively
by Heuristic #1 and Heuristic #2, respectively.
134
Chapter 7
Experiments
In the previous chapters robustness checking has been theoretically intro-
duced in terms of a tool called RobuCheck. In RobuCheck formal and
non-formal classiﬁers are integrated into to a highly-optimized ﬂow. This
chapter presents the results of the evaluation of RobuCheck on a set of
academic and industrial benchmarks.
• At the beginning the internal interpolation engine of RobuCheck is
evaluated on a set of benchmarks. Basically, the proof-based approach
and model-based approach is evaluated on unsatisﬁable SAT instances
coming from the SAT competition.
• In a next evaluation, the new model checker SimpMC is evaluated on
a set of benchmarks from the Hardware Model Checking Competition
and compared against a state-of-the-art model checker.
• As next, the proposed algorithms to assess the circuit’s robustness
are evaluated. The evaluation starts with the BMC-classiﬁer, ATPG-
classiﬁer, and the ITP-classiﬁer. Each of this three classiﬁers are
evaluated separately in terms of accuracy and run time.
• Furthermore, the SIM-classiﬁer is evaluated and compared against a
formal classiﬁer.
• After a separate evaluation of the classiﬁer, the concurrent classiﬁca-
tion joining all three classiﬁers is evaluated in two diﬀerent setups. In
these setups the SIM-classiﬁer is additionally turned on.
• The COMP-classiﬁer is evaluated on combinational ISCAS’85 circuits
and on complex arithmetic circuits.
135
Chapter 7 EXPERIMENTS
1
10
100
1 10 100
Ru
n
tim
e:
Pr
oo
f-b
as
ed
Run time: Model-based
Figure 7.1: Run time of model-based vs. proof-based approach
• The more diﬀerentiated robustness measure considering the worst
case analysis and probabilistic analysis is evaluated.
• As last evaluation, benchmarks from IBM are taken to evaluated the
ITP-classiﬁer on industrial circuits.
7.1 Interpolation: Model-based vs. Proof-based
The back-end of RobuCheck implements two approaches to compute
interpolants: a proof-based approach and a model-based approach. Both
have been revisited in Section 2.2.4 where details of the implementation
have been introduced in Section 6.7.2 in particular with new reduction
techniques.
Randomly selected unsatisﬁable SAT problems were taken from the
SAT competition benchmark set 20111. Overall, 45 instances were used to
compare both approaches in terms of run time and size of the obtained
interpolants.
1Available under http://www.satcompetition.org
136
Interpolation: Model-based vs. Proof-based 7.1
100
1000
10000
100000
1e+06
1e+07
100 1000 10000 100000 1e+06 1e+07
Si
ze
:
Pr
oo
f-b
as
ed
Size: Model-based
Figure 7.2: Size of the interpolants computed by model-based and proof-based
approach.
For each instance a partition into A and B is randomly created. The
number of common variables of A and B ranges from 63 to 126 variables
and the size of the entire CNF (A∪B) ranges from 20,000 clauses to 97,000
clauses. A time out was set to 900 CPU seconds. The experiments were
conduced on a AMD OpteronTMCPU with six cores running at 2.8Ghz with
32GB main memory.
Figure 7.1 shows a scatter plot of the run time in CPU seconds for both
approaches. A single point represents a single instance. If a point is below
the diagonal line the proof-based approach was faster than the model-based
approach and vice versa. Overall, both areas are almost equal, i.e., both
approaches perform similar.
Figure 7.2 shows a scatter plot of the size of the obtained interpolants
for both approaches. The size is provided by the number of And-Inverter-
Graph (AIG) nodes that correspond to the number of AND-gates.
All obtained interpolants using the model-based approach are smaller
than the obtained interpolants using the proof-based approach. While the
proof-based approach generates very large interpolants where the largest
contains more than 2 million nodes, generates the model-based approach
137
Chapter 7 EXPERIMENTS
an interpolant with 223 nodes for the same benchmark. The discrepancy
of the obtained interpolants is signiﬁcant. But the model-based approach
runs out of time for some cases where the proof-based approach provides
an interpolant within the time limit.
However, the run time of the model-based approach signiﬁcantly depends
on the number of common variables of the partition (A,B) since the number
of assignments that are computed grows exponentially with that number.
Additionally, the model-based approach was evaluated where the mini-
mization techniques have been turned oﬀ. For all instances the model-based
approach ran out of time. Consequently, minimizing the assignments is
crucial for the model-based approach.
Overall, the interpolants are concurrently computed by both approaches
where the fastest approach delivers the interpolant.
7.1.1 Future Work
The proof-based approach computes the interpolants along the resolution
proof. The size of the proof might be very large and thus the time to
compute the proof and afterwards the interpolant may take very long time.
The interpolant is completely computed when the entire proof is traversed.
The model-based approach enumerates assignments where the number of
these assignments grows exponentially with the number of the common
variables. The interpolant is completely computed when all assignments
are conjoined to a DNF.
Both approaches are time consuming relative to the entire run time of the
veriﬁcation task. While the proof-based approach delivers the interpolant
after the entire proof is traversed the model-based approach can deliver
parts of the interpolant even before completion. This part can be used even
before the entire interpolant is computed to checked whether a spurious
counterexample or a spurious classiﬁcation is performed which can be safely
decided even in case of an incomplete interpolant. Consequently, in this
case the further enumeration of assignments can be aborted and therefore
the run time might be reduced.
7.2 Simple Model Checker - SimpMC
In Section 6.3.3 a simple model checker SimpMC exploiting interpolants to
over-approximate the reachable states is introduced. Beside RobuCheck
this model checker has been implemented in C++ on top of WoLFram
[SKF+09]. In this section SimpMC is evaluated on a randomly selected
138
Simple Model Checker - SimpMC 7.2
subset of the Hardware Model Checking Competition (HWMCC) benchmarks
from 20122.
Overall, 90 problem instances were used and a time out of 900 CPU
seconds was set. To compare SimpMC against a diﬀerent model checker,
the powerful model checker IIMC3 was used. IIMC implements several ver-
iﬁcation algorithm in particular the newly introduced IC3 [Bra11] approach
in a very sophisticated ﬂow consisting of various optimization techniques.
IC3 won the third place in the HWMCC 2010 with a relatively outdated
SAT solver Zchaff [MMZ+01a]. In contrast, SimpMC implements only a
single veriﬁcation algorithm based on interpolation.
1
10
100
1000
1 10 100 1000
Ru
n
tim
e:
II
M
C
Run time: SimpMC
Figure 7.3: Run time of SimpMC vs. IIMC.
Figure 7.3 shows a scatter plot of the run time for both tools. In 49
instances, SimpMC is faster than IIMC where IIMC runs in a time out
for 29 out of these 49 instances, i.e., SimpMC outperforms IIMC by a
considerably factor. For the remaining 51 instances SimpMC could not
terminate within the time out where IIMC solved additionally 38 instances.
Overall, SimpMC solved 49 instances faster than IIMC although SimpMC
2Available under http://fmv.jku.at/hwmcc12/
3Available under http://ecee.colorado.edu/wpmu/iimc/
139
Chapter 7 EXPERIMENTS
does not implement any sophisticated ﬂow. Consequently, SimpMC states
a powerful additional veriﬁcation engine within an orchestrated veriﬁcation
tool.
7.3 Robustness Checking
This section provides the obtained results of performing robustness checking
using RobuCheck.
7.3.1 Benchmarks
A subset of circuits of the ITC99 benchmark suite4 was used as benchmarks.
More precisely, the circuits b08 to b15 and further derived fault-tolerant
circuits were used.
Based on the original circuits techniques to protect the circuit against
transient faults have been implemented:
• A system-level and FF-based Triple Modular Redundancy (TMR)
implementation are revisited in Section 2.6.2. Both techniques were
applied to the benchmark circuits where the system-level circuits
have no fault signal and the FF-based circuits have a fault signal.
The system-level implementations are marked by -tmr-sys and the
FF-based implementations are marked by -tmr-ff.
• As a further hardening technique a parity checker has been imple-
mented on each considered ITC’99 circuit. A checker circuitry has
been generated that computes the parity over the ﬂip ﬂops and the
primary outputs. The primary outputs are buﬀered with extra ﬂip
ﬂops. A wrong parity is reported by a fault signal. The implementa-
tion of these circuits are relative robust since each transient fault that
ﬂips odd numbers of ﬂip ﬂops and primary outputs is detected. The
parity circuits are marked by -par. The circuits have been optimized
by SIS [SSL+92].
The characteristics of all circuits are shown in Table 7.1. The ﬁrst
column Circuit denotes the name of the circuit. The remaining columns
list the characteristics: |X| lists the number of primary inputs, |Y | lists the
number of primary outputs, |V | lists the number of components, and |FF |
lists the number of ﬂip ﬂops. Overall, 32 ITC’99 and derived circuits are
used.
4Available under http://www.cerc.utexas.edu/itc99-benchmarks/bench.html
140
Robustness Checking 7.3
Table 7.1: Characteristics of the benchmark circuits.
Circuit |X| |Y | |V | |FF |
b08 9 4 240 21
b09 1 1 203 28
b10 11 6 279 17
b11 7 6 894 31
b12 5 6 1363 121
b13 10 10 415 53
b14 32 54 11395 245
b15 36 70 11035 449
b08-tmr-sys 9 4 765 63
b09-tmr-sys 1 1 619 84
b10-tmr-sys 11 6 902 51
b11-tmr-sys 7 6 2743 93
b12-tmr-sys 5 6 4148 363
b13-tmr-sys 10 10 1345 159
b14-tmr-sys 32 54 34703 735
b15-tmr-sys 36 70 33771 1347
b08-tmr-ﬀ 9 4 1458 63
b09-tmr-ﬀ 1 1 1543 84
b10-tmr-ﬀ 11 6 1463 51
b11-tmr-ﬀ 7 6 3766 93
b12-tmr-ﬀ 5 6 8141 363
b13-tmr-ﬀ 10 10 3094 159
b14-tmr-ﬀ 32 54 42788 735
b15-tmr-ﬀ 36 70 48588 1347
b08-par 9 5 618 26
b09-par 1 2 597 30
b10-par 11 7 730 24
b11-par 7 7 1876 38
b12-par 5 7 3990 128
b13-par 10 11 1273 64
b14-par 32 55 20203 300
b15-par 36 71 25656 520
141
Chapter 7 EXPERIMENTS
Unless otherwise stated, the benchmarks were carried out on AMD
OpteronTM CPU running at 3.0GHz with 64GB main memory.
7.3.2 Formal classiﬁers
The BMC-classiﬁer, ATPG-classiﬁer, and ITP-classiﬁer are based on for-
mal methods analyzing sequential circuits. These classiﬁers are evaluated
on the benchmarks described above. All components have to be classi-
ﬁed, i.e., U = V .
Quality
The BMC-classiﬁer and ATPG classiﬁer uses approximate reachability
information. An over-approximation of the reachable states leads to a lower
bound of robustness (Rk¯lb) and an under-approximation of the reachable
states leads to an upper bound of the robustness (Rk¯ub). In order to obtain
bounds of the robustness both classiﬁers are called twice with diﬀerent
approximations leading to four runs for each circuit.
As over-approximation simply the entire set of states is considered. This
is realized by leaving the initial value of the state elements unconstrained.
The under-approximation is realized by conﬁguring the reachability window
to 10 time frames, i.e., l¯ = 10. An observation window of size 10 was
considered, i.e., k¯ = 10. The ITP-classiﬁer generally considers an unlimited
observation window and computes the set of states by interpolants.
The BMC-classiﬁer and ATPG-classiﬁer use a single core of CPU. Thus,
run time is measured in CPU second. The ITP-classiﬁer computes inter-
polants concurrently on two CPU cores according to the proposed concurrent
computation of interpolants. Thus, the run time of the ITP-classiﬁer is
measured in wall clock seconds. For each benchmark the run time was
limited to 8 hours.
Under unlimited computational resources the BMC-classiﬁer and ATPG-
classiﬁer deliver identical results since the computational model and the
considered set of states are equal. However, the diﬀerences are shown in
the following.
The ITP-classiﬁer computes fully automatic suitable approximations
based on interpolation. Thus, reachability window and observation window
are conﬁgured to be unlimited.
142
Robustness Checking 7.3
Ta
bl
e
7.
2:
D
et
er
m
in
ed
ro
bu
st
ne
ss
bo
un
ds
of
IT
C
’9
9
ci
rc
ui
ts
.
BM
C
-c
la
ss
iﬁ
er
AT
PG
-c
la
ss
iﬁ
er
IT
P-
cl
as
siﬁ
er
V
irt
ua
lB
es
t
C
irc
ui
t
Rˆ
k¯ lb
Rˇ
k¯ u
b
Rˆ
k¯ lb
Rˇ
k¯ u
b
R
k
cm
pl
lb
R
k
cm
pl
u
b
R
lb
R
u
b
b0
8
0.
00
%
48
.3
3%
0.
00
%
48
.3
3%
0.
00
%
0.
00
%
0.
00
%
0.
00
%
b0
9
0.
00
%
79
.3
1%
0.
00
%
79
.3
1%
0.
00
%
0.
49
%
0.
00
%
0.
49
%
b1
0
0.
00
%
1.
79
%
0.
00
%
1.
79
%
1.
79
%
1.
79
%
1.
79
%
1.
79
%
b1
1
0.
11
%
8.
84
%
0.
11
%
8.
84
%
6.
26
%
7.
83
%
6.
26
%
7.
83
%
b1
2
0.
00
%
51
.5
8%
0.
00
%
51
.5
8%
0.
00
%
39
.4
7%
0.
00
%
39
.4
7%
b1
3
0.
72
%
54
.7
0%
0.
72
%
10
0.
00
%
6.
02
%
48
.1
9%
6.
02
%
48
.1
9%
b1
4
0.
00
%
10
0.
00
%
0.
00
%
96
.4
2%
0.
09
%
13
.1
9%
0.
09
%
13
.1
9%
b1
5
0.
00
%
10
0.
00
%
0.
00
%
99
.9
5%
0.
43
%
28
.4
5%
0.
43
%
28
.4
5%
b0
8-
tm
r-
ﬀ
98
.8
3%
99
.4
5%
98
.8
3%
10
0%
98
.8
3%
98
.8
3%
98
.8
3%
98
.8
3%
b0
9-
tm
r-
ﬀ
99
.8
1%
99
.8
7%
99
.8
1%
10
0%
99
.8
1%
99
.8
1%
99
.8
1%
99
.8
1%
b1
0-
tm
r-
ﬀ
98
.4
3%
98
.4
3%
98
.4
3%
10
0%
98
.4
3%
98
.4
3%
98
.4
3%
98
.4
3%
b1
1-
tm
r-
ﬀ
99
.5
0%
10
0.
0%
0.
0%
10
0%
99
.5
0%
99
.5
0%
99
.5
0%
99
.5
0%
b1
2-
tm
r-
ﬀ
99
.7
9%
10
0.
0%
0.
0%
10
0%
99
.8
4%
99
.8
4%
99
.8
4%
99
.8
4%
b1
3-
tm
r-
ﬀ
99
.0
3%
99
.3
2%
99
.0
3%
10
0%
99
.2
9%
99
.2
9%
99
.2
9%
99
.2
9%
b1
4-
tm
r-
ﬀ
0.
0%
10
0.
0%
0.
0%
10
0%
0.
00
%
99
.7
5%
0.
00
%
99
.7
5%
b1
5-
tm
r-
ﬀ
0.
0%
10
0.
0%
0.
0%
10
0%
0.
00
%
99
.7
1%
0.
00
%
99
.7
1%
b0
8-
pa
r
74
.1
1%
97
.5
7%
74
.1
1%
97
.5
7%
74
.1
1%
74
.1
1%
74
.1
1%
74
.1
1%
b0
9-
pa
r
84
.2
5%
98
.1
6%
84
.2
5%
98
.1
6%
86
.9
3%
87
.1
0%
86
.9
3%
87
.1
0%
b1
0-
pa
r
82
.6
0%
84
.7
9%
82
.6
0%
84
.7
9%
83
.8
4%
83
.8
4%
83
.8
4%
83
.8
4%
b1
1-
pa
r
80
.4
4%
86
.1
4%
80
.4
4%
86
.1
4%
83
.3
2%
83
.6
4%
83
.3
2%
83
.6
4%
b1
2-
pa
r
6.
57
%
94
.6
1%
83
.9
8%
84
.7
9%
6.
57
%
99
.8
5%
83
.8
9%
99
.8
5%
b1
3-
pa
r
89
.3
2%
95
.0
5%
89
.3
2%
95
.0
5%
89
.3
2%
94
.6
6%
89
.3
2%
94
.6
6%
b1
4-
pa
r
0.
0%
94
.7
4%
1.
91
%
99
.9
9%
0.
00
%
10
0.
00
%
1.
91
%
94
.7
4%
b1
5-
pa
r
0.
0%
99
.7
3%
2.
50
%
99
.9
7%
0.
00
%
10
0.
00
%
2.
50
%
99
.7
3%
143
Chapter 7 EXPERIMENTS
Table 7.3: Robustness of hard TMR circuits.
BMC-classiﬁer ITP-classiﬁer
Circuit Rlb Rub l k R
kcmpl
lb R
kcmpl
ub Run time
b08-tmr-sys 1.2% 99.4% 14 5 99.4% 99.4% 68
b09-tmr-sys 0.3% 99.6% 19 4 99.6% 99.6% 44
b10-tmr-sys 1.5% 97.8% 16 4 97.8% 97.8% 778
b11-tmr-sys 0.6% 99.4% 13 2 99.4% 99.4% 373
b12-tmr-sys 0.3% 99.8% 19 2 99.8% 99.8% 395
b13-tmr-sys 2.3% 99.0% 7 3 99.0% 99.0% 213
Table 7.2 the determined robustness bounds are listed for the BMC-
classiﬁer, ATPG-classiﬁer, and ITP-classiﬁer. The ﬁrst column lists the
circuit name. For each classiﬁer a lower bound and an upper bound of
the robustness is provided. The last two columns lists the virtually best
robustness bound, i.e, the best lower bound and the best upper over all
classiﬁers.
For a large portion of the circuits tight bounds are determined, i.e., the
original ITC’99 circuits have a relatively low robustness value and the
fault-tolerant implementations have high robustness values as expected.
Furthermore, the BMC-classiﬁer and the ATPG-classiﬁer delivers often the
same results except for the tmr-ff circuits where the ATPG-classiﬁer did
not classify any component leading to an upper bound of 100%. In contrast,
the ITP-classiﬁer deliver equal or even tighter bounds in all cases except
circuit b14-par and b15-par. Even for the large circuits b14 and b15 the
ITP-classiﬁer delivers good results since a systematic computation of the
reachable states is performed that includes only relevant facts to classify
the components. Only the ATPG-classiﬁer completed a few components.
The BMC-classiﬁer and ATPG-classiﬁer could marginally classify more
components than the ITP-classiﬁer for the circuits b14-par and b15-par.
A more detailed analysis of both circuits show that the interpolation engine
did not complete the computation of the interpolant. The resolution proof
of the proof-based approach was getting too large to complete and the
number of common variables of the formula pair was to high to complete
the generation process by the model-based approach.
Furthermore, a signiﬁcant result is that the ATPG-classiﬁer did not
work very well on the -tmr-ﬀ circuits.
Outstanding results are reached by the ITP-classiﬁer for the -tmr-sys
circuits listed in Table 7.3. Since most of the components of these circuits
144
Robustness Checking 7.3
Table 7.4: Run times of the formal classiﬁers.
BMC-classiﬁer ATPG-classiﬁer ITP-classiﬁer
Circuit Run time Run time Run time l k
b08 5.4 22.8 68.4 18 17
b09 3.1 16.7 time out 21 47
b10 27.9 20.2 745.4 6 12
b11 175.9 2349.5 time out 12 21
b12 335.1 4427.9 time out 42 12
b13 19.6 14.2 time out 30 20
b14 time out time out time out 4 3
b15 time out time out time out 19 2
b08-tmr-ﬀ 751.9 37060.0 1146.3 11 18
b09-tmr-ﬀ 322.0 31904.6 99.9 10 2
b10-tmr-ﬀ 5190.1 30511.9 163.7 5 5
b11-tmr-ﬀ time out time out 1171.8 2 3
b12-tmr-ﬀ time out time out 21863.5 13 12
b13-tmr-ﬀ 30213.8 49447.9 334.1 4 2
b14-tmr-ﬀ time out time out time out 2 0
b15-tmr-ﬀ time out time out time out 2 0
b08-par 250.3 201.7 1303.3 18 18
b09-par 262.0 182.6 time out 61 83
b10-par 1856.2 385.6 133.6 9 6
b11-par 35968.8 18116.2 time out 15 13
b12-par 34151.2 32186.0 time out 2 1
b13-par 29056.1 1065.0 time out 28 18
b14-par time out time out time out 2 0
b15-par time out time out time out 2 0
are unbounded dangerous and are therefore hard to classify. Once a fault
is injected the modiﬁed state persists for all subsequent time frames but
is masked by the majority voter, i.e., the circuit state is corrupted by the
speciﬁcation of the circuit is still kept. Consequently, classifying these
components requires that the corresponding proof procedure of the ITP-
classiﬁer needs to unroll the circuit up to the completeness threshold kcmpl.
However, the classiﬁcations are completed by eﬀectively computing suitable
approximations before reaching this value. The ITP-classiﬁer provides exact
results while the BMC-classiﬁer provides only marginal results leading to
very low accuracy.
145
Chapter 7 EXPERIMENTS
Run time
Table 7.4 shows the run times of the formal classiﬁers. The run times for
the BMC-classiﬁer and the ATPG-classiﬁer are accumulated over two runs
since the lower bound and the upper bound are determined separately. A
time out is written if at least one run was out of time as well as if the
ITP-classiﬁer was out of time.
The search space of the BMC-classiﬁer and ATPG-classiﬁer is bounded
by the maximal reachability window and observation window of size 10.
The ITP-classiﬁer automatically determines this value but are additionally
listed by l and k.
The run times of the BMC-classiﬁer and the ATPG-classiﬁer against
the ITP-classiﬁer needs to be compared separately since the approaches are
diﬀerent. The size of search space of the ITP-classiﬁer is higher than size
of the BMC-classiﬁer and ATPG-classiﬁer since reachability window and
observation window is left open for the ITP-classiﬁer. In the following the
BMC-classiﬁer and ATPG-classiﬁer are referred to as static classiﬁers.
In some cases the run time of the ITP-classiﬁer is higher than for the
static classiﬁers but the obtained bounds are more accurate. Examples for
this case are b10, b13 (compare Table 7.4 and Table 7.2).
However, the diﬀerence of the run times of the BMC-classiﬁer and the
ATPG-classiﬁer is signiﬁcant. For example, the ATPG-classiﬁer requires
1065 seconds and the BMC-classiﬁer requires 29056 seconds to classify the
circuit b13-par. That means in this case the ATPG-classiﬁer outperforms
the BMC-classiﬁer considerably. In contrast, the ATPG-classiﬁer is signiﬁ-
cantly outperformed by the BMC-classiﬁer exemplary in case of circuit b12.
That means, there are signiﬁcant cases where both perform diﬀerently but
it is a priori unknown which classiﬁer would perform better.
The reached reachability windows and observation windows of the ITP-
classiﬁer are very diﬀerent. For example the classiﬁcation of the circuit
par_b09 could not be completed although a reachability window of 61 and
a observation window of 83 were considered. A detailed analysis of this case
shows that exactly a single component has not been completely classiﬁed
which is also reﬂected by the very tight bounds in Table 7.2. A further
detailed analysis shows that in the most cases a considerably portion of the
components are classiﬁed while considering small reachability windows and
observation windows.
Overall, the ITP-classiﬁer generally provides a higher accuracy, i.e., the
gap of the bounds is signiﬁcantly lower and in some cases exact results
are provided in terms of equal bounds. For example all components of the
146
Robustness Checking 7.3
circuit b12-tmr-ff are classiﬁed by the ITP-classiﬁer in lower run times
where the static classiﬁers completed only a few classiﬁcations.
Moreover, as a serious point, the ITP-classiﬁer has typically signiﬁcant
higher memory consumption than the BMC-classiﬁer and ATPG-classiﬁer.
This is caused by the interpolation engine computing interpolants from
the resolution proof as illustrated in Section 7.1. However, the memory
consumption can be drastically reduced by minimizing the resolution proof
which directly corresponds to the size of the interpolants. However, available
techniques can be easily applied within RobuCheck.
Conspicuously, the run times of the parity circuits are very high for all
classiﬁers. The parity is computed over many XOR-gates. The corresponding
SAT instance in terms of a CNF is deﬁned over clauses based on conjunctions
and disjunctions. The SAT solver has no direct knowledge about those
XOR-gates. Usually, XOR dominated SAT instances are hard to solve which
is addressed by several works, e.g., [Soo12] to improve the performance for
those instances. There are SAT solver that particularly consider these kinds
of instances, e.g., CryptoMiniSAT5.
7.3.3 SIM-classiﬁer
Results of the SIM-classiﬁer are presented in this section. Recall the
SIM-classiﬁer performs random simulation to classify the components by
randomly computed input stimulus. The number of simulation traces has
been conﬁgured to be unlimited. But the run-time of the SIM-classiﬁer
was limited to 8 hours. That means, the SIM-classiﬁer terminates once
all components are completely classiﬁed or runs out of time. Recall, the
SIM-classiﬁer explores partially the search space that means may miss
corner cases.
Upper Bound
Table 7.5 lists the obtained results of the SIM-classiﬁer. The column |S|
speciﬁes the number of classiﬁed non-robust components. The column
Traces speciﬁes the number of simulated traces. The column Rub denotes
the obtained robustness in terms of an upper bound. Additionally the
best obtained robustness bound by the formal classiﬁers are listed as well
from Table 7.2. The last column denotes the diﬀerence of best obtained
robustness bound and the bound obtained by the SIM-classiﬁer.
The diﬀerences of the bounds are signiﬁcant in certain cases although
the number of simulated traces is huge. For circuit b14 the diﬀerence is
5Available under http://www.msoos.org/cryptominisat2/
147
Chapter 7 EXPERIMENTS
Table 7.5: Robustness obtained by the SIM-classiﬁer.
Circuit |S| Traces Rub Best Rub Diﬀ.
b08 238 26,408,000 0.83% 0.00% 0.83%
b09 120 146,825,000 40.89% 0.49% 40.40%
b10 271 66,569,000 2.87% 1.79% 1.08%
b11 774 270,678,000 13.42% 7.83% 5.59%
b12 633 230,749,000 53.56% 39.47% 14.09%
b13 289 1,238,000 30.36% 48.19% -17.83%
b14 3680 73,878,000 67.71% 13.19% 54.52%
b15 2466 31,492,000 77.65% 28.45% 49.20%
b08-tmr-ﬀ 13 115,674,000 99.11% 98.83% 0.28%
b09-tmr-ﬀ 3 121,234,000 99.81% 99.81% 0.00%
b10-tmr-ﬀ 21 120,123,000 98.56% 98.43% 0.13%
b11-tmr-ﬀ 19 37,554,000 99.50% 99.50% 0.00%
b12-tmr-ﬀ 12 28,545,000 99.85% 99.84% 0.01%
b13-tmr-ﬀ 22 54,085,000 99.29% 99.29% 0.00%
b14-tmr-ﬀ 2 11,018,000 < 100.00% 99.75% 0.25%
b15-tmr-ﬀ 0 6,513,000 100.00% 99.71% 0.29%
b08-par 135 114,533,000 78.16% 74.11% 4.05%
b09-par 34 198,235,000 94.30% 87.10% 7.20%
b10-par 104 108,784,000 85.75% 83.84% 1.91%
b11-par 221 181,717,000 88.22% 83.64% 4.58%
b12-par 123 82,258,000 96.92% 99.85% -2.93%
b13-par 94 86,921,000 92.62% 94.66% -2.04%
b14-par 245 33,399,000 98.79% 99.74% -0.95%
b15-par 137 21,295,000 99.47% 99.73% -0.26%
considerably. The formally best obtained robustness bound is 13.19% where
the SIM-classiﬁer determined 67.71%. Thus, a signiﬁcant diﬀerence. In
particular, the SIM-classiﬁer cannot prove lower bounds since it requires
to prove the absence of faulty behavior which is naturally not provided
by simulation. However, this is necessary when showing the correctness of
fault-tolerant circuits. But the SIM-classiﬁer is useful as a pre-processor:
In case of the circuit b13 the SIM-classiﬁer completed signiﬁcant more
components than the best formal classiﬁer, i.e, the diﬀerence of the bounds
is 17.84%. For circuits b15-tmr-ff, the SIM-classiﬁer did not complete any
classiﬁcation. For the larger parity circuits, the SIM-classiﬁer classiﬁes a
few more components than the best formal classiﬁer.
148
Robustness Checking 7.3
 0
 1000
 2000
 3000
 4000
 5000
 6000
 7000
 8000
 9000
 10000
 0  2000  4000  6000  8000  10000  12000  14000  16000  18000
Cl
as
sif
ie
d 
co
m
po
ne
nt
s
Run time [s]
Formal
Simulation
Figure 7.4: Formal vs. simulation
Overall, the SIM-classiﬁer provides a roughly approximate classiﬁcation.
In the following the progress of the classiﬁcation is illustrated.
Classiﬁcation Progress
Figure 7.4 illustrates the classiﬁcation progress. The x-axis denotes the run
time and the y-axis the number of completely classiﬁed components. The
circuit b14 has been taken as example.
The SIM-classiﬁer comes into a saturation approximately after 5000 CPU
seconds. Only a small portion of components are subsequently classiﬁed.
The formal-classiﬁer systematically explores the search space and classiﬁes
more components. However, an integration of both approaches seems to be
powerful since the area between both curves can be exploited to skip the
classiﬁcations. That means once one classiﬁer has completely classiﬁed a
component the classiﬁcation can be transfered to the remaining classiﬁer.
One can further see, that the formal classiﬁer is even more powerful than
the SIM-classiﬁer on this complex benchmark.
Heuristics
In Section 6.7.3 two heuristics have been introduced. In Heuristic #2
the SIM-classiﬁer collects k-dangerous components concurrently while the
formal classiﬁers classiﬁes formally the components. While classiﬁcation
over diﬀerent time frames components are classiﬁed as k-dangerous multiple
149
Chapter 7 EXPERIMENTS
Figure 7.5: Collect k-dangerous components for circuit b12
times for various values of k. The heuristic presented in Section 7.3.4.
In this heuristic, the SIM-classiﬁer collects k-dangerous components for
diﬀerent values of k.
In Figure 7.5 a histogram of collected k-dangerous is shown for ITC’99
circuit b12. This circuit is composed of 1363 components. The x-axis
denotes the range of k and y-axis denotes the number of components
classiﬁed as k-dangerous for various k, respectively. The SIM-classiﬁer ran
for only one minute. The observation window has been conﬁgured to the
interval [0, 100].
For a small value of k there are many k-dangerous components (850 out of
1363), i.e., a fault is observable at the state elements. With increasing k the
fault is either propagated to the outputs or masked out, i.e, fewer components
are k-dangerous. Note the k-dangerous components might also be k-non-
robust components which is not uniquely classiﬁed due to the incomplete
simulation. But the formal classiﬁers can access this storage to skip costly
k-dangerous classiﬁcations which increases the overall performance.
150
Robustness Checking 7.3
7.3.4 Concurrent Classiﬁcation
Running the classiﬁers concurrently are presented in the following evaluation.
Due to the global classiﬁcation storage of RobuCheck the classiﬁer may
beneﬁt from classiﬁcation of other classiﬁers.
The underlying hardware of the benchmark system has six CPU cores.
The following conﬁguration has been chosen:
• BMC-classiﬁer and ATPG-classiﬁer: Both classiﬁers consider over-
approximation and under-approximation of reachable states which
yields non-robust and robust components. The observation window
has been conﬁgured to maximal k¯ = 10 time frames and in case of an
under-approximation l¯ = 0 time frames has been conﬁgured for the
reachability window. This yields overall four classiﬁers.
• ITP-classiﬁer: The ITP-classiﬁer uses internally two CPU cores to
compute interpolants as presented in Section 6.7.2.
Overall, all six cores are used within this evaluation. The overall run time
was drastically reduced from 8 hours to only 1 hour.
Two diﬀerent setups are evaluated denoted by Setup #1 and Setup #2.
In the ﬁrst setup Heuristic #1 from Section 6.7.3 has been activated and in
the second setup Heuristic #2 from Section . Additionally, in both setups
the heuristic of the Minimal Propagation Path from Section 5.6 has been
turned on to skip costly classiﬁcations by a simple structural analysis.
In Table 7.6 the determined robustness bounds of both setups are listed.
The ﬁrst column denotes the name of the circuits. The lower and upper
bounds are shown in the remaining columns, respectively.
Even in the limited computational resouce to only one hour run time
the accuracy of this evaluation is very high for both setup. Often the results
are equal to the result obtained by running the classiﬁers separately for 8
hours.
However, in some cases the results of Setup #1 and Setup #2 diﬀer. Sig-
niﬁcant diﬀerences are observed for, e.g, circuit b15 and circuit b12-tmr-ff.
The progress of the classiﬁcation is illustrated in Figure 7.6 over the com-
plete run time for both circuits. The x-axis denotes the run time and the
y-axis the number of non-classiﬁed components.
As it is shown in Figure 7.6 Setup #2 classiﬁes more components per
time for both circuits. However, the only diﬀerence of the conﬁguration is
Heuristic #1 for Setup #1 and Heuristic #2 for Setup #2. That means,
in this conﬁguration Heuristic #2 that collects k-dangerous component.
concurrently is more eﬀective for these circuits. There are also cases where
Setup #1 is better but these cases are less signiﬁcant.
151
Chapter 7 EXPERIMENTS
Table 7.6: Determined bounds by Setup #1 and Setup #2.
Setup #1 Setup #2
Circuit Rk¯lb Rk¯ub Rk¯lb Rk¯ub
b08 0.00% 0.00% 0.00% 0.00%
b09 0.00% 0.49% 0.00% 0.49%
b10 1.79% 1.79% 1.79% 1.79%
b11 0.11% 14.99% 6.49% 8.05%
b12 0.00% 46.37% 0.00% 46.37%
b13 0.96% 53.49% 1.45% 56.39%
b14 0.00% 63.47% 0.01% 79.93%
b15 0.43% 87.94% 0.43% 64.05%
b08-tmr2 98.83% 98.83% 98.83% 98.83%
b09-tmr2 99.81% 99.81% 99.81% 99.81%
b10-tmr2 98.43% 98.43% 98.43% 98.43%
b11-tmr2 99.50% 99.50% 99.50% 99.50%
b12-tmr2 23.62% 99.84% 41.56% 99.84%
b13-tmr2 99.29% 99.29% 99.03% 99.29%
b14-tmr2 0.00% 99.74% 0.00% 99.74%
b15-tmr2 0.00% 99.71% 0.00% 99.71%
b08-par 74.43% 74.43% 74.11% 74.11%
b09-par 86.93% 87.10% 86.93% 87.10%
b10-par 83.97% 83.97% 83.70% 83.70%
b11-par 80.54% 85.13% 73.40% 86.35%
b12-par 36.07% 94.76% 31.68% 99.82%
b13-par 89.32% 94.66% 89.32% 94.66%
b14-par 0.20% 99.97% 0.60% 99.74%
b15-par 2.50% 99.97% 0.01% 99.73%
7.3.5 Probabilistic Analysis
In Section 4.2 a technique was proposed that computes a diﬀerentiation of
non-robust components. The results of the evaluation of this technique is
presented in the following.
Combinational circuits were taken from the LGsynth93 benchmark suite
and sequential circuits from the ITC’99 benchmark suite, respectively. For
every circuit a parity checker was implemented as introduced in Section 2.6.1.
A time out was set to 5000 CPU seconds. Exceeding this limit is denoted
by time out.
152
Robustness Checking 7.3
(a) b15
(b) b12-tmr-ﬀ
Figure 7.6: Progress of b15 and b12-tmr-ff by Setup #1 and Setup #2
153
Ta
bl
e
7.
7:
R
es
ul
ts
fo
r
co
m
bi
na
tio
na
lc
irc
ui
ts
.
W
or
st
C
as
e
EP
P-
ba
se
d
(1
00
sc
en
ar
io
s)
EP
P-
ba
se
d
(1
0,
00
0
sc
en
ar
io
s)
C
irc
ui
t
|X
|
|V
|
R
lb
|S|
R
λ u
b
λ
R
un
tim
e
M
A
A
R
λ u
b
λ
>
λ
Ψ
(1
)
R
un
tim
e
M
A
A
pa
r_
5x
p1
7
39
1
88
.4
4%
49
95
.3
5%
10
0.
00
%
5.
96
8.
55
pa
r_
9s
ym
9
65
5
98
.6
9%
9
99
.7
1%
97
.6
6%
3.
73
9.
6
99
.7
1%
10
0%
0
3.
89
9.
64
pa
r_
ap
ex
7
49
72
0
77
.4
8%
27
5
77
.4
8%
<
0.
01
%
46
5.
14
32
.4
1
77
.4
8%
<
0.
01
%
20
4
tim
e
ou
t
17
0.
75
pa
r_
cm
42
a
4
81
85
.7
1%
15
92
.0
2%
10
0.
00
%
0.
17
0.
21
pa
r_
cm
82
a
5
62
73
.1
7%
22
86
.9
7%
10
0.
00
%
0.
2
0.
26
pa
r_
cm
b
16
13
6
63
.1
6%
70
90
.2
7%
0.
76
%
5.
55
5.
52
95
.9
1%
15
.2
5%
6
76
.6
4
44
.8
9
pa
r_
co
m
p
32
38
5
41
.3
6%
28
5
41
.3
6%
<
0.
01
%
30
0.
79
86
4.
64
–
–
tim
e
ou
t
tim
e
ou
t
pa
r_
co
n1
7
65
81
.1
1%
17
95
.0
1%
10
0.
00
%
0.
26
0.
36
pa
r_
co
rd
ic
23
28
66
97
.6
5%
69
97
.6
5%
0.
01
%
52
4.
98
tim
e
ou
t
–
–
tim
e
ou
t
tim
e
ou
t
pa
r_
cu
14
16
6
76
.9
2%
51
79
.6
3%
3.
05
%
11
.7
7
2.
86
92
.8
8%
61
.0
3%
3
10
8.
04
29
.9
5
pa
r_
du
ke
2
22
97
6
76
.2
3%
25
5
76
.4
2%
0.
01
%
38
4.
39
62
0.
65
–
–
tim
e
ou
t
tim
e
ou
t
pa
r_
e6
4
65
14
09
88
.4
5%
19
3
89
.5
9%
<
0.
01
%
55
3.
68
74
0.
04
–
–
tim
e
ou
t
tim
e
ou
t
pa
r_
f5
1m
8
31
8
91
.7
6%
29
95
.0
9%
10
0.
00
%
4.
62
6.
82
pa
r_
fr
g1
28
39
3
92
.7
4%
35
92
.7
4%
<
0.
01
%
25
.1
5
13
.8
5
92
.7
4%
<
0.
01
%
35
82
2.
3
32
4.
25
pa
r_
rd
84
8
11
57
82
.8
1%
20
4
96
.1
4%
10
0.
00
%
48
.7
3
11
1.
83
pa
r_
ro
t
13
5
19
32
76
.1
7%
58
3
76
.1
7%
<
0.
01
%
45
50
.9
2
17
0.
28
76
.1
7%
<
0.
01
%
58
3
tim
e
ou
t
55
6.
64
pa
r_
sa
o2
10
50
2
82
.1
6%
96
94
.4
1%
48
.8
3%
19
.3
9
46
.6
5
96
.9
6%
10
0%
0
21
.2
4
50
.7
4
pa
r_
sq
rt
8m
l
8
44
7
60
.8
0%
18
7
91
.9
2%
10
0.
00
%
19
.4
6
54
.2
6
pa
r_
sq
ua
r5
5
27
3
80
.8
7%
57
91
.0
4%
10
0.
00
%
1.
2
1.
71
pa
r_
t4
81
16
17
52
99
.1
1%
16
99
.1
1%
0.
76
%
62
.6
4
45
0.
69
99
.1
1%
15
.2
6%
16
13
04
.4
8
tim
e
ou
t
pa
r_
ta
bl
e5
17
14
55
70
.5
8%
44
8
75
.0
7%
0.
38
%
85
8.
44
19
63
.0
5
–
–
tim
e
ou
t
tim
e
ou
t
Robustness Checking 7.3
Table 7.7 shows the results for the combinational circuits. The ﬁrst
three columns describe properties of the circuit: the name, the number of
primary inputs and the number of components in the circuit. Note that for
combinational circuits SDC cannot occur, i.e. all components are classiﬁed
as robust or as non-robust. Consequently, there is only a single value for
the robustness of such circuits as introduced in Section 4.2. The results of
the worst case analysis, the new measure using 500 scenarios and the new
measure using 10,000 scenarios are given in the following columns. The
parameter λ has been adjusted accordingly. For the worst case analysis,
the robustness value and the number of non-robust components are shown
in columns Rub and |S|, respectively. For the new measure, the robustness
value Rλub, the parameter λ, the overall run time t in CPU seconds without
Minimal Assignment Analysis (MAA) (Section 6.2.2), and the run time
(MAA) with MAA are shown in the respective columns. Additionally,
column > λΨ gives the number of components having more than 10,000
scenarios. Blank cells denote that no computation with 10,000 scenarios
was required as all components had less than 500 scenarios. The run times
are longer for the new measure as usually more than a single scenario has
to be considered before the classiﬁcation of a non-robust component is
completed. For small circuits the use of MAA increases the run time (e.g.
for par_5xp1). In these cases the SAT solver eﬃciently enumerates multiple
solutions. In contrast, when the number of inputs increases, MAA often
yields shorter run times as a single solution of the SAT solver is generalized
to many scenarios (e.g. for par_apex9 and for par_rot). The robustness
value for the new measure is typically larger than the one for the worst-case
analysis. As soon as a single scenario exists a component is classiﬁed as
"completely non-robust" in the worst-case analysis. For the new measure
non-robust components are graded by the number of scenarios. As long
as the number of scenarios is below the predeﬁned limit, a component
contributes to the circuit’s robustness. For example, this case occurs for
the circuits par_cmb and par_cu. In such cases a ﬁne grain diﬀerentiation
between non-robust components is available. The designer decides whether
further protection is required for some of these components. If the number
of scenarios always exceeds the predeﬁned limit, the robustness values are
identical for the worst case analysis and the new measure. For example,
this occurs in case of par_rot and par_t481. All non-robust components
must be considered as hot spots and further hardening techniques have to
be taken to handle transient faults.
A more detailed evaluation for the combinational circuit par_cu is shown
by the histogram in Figure 7.7. The x-axis depicts the number of scenarios.
The y-axis gives the number of components that had a certain number of
155
Chapter 7 EXPERIMENTS
Figure 7.7: Histogram of circuit par_cu
scenarios. In total there are 214 = 16, 384 input scenarios. The number
scenarios considered was limited to 10,000 as in the previous experiment.
The worst-case analysis yields 51 non-robust components. For most of
these components there exist less than 10,000 scenarios . A very ﬁne grain
diﬀerentiation between non-robust components is determined.
7.3.6 COMP-classiﬁer
The COMP-classiﬁer implements a compositional reasoning. That means,
a set of subcircuits are locally classiﬁed and the results are composed with
the entire circuit if necessary. In this evaluation combinational circuits are
considered.
The COMP-classiﬁer has been compared against a monolithic approach.
The BMC-classiﬁer has been chosen as monolithic approach considering
only one time frame since combinational circuits are analyzed.
Two diﬀerent kinds of circuits have been evaluated that are described
in the following.
• Two circuits from the ISCAS’85 benchmark suite have been taken.
All Fanout-Free Regions (FFR) of the circuits have been hardened
with TMR. Each FFR has a fault signal. All local fault signals are
propagated to a global fault signal.
• An Arithmetic Logic Unit (ALU) that consists of an multiplier and an
adder circuit. Operands can be selective shifted or negated. The bit-
width of the operands can be arbitrarily chosen. For this evaluation
156
Robustness Checking 7.3
Figure 7.8: Arithmetic logic unit
the bit-width 8 and 16 has been generated. Figure 7.8 shows the the
ALU.
Results of the circuits are shown in Table 7.8. The ﬁrst three columns
show the circuit name, size of the circuit |V |, and number of subcircuits
n. The columns R(C) and Run time (C) show the robustness and the run
time of the COMP-classiﬁer. The columns R (M) and Run time(M) show
the robustness and the run time of the BMC-classiﬁer.
The run time of both approaches was limited to 5000 CPU seconds. A
time out is denoted by time out.
ISCAS’85 Circuits
For robustness checking each FFR has been chosen as subcircuit. As shown
in Table 7.8 this leads to 929 subcircuits for the c5313-red circuit, and
1408 subcircutis for the c7552-red circuit. The ﬁrst circuit is fully classiﬁed
by the COMP-classiﬁer and the BMC-classiﬁer while the COMP-classiﬁer
outperforms the BMC-classiﬁer by a factor of two. The larger circuit was
only fully classiﬁed by the COMP-classiﬁer. The BMC-classiﬁer was able to
157
Chapter 7 EXPERIMENTS
Table 7.8: Monolithic vs. Compositional
Circuit |V | n R (C) Run time (C) R (M) Run time (M)
c5315-red 20295 929 89.68% 2147.83s 72.50% 1071.52s
c7552-red 30795 1408 < 95.86% time out 70.91% 3069.89s
ALU-8 1204 4 74.75% 818.81s 73.69% 18.57s
ALU-16 2884 4 - time out 70.97% 165.99s
classify approximately 1000 components within 5000 seconds such that an
upper bound of the robustness is provided. The COMP-classiﬁer provides
a lower bound of the robustness. The most components are correctly as
robust classiﬁed by the COMP-classiﬁer. In some cases, the approximation
of the propagation is too optimistic.
Arithmetic Logic Unit
The subcircuits have been chosen according the dashed rectangle shown in
Figure 7.8. Knowledge about the functionalities of the respective compo-
nents are exploied to derive good subcircuits. Transient faults are tolerated
by the adder module ADD and the multiplier module MULT. A modulo-3
check has been implemented in both arithmetic circuits. Detected transient
faults are reported by the local fault signals and propagated to the top level
fault signal.
The 8-bit ALU has been fully classiﬁed by both approaches while
the COMP-classiﬁer outperforms the BMC-classiﬁer by a factor of 43.
The accuracy of the COMP-classiﬁer is in this case very high despite the
optimistic approximation. The 16-bit ALU was not fully classiﬁed. The
BMC-classiﬁer does not complete any classiﬁcation. The COMP-classiﬁer
was not able to fully classify the multiplier subcircuit. The listed robustness
is computed after the COMP-classiﬁer ran out of time. Based on the good
choice of the subcircuit the COMP-classiﬁer provides a useful lower bound
of the robustness.
Based on the observation that the ALU is part of a complex processor
design it is illustrated how the COMP-classiﬁer can be applied to determine
the robustness of complex circuits. Functional tests coming from contrained
random simulation are often available that can be used to sensitize paths
within the COMP-classiﬁer.
158
Robustness Checking 7.3
Table 7.9: Robustness checking by Cadence SMV.
Circuit Rkcmpllb R
kcmpl
ub Run time
b08 0% 0% 2652s
b09 0% 47.52% time out
b10 1.79% 1.79% 5309s
b11 0% 100% time out
b12 0% 100% time out
b13 0% 100% time out
b14 0% 100% time out
b15 0% 100% time out
7.3.7 Robustness Checking by Means of Model Checking
As presented in Section 5.5 robustness checking can be translated into a
model checking problem which can be solved by state-of-the-art available
model checkers. This section presents the results when using a model
checker to perform robustness checking.
For this evaluation Cadence SMV6 has been used. Preliminary ex-
periments turned out that the model checker NuSMV7 did not return any
proper result in reasonable run times, i.e., no classiﬁcation was completed.
Cadence SMV supports the input format SMV. In an SMV ﬁle the
model and the CTL property are speciﬁed. Technically, for each component
RobuCheck generates an SMV ﬁle according to the model from Section 5.5.
The original ITC’99 circuits and the derived system-level TMR circuits
are checked using Cadence SMV. The experiments were conducted on
an AMD OpteronTMCPU with six cores running at 2.8Ghz with 32GB
main memory. The run time for each classiﬁcation was limited by 600
seconds. The overall run time of the classiﬁcation was limited to 8 hours.
A time out is denoted by time out. The model checker was run using the
default conﬁguration. The reachability window and observation window
are automatically chosen by the model checker that internally guarantee
completeness.
For all TMR circuits, the model checker ran out of time without any
classiﬁcation. Even for the small circuits b08-tmr-sys with 795 components
the classiﬁcation did not succeed.
The results for the original ITC’99 circuits are listed in Table 7.9. The
ﬁrst column denotes the name of the circuit. The second the third column
6Available under http://w2.cadence.com/webforms/cbl_software/index.aspx
7Available under http://nusmv.fbk.eu/
159
Chapter 7 EXPERIMENTS
denotes the determined robustness bounds. The last column lists the wall
clock time in seconds.
Only two circuits are fully classiﬁed leading to equal robustness bounds.
However, the best classiﬁer integrated in RobuCheck outperforms the
model checker by the factor of 492 for b08 and 254 for b10. Despite the
low size of both circuits circuits, b08 with 240 components, and b10 with
279 components, the run times are relatively long.
Overall, dedicated veriﬁcation approaches exploits problem domain
knowledge that leads to signifanctly increased performance.
7.3.8 IBM Benchmarks
In the author’s work [FFA+12] parts of the ITP-classiﬁer have been pub-
lished as a joint work with IBM. In this paper the ITP-classiﬁer was
evaluated on IBM benchmarks which are presented in the following.
Table 7.10 lists the results of the IBM benchmarks. The benchmarks are
grouped: data-path circuits denoted by D1 to D13 and circuits taken from
a multi-processor control unit denoted by D14 to D30. The benchmarks
were conducted in an Intel i5 processor running at 3.1GHz with 4GB main
memory. The ﬁrst four columns specify the characteristics of the circuits.
The column Classiﬁed denotes the number of components that have been
fully classiﬁed. The column l and k denotes the reached values for l and k.
The run times are shown in the last column provided in CPU seconds. The
analysis of the components is restricted to ﬂip ﬂops.
A signiﬁcant portion of the ﬂip ﬂops were classiﬁed for the most circuits.
If RobuCheck ran out of time or out of memory less than 100% were
classiﬁed. In those cases the underlying problem instance may get too large
and consequently the search space becomes too complex. The experiments
were conducted by an older version of RobuCheck using only the proof-
based approach to obtain interpolants. A newer version of RobuCheck that
integrates the model-based approach may classiﬁer more components since
this approach consumes usually much fewer memory. Overall, RobuCheck
was able to produce a signiﬁcant outcome for industrial circuits.
160
Robustness Checking 7.3
Table 7.10: IBM Benchmarks
Circuit |X| |Y | |FF | Classiﬁed [%] l k Runtime
D1 204 259 1430 9.30% 2 0 2728
D2 228 65 1424 17.49% 5 1 783
D3 727 293 1395 7.74% 3 0 220
D4 700 497 1038 70.52% 7 1 678
D5 364 142 940 100.00% 2 1 60
D6 105 60 699 99.86% 2 57 1699
D7 284 262 513 84.99% 18 3 3797
D8 112 56 456 100.00% 6 2 144
D9 268 99 447 89.26% 8 1 8281
D10 734 194 435 87.36% 2 1 611
D11 155 120 394 100.00% 2 1 11
D12 53 37 322 100.00% 2 1 19
D13 124 67 222 48.20% 5 3 37
D14 119 112 878 81.55% 5 3 1492
D15 140 55 804 88.56% 31 0 710
D16 29 24 555 15.86% 61 1 1201
D17 377 25 506 70.16% 6 6 1504
D18 176 154 464 70.26% 55 1 2044
D19 252 131 451 56.54% 7 7 1714
D20 327 102 428 67.06% 3 44 9050
D21 173 256 412 88.35% 8 4 486
D22 135 206 247 90.28% 2 33 23170
D23 218 96 231 95.24% 2 86 3631
D24 119 57 231 96.54% 2 2 580
D25 227 114 216 95.37% 4 95 7589
D26 70 51 210 17.62% 131 0 3697
D27 103 63 207 91.30% 7 6 598
D28 130 79 195 95.90% 2 80 35888
D29 100 37 126 100.00% 5 5 353
D30 139 94 123 100.00% 4 5 59
161

Chapter 8
Conclusion and Future Work
The vulnerability of transient faults increases signiﬁcantly with continuously
shrinking feature sizes. Tolerating transient faults is possible based on
strong hardening techniques such as Triple Modular Redundancy or Error-
Correcting-Codes. The implementation of those techniques might be buggy
itself. Thus, fault tolerance in terms of robustness needs to be veriﬁed. This
task has been introduced in this thesis under the term robustness checking.
The thesis starts by introducing an adequate fault model to handle
transient faults at logic level. Transient faults are considered as a non-
deterministically value change of a component’s outputs. Not only Boolean
values are considered, a more complex component model allows to cover
local multiple transient faults as well. Usually, models that consider multiple
faults are huge but in this thesis local multiple transient faults become
manageable.
A transient fault may aﬀect the circuit’s function diﬀerently. According
to the diﬀerent behavior a categorization of each component in terms of
k-non-robust, k-dangerous, unbounded dangerous, and robust is introduced.
This step is called classiﬁcation.
Moreover, objective and unique measures are introduced to document the
circuit’s robustness. The ﬁrst measure consider the worst case that a scenario
and a transient fault occur. This leads to a conservative measurement.
The second measure consider a conﬁgurable probability that a pre-deﬁned
number of scenarios and transient faults occur. This measure reﬂects more
the behavior of a circuit during operation since not every scenario may
occur in practice. However, the computational eﬀort of computing both
measures is very diﬀerent. A trade-oﬀ between accuracy and run time can
be found.
A basic algorithm has been introduced that follows the introduced
computational model to classify the components into the respective classes.
163
Chapter 8 CONCLUSION AND FUTURE WORK
The accuracy of the classiﬁcation strongly depends on the considered set
of states for the fault injection. But exact reachability information is hard
to compute. Therefore approximation techniques are embedded into the
classiﬁcation and the inﬂuences of the classiﬁcation are formally emphasized
and considered in the respective measures. Here, new techniques to com-
pute approximations have been introduced based on Craig interpolation.
Moreover, a new model checker results that signiﬁcantly outperforms a
state-of-the-art model checker for a subset of relevant benchmarks.
Overall, various approaches adapted from the well-known Bounded
Model Checking (BMC), Automatic Test Pattern Generation (ATPG),
Interpolation-based BMC, Compositional veriﬁcation and Random sim-
ulation have been introduced to classify the circuit’s components. All these
approaches are integrated into a highly-optimized ﬂow of robustness check-
ing within the veriﬁcation tool called RobuCheck. RobuCheck is able
to formally classify the components of any kinds of combinational and se-
quential circuits and highlights the obtained results in a strong visualization
engine that pin-points the designer directly to vulnerable components.
The demand for a dedicated veriﬁcation ﬂow has been demonstrated
by translating the problem of robustness checking into a model checking
problem. Those model checking instances have been solved by a industrial-
strength model checker. However, this approach provides poor accuracy.
The proposed algorithms of this thesis outperform this model checking
approach considerably while providing signiﬁcant higher accuracy.
The integrated classiﬁers in RobuCheck can be called concurrently
or consecutively that allows to particularly conﬁgure the veriﬁcation for
diﬀerent circuit classes. Overall, the evaluation of academic and indus-
trial benchmarks shows the eﬀectiveness of RobuCheck. Even when the
computational resources in terms of run time is drastically limited high
accuracy is reached by RobuCheck. Even more, by using sophisticated
approximation techniques the accuracy is often very high and even in case
of hard circuit the ITP-classiﬁer provides exact results.
Possible future work directions are to lift the models of robustness check-
ing to a higher level of abstraction in order to handle more complex circuits.
Here the ﬁrst steps are made with the compositional approach. Moreover,
the problem formulation of the ITP-classiﬁer can be lifted to Satisﬁable
Modulo Theory (SMT) to provide more compact problem instances and
ﬁnally to more compact interpolants. This may increase the performance
and decrease the memory consumption considerably.
164
165
Chapter 8 CONCLUSION AND FUTURE WORK
List of Symbols
T (s, s′) transition from state s to state s′
I predicate of the initial state
P predicate describing the property
k¯ maximal size of the observation window k¯ ∈ [0, kcmpl]
k size of the observation window k ∈ [0, k¯]
kcmpl completeness threshold to cover all propagation paths
l number of time frames considered after the initial state
lcmpl completeness threshold to cover all reachable states
l¯ maximal size of the reachability window l¯ ∈ [0, lcmpl]
Sk set of k-non-robust components
Dk set of k-dangerous components
T set of robust components
Sˆk set of spurious k-non-robust components
Dˆk set of spurious k-dangerous components
Sˇk subset of k-non-robust components
Dˇk subset of k-dangerous components
σ, Iˆ Craig interpolants
A,B Boolean formulas
Rk¯lb lower bound of the robustness with respect to k¯
Rk¯ub upper bound of the robustness with respect to k¯
S0 states for fault injection
S∗ set of all reachable states
Sˆ over-approximation of reachable states
Sˇ under-approximation of reachable states
166
List of Figures
1.1 Robustness checking embedded into the design ﬂow . . . . . 4
2.1 Types of Interpolation Systems . . . . . . . . . . . . . . . . 21
2.2 Typical gates . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Relation of sets of states . . . . . . . . . . . . . . . . . . . . 24
2.4 A schematic view of a circuit with a checker circuitry . . . . 33
2.5 A schematic view of a system-level TMR implementation . 34
3.1 Component Model: Transient faults at diﬀerent levels of
abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 CTFs and multiple STFs . . . . . . . . . . . . . . . . . . . . 38
3.3 Modulo-3 counter with impact of a STF . . . . . . . . . . . 39
3.4 Transient fault at an arbitrary time frame . . . . . . . . . . 40
3.5 Robust classiﬁcation of a component (Condition 1) . . . . . 42
3.6 Robust classiﬁcation for a component (Condition 2) . . . . 42
3.7 Non-robust classiﬁcation of a component . . . . . . . . . . . 43
3.8 Dangerous classiﬁcation of a component . . . . . . . . . . . 44
3.9 Unbounded dangerous components . . . . . . . . . . . . . . 47
3.10 Transmission system . . . . . . . . . . . . . . . . . . . . . . 48
4.1 Inﬂuence of the scenario ratio λ . . . . . . . . . . . . . . . . 57
5.1 Component gnew encapsulate component g and logic to inject
CTF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2 Model for classifying 0-non-robust components . . . . . . . 63
5.3 Model for classifying 1-non-robust components . . . . . . . 65
5.4 Model for classifying 0-dangerous components . . . . . . . . 66
5.5 Classiﬁcation based on model checking . . . . . . . . . . . . 71
5.6 Example circuit in DAG-based representation with weighted
edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.1 Illustration of formula NBMCl(U, k) . . . . . . . . . . . . . 82
167
Chapter 8 CONCLUSION AND FUTURE WORK
6.2 Fanout and transitive fanin cone of an aﬀected component . 90
6.3 Over-approximation of the interpolants leading to spurious
counterexamples . . . . . . . . . . . . . . . . . . . . . . . . 96
6.4 General idea of locally classifying subcircuits . . . . . . . . 111
6.5 General idea of the SIM-classiﬁer . . . . . . . . . . . . . . . 119
6.6 System overview of RobuCheck . . . . . . . . . . . . . . . 123
6.7 Graphical User Interface of RTLVisonProTM . . . . . . . 124
6.8 Proof-based interpolants computed with PicoSAT . . . . . 126
6.9 Model-based interpolant computation . . . . . . . . . . . . 130
6.10 Integrated ﬂow of the SIM-classiﬁer into formal-methods
based classiﬁers . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.11 Collecting k-dangerous components. . . . . . . . . . . . . . 133
7.1 Run time of model-based vs. proof-based approach . . . . . 136
7.2 Size of the interpolants computed by model-based and proof-
based approach. . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.3 Run time of SimpMC vs. IIMC. . . . . . . . . . . . . . . . 139
7.4 Formal vs. simulation . . . . . . . . . . . . . . . . . . . . . 149
7.5 Collect k-dangerous components for circuit b12 . . . . . . . 150
7.6 Progress of b15 and b12-tmr-ff by Setup #1 and Setup #2 153
7.7 Histogram of circuit par_cu . . . . . . . . . . . . . . . . . . 156
7.8 Arithmetic logic unit . . . . . . . . . . . . . . . . . . . . . . 157
168
Bibliography
[AB09] S. Arora and B. Barak. Computational Complexity: A Modern
Approach. Cambridge University Press, New York, NY, USA,
1st edition, 2009.
[AK84] A. Avizienis and J. P. J. Kelly. Fault tolerance by design diver-
sity: Concepts and experiments. IEEE Computer, 17(8):67–80,
1984.
[ALRL04] A. Avizienis, J.-C. Laprie, B. Randell, and C. Lanwehr. Basic
concepts and taxonomy of dependable and secure computing.
Dependable and Secure Computing, 1(1):11–33, 2004.
[Ash01] P. J. Ashenden. The Designer’s Guide to VHDL. Morgan
Kaufmann Publishers Inc., San Francisco, CA, USA, 2nd
edition, 2001.
[AV02] J.A. Abraham and V.M. Vedula. Verifying properties using
sequential ATPG. In Internatioal Test Conference, pages 194
– 202, 2002.
[Bau02] R. Baumann. Soft Errors in Commercial Semiconductor Tech-
nology: Overview and Scaling Trends. IEEE 2002 Reliability
Physics Tutorial Notes, Reliability Fundamentals, page 121,
2002.
[Bau05] R.C. Baumann. Radiation-induced soft errors in advanced
semiconductor technologies. Device and Materials Reliability,
IEEE Transactions on, 5(3):305 – 316, sept. 2005.
[BBC+09] S. Baarir, C. Braunstein, R. Clavel, E. Encrenaz, J.-M. Ilie,
R. Leveugle, I. Mounier, L. Pierre, and D. Poitrenaud. Com-
plementary formal approaches for dependability analysis. In
DFT ’09, pages 331 –339, oct. 2009.
169
Chapter 8 CONCLUSION AND FUTURE WORK
[BBL+12] S. Baeg, J. Bae, S. Lee, C.S. Lim, S.H. Jeon, and H. Nam. Soft
error issues with scaling technologies. In Asian Test Symp.,
pages 68–68, 2012.
[BCCZ99] A. Biere, A. Cimatti, E.M. Clarke, and Y. Zhu. Symbolic
model checking without BDDs. In Tools and Algorithms for
the Construction and Analysis of Systems, pages 193–207,
1999.
[BCG+10] R. Bloem, K. Chatterjee, K. Greimel, T. Henzinger, and
B. Jobstmann. Robustness in the presence of liveness. In
Computer Aided Veriﬁcation, volume 6174 of Lecture Notes
in Computer Science, pages 410–424. 2010.
[BCM+90] J.R. Burch, E.M. Clarke, K.L. McMillan, D.L. Dill, and L.J.
Hwang. Symbolic model checking: 1020 states and beyond.
In Logic in Computer Science, pages 428 –439, 1990.
[BCT07] M. Bozzano, A. Cimatti, and F. Tapparo. Symbolic fault tree
analysis for reactive systems. In Automated technology for
veriﬁcation and analysis, pages 162–176, 2007.
[Ber68] E. Berlekamp. Algebraic Coding Theory. McGraw-Hill, 1968.
[BF76] M.A. Breuer and A.D. Friedman. Diagnosis and reliable design
of digital systems. Digital system design series. Computer
Science Press, 1976.
[Bie08] A. Biere. Picosat essentials. JSAT, 4(2-4):75–97, 2008.
[Bie10] A. Biere. Lingeling, Plingeling, PicoSAT and PrecoSAT at
SAT race 2010. Technical report, Institute for Formal Models
and Veriﬁcation, Johannes Kepler University, Linz, Austria,
2010.
[Bie12] A. Biere. Aiger package. Technical report, Johannes Kepler
University, Linz, Austria, 2012.
[BIFH+11] O. Bar-Ilan, O. Fuhrmann, S. Hoory, O. Shacham, and
O. Strichman. Reducing the size of resolution proofs in linear
time. International Journal of Software Tools Technology
Transfer, 13(3):263–272, June 2011.
[BIMM12] J. Baumgartner, A. Ivrii, A. Matsliah, and H. Mony. IC3-
guided abstraction. In Int’l Conf. on Formal Methods in CAD,
pages 182–185, 2012.
170
8.0
[BKA02] J. Baumgartner, A. Kuehlmann, and J. Abraham. Property
checking via structural analysis. In Computer Aided Veriﬁca-
tion, pages 151–165, 2002.
[Bor07] S. Borkar. Thousand core chips: A technology perspective.
In Design Automation Conf., pages 746–749, 2007.
[Bra11] A. R. Bradley. SAT-based model checking without unrolling,
2011.
[BRTF99] Vamsi Boppana, SreerangaP. Rajan, Koichiro Takayama, and
Masahiro Fujita. Model checking based on sequential atpg.
In Nicolas Halbwachs and Doron Peled, editors, Computer
Aided Veriﬁcation, volume 1633 of Lecture Notes in Computer
Science, pages 418–430. Springer Berlin Heidelberg, 1999.
[Bry86] R.E. Bryant. Graph-based algorithms for Boolean function
manipulation. Computers, IEEE Transactions on, C-35(8):677
–691, aug. 1986.
[CCK03] P. Chauhan, E.M. Clarke, and D. Kroening. Using SAT based
image computation for reachability analysis. Technical Report
2197, Carnegie Mellon University, 2003.
[CGJ+00] E. Clarke, O. Grumberg, S. Jha, Y. Lu, and H. Veith.
Counterexample-guided abstraction reﬁnement. In Computer
Aided Veriﬁcation, volume 1855 of Lecture Notes in Computer
Science, pages 154–169. 2000.
[CGP01] E. M. Clarke, O. Grumberg, and D. Peled. Model checking.
MIT Press, 2001.
[CIM12] H. Chockler, A. Ivrii, and A. Matsliah. Interpolants without
proofs. In Haifa Veriﬁcation Conference, 2012.
[CKOS04] E. Clarke, D. Kroening, J. Ouaknine, and O. Strichman.
Completeness and complexity of bounded model checking.
In Veriﬁcation, Model Checking, and Abstract Interpretation,
volume 2937 of Lecture Notes in Computer Science, pages
85–96. 2004.
[CLM89] E. Clarke, D. Long, and K. McMillan. Compositional model
checking. In Proceedings of the Fourth Annual Symposium
on Logic in computer science, pages 353–362, Piscataway, NJ,
USA, 1989. IEEE Press.
171
Chapter 8 CONCLUSION AND FUTURE WORK
[CMB06] M.L. Case, A. Mishchenko, and R. K. Brayton. Inductively
ﬁnding a reachable state space over-approximation. In Int’l
Workshop on Logic Synth., pages 172–179, 2006.
[CMCHG96] E. Clarke, K. McMillan, S. Campos, and V. Hartonas-
Garmhausen. Symbolic model checking. In Computer Aided
Veriﬁcation, volume 1102 of Lecture Notes in Computer Sci-
ence, pages 419–422. 1996.
[CMR+02] P. Civera, L. Macchiarulo, M. Rebaudengo, M.Sonza Reorda,
and M. Violante. An fpga-based approach for speeding-up
fault injection campaigns on safety-critical circuits. Journal
of Electronic Testing, 18:261–271, 2002.
[Con09] Concept Engineering GmbH. RTLvision PRO.
http://www.concept.de, 2009.
[Coo71] S.A. Cook. The complexity of theorem proving procedures. In
3. ACM Symposium on Theory of Computing, pages 151–158,
1971.
[CPL+09] A. Czutro, I. Polian, M. Lewis, P. Engelke, S. M. Reddy, and
B. Becker. Tiguan: Thread-parallel integrated test pattern
generator utilizing satisﬁability aaalysis. In International
Conference on VLSI Design, pages 227–232, 2009.
[Cra57] W. Craig. Linear reasoning. a new form of the herbrand-
gentzen theorem. The Journal of Symbolic Logic, 22(3):pp.
250–268, 1957.
[CRP+96] H. Cha, E.M. Rudnick, J.H. Patel, R.K. Iyer, and G.S. Choi.
A gate-level simulation environment for alpha-particle-induced
transient faults. IEEE Trans. on CAD, 45(11):1248 –1256,
nov 1996.
[DB98] R. Drechsler and B. Becker. Graphenbasierte Funktionsdarstel-
lung. B.G. Teubner, Stuttgart, 1998.
[DEFT09] R. Drechsler, SS. Eggersglüß, G. Fey, and D. Tille. Test
Pattern Generation using Boolean Proof Engines. Springer,
2009.
[Dij59] E. W. Dijkstra. A note on two problems in connexion with
graphs. Numerische Mathematik, 1(1):269–271, 1959.
172
8.0
[DJBT81] J. A. Darringer, W.H. Joyner, C. L. Berman, and L Trevillyan.
Logic synthesis through local transformations. IBM Journal
of Research and Development, 25(4):272 –280, july 1981.
[DKPW10] V. D’Silva, D. Kroening, M. Purandare, and G. Weissenbacher.
Interpolant strength. In Veriﬁcation, Model Checking, and
Abstract Interpretation, volume 5944 of Lecture Notes in Com-
puter Science, pages 129–145. 2010.
[DLL62] M. Davis, G. Logeman, and D. Loveland. A machine program
for theorem proving. Comm. of the ACM, 5:394–397, 1962.
[DP60] M. Davis and H. Putnam. A computing procedure for quan-
tiﬁcation theory. Journal of the ACM, 7:506–521, 1960.
[DPK08] V. D’Silva, M. Purandare, and D. Kroening. Approximation
reﬁnement for interpolation-based model checking. In Veri-
ﬁcation, Model Checking, and Abstract Interpretation, pages
68–82, 2008.
[ED07] S. Eggersglüß and R. Drechsler. Improving test pattern com-
pactness in SAT-based ATPG. In Proceedings of the 16th
Asian Test Symposium, ATS ’07, pages 445–452. IEEE Com-
puter Society, 2007.
[ED11] S. Eggersglüß and R. Drechsler. Eﬃcient data structures
and methodologies for SAT-based ATPG providing high fault
coverage in industrial application. Computer-Aided Design
of Integrated Circuits and Systems, IEEE Transactions on,
30(9):1411 –1415, sept. 2011.
[EMA10] N. Een, A. Mishchenko, and N. Amla. A single-instance
incremental sat formulation of proof- and counterexample-
based abstraction. In Int’l Conf. on Formal Methods in CAD,
pages 181–188, 2010.
[Eme95] E. Allen Emerson. Temporal and modal logic. In Handbook of
theoretical computer science, pages 995–1072. Elsevier, 1995.
[EMS07] N. Een, A. Mishchenko, and N. Sörensson. Applying logic
synthesis for speeding up SAT. In SAT, pages 272–286, 2007.
[ES03] N. Een and N. Sörensson. An extensible SAT-solver. In SAT,
volume 2919 of Lecture Notes in Computer Science, pages
502–518. Springer, 2003.
173
Chapter 8 CONCLUSION AND FUTURE WORK
[FD08] G. Fey and R. Drechsler. A basis for formal robustness check-
ing. In Int’l Symp. on Quality Electronic Design, pages 784–
789, 2008.
[FF10] S. Frehse and G Fey. Kompositionelle Formale Robustheit-
sprüfung. In Zuverlässigkeit und Entwurf, pages 73–74, 2010.
[FFA+12] S. Frehse, G. Fey, E. Arbel, K. Yorav, and R. Drechsler.
Complete and eﬀective robustness checking by means of inter-
polation. In Int’l Conf. on Formal Methods in CAD, pages
82–90, 2012.
[FFD10] S. Frehse, G. Fey, and R. Drechsler. A better-than-worst-
case robustness measure. In IEEE Workshop on Design and
Diagnostics of Electronic Circuits and Systems, pages 78–83,
2010.
[FFSD09] S. Frehse, G. Fey, A. Suﬂow, and R. Drechsler. Robustness
check for multiple faults using formal techniques. In Digi-
tal System Design, Architectures, Methods and Tools, 2009.
Digital System Design, Conference on, pages 85 –90, aug.
2009.
[FFSD10] S. Frehse, G. Fey, A. Sülﬂow, and R. Drechsler. Robucheck:
A robustness checker for digital circuits. In EUROMICRO
Symp. on Digital System Design, pages 226–231, 2010.
[FHD+11] S. Frehse, F. Haedicke, M. Diepenbeck, G. Fey, and R. Drech-
sler. Hochoptimierter Ablauf zur Robustheitsprüfung. In
Zuverlässigkeit und Entwurf, pages 35 – 42, 2011.
[FRF12] S. Frehse, H. Riener, and G. Fey. Hardware-software-co-
synthese zur verbesserung der fehlertoleranz. In Zuverläs-
sigkeit und Entwurf, pages 90–96, 2012.
[FSD09] G. Fey, A. Sülﬂow, and R. Drechsler. Computing bounds for
fault tolerance using formal techniques. In Design Automation
Conf., pages 190–195, 2009.
[FSF11] A. Finder, A. Sülﬂow, and G. Fey. Latency analysis for
sequential circuits. In European Test Conf., pages 129–134,
2011.
[FSFD11] G. Fey, A. Sulﬂow, S. Frehse, and R. Drechsler. Eﬀective
robustness analysis using bounded model checking techniques.
174
8.0
Computer-Aided Design of Integrated Circuits and Systems,
IEEE Transactions on, 30(8):1239 –1252, aug. 2011.
[FSSFa10] G. Fey, A. Sülﬂow, and R. Drechsler S. Frehse and. Automatis-
che formale Veriﬁkation der Fehlertoleranz von Schaltkreisen.
it-Information Technology, 42(4):216–223, 2010.
[FWD10] S. Frehse, R. Wille, and R. Drechsler. Eﬃcient simulation-
based debugging of reversible logic. In Int’l Symp. on Multi-
Valued Logic, pages 156 –161, 2010.
[GOSM08] M. Gössel, V. Ocheretny, E. Sogomonyan, and D. Marienfeld.
New Methods of Concurrent Checking. Frontiers in Electronic
Testing Series. Springer Science+Business Media B.V., 2008.
[Gro12] ABC Group. ABC: A System for Sequential Synthesis and Ver-
iﬁcation. http://www.eecs.berkeley.edu/ alanmi/abc/, 2012.
[GS05] A. Gupta and O. Strichman. Abstraction reﬁnement for
bounded model checking. In Computer Aided Veriﬁcation,
pages 112–124, 2005.
[Ham50] R. W. Hamming. Error detecting and error correcting codes.
Bell System Technical Jour., 26(2):147–160, 1950.
[Hel63] L. Hellerman. A catalog of three-variable or-invert and and-
invert logical circuits. Electronic Computers, IEEE Transac-
tions on, EC-12(3):198 –223, june 1963.
[HFF+11] F. Haedicke, S. Frehse, G. Fey, D. Grosse, and R. Drechsler.
metasmt: Focus on your application not on solver integration.
In International Workshop on Design and Implementation of
Formal Tools and Systems, pages 22 – 29, 2011.
[HH08] M. Hunger and S. Hellebrand. Veriﬁcation and analysis of self-
checking properties through ATPG. 11th IEEE International
On-Line Testing Symposium, 0:25–30, 2008.
[HHC+09] M. Hunger, S. Hellebrand, A. Czutro, I. Polian, and B. Becker.
Atpg-based grading of strong fault-secureness. In On-Line
Testing Symposium, 2009. IOLTS 2009. 15th IEEE Interna-
tional, pages 269 –274, june 2009.
[HLGD12] F. Haedicke, H.M. Le, D. Grosse, and R. Drechsler. CRAVE:
An advanced constrained random veriﬁcation environment for
SystemC. In System on Chip (SoC), pages 1 –7, oct. 2012.
175
Chapter 8 CONCLUSION AND FUTURE WORK
[HPB07] J.P. Hayes, I. Polian, and B. Becker. An analysis framework
for transient-error tolerance. In VLSI Test Symposium, 2007.
25th IEEE, pages 249 –255, may 2007.
[Hua95] G. Huang. Constructing Craig interpolation formulas. In
Annual International Conference on Computing and Combi-
natorics, pages 181–190, 1995.
[IS75] O.H. Ibarra and S.K. Sahni. Polynomially complete fault
detection problems. Computers, IEEE Transactions on, C-
24(3):242 – 249, march 1975.
[JBH12] M. Järvisalo, A. Biere, and M. Heule. Simulating circuit-level
simpliﬁcations on cnf. Journal of Automated Reasoning, pages
1–37, 2012.
[JFWD10] J.C. Jung, S. Frehse, R. Wille, and R Drechsler. Enhancing
debugging of multiple missing control errors in reversible logic.
In Great lakes symposium on VLSI, pages 465–470, 2010.
[JG03] N. Jha and S. Gupta. Testing of Digital Systems. Cambridge
University Press, 2003.
[Kai11] R. Kaivola. Intel coretm i7 processor execution engine vali-
dation in a functional language based formal framework. In
PADL, page 1, 2011.
[KGN+09] R. Kaivola, R. Ghughal, N. Narasimhan, A. Telfer, J. Whit-
temore, S. Pandav, A. Slobodová, C. Taylor, V. Frolov,
E. Reeber, and A. Naik. Replacing testing with formal veriﬁ-
cation in intel coretm i7 processor execution engine validation.
In CAV, pages 414–429, 2009.
[KK07] I. Koren and C.M. Krishna. Fault-Tolerant Systems. Morgan
Kaufmann, 2007.
[KL70] B. W. Kernighan and S. Lin. An eﬃcient heuristic procedure
for partitioning graphs. The Bell system technical journal,
49(1):291–307, 1970.
[KMH05] S. Krishnaswamy, I.L. Markov, and J.P. Hayes. Logic cir-
cuit testing for transient faults. In Test Symposium, 2005.
European, pages 102 – 107, may 2005.
176
8.0
[KPJ+06] U. Krautz, M. Pﬂanz, C. Jacobi, H.W. Tast, K. Weber, and
H.T. Vierhaus. Evaluating coverage of error detection logic
for soft errors using formal methods. In Design, Automation
and Test in Europe, 2006. DATE ’06. Proceedings, volume 1,
pages 1 –6, march 2006.
[KPMH07] S. Krishnaswamy, S.M. Plaza, I.L. Markov, and J.P. Hayes.
Enhancing design robustness with reliability-aware resynthe-
sis and logic simulation. In Computer-Aided Design, 2007.
ICCAD 2007. IEEE/ACM International Conference on, pages
149 –154, nov. 2007.
[Kra97] J. Krajicek. Interpolation theorems, lower bounds for proof
systems, and independence results for bounded arithmetic.
The Journal of Symbolic Logic, 62(2):457–486, 1997.
[KS92] H. Kautz and B. Selman. Planning as satisﬁability. In IN
ECAI-92, pages 359–363. Wiley, 1992.
[KSMS11] H. Katebi, K. A. Sakallah, and J.P. Marques-Silva. Empirical
study of the anatomy of modern SAT solvers. In SAT, SAT’11,
pages 343–356, 2011.
[Lar92] T. Larrabee. Test pattern generation using boolean satisﬁa-
bility. IEEE Trans. on CAD, 11:4–15, 1992.
[LT93] N.G. Leveson and C.S. Turner. An investigation of the therac-
25 accidents. Computer, 26(7):18 –41, july 1993.
[McM03] K.L. McMillan. Interpolation and SAT-based model checking.
In Computer Aided Veriﬁcation, pages 1–13, 2003.
[MMZ+01a] M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and
S. Malik. Chaﬀ: engineering an eﬃcient sat solver. In Pro-
ceedings of the 38th annual Design Automation Conference,
Design Automation Conf., pages 530–535, New York, NY,
USA, 2001. ACM.
[MMZ+01b] M.W. Moskewicz, C.F. Madigan, Y. Zhao, L. Zhang, and
S. Malik. Chaﬀ: Engineering an eﬃcient sat solver. In Design
Automation Conf., pages 530–535, 2001.
[MR08] J. A. Maestro and P. Reviriego. Study of the eﬀects of mbus on
the reliability of a 150 nm sram device. In Design Automation
Conf., pages 930–935, 2008.
177
Chapter 8 CONCLUSION AND FUTURE WORK
[MSS99] J.P. Marques-Silva and K. A. Sakallah. Grasp: A search
algorithm for propositional satisﬁability. IEEE Trans. on
CAD, 48(5):506–521, 1999.
[MZM06] N. Miskov-Zivanov and D. Marculescu. Circuit reliability
analysis using symbolic techniques. Computer-Aided Design
of Integrated Circuits and Systems, IEEE Transactions on,
25(12):2638 –2649, dec. 2006.
[MZM10] N. Miskov-Zivanov and D. Marculescu. Multiple transient
faults in combinational and sequential circuits: A systematic
approach. IEEE Trans. on CAD of Integrated Circuits and
Systems, 29(10):1614–1627, 2010.
[Nic11] M. Nicolaidis. Soft Errors in Modern Electronic Systems.
Frontiers in Electronic Testing. Springer, 2011.
[NVI12] NVIDIA. NVIDIA’s next generation CUDATMcompute archi-
tecture: Fermi. Technical report, NVIDIA GmbH, 2012.
[Par10] V. Paruthi. Large-scale application of formal veriﬁcation:
from ﬁction to fact. In Int’l Conf. on Formal Methods in
CAD, pages 175–180, 2010.
[PCZ+08] A. Pellegrini, K. Constantinides, Dan Zhang, S. Sudhakar,
V. Bertacco, and T. Austin. Crashtest: A fast high-ﬁdelity
FPGA-based resiliency analysis framework. In Computer
Design, ICCD 2008. IEEE International Conference on, pages
363 –370, 2008.
[PD11] K. Pipatsrisawat and A. Darwiche. On the power of clause-
learning sat solvers as resolution engines. Artiﬁcial Intelli-
gence, 175(2):512–525, February 2011.
[PG86] D. A. Plaisted and S. Greenbaum. A structure-preserving
clause form translation. J. Symb. Comput., 2(3):293–304,
September 1986.
[Pnu77] A. Pnueli. The temporal logic of programs. Foundations
of Computer Science, IEEE Annual Symposium on, 0:46–57,
1977.
[Pud97] P. Pudlák. Lower bounds for resolution and cutting plane
proofs and monotone computations. The Journal of Symbolic
Logic, 62(3):981–998, 1997.
178
8.0
[RF12] H. Riener and G. Fey. Model-based diagnosis versus error
explanation. In Formal Methods and Models for Codesign
(MEMOCODE), 2012 10th IEEE/ACM International Con-
ference on, pages 43 –52, july 2012.
[RFF12] H. Riener, S. Frehse, and G. Fey. Improving fault tolerance uti-
lizing hardware-software-co-synthesis. In Design, Automation
and Test in Europe, pages 939–943, 2012.
[Rin12] J. Rintanen. Planning as satisﬁability: Heuristics. Artiﬁcial
Intelligence, 193(0):45 – 86, 2012.
[Rob65] J. A. Robinson. A machine-oriented logic based on the reso-
lution principle. J. ACM, 12(1):23–41, January 1965.
[RS04] K. Ravi and F. Somenzi. Minimal assignments for bounded
model checking. In Kurt Jensen and Andreas Podelski, editors,
Tools and Algorithms for the Construction and Analysis of
Systems, volume 2988 of Lecture Notes in Computer Science,
pages 31–45. Springer Berlin Heidelberg, 2004.
[SC85] A. P. Sistla and E. M. Clarke. The complexity of propositional
linear temporal logics. J. ACM, 32(3):733–749, July 1985.
[SFFD09] A. Sülﬂow, S. Frehse, G. Fey, and R. Drechsler. Anwendungs-
bezogene Analyse der Robustheit von Digitalen Schaltungen.
In Zuverlässigkeit und Entwurf, pages 45–52, 2009.
[SFWD12] M. Soeken, S. Frehse, R. Wille, and R. Drechsler. Revkit: An
open source toolkit for the design of reversible circuits. In Re-
versible Computation. Workshop on Reversible Computation,
volume 7165 of Lecture Notes in Computer Science, LNCS,
pages 64–76, 2012.
[Sht01] O. Shtrichman. Pruning techniques for the SAT-based
bounded model checking problem. In CHARME, volume
2144 of LNCS, pages 58–70, 2001.
[SKF+09] A. Sülﬂow, U. Kühne, G. Fey, D. Grosse, and R. Drechsler.
Wolfram- a word level framework for formal veriﬁcation. In
Proceedings of the 2009 IEEE/IFIP International Symposium
on Rapid System Prototyping, RSP ’09, pages 11–17, Wash-
ington, DC, USA, 2009. IEEE Computer Society.
179
Chapter 8 CONCLUSION AND FUTURE WORK
[SKK+02] P. Shivakumar, M. Kistler, S.W. Keckler, D. Burger, and
L. Alvisi. Modeling the eﬀect of technology trends on the
soft error rate of combinational logic. In Dependable Systems
and Networks, 2002. DSN 2002. Proceedings. International
Conference on, pages 389 – 398, 2002.
[SKS+11] B. Sinharoy, R. Kalla, W. J. Starke, H. Q. Le, R. Cargnoni,
J. A. Van Norstrand, B. J. Ronchetti, J. Stuecheli, J. Leenstra,
G. L. Guthrie, D. Q. Nguyen, B. Blaner, C. F. Marino, E. Ret-
ter, and P. Williams. Ibm power7 multicore server processor.
IBM Journal of Research and Development, 55(3):1:1 –1:29,
may-june 2011.
[SLM07] S. A. Seshia, W. Li, and S. Mitra. Veriﬁcation-guided soft
error resilience. In Proceedings of the conference on Design,
automation and test in Europe, DATE ’07, pages 1442–1447,
San Jose, CA, USA, 2007. EDA Consortium.
[Soo12] M. Soos. Enhanced gaussian elimination in DPLL-based SAT
solvers. In Daniel Le Berre, editor, POS-10, volume 8 of EPiC
Series, pages 2–14. EasyChair, 2012.
[SSL+92] E.M. Sentovich, K.J. Singh, L. Lavagno, C. Moon, R. Murgai,
A. Saldanha, H. Savoj, P.R. Stephan, Robert K. Brayton,
and Alberto L. Sangiovanni-Vincentelli. Sis: A system for
sequential circuit synthesis. Technical Report UCB/ERL
M92/41, EECS Department, University of California, Berkeley,
1992.
[SVAV05] A. Smith, A. Veneris, M.F. Ali, and A. Viglas. Fault diagnosis
and logic debugging using Boolean satisﬁability. Computer-
Aided Design of Integrated Circuits and Systems, IEEE Trans-
actions on, 24(10):1606 – 1621, oct. 2005.
[SVD08] S. Safarpour, A. G. Veneris, and R. Drechsler. Improved
SAT-based reachability analysis with observability don’t cares.
JSAT, 5(1-4):1–25, 2008.
[TM96] D. E. Thomas and P. R. Moorby. The VERILOG Hardware
Description Language. Kluwer Academic Publishers, Norwell,
MA, USA, 3rd edition, 1996.
[Tse68] G. S. Tseitin. On the complexity of derivations in the propo-
sitional calculus. Studies in Mathematics and Mathematical
Logic, Part II:115–125, 1968.
180
8.0
[VG09] Y. Vizel and O. Grumberg. Interpolation-sequence based
model checking. In Int’l Conf. on Formal Methods in CAD,
pages 1–8, 2009.
[VLP+05] D. W. Victor, J. M. Ludden, R. D. Peterson, B. S. Nelson,
W. K. Sharp, J. K. Hsu, B.-L. Chu, M. L. Behm, R. M. Gott,
A D. Romonosky, and S. R. Farago. Functional veriﬁcation of
the POWER5 microprocessor and POWER5 multiprocessor
systems. IBM J. Res. Dev., 49(4/5):541–553, July 2005.
[vN56] J. von Neumann. Probabilistic logics and the synthesis of
reliable organisms from unreliable components. Automata
Studies, 34:43–99, 1956.
[WA73] M. J. Y. Williams and J. B. Angell. Enhancing testability of
large-scale integrated circuits via test points and additional
logic. IEEE Trans. on Comp., C-22(1):46–60, 1973.
[Weg87] I. Wegener. The complexity of Boolean functions. Wiley-
Teubner, 1987.
[Wei12] G. Weissenbacher. Interpolant strength revisited. In SAT,
Lecture Notes in Computer Science. 2012.
[WGF+09] R. Wille, D. Große, S. Frehse, G. W. Dueck, and R. Drechsler.
Debugging of toﬀoli networks. In Design, Automation and
Test in Europe, pages 1284–1289, 2009.
[WGF+11] R. Wille, D. Große, S. Frehse, G.W. Dueck, and R. Drechsler.
Debugging reversible circuits. Integration, the VLSI Journal,
44(1):51 – 61, 2011.
[Xil13] Xilinx. Xilinx TMRTool. http://www.xilinx.com, 2013.
[XWMB12] J. Xu, M. Williams, Hari Mony, and Jason Baumgartner.
Enhanced reachability analysis via automated dynamic netlist-
based hint generation. In Int’l Conf. on Formal Methods in
CAD, pages 157–164, 2012.
[Yeh96] Y.C. Yeh. Triple-triple redundant 777 primary ﬂight computer.
In Aerospace Applications Conference, 1996. Proceedings.,
1996 IEEE, volume 1, pages 293 –307 vol.1, feb 1996.
[ZBD07] C. Zhao, X. Bai, and S. Dey. Evaluating transient error eﬀects
in digital nanometer circuits. Reliability, IEEE Transactions
on, 56(3):381 –391, sept. 2007.
181
Chapter 8 CONCLUSION AND FUTURE WORK
[ZFWD11] H. Zhang, S. Frehse, R. Wille, and R. Drechsler. Determining
minimal testsets for reversible circuits using Boolean satisﬁa-
bility. In AFRICON, 2011, pages 1 –6, sept. 2011.
[ZKKSV06] Q. Zhu, N. Kitchen, A. Kuehlmann, and A. Sangiovanni-
Vincentelli. SAT sweeping with local observability don’t-cares.
In Design Automation Conference, 2006 43rd ACM/IEEE,
pages 229 –234, 0-0 2006.
[ZM03] L. Zhang and S. Malik. Validating SAT solvers using an inde-
pendent resolution-based checker: Practical implementations
and other applications. In Design, Automation and Test in
Europe, page 10880, 2003.
182
