Goldmine: An integration of data mining and static analysis for automatic generation of hardware assertions by Sheridan, David
c© 2011 David Sheridan
GOLDMINE: AN INTEGRATION OF DATA MINING AND STATIC
ANALYSIS FOR AUTOMATIC GENERATION OF HARDWARE
ASSERTIONS
BY
DAVID SHERIDAN
THESIS
Submitted in partial fulfillment of the requirements
for the degree of Master of Science in Electrical and Computer Engineering
in the Graduate College of the
University of Illinois at Urbana-Champaign, 2011
Urbana, Illinois
Adviser:
Assistant Professor Shobha Vasudevan
ABSTRACT
We present GOLDMINE, a methodology for generating assertions automatically.
Our method involves a combination of data mining and static analysis of the Reg-
ister Transfer Level (RTL) design. The RTL design is first simulated to generate
data about the design’s dynamic behavior. The generated data is then mined for
“candidate assertions” that are likely to be invariants. We present both a decision
tree supervised learning algorithm as well as a coverage guided mining algorithm
for generating high-quality assertions. These candidate assertions are then passed
through a formal verification engine to filter out the spurious candidates. The as-
sertions that are attested as true by the formal engine are system invariants. These
are then evaluated by a process of designer ranking that is provided as feedback
to the data mining engine. We present results of using GoldMine for assertion
generation of the RTL of Sun’s OpenSparc T2 many-threaded processor. Our re-
sults show that GoldMine can generate complex, high-coverage assertions in RTL,
thereby minimizing human effort in this process.
ii
Dedicated to my family and my fiance´e, Kelsey
iii
ACKNOWLEDGMENTS
First, I would like to thank Lingyi Liu and Viraj Athavale for contributing to
GoldMine and providing ideas when needed. I would also like to thank Sam
Hertz for crafting an extremely efficient and powerful GoldMine in C++, vastly
improving the original Java implementation. Thanks also to Sanjay Patel and
Daniel Johnson, who allowed us to test GoldMine on the developing Rigel 1000-
core CPU design. Thanks to Bill Touhy who provided excellent insight due to his
experience in the verification field, which kept us focused on practical applications
for GoldMine.
Big thanks to David Tcheng, who helped us craft an appropriate data mining
algorithm for GoldMine. It was David’s help which really got the project off the
ground. Thanks also to Hyungsul Kim, who took the initiative to provide us with
ideas on how to improve GoldMine’s data mining algorithm. Hyungsul helped us
design the coverage guided mining algorithm, which allowed us to generate much
higher quality assertions. Both David and Hyungsul provided their data mining
expertise so that we could utilize the power of data mining to the fullest extent.
Lastly, I would like to thank Assistant Professor Shobha Vasudevan, my adviser.
I would not have attended graduate school had Shobha not opened my eyes to the
world of research. Her insight is invaluable and her ideas are always creative and
interesting. She is an inspiring individual, and I expect to one day see her mold
GoldMine into a powerful tool that changes the industry.
iv
TABLE OF CONTENTS
LIST OF ABBREVIATIONS . . . . . . . . . . . . . . . . . . . . . . . . . vii
CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . 1
CHAPTER 2 RELATED WORK . . . . . . . . . . . . . . . . . . . . . . 4
2.1 Software Invariant Generation . . . . . . . . . . . . . . . . . . . 4
2.2 Assertion Generation for Hardware . . . . . . . . . . . . . . . . . 5
CHAPTER 3 BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . 7
3.1 The Hardware Design Cycle . . . . . . . . . . . . . . . . . . . . 7
3.2 Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
CHAPTER 4 GOLDMINE: ASSERTION GENERATION METHOD-
OLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1 Data Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Lightweight Static Analyzer . . . . . . . . . . . . . . . . . . . . 12
4.3 A-Miner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.4 Decision Tree Based Supervised Learning Algorithms . . . . . . . 13
4.5 Formal Verifier . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.6 A-Val: Evaluation and Ranking . . . . . . . . . . . . . . . . . . . 15
4.7 Temporal Assertions . . . . . . . . . . . . . . . . . . . . . . . . 17
4.8 An Example Run through GoldMine . . . . . . . . . . . . . . . . 18
4.9 Applications of GoldMine . . . . . . . . . . . . . . . . . . . . . 20
CHAPTER 5 A CASE STUDY OF A MULTI-CORE RTL . . . . . . . . 21
5.1 Subjective Ranking of Assertions by a Designer . . . . . . . . . . 21
5.2 Complex Assertions in GoldMine . . . . . . . . . . . . . . . . . 23
5.3 Outputs Covered by GoldMine . . . . . . . . . . . . . . . . . . . 23
5.4 The Acid Test: Regression Test Experiments . . . . . . . . . . . . 24
CHAPTER 6 SCALING GOLDMINE TO INDUSTRIAL DESIGNS:
THE OPENSPARC T2 . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.1 Evaluation of True Assertion Success Rate . . . . . . . . . . . . . 26
6.2 Evaluation of Assertion Input Space Coverage . . . . . . . . . . . 27
v
6.3 Evaluation of the Percentage of Complex Assertions . . . . . . . 28
6.4 Comparing the Generated Assertions with the OpenSparc Spec-
ification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.5 Evaluation of the Runtime and Memory Usage of GoldMine . . . 31
CHAPTER 7 THE EVOLUTION OF GOLDMINE . . . . . . . . . . . . 35
7.1 Shaping GoldMine: Early Changes to the GoldMine
Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.2 Performance Enhancements . . . . . . . . . . . . . . . . . . . . . 36
7.3 Improving the Core GoldMine Algorithms . . . . . . . . . . . . . 37
CHAPTER 8 MOTIVATION FOR A COVERAGE GUIDED
APPROACH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
8.1 Why Decision Tree Assertions Need Improvement . . . . . . . . 38
8.2 Coverage Guided Mining . . . . . . . . . . . . . . . . . . . . . . 39
CHAPTER 9 THE COVERAGE GUIDED MINING ALGORITHM . . . 41
9.1 Background Concepts . . . . . . . . . . . . . . . . . . . . . . . . 41
9.2 Algorithm Explanation . . . . . . . . . . . . . . . . . . . . . . . 42
9.3 Integration of Formal Verification . . . . . . . . . . . . . . . . . 46
CHAPTER 10 A COMPARISON BETWEEN THE COVERAGE GUIDED
AND DECISION TREE APPROACHES IN GOLDMINE . . . . . . . 49
10.1 Input Space Coverage as a Function of Iterations . . . . . . . . . 50
10.2 Runtime and Memory Requirements of Our Algorithm . . . . . . 50
10.3 Comparison of Input Space Coverage . . . . . . . . . . . . . . . 51
10.4 Comparison of Succinctness of Assertions . . . . . . . . . . . . . 52
10.5 Comparison of Conciseness of Generated Assertions . . . . . . . 52
10.6 Comparison of Information per Unit: Average Input Space
Coverage per Assertion . . . . . . . . . . . . . . . . . . . . . . . 53
10.7 Comparison of Number of Assertions Triggered in Directed
Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
10.8 The Final Test: Subjective Designer Rankings . . . . . . . . . . . 54
CHAPTER 11 RESOURCES . . . . . . . . . . . . . . . . . . . . . . . . 57
11.1 Obtaining GoldMine . . . . . . . . . . . . . . . . . . . . . . . . 57
11.2 Using the Decision Tree Based GoldMine Implementation . . . . 58
11.3 Using the Coverage Guided Mining GoldMine Implementation . . 59
CHAPTER 12 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . 61
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
vi
LIST OF ABBREVIATIONS
RTL Register Transfer Level
FV Formal Verification
ALU Arithmetic Logic Unit
CPU Central Processing Unit
IP Intellectual Property
HDL Hardware Design Language
HVL Hardware Verification Language
ABV Assertion Based Verification
TLB Translation Lookaside Buffer
IQ Input Queue
GB Gigabyte
RAM Random Access Memory
LTL Linear Temporal Logic
ITC International Test Conference
MMU Memory Management Unit
SoC System on a Chip
GHz Gigahertz
FPGA Fully Programmable Gate Array
VHSIC Very High-Speed Integrated Circuit
VHDL VHSIC Hardware Design Language
vii
CHAPTER 1
INTRODUCTION
In the hardware design industry, having a design error can be disastrous. In the
Intel Pentium P5 chip, a floating point division bug caused Intel to lose up to $475
million in 1995. More recently in 2007, AMD encountered a virtualization bug in
its Phenom line of CPUs requiring them to revise the silicon - a costly procedure.
Unlike software bugs, hardware bugs cannot always be fixed with a simple patch.
These bugs cost hardware manufacturers millions of dollars and precious time in
a quickly moving industry.
Assertions or invariants provide a mechanism to express desirable properties
that should be true in the system. Assertions are used for validating hardware
designs at different stages through their life-cycle, such as pre-Silicon formal ver-
ification, dynamic validation, runtime monitoring, and emulation [2, 3, 4]. Asser-
tions are also synthesized into hardware for post-Silicon debug and validation and
in-field diagnosis [2, 5].
Among all the solutions for ensuring robustness of hardware systems, assertion
based verification has emerged as the most popular candidate [6] solution for “pre-
Silicon” design functionality checking. Assertions are used for static (formal)
verification as well as dynamic verification of the Register Transfer Level (RTL)
design in the pre-Silicon phase.
The key question then is: How are these assertions generated? Assertion gen-
eration is an entirely manual effort in the hardware system design cycle. Placing
too many assertions can result in an unreasonable performance overhead. Placing
too few assertions, on the other hand, results in insufficient coverage of behavior.
The trade-off point for crafting minimal, but effective (high coverage) assertions
takes multiple iterations and man-months to achieve [3, 7, 8]. Another challenge
with assertion generation is due to the modular nature of system development.
A module developer would write local assertions that pertain to his/her module.
Maintaining consistency of inter-modular global assertions as the system evolves
This chapter includes work previously published in [1].
1
in this fragmented framework is very tedious. In sequential hardware, temporal
properties that cut across time cycles are usually the source of subtle, but seri-
ous bugs. It is difficult for the human mind to express and reason with temporal
relations, making temporal assertion generation very challenging.
We integrate two solution spaces, statistical, dynamic techniques (data mining)
and deterministic, static techniques (lightweight static analysis and formal verifi-
cation), to provide a solution to the assertion generation problem. Static analysis
can make excellent generalizations and abstractions, but its algorithms are limited
by computational capacity. Data mining, on the other hand, is computationally
efficient with dynamic behavioral data, but lacks perspective and domain context.
We present GoldMine, a tool for automatically generating RTL assertions. An
RTL design is simulated using random vectors to produce dynamic behavioral
data for the system. This data is mined by advanced data mining algorithms to
produce rules that are candidate assertions, since they are inferred from the sim-
ulation data, but not for all possible inputs. These candidate assertions are then
passed through a formal verification engine along with the RTL design to filter
out spurious assertions and retain the system invariants. Static behavioral analysis
techniques are employed to guide the data mining process. A designer evaluation
and ranking process is facilitated in GoldMine to provide useful feedback to the
iterative data mining process.
GoldMine proposes a radical, but powerful validation paradigm. It uses two
high impact technologies, data mining and static analysis, symbiotically to as-
similate the design space. It then reports its findings in a human digestible form
(assertions) early on and with minimal manual effort. This technique is intended
to replace the traditional method of the engineer deducing all possible correct be-
haviors, capturing them in assertions, testing assertions, creating directed tests to
observe behavior and finally applying random stimulus.
Random stimulus is applied late in the validation phase, when the design and
assertion-based verification environment are mature enough to withstand and in-
terpret random behavior. GoldMine explores the random stimulus space and dis-
tills it into assertions that a human can review. GoldMine’s data mining, then,
gains knowledge about design spaces that are as yet unexplored by a human-
directed validation phase. Eventually, the manual, iterative process of validation
will arrive at a point of high coverage. Using GoldMine, however, this step can
be done very early in the design, making a quantum leap in the validation cycle.
If an unintended invariant behavior is observed, a bug is detected. Otherwise, an
2
assertion that can be used for all future versions of the design has been generated.
GoldMine is best utilized in the regression test suite of an RTL design.
GoldMine is completely automatic. It is able to generate many assertions per
output for a large percentage of module outputs in very reasonable runtimes (see
case study). It has the ability to minimize human effort, time and resources in
the long-drawn assertion generation process and increase validation productivity.
Along with input/output or propositional assertions, GoldMine can also gener-
ate temporal assertions in Linear Temporal Logic [9].1 GoldMine can generate
assertions that are complex or span multiple logic levels in the RTL.
The contributions in this work are as follows.
• We present a tool which can be used to generate assertions automatically.
• Our tool can produce complex assertions for both combinational and se-
quential designs, a feat which has not been accomplished by any other
known tools.
• We enable the discovery of design knowledge otherwise unattainable early
in the design logic.
• With GoldMine, we propose a validation paradigm that can significantly
reduce the time and effort of assertion creation.
• We explore both a decision tree based supervised learning algorithm and
an alternate coverage guided algorithm which can produce high coverage
assertions.
We demonstrate that GoldMine produces excellent results on real RTL designs in
the form of complex, high coverage assertions.
1At this time, we can generate assertions with the X operator.
3
CHAPTER 2
RELATED WORK
GoldMine is not the first tool to attempt assertion generation. Many techniques
have been used since the 1970s for both hardware and software. In this section,
we review work related to GoldMine.
2.1 Software Invariant Generation
The roots of automatic assertion generation can be traced back to the automatic
generation of invariants for proving loop behavior. Both Wegbreit [10] and Katz
[11] attempted to do this using two methods. The first method was a top-down
approach, which attempts to generalize behavior by observing the entry and exit
conditions. A second bottom-up approach involved directly observing the behav-
ior of the loop body. Caplain [12] observed that for complex programs, it was
not possible to necessarily derive the entry and exit conditions of non-trivial loops
such as nested loops. He suggested a technique which involved decomposing the
general loop body into individual expressions at each iteration of the loop and
tried to match these cases with common loop behavior. Misra [13] also extended
this work by solving the loop invariant generation problem for while loops.
When model checking became common in the 1990s, many researchers used
invariant generation as a technique for verifying properties. This technique in-
volved generating an invariant that was stronger than the given property, proving
that the property was true. Bensalem [14] presented various techniques to im-
prove invariant generation including generalized reaffirmed invariance, propaga-
tion of invariants, refined strengthening, and combining invariants. Bjorner [15]
noted that, while invariant generation was complete, it was not practical for real
world application. He suggested a technique involving “assertion graphs”, which
break down the property into individual assertions. Stark [16] worked on coupling
the heuristic and theorem proving component together, which allowed the use of
4
failed proofs to help direct the invariant search using successive approximations.
Tiwari [17] used a technique that involved generating an upper and lower bound
for the reachable state set and iterating until the upper bound provided the best in-
variant. Pasareanu [18] also extended this technique by using symbolic execution
for checking that the generated invariant is inductive. This research included an
implementation for property checking of Java programs.
It became apparent that using static analysis for generation of invariants was
not scalable for complex programs. This invited the use of dynamic analysis as an
alternative. Cheng [19] uses data mining to produce a set of invariants based on
the execution of the software. These invariants are then used to limit the search
space, making model checking much more tractable. Ammons [20] has explored
data mining as a method for generating program specifications. This tool ana-
lyzes program behavior and summarizes frequent patterns as state machines. This
specification can also be used to identify bugs if a designer encounters incorrect
behavior in the generated specification.
DAIKON [21, 22, 23] is a tool that has been developed for generation of asser-
tions in software. This tool performs dynamic analysis of a simulation trace to try
and match behavior to a set of pre-defined property templates. These templates
specify that a variable is constant, non-zero, or within some range. In addition,
it can determine some more complex properties such as linear relationships or
library functions.
2.2 Assertion Generation for Hardware
One of the first attempts at assertion generation for hardware designs was devel-
oped by Wang [24] who observed that manually generating assertions would not
only cut down on the manual effort, but increase the efficiency and quality of
verification. This technique relies on abstracting out the sequential elements of
the design and considering only the combinational elements. The tool then uses
symbolic values to define the logic of the design. This methodology was used for
verification of the memory arrays of the PowerPC.
Hekmatpour [25] proposed a schema-driven assertion generation strategy based
on the block-level structure of the design. First, a schema is defined or selected
from a pre-existing library which defines the actions that need to be taken. This
schema is then used for constructing assertions based on the information collected
5
from the design. This system supports both interface assertions and interconnect
assertions.
Hangal [26] presents the IODINE tool, which has a striking similarity to
DAIKON. IODINE uses a set of pre-defined property templates and attempts to
match these templates with actual behavior that is observed during simulation.
This tool supports one-hot, mutual exclusion, state traversal, req-ack protocol,
and scoreboard-related assertions.
Rogin [27] presents the Dianosis tool which is not restricted by pre-defined
property templates like IODINE and DAIKON. The main idea behind this tech-
nique is to combine existing basic assertions to generate new, complex assertions.
If no basic properties exist, an approach similar to IODINE is used to generate an
initial set of basic properties. The combined assertions are checked using the sim-
ulation trace. Any combined assertions which do not comply with the simulation
trace are discarded.
6
CHAPTER 3
BACKGROUND
3.1 The Hardware Design Cycle
To understand why verification is important and what methods are used for testing
circuits, it is important to understand the hardware development cycle. The first
step in the hardware development cycle is the specification stage, where archi-
tects will specify the behavior of a circuit. This may include creating system-level
models to simulate this behavior using tools like SystemC [28]. The next step
is to specify the Register Transfer Level (RTL) implementation using a hardware
design language (HDL) such as Verilog [29] or VHDL [30] which describes the
flow of data in a circuit and how that data is manipulated to achieve the desired be-
havior. The RTL implementation is then synthesized into a gate-level implemen-
tation, which specifies how the circuit must be constructed out of individual logic
gates. This gate-level implementation is then mapped out to determine where the
transistors and wiring will be physically located on a chip. This physical layout is
then manufactured at a fabrication plant where the circuits are printed onto silicon.
This silicon is placed into a package which can interface with other systems.
Since there is so much work and cost that goes into each step of this cycle,
hardware designers put an extremely large effort into making sure that each step
is done correctly. Making a mistake in one of the steps means that all of the
following steps will be wrong, costing even more time and money. In this thesis,
we focus on the testing of the RTL design. There are many strategies used in the
testing of the RTL design. The first testing strategy is known as a directed test,
which involves biasing the inputs in a certain way to create expected behavior.
The directed tests are often paired with mechanisms which check the outputs and
internal state to ensure that the expected behavior and the actual behavior match.
Another strategy is to randomize the input stimuli to create completely random
behavior. This random simulation is paired with many checkers that ensure that
7
circuit behavior is legal for the system. The last strategy is called assertion based
verification.
3.2 Assertions
The idea of an assertion was first proposed by Alan Turning, who suggested
breaking down a large software routine into a set of properties which could each
be checked [3]. Later, Robert Floyd developed a formal system for reasoning
about flowcharts [31] which was then refined by C. A. R. Hoare [32]. The sys-
tem was adapted for use in software verification which allowed a programmer
to check that certain conditions did not occur [3]. Hardware design and verifi-
cation was a largely manual process until the VHDL became a standard in 1987
[30]. VHDL supports the ‘assert’ keyword, which allows a designer to specify a
condition that must always evaluate to true. Around this same time, formal verifi-
cation of assertions was also introduced which allowed assertions to be formally
proved [33]. However, the power of assertions was limited until hardware verifica-
tion languages (HVLs) were developed which introduced the concept of assertion
based verification (ABV) [34]. Today, there are many different HVLs which en-
able ABV such as SystemVerilog [35], OpenVera [36], and Property Specification
Language [37].
Assertion based verification [3, 2] involves defining desired properties of the
hardware design and asserting that those properties are never violated. These
assertions can be paired with a dynamic method, such as directed tests or random
simulation, and will give an error if the property is violated. In addition, a tool
called formal verification is a static method that creates a model of the design and
checks if the assertion can ever be violated. Formal verification either guarantees
that the property can never be violated or gives a counterexample that shows how
the assertion is violated. In addition to RTL testing, assertions can be physically
synthesized into silicon and used for checking after the chip has been fabricated
[2, 5]. Because of their power and versatility, assertions have become the most
popular method of verifying an RTL design [6].
However, assertion based verification has a significant drawback. Assertion
generation up until this point has been a manual effort. Assertions must be spec-
ified by the designer or the verification engineer. This can be easy enough for
simple combinational properties, but for complex temporal properties, it can be
8
very time consuming. In addition, it is difficult to reason between module bound-
aries. Even if the assertion is correctly specified, certain constraints must also be
specified for the assertion to be true. It can also be difficult coming up with the
right number of assertions. If the set of assertions is too small, it will not provide
very good coverage of the design, leading to a large number of bugs. It can be easy
to provide high coverage if there are a very large number of assertions, but this can
take a very long time to produce. Additionally, a large set of assertions can also
make simulation very slow and synthesis for post-silicon verification impossible
if the area is too large. This means that it is up to the designers to produce a mini-
mal set of assertions that also provides high coverage of the design. This process
can take up a large percentage of the design cycle [3, 7, 8], resulting in many lost
months of productivity. The solution to this problem is taking the manual effort
out of assertion generation.
3.3 Data Mining
Data mining is a relatively young field that developed as a means for organizing
and analyzing the information stored in databases [38]. There are many forms of
data mining such as frequent pattern mining, sequential mining, and clustering.
However, we will focus on frequent pattern mining since this is the type of min-
ing we currently use in GoldMine. In general, frequent pattern mining involves
finding correlations, or patterns, between items.
3.3.1 Decision Tree Based Learning
The decision tree [39, 40] algorithm works by making successive recursive splits
on a database in relation to a target item. Each split implies that a new item from
that database has been added to the set of items, referred to as the itemset. These
splits are based on statistics referred to as mean and error. Mean refers to the
average value of the target item in the database. The error refers to how well the
items in the pattern correlate with the target item. The goal is to find a correlation
between the target item and the items in the pattern by reducing the error.
For example, consider a database that contains the items which were purchased
by customers at a supermarket. Each transaction has a Boolean value associated
with each item indicating if that item was purchases (1) or not (0). We want to
9
see what items are frequently purchased along with the target item, “milk”. The
decision tree observes that splitting on the item “bread” reduces the error more
than splitting on any other item. This means that the decision tree will partition
the database into entries where bread = 1 and entries where bread = 0. Bread is
added to the itemset and the recursive process continues for each set of database
entries. The result is a tree structure that predicts whether milk is likely to be
purchased depending on the other items that are purchased.
3.3.2 Association Rule Learning
Association rule mining [41] is a data mining method that attempts to generate all
possible correlations between items. This is done by recursively adding items to
an itemset until that itemset is frequently correlated with some target item. Though
this algorithm has an exponential complexity in the worst case, high efficiency is
achieved by applying constraints and using pruning techniques.
Considering the example above, we want to check what items are purchased
along with milk. The algorithm attempts to match each single item with milk to
determine if there are a significant number of transactions to consider this a valid
pattern. After this step, all possible sets of two items are checked for correlation
with the target item. This process continues until all possible combinations of
items are tested for correlation with milk. This algorithm gives all likely correla-
tions with milk, though the runtime may make it intractable. Significant effort is
put into pruning the search space to make this algorithm reasonable to use.
10
CHAPTER 4
GOLDMINE: ASSERTION GENERATION
METHODOLOGY
We propose GoldMine, a methodology to automatically generate assertions using
data mining and static analysis There are five main parts in GoldMine, as shown
in Figure 4.1.
RTL
2. Pre−Si formal verification
3. Fault Tolerance and Reliability
4. Post−Si Validation
5. Code Optimization
Assertion consumers
Design
Target
Static Analysis
Simulation
Traces
A−MINER
Feasible static execution traces
Invariants
Likely
Counter−examples
Designer Feedback Loop
System
Invariants
ASSERTIONS
Data Generator Evaluation and
Ranking
A−VAL
1. Pre−Si dynamic analysis
Verification
Formal
Figure 4.1: Goldmine Tool Suite
4.1 Data Generator
The Data Generator simulates a given design (or a “module” of the design). If
regression tests or workloads for the design are available, they can be used to
obtain the simulation traces. GoldMine also generates its own set of simulation
traces using random input vectors.
Typically, simulating with randomized inputs produces the largest number of
true assertions. We used a script to generate a testbench for each verilog design
that we wanted to test. In the testbench, each input bit is assigned with a com-
pletely random value for each cycle by using the verilog$random function. We
have the ability to expand this method in the future by constraining the random
input values using background information where certain input combinations may
not be allowed. For most of our tests, we simulate for 10,000 cycles, though we
can increase this number for extremely large or complex designs.
This chapter includes work previously published in [1].
11
4.2 Lightweight Static Analyzer
The static analyzer extracts domain-specific information about the design that can
be passed to A-Miner. It can include cone-of-influence, localization reductions
[42], topographical variable ordering, and other behavioral analysis techniques.
The current version of the tool only uses static analysis for logic cone informa-
tion. The logic cone of a signal consists of all of the inputs which can influence
the value of a given output. Since data mining methods can only use statistical
methods to infer relationships between signals, it is possible that an unrelated
input may be correlated to an output. The logic cone prevents this problem by
restricting the searched inputs to only those which are related to the output. This
static analysis is also advantageous in that it decreases the runtime in many data
mining algorithms since there are fewer inputs to consider.
We have developed a script for generating the logic cone of an output. This
script first synthesizes the target RTL into gate-level RTL and flattens the hier-
archy, making it easier to parse. Then the script analyzes each gate and records
which input signals influence the output of the gate to generate a one-level-deep
logic cone for each internal signal and primary output. Based on these one-level
logic cones, the script recursively adds the logic cones of the signals in each pri-
mary output’s logic cone until a full logic cone has been produced.
4.3 A-Miner
The A-Miner phase derives knowledge and information from the simulation trace
data. This is done by searching for correlations between the inputs and a target
output. For example, in a simulation trace, whenever inputs A and B are both
1, the output C is also 1. A data mining algorithm can quickly and efficiently
recognize this pattern. Data mining algorithms use statistics such as support and
confidence to determine whether there is actually a relationship between the inputs
and target output. Given a rule A =⇒ B (henceforth of the form if a then b),
support(A) is the proportion of instances in the data that contain A. Confidence
can be interpreted as an estimate of the conditional probability P (B|A). If a rule
has 100 percent confidence, it means that within the data set, there is complete
coincidence between A and B. A high support for this rule means that A occurs
frequently in the data set. In GoldMine, we must guarantee that the confidence
12
is 100% if we want to generate an assertion that is likely to be true. The reason
for this is that if a given antecedent is correlated with an output that has multiple
different values, then that cannot be an assertion since the antecedent does not
imply a single value.
A-Miner also provides hooks for incorporating domain specific information
from the lightweight static analyzer into the mining algorithms. The data min-
ing algorithm allows specification of which inputs have a relationship with the
target output as determined by the logic cone. In addition, this phase of GoldMine
can have multiple feedback loops from different parts of the tool. Using the infor-
mation provided to it, the A-Miner produces a set of candidate assertions which
are likely to be true. Objective measures of interestingness [43] can be used to
rank this set of candidate assertions, such as the support as specified above.
4.4 Decision Tree Based Supervised Learning
Algorithms
Association rule based data mining algorithms find all possible associations be-
tween sets of predicates and rank them according to support/confidence. For se-
quential blocks that might have temporal properties, exhaustive search is an inef-
ficient option in our experience (see case study).
We primarily use decision tree based supervised learning algorithms [39] in
A-Miner. In a decision tree, the data space is locally divided into a sequence of
recursive splits in a small number of steps. A decision tree is composed of internal
nodes and terminal leaves. Each decision node implements a “splitting function”
with discrete outcomes labeling the branches. This hierarchical decision process
that divides the input data space into local regions continues recursively until it
reaches a leaf.
We require only Boolean splits (for Boolean variables) at every decision node.
The error function implemented to select the best splitting variable at each node
is the variance between the target output values and the values predicted by a can-
didate antecedent. The winner is the one whose error is minimum, which then
forms the next level of the decision tree. Each leaf in the decision tree becomes a
candidate assertion where the variable and value at each split represents a propo-
sition in the antecedent and the mean of the output represents its predicted value
in the consequent.
13
The decision tree algorithm used in GoldMine is shown in Algorithm 1. The
decision tree function has three inputs: F represents the set of inputs that are
available to split on, P represents the set of propositions in the antecedent of an
assertion, and E represents the set of simulation trace samples. In addition, Ac
represents the set of candidate assertions and z represents the output for which
assertions are being mined.
The mean function calculates the mean of the values for z in each sample and
represents the expected value of z. The error is a function that calculates the
absolute deviation of the output value in each sample from the expected value.
Other functions, such as variance, can be used as an error function. The error
function will be high when there is a lot of deviation in the output’s value in each
sample and it will be zero when the output’s value is the same in each sample.
The algorithm first checks if the error of the simulation trace is zero. If so, a
candidate assertion is added to Ac where P represents the set of propositions in
the antecedent and the output is equal to the mean in the consequent. If the error is
zero, it indicates that all values of the output are the same, meaning that the mean
is equivalent to the value of the output in all samples.
If the error is not zero and an assertion cannot be created, the algorithm looks
for a suitable input in F to split on. The potential error is calculated based on
partitioning the simulation data into only the samples where fi = 0 and only the
samples where fi = 1. The potential error of each set of samples is summed
and subtracted from the error of the unpartitioned data set. This is the potential
error reduction for splitting on Fi. The fi that results in the best error reduction is
chosen as the splitting variable. The algorithm recurses with the splitting variable
removed from F . One instance of decision tree will add fbest = 0 to P and have
E partitioned with respect to 0 while the other instance will have fbest = 1 in P
and E partitioned with respect to 1.
4.5 Formal Verifier
In order to check if the likely invariants generated by A-Miner are system invari-
ants, the design and candidate assertions are passed through a formal verification
engine. If a candidate assertion fails formal verification, a counterexample can be
generated for feedback to the A-Miner. We use SMV [44] and Incisive Formal
Verifier as our formal verification engines. The candidate assertions are attached
14
Algorithm 1 Decision Tree Supervised Learning Algorithm
decision tree(F, P,E)
1: mean = mean(E, z)
2: err = error(E, z)
3: if err ≤ 0 then
4: Ac = Ac ∪ P =⇒ z = mean
5: return
6: end if
7: best reduction = 0
8: for each input in F , fi do
9: reduction = err - error(Efi=0, z) - error(Efi=1, z)
10: if reduction > best reduction then
11: best reduction = reduction
12: fbest = fi
13: end if
14: end for
15: if best reduction > 0 then
16: decision tree(F − fbest, P ∪ fbest = 0, Efbest=0)
17: decision tree(F − fbest, P ∪ fbest = 1, Efbest=1)
18: end if
to the design for verification and checked at the positive edge of the clock cy-
cle. The reset signal of the design is constrained to “off” so to prevent spurious
counterexamples. Although the attempt in GoldMine is to minimize the human
effort in the assertion generation process, we need human intervention to differen-
tiate between a spurious candidate assertion that fails the formal verification and
a genuine system invariant whose failure reports the existence of a bug.
4.6 A-Val: Evaluation and Ranking
Once the assertions have been generated through GoldMine, their evaluation is
extremely important to the process. This is because assertion generation has been
a completely manual process thus far in the system design cycle.
There are several ways for us to evaluate A-Miner’s performance. One basic
metric is the hit rate of true assertions. The hit rate of a run in GoldMine is
the ratio of true assertions to candidate assertions. This provides a very crude
indicator of performance. In addition, we can consider output hit rate, which is
the number of outputs for which GoldMine could generate a true assertion over
15
the total number of inputs.
Since there are no commercially used metrics for evaluating the coverage of an
assertion, we have devised a method to evaluate assertion coverage. It should be
noted that this metric has no relation to standard coverage metrics such as code,
branch, or path coverage. The reason for this is that those metrics are used for
judging the quality of a directed test suite, which means that they cannot be ap-
plied to a set of individual assertions. We can evaluate the coverage of an assertion
by considering the input space that is covered by the antecedent of the assertion.
If we consider the truth table with respect to some output, each entry that corre-
sponds to the propositions in the antecedent of an assertion is defined as covered
by that assertion. For example, if there is an assertion (a = 1&b = 1 =⇒ c = 1),
we can consider the input space coverage to be 25% since we know that 25% of
the truth table entries contain a = 1, b = 1. The reasoning behind this thinking
is that if there is a set of assertions that covers each entry in the truth table of an
output, that output is well covered by the set of assertions. This metric is simple
to calculate since we can determine the percentage of the input space that an an-
tecedent of an assertion covers without knowing every single input combination.
The input space coverage is defined as 1/2|P | where |P | is the number of proposi-
tions in the antecedent. Based on this definition, it can be seen that the input space
coverage is relative to the number of propositions in the antecedent. It should also
be noted that this notion of coverage can be extended to sequential designs. If we
consider an unrolled circuit where each signal, s, in the truth table is represented
at the current time, s[t], we can consider the signal at each time cycle before it,
s[t− 1], s[t− 2],...,s[t− n]. Given that n is large enough, we will always be able
to represent this coverage accurately in these terms.
In order to bridge the gap between the human and the machine generated asser-
tions, human judgment can also be made a part of the GoldMine process where
the designer ranks the true assertions according to some pre-defined ranks. This
provides an objectification of an inherently subjective decision and can be used as
feedback into A-Miner, with a view to predict the ranking of a generated assertion
and optimize the process for achieving higher ranks.
16
4.7 Temporal Assertions
There are some single-cycle assertions which are interesting, but it can be even
more interesting to see assertions which span several cycles. These multi-cycle as-
sertions can be found without having to change the data mining algorithm. When
the simulation trace is produced, each signal in a sample refers to the value of
that signal at the current time, t. The maximum length of a temporal assertion is
user-specified as l. We want to represent the signals at previous time cycles t− 1,
t− 2, ..., t− l in this sample. For each signal, we can determine the previous val-
ues of that signal by checking the samples representing a previous time before the
current sample. Now that there is data representing each signal over a number of
cycles, the data mining algorithm can proceed as normal to look for relationships.
For example, consider a protocol that asserts ack = 1 two cycles after req = 1.
The simulation data for this module is shown in Table 4.1.
Table 4.1: The simulation data for a req/ack protocol
time req ack
0 0 0
1 1 0
2 0 0
3 0 1
4 0 0
We set our maximum assertion length l = 2 and perform the necessary data
transformation. The resulting simulation data are shown in Table 4.2. Since there
is no information for [t − 1] in cycle 0 or [t − 2] in cycle 0 or 1, we must discard
the data in cycle 0 and 1.
Table 4.2: The previous cycle information is added to enable temporal assertion mining
time req[t] ack[t] req[t− 1] ack[t− 1] req[t− 2] ack[t− 2]
0 0 0 x x x x
1 1 0 0 0 x x
2 0 0 1 0 0 0
3 0 1 0 0 1 0
4 0 0 0 0 0 0
The data mining algorithm used for single-cycle assertions cannot be applied.
In cycle 3, there is a clear relationship between ack[t] and req[t−2] which results
in the assertion req = 1XX => ack = 1. This assertion represents the expected
behavior for the protocol.
17
4.8 An Example Run through GoldMine
    int.L1_hit = int.has_dreg
        wb_valid0 = 1;
    else
        wb_valid0 = 0;
          int.has_dreg)
    if (int.valid && 
always @ *
always @ *
int.valid
0
0
1
1
0
int.L1_hit
0
1
int.has_dreg
0
0
1
wb_valid0
1
1
0
1
0 0
Figure 4.2: Example RTL and Data Generator traces
Consider the fragment of the Rigel processor RTL source code shown in Figure
4.2. This code implies that writeback on port 0, wb valid0 is valid if the integer
writeback signal int.valid is set and a register is available int.has dreg.
This event updates the L1 cache hit rate. We now illustrate the GoldMine assertion
generation process on this code. The Data Generator runs a few simulations and
produces the simulation results shown in the table in Figure 4.2.
In the absence of any guidance from the static analyzer, A-Miner forms a de-
cision tree for the data. Figure 4.3 shows this process. The mean of wb valid0
is set to 0.25 (average of its values) and the error is set to the absolute difference
from the mean, 0.375. The decision tree now tries to split based on the maximum
error reduction among all the input values. The values of error for the 0/1 values
of int.valid, int.L1 hit and int.has dreg are (0,0.5). Since all values
(0/1) of all inputs produce equal error, and in the absence of any guidance from
the static analyzer, the decision tree uses the simple heuristic of splitting on the
first variable in the list, int.valid. On the int.valid = 0 branch, error is
reduced to 0, making it a leaf node. A0: if (int.valid = 0) then (wb valid0 = 0) is the
candidate assertion generated. Since the error value has not yet reached 0 on the
int.valid = 1 branch, the decision tree tries to split again. Although the value
of int.has dreg is the variable that affects the output of interest, the splitting
variable is int.L1 hit since the error reduction for all variable values are equal,
and it is first in the list. Since both branches of the tree at this level reach error = 0,
the leaves produce A1: if (int.valid = 0) and (int.L1 hit = 0) then (wb valid0 = 0) and
A2: if (int.valid = 1) and (int.L1 hit = 1) then (wb valid0 = 1) as candidate assertions.
All candidate assertions A0, A1, A2 are passed to a formal verification engine,
that passes A0 and A1, but fails A2. Hit rate is 2/3 in this case. A3 fails due to the
18
false causality that is established by simulation data.
In the presence of the lightweight static analyzer, the logic cone-of-influence
information would suffice in this case. The logic cone establishes the part of the
design that is causal to int.valid, providing a list of variables to the decision
tree that excludes int in.L1 hit. The corresponding decision tree is shown in
Figure 4.3. The candidate assertions produced now are A0 (same as in previous
case), A1: if (int.valid = 0) and (int.has dreg = 0) then (wb valid = 0) and A2: if (int.valid
= 1) and (int.has dreg = 1) then (wb valid = 1). All these candidate assertions are
passed by the formal verifier, with a consequent hit rate of 1.
There are three disadvantages of the temporal assertion mining method. The
first is that there must be a user-specified bound on the maximum number of cy-
cles in an assertion, l. The second is that as l increases, the runtime of the algo-
rithm increases since the number of signals that the data mining algorithm needs
to search has increased. The third disadvantage is that as l increases, the quality of
the generated assertions can decrease since the number of inputs can get so large
that making a good splitting decision is difficult. These disadvantages can be mit-
igated by using background knowledge of the design to choose a good maximum
cycle length, l, or testing several different values for l to optimize results.
Mean: 0
Error: 0
Mean: .25
Mean: 1
Error: .38
Error: 0
Mean: .5
Error: .5
Error: 0
Mean: 0
Mean: 0
Error: 0
Mean: .25
Mean: 1
Error: .38
Error: 0
Mean: .5
Error: .5
Error: 0
Mean: 0
int.valid == 0 int.valid == 1
int.L1_hit == 0 int.L1_hit == 1
int.valid == 0 int.valid == 1
wb_valid0 wb_valid0
Without Static Analysis With  Static Analysis
int.has_dreg == 0 int.has_dreg == 1
Figure 4.3: Example Decision Tree Output with and without Static Analysis
19
4.9 Applications of GoldMine
Though GoldMine is an interesting tool, it can be difficult to see how it can be
used is a realistic verification environment. Since GoldMine produces assertions
based on RTL which are then verified using formal verification, it is trivial that
generated assertions will pass on the given RTL. The beauty of this tool is that it
can actually be applied in a number of ways, included applications that have not
even been developed yet.
One way to use GoldMine effectively is to use the assertions as a regression test
throughout development. The assertions that are true in one revision may fail in a
later revision. This can indicate that the assertions are no longer relevant, which
indicates that those assertions must be updated. However, it can also indicate that a
revision of the design introduced a bug which the assertion can help to locate. For
example, GoldMine is used on an ALU unit and produces a set of assertions. The
ALU is then revised to make a certain function faster. If there are any assertions
that fail, it likely indicates that there is a bug in the revised code.
When using random testing to verify a design, it can be difficult to determine
the number of cycles to simulate before declaring a unit fully verified. One way to
measure testing completeness is to use standard coverage metrics, but this method
only gives a very general idea of the coverage. GoldMine can also be used in ad-
dition to standard coverage metrics to increase confidence of a design. The trace
from the random test simulation can be mined for assertions using GoldMine. Any
assertion mined from this trace indicates behavior that is covered in the simulation
trace. This means that if the assertions generated in GoldMine have a high cover-
age, it is likely that a high percentage of design behavior has been covered in the
random test. If the assertions generated do not have high coverage, the simulation
likely needs to run for more cycles.
20
CHAPTER 5
A CASE STUDY OF A MULTI-CORE RTL
We first present the results of applying GoldMine to the 1000+ core Rigel RTL
design. Our intention is to use assertions from GoldMine to provide a regression
test suite for the Rigel RTL that is in the later stages of its evolution. We generated
assertions for three principal modules in Rigel: the writeback stage, the decode
stage and the fetch stage. The writeback stage is a combinational module with
interesting propositional properties. The decode and fetch stages are sequential
modules with many interesting temporal properties.
5.1 Subjective Ranking of Assertions by a Designer
We performed some experiments to help evaluate GoldMine’s assertions. We per-
formed an extensive designer ranking session for every phase of assertion gen-
eration of each module. Also, since the Rigel RTL does not have manual target
assertions to compare against, we performed a subjective, but intensive evaluation
strategy. Rankings were from 1 to 4, calibrated as below:
1. Trivial assertion that the designer would not write
2. Designer would write the assertion
3. Designer would write, captures subtle design intent
4. Complex assertion that designer would not write
The results presented in Figure 5.1 show the distribution of these ranks for a
sample of representative assertions over all the modules. The algorithmic knobs
that produced the highest hit rate as well as the highest number of assertions were
This chapter includes work previously published in [1].
21
Ranking 4:
17.65%
Ranking 3:
1.96%
Ranking 2:
62.75%
Ranking 1:
17.65%
 
 
All Modules
(Sequential & Combinational)
b
Ranking
 
 
Level 3 or more:
13.73%
Level 2: 
15.69%
Level1: 
70.59%)
Complexity
a
Figure 5.1: GoldMine assertion complexity; ranking by designers
turned on for this experiment. Most assertions in this analysis rank at 2. The
writeback module has some assertions ranked 3. The absence of 3 in the sequential
modules, according to the designers, is due to the fact that intra module behavior
is not complicated enough to have many subtle relationships. For example, an
assertion ranked 1 is: If the halt signal in the integer, floating point, and memory unit is
set to 0, the halt signal is 0. In the RTL, the halt signal is a logical OR between the
integer, floating, and memory units. GoldMine found a true, but over-constraining
rule. The designers ranked it 1, since they would not have written this rule. Now,
consider this RTL code:
decode2mem.valid <= valid_mem &&
!issue_halt && !branch_mispredict &&
fetch2decode.valid && !follows_vld_branch
An assertion ranked 2 is if branch mispredict is high, decode2memvalid will be high
in the next cycle. An assertion ranked 3 is If an integer unit does not want to use the
first port, and the floating point unit does not want to use the second port, then the second
port remains unused.
22
5.2 Complex Assertions in GoldMine
Despite the small size of the modules, GoldMine achieved rank 4, i.e. it produced
assertions that capture complex relationships in the design. This is an advantage
of mechanically derived assertions: they are able to capture unintentional, but
true, relationships that can be excellent counter checks and can be brought to the
designer’s attention. We assessed complexity by the number of levels (depth) of
the design captured by assertions. In a few cases, the assertions capture tem-
poral relationships that are more than 6 logic levels deep in the design. This
provides a different perspective on the RTL, outside of the expectation, but may
provide avenues for optimizing or analyzing the RTL. For example, the RTL has
the following relationship:
if( choice_mem)
decode_packet <= decode_packet1;
An assertion ranked 4 is: if (reset=0) and (issue0=0) and (decode packet dreg=0),
and in the next cycle if (instr0 issued = 0), then decode packet dreg = 0. This assertion
relates a single field in the decode packet variable to reset and instr0 issued, both of
which are related to choice mem when the code is traversed beyond 6 levels of (se-
quential) logic. Such a relationship would have been extremely hard to decipher
through static analysis and code traversal. To the best of our knowledge, there is
no state-of-the-art tool/technique that can claim to decipher such complex asser-
tions. Figure 5.1 shows the distribution of assertions with respect to complexity.
5.3 Outputs Covered by GoldMine
Table 5.1: Percentage of outputs covered by GoldMine for Rigel
time Outputs Covered
Decode Stage 46.76%
Fetch Stage 35.71%
Writeback Stage 87.50%
Table 5.1 shows the number of outputs per module for which assertions were
generated by GoldMine. Although candidate assertions were generated for all the
module outputs, the assertions that passed formal verification covered a percent-
age of them. Figure 5.2 shows the probability distribution of true assertions per
23
output. At the 50% mark, there will be approximately 4 to 5 unique assertions per
output in the decode module, Although we are not able to get a precise notion of
path coverage per output signal, the unique assertions per output are indicative of
high path coverage.
0.01 0.1 1 10 100 1000
1
10
100
 
 
C
um
ul
at
iv
e 
D
is
tr
ib
ut
io
n 
Fu
nc
tio
n 
(%
)
# Assertions/Output
50%
Median = 4.6 
assertions/output
Figure 5.2: Distribution of unique assertions per output in all modules
5.4 The Acid Test: Regression Test Experiments
As a final evaluation of the entire regression suite of GoldMine assertions, we
appended them in the RTL and ran a new set of directed Rigel tests.
We will analyze the results for the writeback module, since the fetch and de-
code are very similar. We used Synopsys VCS with RTL conditional coverage
for procuring coverage of the directed tests. We used the conditional coverage
metric since unique assertions in GoldMine pertain to different paths. This metric
is meaningful for us since it examines individual path conditions in generating an
output.
The writeback module directed tests achieved 76% conditional coverage, while
the random tests used to generate the GoldMine assertions achieved 100% con-
24
Figure 5.3: The added coverage of design behavior through GoldMine assertions for the
writeback module
ditional coverage and generated 200 unique assertions. When the GoldMine as-
sertions were included in the directed test runs, 110 (55%) of the assertions were
triggered1 by the directed tests. Therefore, 90 assertions, or 45%, refer to de-
sign behavior as yet untested by the directed tests. Figure 5.3 shows the overlap
of assertions with directed tests. This highlights the value of GoldMine, since it
provides significant coverage of the unexplored regions of the design at this early
stage.
The overlapping assertions that coincide with the designer-crafted directed tests
can be used for static checking, formal verification, etc. However, the untouched
assertions can be used to improve the quality of the directed tests. They can be
used as regression checks as the test patterns mature and the regression test suite
evolves. It is probable that the manual assertion generation process would eventu-
ally get to this point after multiple iterations. In contrast, GoldMine, a mechanical
assertion generator, could explore the design space far beyond the human gen-
erated tests. The designers of Rigel have evaluated GoldMine’s contribution as
“covering a wide design space much earlier in the design cycle than typically
achievable” [45].
1An assertion is triggered if the antecedent condition evaluates to true.
25
CHAPTER 6
SCALING GOLDMINE TO INDUSTRIAL
DESIGNS: THE OPENSPARC T2
It is difficult to properly assess the utility of GoldMine without testing the tool on
an industrial size design. This can be difficult since most companies are protective
of their IP and will not distribute their HDL designs. However, Sun provides a few
open source designs for the UltraSparc series of CPUS. Sun’s OpenSparc T2 CPU
[46] is a many-threaded, open source design which makes it an optimal example
to demonstrate GoldMine.
For our initial tests, we have isolated the memory management unit (MMU) of
the core for GoldMine assertion generation. This unit reads the TLB for the data
and instruction caches and performs a page table walk in the case of a miss. This
unit has 59 inputs, 54 outputs, and 313 internal signals. We searched for assertions
for the 16 outputs for which we could generate a significant number of samples
using random input vectors. The A-Miner searched for correlations between each
output and all of the inputs and internal signals. If no logic cone information is
used, the total number of bits that the decision tree can split on is nearly 3000,
making this a complex test for the A-Miner. We performed tests with 10,000 and
1 million cycles worth of simulation data for assertion generation.
6.1 Evaluation of True Assertion Success Rate
The first metric for gauging the success of GoldMine is to determine what per-
centage of outputs had at least one valid assertion generated. We compare several
different configurations of GoldMine for this statistic. Our first configuration uses
10,000 cycles of simulation data and no logic cone of influence. The second also
uses the same 10,000 cycles of simulation data, but includes the logic cone infor-
mation. The third and fourth configurations use 1,000,000 cycles of simulation
data. Similarly, the third configuration does not use logic cone information and
the fourth configuration does. In Figure 6.1, we can see that both increasing the
26
number of cycles of simulation data and using logic information can increase the
number of outputs with at least one true assertion. Also interesting to note is that
only using the logic cone can result in more outputs being covered than increasing
the number of cycles of simulation data by two orders of magnitude.
Figure 6.1: Percentage of outputs for which at least one true assertion was generated
6.2 Evaluation of Assertion Input Space Coverage
In the process of evaluating the assertions generated for the Rigel design, we had
the liberty of having the generated assertions ranked by the actual designers of the
modules. However, we do not have the same ability to get subjective rankings on
the OpenSparc CPU. Because of this, we have to use an objective way to assess
the quality of the assertions. To do this, we use input space coverage as defined in
Section 4.6. For this experiment, we use simulation data generated using 10,000
and 1,000,000 cycles of random input stimulus. As shown in Figure 6.2, we can
see that GoldMine is able to produce assertions with good input space coverage
with only 10,000 cycles of simulation data. The input space coverage when using
a large amount of data produces a set of assertions with even greater input space
coverage. However, as we can see in Figure 6.3, the total number of assertions
increases greatly to account for the new coverage.
27
Figure 6.2: Input space coverage for MMU
6.3 Evaluation of the Percentage of Complex
Assertions
To assess the complexity of the assertion sets, we again must use an objective mea-
sure since we do not have a designer to review the assertions. We can consider
the complexity of an assertion to be relative to the number of propositions in the
antecedent of an assertion. For this experiment, we made a statistic of the num-
ber of assertions that had more than 10 propositions based on the intuition that it
would be difficult for a verification engineer to develop an assertion with this com-
plexity. The percentage of complex true assertions out of the total number of true
assertions is shown in Figure 6.4. There are certain outputs which have a higher
percentage of complex assertions which can be attributed to the complexity of the
logic corresponding to that circuit. This figure also shows that the experiment
with 1,000,000 cycles tends to produce more complex assertions since complex
behavior is more likely to appear in a larger random input simulation trace.
28
Figure 6.3: Total number of true assertions generated for MMU - Logarithmic
6.4 Comparing the Generated Assertions with the
OpenSparc Specification
To further judge the quality of the assertions generated by GoldMine, we ob-
serve some candidate assertions generated for the L2 cache controller (L2T) of
the OpenSparc SoC. These candidate assertions are generated with respect to
the L2 pipeline stall signal. To understand these assertions, the circuit behav-
ior must first be understood. A stall (l2t pcx stall pq) is signaled when the input
queue (IQ), which contains requests for the L2 cache, is full. There are two sig-
nals which control whether a request is added or removed from the queue. The
data ready signal (pcx l2t data rdy px2 d1) indicates that a request will be added
to the queue, causing an increase in queue size. The input queue select signal
(arb iqsel px2 d1) indicates if a request can be removed and processed by the L2
cache, causing a decrease in queue size.
A complication to this is that the data ready and IQ select signals are passed
through a series of flops before they are evaluated to determine whether or not
there is a stall. The input signals to the L2 cache for the data ready signal is
pcx l2t data rdy px1, while the input signal for IQ select is arb iqsel px2. The
chart in Table 6.1 shows that the input for IQ select arrives one cycle before being
evaluated, while the input for data ready arrives three cycles before.
29
Figure 6.4: Percentage of true assertions which have greater than 10 propositions in the
antecedent
Table 6.1: The temporal relationship between signals
Signal Cycle t-3 Cycle t-2 Cycle t-1 Cycle t
IQ Select – – arb iqsel px2 arb iqsel px2 d1
Data Ready data rdy px1 data rdy px1 fnl data rdy px2 d1 data rdy px2 d1
The candidate assertions generated by GoldMine are shown below.
Candidate Assertions:
@(posedge gclk)
gm1: l2t pcx stall pq=1 |=> ##1 l2t pcx stall pq=1;
@(posedge gclk)
gm2: pcx l2t data rdy px1==1 ##2 arb iqsel px2==0 &&
l2t pcx stall pq==0 |=> ##1 l2t pcx stall pq==0;
@(posedge gclk)
gm3: pcx l2t data rdy px1==1 ##2 arb iqsel px2==1 &&
l2t pcx stall pq==0 |=> ##1 l2t pcx stall pq==0;
@(posedge gclk)
gm4: pcx l2t data rdy px1==0 ##2 l2t pcx stall pq==0 |=>
##1 l2t pcx stall pq==0;
Based on the behavior described, we can now determine the validity and useful-
ness of the given assertions. In assertion gm1, the assertion indicates that if there
30
was a stall in the previous cycle, there will also be a stall in the current cycle. This
is clearly a spurious assertion. If the queue is currently full, an instruction may be
processed causing the queue to no longer be full. This means that in the simula-
tion trace, there were many instances where a stall was followed by a stall, leading
to a correlation. However, the circuit does not behave in this way, meaning gm1
is false.
What assertion gm2 indicates is that if stall is currently not active and a request
is added, the pipeline will not become stalled. If stall is inactive, the size of the
queue is currently below capacity. Based on the inputs, the size of the queue must
increase since data ready is true and IQ select is false. This assertion will be true
most of the time, but if the queue is one below capacity, an added request will
cause it to become full. This will result in a stall. This means that gm2 is also
false.
Assertion gm3 states that if there is no stall and the queue size does not change,
the queue will remain unstalled. The queue size remains the same because data
ready is true, indicating that a request is added, but IQ select is active, meaning
that a request is also processed. If the queue is below capacity and stays the same
size, it will remain below capacity, meaning that there is no stall. Assertion gm3
is true.
In assertion gm4, if there is currently no stall and there are no more requests
added to the queue, the pipeline will remain unstalled. Since data ready is false,
the queue size must stay the same or decrease (if IQ select is active). This means
that if the queue is below capacity, it must remain below capacity. This means that
assertion gm4 is also true.
These results show that the assertions that GoldMine generates are interesting
and complex, making them good choices for including in an RTL design. It also
shows that, while it can take humans a long time to reason about circuit behavior,
GoldMine is able to do it much more quickly and efficiently.
6.5 Evaluation of the Runtime and Memory Usage of
GoldMine
Our last experiment evaluates the performance of GoldMine. In this experiment,
we compare Rigel and OpenSparc modules in terms of runtime and memory. First,
we present a comparison of the characteristics of our test modules as shown in
31
Table 6.2. We will later show how each of these factors affects the runtime and
memory consumption.
Table 6.2: The characteristics of each tested module
Module Inputs Outputs Area
Rigel - Decode Stage 2195 79 32735
Rigel - Fetch Stage 458 6 4165
Rigel - Writeback Stage 963 3 269
OpenSparc - MMU 3393 16 66395
We will first look at the runtime of GoldMine. We use a simulation trace of
10,000 cycles as well as a trace containing 1,000,000 cycles. These tests are
performed on a 2.66GHz Intel Core 2 Quad CPU with 4GB RAM. Figure 6.5
shows the runtimes without formal verification. This figure shows that, even on
a common desktop processor, GoldMine is able to produce candidate assertions
in a very short time. GoldMine runs in just minutes for both the 10,000 and
1,000,000 cycle simulation trace. It can also be observed that the runtime has no
relationship with the circuit size. Instead, it is the number of inputs, outputs, and
cycles of simulation data which affects the runtime. This means that GoldMine
has extremely good scalability.
Figure 6.5: GoldMine runtime; no formal verification
The runtime is also evaluated for GoldMine when formal verification is used.
For this test, we use a cluster of four six-core AMD Operton 8435 CPUs and en-
able parallel formal verification of assertions. Figure 6.6 shows the runtimes when
32
formal verification is used. The runtime is expectedly much higher, but even for
the complex OpenSparc MMU module, the 10,000 cycle test completes in only
one hour and the 1,000,000 cycle test completes in just over 2 hours. One of the
largest factors influencing the runtime when formal verification is enabled is the
number of candidate assertions that are generated, since each one must be verified.
A solution for the reduction of runtime would be to limit the number of candidate
assertions produced. Though this may limit the number of true assertions gener-
ated, it may be a viable choice when runtime is limited.
Figure 6.6: GoldMine runtime; formal verification enabled
Our last performance experiment is to record the maximum memory usage of
GoldMine. This test is performed on the Intel Core 2 Quad CPU. Since formal
verification does not affect the memory usage of GoldMine, it is disabled in this
test. Figure 6.7 shows the results of the test. From this figure, it is clear that
GoldMine is very efficient in terms of memory usage. Even in the worst case,
GoldMine does not exceed 1GB of memory usage in these tests. It can also be
observed that the memory usage is again not related to the area of the circuit. The
memory usage is actually related to the size of the simulation trace that must be
stored in memory, meaning that both the number of inputs and simulation cycles
affect the memory usage. The memory usage is also affected by the size of the
decision tree data structure which, in the worst case, can be exponential with the
maximum height of the tree. If we limit the height of the tree, we do not have
to worry about the tree size ever becoming a problem. Since memory usage is
33
not relative to the area or complexity of a circuit, GoldMine has great memory
scalability as well as runtime scalability.
Figure 6.7: GoldMine maximum memory usage
34
CHAPTER 7
THE EVOLUTION OF GOLDMINE
GoldMine has evolved continuously since the original concept for the tool was
developed.
7.1 Shaping GoldMine: Early Changes to the
GoldMine Methodology
In the initial phase of GoldMine development, we used an FP-Growth algorithm.
This took an unreasonable time (>10 hours) for reaching rules with just 3 predi-
cates for the decode module. We therefore resorted to the decision tree algorithm
for our purposes. The decision tree is a very fast data mining algorithm that does
not suffer from an exponential runtime like the FP-Growth algorithm does.
In the first iteration of the Data Generator, we used directed test simulations.
This data was insufficient (approximately 15 tests of 1000 samples each), produc-
ing a very low hit rate. We then used random input vector generation on the
RTL for the target modules. Even when using only 10,000 samples of simulation
data, this drastically increased the hit rate as well as number of true assertions,
demonstrating that the type and amount of data can greatly affect the results of
GoldMine. For the writeback module, we achieved a 100 percent hit rate with this
step alone.
Another aspect that had been changed is the stopping criterion of the decision
tree splitting. Our initial experiments continued the splitting process beyond the
point where the minimum error reduction was reached. This process gave us an
extremely high number of candidate assertions (>80,000) with many duplicates
(289 out of 300 in one test). In the later stages, we elected to end the decision
tree splitting when error was numerically equal to “0”, i.e. at the point of 100%
confidence, since nothing can be gained past this point.
Originally, GoldMine only worked with combinational circuits, which are inter-
35
esting, but not very useful to the average verification engineer. The reason for this
is that pattern recognition algorithms used in data mining look for correlations
that hold true in all samples, which is consistent with combinational behavior
since outputs change immediately. However, in sequential circuits, outputs do not
change until the positive edge of the clock. This means that if a sample is taken
before the clock edge, the output will contain the value determined by the inputs
at the previous clock edge, and not the inputs at the current time. This means that
no relationship can be found since the current inputs have not influenced the out-
put yet. If a sample is taken after the clock edge, the inputs have already changed
from the values that determined the current output, meaning that there is still no
relationship that can be inferred from the samples. This problem can be solved
without having to change the data mining algorithm. The data is only sampled
once per positive clock edge since that is when the interesting behavior happens.
The exact time at which the signal is sampled depends on the type of signal. If
the signal is an input, it is sampled right before the positive clock edge, and if the
signal is an output, it is sampled right after the clock edge. This makes it seem
as if the inputs and outputs have changed at the same time and the data mining
algorithm is able to find relationships between the inputs and outputs.
In the next phase of GoldMine, we added lightweight static analyzer informa-
tion that was specific to the domain, such as logic cone-of-influence generation.
Although this increased the hit rate only marginally, it increased the number of
true assertions significantly. This shows that the static analysis information was
very useful in helping A-Miner focus on the relevant neighborhood of variables to
generate candidate assertions.
7.2 Performance Enhancements
The decision tree algorithm is very quick, but the formal verification in GoldMine
can take a long time when there are many assertions to verify. By using a com-
mercial tool for formal verification instead of SMV, we were able to achieve a
significant speedup. We have also used parallelism to increase the speed of the
formal verification step. Since each assertion can be verified concurrently, several
formal verification threads can be used for a significant speedup. Because there
is some overhead in creating the model in formal verification, a small batch of
assertions is verified in each thread.
36
Since memory conservation is important for large problems, we have ported
our code from Java to C++. Since Java has dynamic memory management, it
is difficult to control the memory usage and it can be difficult to debug memory
leaks. Since C++ requires manual memory management, it is easier to keep the
memory usage low and controlled.
7.3 Improving the Core GoldMine Algorithms
Though we made many changes to the core of GoldMine, we have also extended
the GoldMine tool in several ways. The first extension to GoldMine was the addi-
tion of counter-example feedback. When the formal verification step determines
that a candidate assertion is false, it also produces a counter-example to prove
that the assertion can be violated. We can use this information to our advan-
tage and give feedback to the data mining engine. What we do is convert the
counter-example into a data sample, as if that sample were included in the orig-
inal simulation trace. This forces the data miner to reconsider the confidence of
this assertion. Since a counter-example is added, the confidence can no longer be
100%, meaning that the decision tree can continue to split. This method allows the
decision tree to continue to produce new candidate assertions until all assertions
are true. This methodology is discussed in detail in [47].
We have also explored alternative data mining algorithms to the decision tree
based supervised learning algorithm detailed in Section 4.3. A coverage guided
algorithm for generation of high-quality assertions is described and evaluated in
Chapters 8, 9, and 10
37
CHAPTER 8
MOTIVATION FOR A COVERAGE
GUIDED APPROACH
While the decision tree supervised learning algorithms produces excellent results
and provides an excellent jumping-off point for the A-Miner in GoldMine, this
data mining algorithm has several flaws.
8.1 Why Decision Tree Assertions Need Improvement
In addition to the lack of assertion quality awareness, the decision tree has other
shortcomings. Due to its faithfulness to a (binary) tree structure, it explores every
value of each splitting variable. An assertion generated at a leaf node will neces-
sarily have all the splitting variables of the previous levels of the tree. This leads
to assertions that are over-constrained, or contain too many propositions (variable,
value pairs) in the antecedent. Intermittent poor splitting choices during tree con-
struction can result in irrelevant variables being added in the assertions as well.
For instance, a decision tree would create the assertion (request ∧ we ∧ rd ∧
branch) =⇒ (gnt), where the dependencies on the write enable and read sig-
nals are coincidental, but not causal. The desired assertion would be (request)
=⇒ (gnt). 1 Over-constraining restricts behavior and reduces the input space
behavioral coverage of assertions. It also decreases the readability of the asser-
tions. Since individual decision tree assertions have low input space coverage, a
large number of assertions is required to cover the design behavior. An increase
in the number of assertions is an undesirable side effect, since it implies overhead,
whether in pre-Silicon runtimes or in post-Silicon cost.
A subjective ranking distribution by the designers of the Rigel [48] processor
for decision tree generated GoldMine assertions is shown in Figure 5.1. Rank 1
represents a trivial assertion that would not be used in verification of the design.
1GoldMine produces assertions using Linear Temporal Logic [9]. The proposition on the left-
hand side of the implication operator is the antecedent and the right-hand side is the consequent.
38
Rank 2 represents a somewhat interesting assertion that may be used for verifica-
tion. Rank 3 represents an assertion that captures subtle design intent and would
be likely to be used in verification. Assertions ranked at 4 are complex assertions
which were too difficult for a human to judge.
In Chapter 5 in Figure 5.1, we had a designer rank assertions from 1 to 3 with
1 being the worst and 3 being the best. The designers ranked many assertions at
2 instead of 3, due to the over-constraining and lack of succinctness. The result is
that there is a very small percentage of rank 3 assertions created by the decision
tree algorithm. Our solution to this problem was the development of a coverage
guided mining algorithm to replace the decision tree based algorithm.
8.2 Coverage Guided Mining
We present the coverage guided mining algorithm, which is intended to increase
the number of rank 3 assertions and decrease the number of rank 2 assertions
produced by GoldMine. This coverage guided association miner replaces the de-
cision tree in the A-miner phase of the GoldMine algorithm. It uses a combination
of association rule learning, greedy set covering, and formal verification. In each
iteration of the coverage guided mining algorithm, the association rule learning
finds each assertion that has higher coverage than a specified minimum coverage.
In successive iterations, the minimum coverage for each assertion is lowered. This
guarantees that the highest coverage assertions are added to the candidate asser-
tion set in a greedy manner at every iteration. In addition, a formal verifier is used
to verify that candidate assertions added to the solution set are true.
Algorithms based on association rule learning are typically not scalable due to
their nature of finding all relations between all variables exhaustively. However,
in our algorithm, we constrain the solution space of the association learning by
considering only those candidates that fulfill a coverage criterion. We also require
that the candidates should be true as attested by formal verification. We also use a
heuristic of having minimal propositions in an antecedent for our greedy selection
of high-coverage candidate assertions. These restrictions sidestep the exhaustive
nature of the association learning and result in an efficient, scalable approach.
Our approach produces succinct assertions, with higher expressiveness per as-
sertion. This upgrades the value added by an assertion. Since the value added
by an assertion can be quantitatively expressed as input space coverage, this al-
39
gorithm iteratively refines the set of assertions until it maximizes the coverage
achieved by them. The coverage guided mining algorithm, therefore, converges
to a set of assertions that are few in number, but high in coverage. A graphical
representation of these two methods is shown in Figure 8.1.
Figure 8.1: Comparison between assertions in decision tree and coverage guided
mining over time for a design output. The dots represent behavior points in the design.
Decision tree generated assertions are unaware of behavior coverage and do not
optimize the design points covered. Coverage guided mining is coverage conscious when
generating assertions and greedily picks the highest coverage ones.
Our experimental results are shown on the OpenSparc T2 [46], OR1200 [49],
SpaceWire [50], ITC benchmarks [51], and Rigel [48] processor RTL modules.
We show that coverage guided association mining performs competitively against
the decision tree method in terms of overall input space coverage and far better
than decision trees with respect to input space coverage per assertion, number of
propositions per assertion, and subjective designer rankings.
40
CHAPTER 9
THE COVERAGE GUIDED MINING
ALGORITHM
9.1 Background Concepts
Association rule mining [41] is a data mining method that attempts to generate
all possible correlations between items. Though this algorithm has a exponential
complexity in the worst case, high efficiency is achieved by applying constraints
and using pruning techniques.
The set covering problem refers to a case where there are many sets that each
cover several elements and one wishes to find the minimal number of sets that
cover all possible elements. The complexity of finding the optimally minimal set
cover is NP-Complete [52]. However, there are many approximation algorithms
which can find a near-optimal solution efficiently. The greedy set covering al-
gorithm works by choosing the set that covers the largest number of uncovered
element until all elements have been covered.
Gain is a data mining concept that refers to the value of adding some rule to the
solution set of rules. In data mining, we only want to add a rule to our solution set
if its gain is higher than any other potential rules. This concept fits well with our
concept of input space coverage since we can define a notion of coverage gain.
The coverage gain of a rule (assertion) refers to the change in total coverage of a
set given that the rule is added to that set. For example, if a set of assertions has a
total input space coverage of 75% and an assertion with a coverage gain of 12.5%
is added, the new total coverage of that set will be 82.5%.
Typically, an association mining algorithm will try to exhaustively produce all
possible rules relating all input variables to all output variables. To restrict the
number of rules, we apply several constraints. Our first constraint, as in [1], is
that only rules with 100% confidence can be considered as candidate assertions
for association rule mining. We now include coverage feedback as a constraint.
We impose a minimum coverage gain to drastically limit the number of candidate
41
assertions. We then gradually relax this constraint until we have reached a desired
coverage value. The greedy set covering algorithm will always choose the highest
coverage assertions in each iteration.
As defined in Section 4.6, input space (or truth table) coverage is a metric
which has been adopted for the purpose of evaluating a set of assertions in relation
to some output. Because no alternative metric exists for evaluating the quality of
an assertion, we use this definition for the coverage guided mining algorithm. It
should be noted that if coverage is mentioned, it is assumed that it is input space
coverage.
9.2 Algorithm Explanation
We run this algorithm to generate assertions for a specified output in a design, z.
The assertions will be in the format where a set of propositions describing input
variables and their respectively assigned values imply that the output, z, will be a
certain value.
As is defined as the solution set of assertions. The expected total input space
coverage of As is defined as c(As). We define g(As, A′s) as the input space cover-
age gain between two sets of assertions where A′s = As + a and a is an assertion.
We also define gmin as the minimum coverage gain. The minimum coverage gain
ensures that any assertion that is mined must raise the total coverage of As by
gmin. The total coverage of As is defined as c(As). We set a minimum cover-
age gain threshold gthreshold and a maximize total coverage threshold cthreshold
which result in algorithm termination when reached. Our goal is to maximum the
expected total input space coverage c(As) by maximizing the g(As, A′s) in each
iteration while minimizing the total number of assertions and propositions in the
antecedent of each assertion.
The basic flow of the algorithm is shown in Figure 9.1. We will apply the algo-
rithm as it is explained to the simulation trace in Figure ??. We set the maximum
total coverage threshold to 99% and the minimum coverage gain threshold to 1%.
The algorithm starts by initializing the gmin = 50%, As = {}, and c(As) = 0%.
We know that at least one proposition must be in the antecedent of the assertion,
which means that the maximum coverage gain must be 50%. We do not consider
assertions without any propositions in the antecedent since those assertions are
trivial.
42
Figure 9.1: Coverage guided association mining algorithm
In the next step, gen candidate, the algorithm described in Algorithm 2, is
invoked. In the gen candidates, P refers to a set of {input variable, value}
pairs representing the antecedent of a potential assertion a. F refers to the set
of {input variable, value} pairs not in P , since we do not want to add the same
{input variable, value} pair to an antecedent twice. E refers to the simulation
trace and is represented as a set of signal values at each cycle. In our example
F = {{a, 0}, {a, 1}, {b, 0}, {b, 1}, {c, 0}, {c, 1}}, P = {}, and E is the data in
Table 9.1.
Table 9.1: The dataset for the example function z = (a|¬c)&b)
a b c z
0 0 0 0
0 1 1 0
1 0 1 0
1 1 0 1
Essentially what gen candidates does is recursively add {input variable, value}
pairs to P . If all pairs in P are 100% correlated with the output pair {z, 0} or
{z, 1} in all cycles of the simulation trace represented by E, a candidate assertion
43
Algorithm 2 Association Miner
gen candidates(F, P,E)
1: for each {input variable, value} pair in F , fi do
2: if g(As, As ∪ assertion(P ∪ fi =⇒ {z,X})) ≥ gmin then
3: if ∀ej ∈ E, P ∪ fi =⇒ {z, 0} then
4: Ac = Ac ∪ assertion(P ∪ fi =⇒ {z, 0})
5: else if ∀ej ∈ E, P ∪ fi =⇒ {z, 1} then
6: Ac = Ac ∪ assertion(P ∪ fi =⇒ {z, 1})
7: else
8: mine(F − fi, P ∪ fi)
9: end if
10: end if
11: end for
is generated based on that correlation and the algorithm returns. The algorithm
also returns when the coverage gain falls below the minimum coverage gain be-
cause adding more propositions to the antecedent can only decrease the coverage
gain.
In line 1, fi = {a, 0}. The coverage gain of of the assertion (a = 0) =⇒ (z =
X) 1 is calculated to 50% in line 2, which is equal to gmin. At line 3, we can see
that for the data in every cycle, ej , (a = 0) =⇒ (z = 0), which means that there
is a correlation between a = 0 and z = 0 which indicates a candidate assertion.
The candidate assertion a1 : (a = 0) =⇒ (z = 0) is added to Ac, the set of
candidate assertions, in line 4.
Now, back at line 1, fi = {a, 1}. Even though the coverage gain of assertion
(a = 1) =⇒ (z = X) is also 50%, neither the rule (a = 1) =⇒ (z = 0)
nor (a = 1) =⇒ (z = 1) is true for each cycle of data, ej . This means that the
conditions in lines 3 and 5 are not satisfied. The algorithm recurses at line 8 with
P = {{a, 1}} and F = {{b, 0}, {b, 1}, {c, 0}, {c, 1}}}.
Now the coverage gains of assertions (a = 1 ∧ b = 0) =⇒ (z = X),
(a = 1 ∧ b = 1) =⇒ (z = X), (a = 1 ∧ c = 0 =⇒ (z = X)), and
(a = 1 ∧ c = 1) =⇒ (z = X) are each 25% since each has two propositions in
the antecedent. The minimum coverage gain is never satisfied in lines 2, and the
algorithm returns.
The algorithm is continued from line 1 for the remaining {input variable, value}
pairs resulting in the candidates a2 : (b = 0) =⇒ (z = 0) and a3 : (c = 1) =⇒
(z = 0) being added to Ac. The assertions in Ac are sorted by the number of
1X refers to a “don’t care” value since the output does not affect the input space coverage
44
propositions to keep the number of propositions per assertion to a minimum. In
the example, the list remains unchanged since each candidate has the same number
of propositions.
Algorithm 3 recalibrate add
recalibrate add(Ac, As)
1: for all a ∈ Ac do
2: if g(AS, As ∪ a) ≥ gmin then
3: As = As ∪ a
4: end if
5: end for
In the next step, recalibrate add adds candidate assertions with coverage gain
greater than or equal to gmin to the solution set as shown in Algorithm 3. Because
coverage gain, g(As, A′s) is relative to the solution set As, as soon as the solution
set changes, the coverage gain of all assertions must be recalculated based on
the new solution set. For this reason, even though all assertions in Ac must have
coverage gain greater than or equal to gmin with respect to the As before this
function is called, the coverage gain of any assertion may decrease below gmin as
other assertions are added to As. Because of this, Ac must be recalibrated with
regards to coverage gain of each assertion before an assertion may be added toAs.
In our example, a3 is added to the solution set,As, sinceAs remains the same as
before the function was called. After adding that candidate to the solution set, the
coverage gain of next candidate, a2, is recalculated based on the new As. Since
As contains assertion a1 with the antecedent (a = 0), it should be noted that the
truth table entries where a = 0 and b = 0 are already covered. Therefore, the
assertion a2 with antecedent (b = 0) can only cover the truth table entries where
a = 1 and b = 0, resulting in decreased coverage gain of only 25%. By the same
logic, the coverage gain of assertion a3 with antecedent (c = 1) is also reduced
to 25%. Since both candidates have coverage gain less than gmin, they are both
discarded.
In the final step of the first iteration, Ac is cleared and the minimum coverage
gain, gmin, is reduced by half. In the example, gmin is reduced from 50% to 25%,
which is still greater than the minimum gain threshold. The total coverage of As
is 50%, which is less than the maximum total coverage threshold, cthreshold. Since
neither threshold is passed, the algorithm continues to the second iteration.
In the second iteration, gen candidates is performed again with the reduced
gmin. This generates the following candidate assertions which are added to Ac:
45
a4 : (a = 1 ∧ b = 0) =⇒ (z = 0), a5 : (a = 1 ∧ b = 1) =⇒ (z = 1),
a6 : (a = 1 ∧ c = 0) =⇒ (z = 1), a7 : (a = 1 ∧ c = 1) =⇒ (z = 0),
a8 : (b = 0) =⇒ (z = 0), and a9 : (c = 1) =⇒ (z = 0). These candidate
assertions are added to Ac and then sorted by the number of propositions per
assertion with resulting order of a8, a9, a4, a5, a6, a7.
Assertion a8 : (b = 0) =⇒ (z = 0) is added to As. The coverage gain of the
remaining candidate assertions is recalculated, causing a4, a6, a7, and a9 to each
drop to 12.5%. This leaves only the assertion a5 : (a = 1 ∧ b = 1) =⇒ (z = 1)
that remains at 25% which is also added to As.
It should now be noted that the expected total input space coverage of As has
reached 100%, which is above the total coverage gain threshold. This means that
the algorithm can exit, producing the following assertions: a1 : (a = 0) =⇒
(z = 0), a8 : (b = 0) =⇒ (z = 0), and a5 : (a = 1 ∧ b = 1) =⇒ (z = 1).
It should be noted that this algorithm can be applied to temporal assertions
much like in the decision tree algorithm [1]. For temporal assertions, the circuit
is unrolled a user-specified number of times. The number of times the circuit is
unrolled is known as the lookback amount. A separate set of inputs is created for
each clock cycle that the circuit is unrolled where each new set of inputs represents
the value of that signal relative to the current time. For example, a[t] represents
signal a in the current cycle and a[t− 1] represents the value of a in the previous
cycle. With this data transformation, the data mining algorithm can treat the newly
added signals as separate from the signals in the current time and use the same
algorithm as is used on combinational signals.
9.3 Integration of Formal Verification
In our greedy set covering approach, we only choose candidate assertions based
on coverage. Because these candidate assertions are only necessarily true with
respect to a simulation trace, it is possible that a spurious assertion may be added
to the solution set. Additionally, adding this spurious assertion to the solution set
will prevent true assertions that cover the same input space from being added to
the solution set, which adversely affect overall coverage.
Consider the example presented in Section 9.2. While a5 and a8 are true, a1 is
not. Even though the expected input space coverage of the solution set is 100%,
the actual coverage is reduced to 75% since the a1 is untrue. We want to be able
46
to check whether any assertions are true before ever adding them to the solution
set.
Algorithm 4 recalibrate add with Formal Verification
recalibrate add(Ac, As)
1: for all a ∈ Ac do
2: if g(AS, As ∪ a) ≥ gmin then
3: if FomalV erify(a) == True then
4: As = As ∪ a
5: end if
6: end if
7: end for
The solution to this problem is to integrate the formal verifier into the algorithm
to validate candidate assertion choice. We modify the recalibrate add function to
include a formal verification check as shown in Algorithm 4. After the association
rule miner produces the set of candidate assertions, the formal verifier is used to
prune the false candidates while retaining the true assertions. This guarantees
that any assertion that is added to the solution set is going to be true. If we use
this modified algorithm on our example presented in the previous subsection, we
produce the assertions (b = 0) =⇒ (z = 0), (a = 1 ∧ b = 1) =⇒ (z = 1),
(b = 1 ∧ c = 0) =⇒ (z = 1), and (a = 0 ∧ c = 1) =⇒ (z = 0) which results
in 100% input space coverage.
It should be noted that the use of formal verification does present a scalability
concern. Large designs can result in a state space explosion, making verification
slow or even impossible. Though formal verification does have these disadvan-
tages, it does not mean that the coverage guided algorithm is crippled by them. To
date, we have discovered only one module that was so large that it was not possi-
ble to verify (OpenSparc L2 cache). In this case, there are several options. One
option is to individually verify the submodules of the limiting module. Another
option is to disable formal verification of candidate assertions. The candidate as-
sertions can then be simulated and manually checked by humans to determine if
they are valid.
9.3.1 Scalability
For N input variables in a given simulation trace, searching through the space of
all antecedents (3N ) is not scalable. In our algorithm, however, the minimum cov-
47
erage gain helps guide and focus our antecedent search on important assertions.
By definition of coverage gain, an assertion with k propositions in its antecedent
covers at most 1
2k
of the whole input space. In general, the number of antecedents
with k propositions is 2k
(
N
k
)
and their coverage gains are at most 1
2k
. Thus, if the
minimum coverage gain is 1
2k
, the maximum number of possible antecedents in
the search space is O((2N)k). For a fixed k, each iteration runs in polynomial
time in terms of N . In our algorithm, we iteratively increase k by 1, decreasing
the minimum coverage gain gmin, until that minimum coverage gain threshold,
gthreshold, is reached. The maximum iteration of k, kmax, is defined as the itera-
tion when gthreshold is reached. This helps to limit the search space. The algorithm
only increases the search space if necessary. This results in the overall complexity
of the algorithm being O((2N)kmax), which is polynomial for a fixed gthreshold.
Moreover, because of the search space pruning, the actual number of antecedents
searched in practice is much smaller than this theoretical bound.
Our algorithm’s scalability is only restricted by formal verification. Although
formal verification technology is sensitive to state space, we find that in prac-
tice, we are able to effectively verify many modules of large designs, like the
OpenSparc MMU. So far, the only module that was too large to verify is the
OpenSparc L2 cache. The reason for this is that the L2 cache contains very many
RAM elements, which are difficult for the formal verifier to model. In these in-
frequent cases, there are several options. One option is to individually verify the
submodules of the limiting module. Another option is to disable formal verifi-
cation of candidate assertions. The candidate assertions can then be simulated to
determine if they are valid. The rest of the algorithm will proceed in the same
manner.
48
CHAPTER 10
A COMPARISON BETWEEN THE
COVERAGE GUIDED AND DECISION
TREE APPROACHES IN GOLDMINE
We compare the decision tree and coverage guided methods for multiple designs.
The designs used for testing include fetch stage and wb stage from Rigel [48] ,
b10, b13, and b15 from the International Test Conference Benchmark Suite [51],
b100, b101, b102, and b103 from the OpenRisc1200 CPU [49], and Transmitter,
Receiver, and SPW FSM from the European Space Agency SpaceWire codec[50].
We have included results for the OpenSparc T2, which is an open source industrial
size design. The number of inputs bits, outputs, and area can be seen in Table 10.1.
Table 10.1: Characteristics of each module used for experiments
Module Inputs Outputs Area (µm2)
OR1200 - b100 122 9 788
OR1200 - b101 163 11 1178
OR1200 - b102 234 9 1223
OR1200 - b103 596 9 3324
Rigel - fetch stage 458 6 4165
Rigel - wb stage 963 3 269
Spacewire - SPW FSM 46 7 342
Spacewire - Receiver 75 15 979
Spacewire - Transmitter 96 5 896
ITC - b10 27 2 282
ITC - b13 55 6 720
ITC - b15 534 4 9947
OpenSparc - MMU 3393 16 66395
All tests were run on an Intel Core 2 Q6600 with 4GB of RAM. Each simulation
trace contains 10,000 cycles of data. The parameters are configured such that the
minimum support is set to 0.1%, the minimum coverage gain threshold is 0.2%,
and the coverage threshold is set to 99%.
49
10.1 Input Space Coverage as a Function of Iterations
In the first experiment, we show the number of iterations the algorithm takes to
converge. The results for this experiment are taken from the OR1200 data cache
controller module. The results are shown in Figure 10.1. It is clear that there is a
logarithmic increase in input space coverage at each iteration since the minimum
gain is decreased in each cycle.
Figure 10.1: Graph showing the number of iterations taken for each design to reach
100% input space coverage using the coverage guided mining algorithm
10.2 Runtime and Memory Requirements of Our
Algorithm
We applied the algorithm to several outputs from the OR1200 data cache con-
troller. For runtime, we recorded the time when the algorithm starts to the time
that the algorithm exits as defined in Figure 9.1. Formal verification is enabled in
this test. To record the maximum memory usage, we used the Massif tool in the
Valgrind [53]. The runtime is shown in Figure 10.2 and the maximum memory
usage is shown in Figure 10.3.
Though the runtime of the coverage guided mining algorithm is not as fast as
the decision tree (as shown in Figure 6.6), the tool is still very scalable, even
with formal verification enabled. If runtime is a concern, the formal verification
can be disabled. This produces assertions much more quickly although there will
50
Figure 10.2: The runtime of the coverage guided mining method. The highly complex
OpenSparc MMU module completes in a total of five hours.
be no feedback on the validity of the candidate assertions. Maximum memory
usage is also very low. This is due to memory usage scaling with the size of the
simulation trace (inputs × number of cycles). If a bigger simulation trace is used,
the maximum memory usage will increase linearly with the number of cycles.
10.3 Comparison of Input Space Coverage
We compare the total input space coverage of the assertions generated by the cov-
erage guided and decision tree algorithms. The input space coverage of a primary
output is defined as the sum of the input space coverage of each assertion gen-
erated with respect to that primary output. The average input space coverage is
calculated as an average of the input space coverage of each primary output in
the design. The results are shown in Figure 10.4. In every module, the coverage
guided algorithm produces an input space coverage comparable to the decision
tree method. In many cases, the coverage guided algorithm outperforms the deci-
sion tree algorithm. This indicates that in those tests, the decision tree made poor
splitting decisions while the coverage guided algorithm did not suffer from the
same problem.
51
Figure 10.3: The maximum memory usage of coverage guided mining. The memory
usage of this algorithm is negligible.
10.4 Comparison of Succinctness of Assertions
Since a primary intent of the coverage guided mining algorithm is to improve as-
sertion quality, we compare the average number of propositions in the antecedent
between the two algorithms. A low number of propositions in the antecedent indi-
cates a high input space coverage and also means that the assertion is more concise
and thus easier to read by a human. The results of the test are shown in Figure
10.5. These results show that the coverage guided mining algorithm produces a
lower average number of propositions in every module tested.
10.5 Comparison of Conciseness of Generated
Assertions
In this experiment, the total number of assertions generated for all primary out-
puts of each design is recorded for each algorithm. A lower number of assertions
in the final set when the input space coverage is the same indicates that the set of
assertions will occupy less time and area overhead for synthesis as well as simu-
lation. The results are in Figure 10.6. The set covering technique in the coverage
guided mining algorithm outperforms the decision tree. For the b10 module in
particular, the decision tree generates almost ten times more assertions than the
coverage guided method even though the coverage guided method has a higher
52
Figure 10.4: Input space coverage comparison between the coverage guided mining and
decision tree algorithms
input space coverage. It should be noted that while the coverage guided method
generates more assertions for the SpaceWire modules (SPW FSM, Transmitter,
Receiver), it also achieves a significantly higher input space coverage.
10.6 Comparison of Information per Unit: Average
Input Space Coverage per Assertion
It is interesting to see what the average input space coverage per assertion is.
This metric is based on the total input space coverage divided by the number of
assertions in the set. The results in Figure 10.7 show that the coverage guided
algorithm produces higher coverage assertions than the decision tree method.
10.7 Comparison of Number of Assertions Triggered
in Directed Tests
In this experiment, we produce a set of assertions for the fetch stage and wb stage
of Rigel. We then run the directed test suite created by the designers to deter-
mine how many assertions are triggered. If an assertion is triggered, it indicates
that the assertion is checking behavior that would be likely to occur in a realistic
environment. The results of this test are shown in Figure 10.8.
53
Figure 10.5: Comparison of the average number of propositions per assertion between
each algorithm. The coverage guided mining method’s assertions have fewer
propositions, implying that they are concise and expressive.
10.8 The Final Test: Subjective Designer Rankings
For this experiment, we generated assertions for the fetch stage and wb stage of
Rigel and then asked a designer to rank a set of assertions generated by the deci-
sion tree method and a set generated by the coverage guided mining method. The
designer was not informed of the difference between the two sets. The rankings
were assigned from 1 to 3 as described below.
1. Trivial assertion that the designer would not write
2. Designer would write the assertion
3. Designer would write, captures subtle design intent
The results in Figure 10.9 show that the coverage guided algorithm produces
a much higher percentage of rank 3 assertions than the decision tree algorithm.
Any assertions that were good, but included more propositions in the antecedent
than necessary, were reduced from a rank 3 to a rank 2, which was the case for
many decision tree assertions. Overall, the designer commented that he would
use the set of assertions generated by the coverage guided method over the
assertions generated by the decision tree method.
54
Figure 10.6: Comparison of total number of assertions generated using each algorithm.
Coverage guided mining often produces a much smaller set of assertions while retaining
high input space coverage.
Figure 10.7: Comparison of the average input space coverage per assertion using each
algorithm. High input space coverage shows more information per assertion. In some
cases, the coverage guided mining algorithm assertions have average coverage per
assertion up to 20-30% more than the decision tree algorithm.
55
Figure 10.8: Comparison of both algorithms in terms of the percentage of assertions
triggered in the Rigel directed test suite. Assertions generated by coverage guided
mining are triggered at least once, meaning that they are more likely to be triggered in a
realistic environment than those generated by the decision tree algorithm.
Figure 10.9: Subjective ranking by a designer of the set of assertions generated by each
algorithm. All datapath assertions were considered a rank 1 by this designer because he
did not consider them valuable. The coverage guided mining algorithm produces a
significantly higher percentage of assertions which are at rank 3, which was the original
motivation of the technique.
56
CHAPTER 11
RESOURCES
This chapter contains resources on using obtaining and using GoldMine.
11.1 Obtaining GoldMine
Currently, GoldMine is only available within the University of Illinois. However,
the GoldMine binary will soon be available for research purposes at
http://faculty.ece.illinois.edu/shobhav/
A subversion repository is maintained on the Coordinated Science Laboratory
AFS network. This repository can downloaded using the command:
svn co file:///afs/crhc.illinois.edu/project/goldmine/
common/SVNROOT/goldmine
The repository truck is organized as follows. The java directory contains the
source for the original Java implementation of GoldMine. This Java version is
now depreciated and has been replaced with a C++ implementation. The cpp di-
rectory contains the C++ implementation of both the decision tree based src and
coverage guided mining cgm versions of GoldMine. The ruby folder contains
the various scripts that are used in conjunction with GoldMine. The tex directory
contains the LaTeX source files for various conference papers and articles. The
public directory contains the current distributable package of GoldMine.
57
11.2 Using the Decision Tree Based GoldMine
Implementation
To quickly see GoldMine in action, there are some examples included in the prob-
lem directory of the public distribution. GoldMine takes in the problem directory
as its only argument. GoldMine can be run on the OpenRisc1200 b100 module by
executing:
./bin/goldmine problem/b100
GoldMine’s output (including tree printouts) can be found in
work/(module name)/(output name)
The generated assertions can be found in the (output name).true file while the
tree printout can be found in (output name).tree.
11.2.1 The Problem Directory
GoldMine uses a problem directory which contains the input that GoldMine re-
quires to be run. These directories consist of the following components.
Background Knowledge
For each output, there should be a .bk (background knowledge) file which contains
the following parameters. Multiple .bk files are allowed.
• MODULE: Name of the module
• INPUTS: The primary inputs to use as antecedents in assertions
• REG: Registers or wires that are used as antecedents.
• OUTPUTS: The outputs to use as consequents for assertions
• LOOKBACK: Number of cycles to unroll a circuit for temporal assertions
• ASSERTION FORMAT: The format for the assertions. Supports SVA.
58
• FV: The formal verifier to use. Only supports Cadence IFV.
• RTL: The location of the design file.
Example Behavior
In the problem directory, there should be a directory named “csv” which contains
the example behavior of the design in comma-separated value format. Each col-
umn represents a signal and each row represents one cycle at the positive clock
edge.
Using Ruby to Automatically Create the Problem Directory
For automatic creation of the problem directory, both Ruby [54] and Synopsys
VCS [55] are required. In the public distribution, run
./bin/create problem.sh (verilog module)
Where (verilog module) represents the name of a module in the verilog folder
with all necessary files (submodules and library files) included using the “include”
directive. This script creates a random testbench for the specified module which
applies a random value to each input signal, then simulates the design with the cre-
ated testbench producing a VCD dump, converts that VCD file into CSV format,
and then produces a background knowledge (.bk) file for each primary output.
To create bk files that support formal verification and counterexample refinement,
change $enable formal verification = true in ./lib/full signal generator.rb.
11.3 Using the Coverage Guided Mining GoldMine
Implementation
The coverage guided algorithm is based on the decision tree implementation, but
some conversion has to be done to prepare the problem directory for the coverage
guided algorithm. In the cgm subfolder of ruby, there is a script named do.rb.
To use this script, create a text file containing the module(s) that you would like
to run GoldMine on (located in the verilog directory) separated by line feeds. To
59
run this program, use
./do.rb (text file containing module names)
It should be noted that before the program is run the first time, one will need to
edit quick parse.rb and change HOME to the folder containing the verilog
directory and edit do.rb and change $my dir to the directory containing the
verilog and problem directories (likely the same as HOME).
This script automatically instantiates the coverage guided algorithm. The out-
put of the GoldMine is piped to (module name).out.
60
CHAPTER 12
CONCLUSIONS
While the work done on GoldMine has been extensive, it only touches the surface
of what can be done. Because of its speed, it can be deployed where static anal-
ysis fails. Because of its power, it is able to generate complex assertions where
competing tools can only generate trivial assertions. GoldMine also has the ver-
satility to be extended in a variety of ways and can serve many different uses for
the verification of hardware designs. GoldMine has a very good chance of being
developed into a tool that designers and verification engineers cannot live with-
out by maximizing productivity and minimizing human resources and cost in the
assertion generation process.
61
REFERENCES
[1] S. Vasudevan, D. Sheridan, D. Tcheng, S. Patel, W. Tuohy, and D. Johnson,
“GoldMine: Automatic assertion generation using data mining and static
analysis,” in Proc. of the Conf. on Design, automation and test in Europe,
2010, pp. 626–629.
[2] M. Boule, J.-S. Chenard, and Z. Zilic, “Assertion checkers in verification,
silicon debug and in-field diagnosis,” in ISQED ’07: Proc. of the 8th Intl.
Symposium on Quality Electronic Design, 2007, pp. 613–620.
[3] H. Foster, D. Lacey, and A. Krolnik, Assertion-Based Design. Norwell,
MA, USA: Kluwer Academic Publishers, 2003.
[4] A. A. Bayazit and S. Malik, “Complementary use of runtime validation and
model checking,” in Proc. of the 2005 IEEE/ACM Intl. Conf. on Computer-
aided design, Washington, DC, USA, 2005, pp. 1052–1059.
[5] M. Boule´ and Z. Zilic, “Automata-based assertion-checker synthesis of PSL
properties,” ACM Trans. Des. Autom. Electron. Syst., vol. 13, no. 1, pp. 1–21,
2008.
[6] A. Gupta, “Assertion-based verification turns the corner,” IEEE Des. Test,
vol. 19, no. 4, pp. 131–132, 2002.
[7] D. Wang and J. Levitt, “Automatic assume guarantee analysis for assertion-
based formal verification,” in ASP-DAC ’05: Proceedings of the 2005 Asia
and South Pacific Design Automation Conference. New York, NY, USA:
ACM, 2005, pp. 561–566.
[8] J. W. Nimmer and M. D. Ernst, “Automatic generation of program specifi-
cations,” in Proc. of the International Symposium on Software Testing and
Analysis, 2002, pp. 232–242.
[9] A. Pnueli, “The temporal logic of programs,” in Proc. of the 18th Symp. on
Foundations of Computer Science, 1977, pp. 46–57.
[10] B. Wegbreit, “Heuristic methods for mechanically deriving inductive asser-
tions,” in IJCAI, 1973, pp. 524–536.
62
[11] S. Katz and Z. Manna, “A heuristic approach to program verification,” in
Third international conference on Artificial Intelligence, 1973, p. 500.
[12] M. Caplain, “Finding invariant assertions for proving programs,” in Proceed-
ings of the international conference on Reliable software. New York, NY,
USA: ACM, 1975, pp. 165–171.
[13] J. Misra, “Prospects and limitations of automatic assertion generation for
loop programs,” SIAM Journal on Computing, pp. 718–729, 1977.
[14] S. Bensalem and H. Saidi, “Powerful techniques for the automatic generation
of invariants,” in Computer-Aided Verification. Springer-Verlag, 1996, pp.
323–335.
[15] N. Bjrner, A. Browne, and Z. Manna, “Automatic generation of invariants
and intermediate assertions,” in Theoretical Computer Science. Springer-
Verlag, 1997, pp. 589–623.
[16] J. Stark and A. Ireland, “Invariant discovery via failed proof attempts,” in
Proc. LOPSTR 98, LNCS 1559. Springer-Verlag, 1998, pp. 271–288.
[17] A. Tiwari, H. Rueß, H. Saı¨di, and N. Shankar, “A technique for invariant
generation,” in TACAS 2001: Proceedings of the 7th International Confer-
ence on Tools and Algorithms for the Construction and Analysis of Systems.
London, UK: Springer-Verlag, 2001, pp. 113–127.
[18] C. S. Pasareanu and W. Visser, “Verification of Java programs using sym-
bolic execution and invariant generation,” in Proc. of the SPIN Workshop on
Model Checking and Software, Barcelona, Spain, 2004, p. 2989.
[19] X. Cheng and M. S. Hsiao, “Simulation-directed invariant mining for soft-
ware verification,” in Proc.of the conf. on Design, Automation and Test in
Europe. New York, NY, USA: ACM, 2008, pp. 682–687.
[20] G. Ammons, R. Bodı´k, and J. R. Larus, “Mining specifications,” SIGPLAN
Not., vol. 37, no. 1, pp. 4–16, 2002.
[21] M. D. Ernst, J. Cockrell, W. G. Griswold, D. Notkin, S. Member, and I. C.
Society, “Dynamically discovering likely program invariants to support pro-
gram evolution,” IEEE Transactions on Software Engineering, vol. 27, pp.
213–224, 2001.
[22] J. W. Nimmer and M. D. Ernst, “Invariant inference for static checking: An
empirical evaluation,” in Proceedings of the ACM SIGSOFT 10th Interna-
tional Symposium on the Foundations of Software Engineering (FSE 2002),
2002, pp. 11–20.
63
[23] M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S.
Tschantz, and C. Xiao, “The DAIKON system for dynamic detection of
likely invariants,” Science of Computer Programming, vol. 69, no. 1–3, pp.
35–45, Dec. 2007.
[24] L.-C. Wang, M. S. Abadir, and N. Krishnamurthy, “Automatic generation
of assertions for formal verification of powerpc microprocessor arrays using
symbolic trajectory evaluation,” in DAC ’98: Proceedings of the 35th annual
conference on Design automation, 1998, pp. 534–537.
[25] A. Hekmatpour and A. Salehi, “Block-based schema-driven assertion gener-
ation for functional verification,” Asian Test Symposium, vol. 0, pp. 34–39,
2005.
[26] S. Hangal, N. Chandra, S. Narayanan, and S. Chakravorty, “Iodine: a tool to
automatically infer dynamic invariants for hardware designs,” in DAC ’05:
Proceedings of the 42nd Design Automation Conference, 2005, pp. 775–778.
[27] F. Rogin, T. Klotz, G. Fey, R. Drechsler, and S. Rulke, “Automatic genera-
tion of complex properties for hardware designs,” in Proc. of the Conf. on
Design, Automation, and Test in Europe, 2009, pp. 545–548.
[28] IEEE Standard System C Language Reference Manual, IEEE Standard 1666,
2006.
[29] IEEE Standard Hardware Description Language Based on the Verilog(R)
Hardware Description Language, IEEE Standard 1364, 1996.
[30] IEEE Standard VHDL Language Reference Manual, IEEE Standard 1076,
1988.
[31] R. W. Floyd, “Assigning meanings to programs,” in Proceedings of Sympo-
sium on Applied Mathematics, vol. 19, 1967, pp. 19–32.
[32] C. A. R. Hoare, “An axiomatic basis for computer programming,” Commun.
ACM, vol. 12, no. 10, pp. 576–580, 1969.
[33] E. M. Clarke, E. A. Emerson, and A. P. Sistla, “Automatic verification of
finite-state concurrent systems using temporal logic specifications,” ACM
Transactions on Programming Languages and Systems, vol. 8, pp. 244–263,
1986.
[34] B. Cohen, S. Venkataramanan, and A. Kumari, SystemVerilog Assertions
Handbook. Palos Verdes Peninsula, CA: VHDLCohen Publishing, 2005.
[35] Standard for SystemVerilog-Unified Hardware Design, Specification, and
Verification Language, IEEE Working Draft Proposed Standard, Rev. 1800,
2009.
64
[36] OpenVera, “OpenVera reference manual.” [Online]. Available:
http://www.open-vera.com/technical/OVAIPGuidelines.pdf
[37] IEEE Standard for Property Specification Language (PSL), IEEE Standard
1850, 2005.
[38] J. Han and M. Kamber, Data Mining: Concepts and Techniques. San Fran-
cisco, CA: Morgan Kaufmann, 2006.
[39] L. A. Breslow and D. W. Aha, “Simplifying decision trees: A survey,” in
Knowl. Eng. Rev., vol. 12. New York, NY, USA: Cambridge University
Press, January 1997, pp. 1–40.
[40] A. Atramentov, H. Leiva, and V. Honavar, “A multi-relational decision tree
learning algorithm - implementation and experiments,” in Proceedings of the
13th International Conference on Inductive Logic Programming (ILP 2003).
Springer-Verlag, 2003, pp. 38–56.
[41] R. Agrawal, T. Imielinski, and A. Swami., “Mining association rules be-
tween sets of items in large databases,” in Proc. of SIGMOD Conf., 1993,
pp. 207–216.
[42] R. P. Kurshan, Computer-Aided Verification of Coordinating Processes: The
Automata-Theoretic Approach. Princeton University Press, 1994.
[43] P.-N. Tan, V. Kumar, and J. Srivastava, “Selecting the right interestingness
measure for association patterns,” in KDD ’02: Proceedings of the Eighth
ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining. New York, NY, USA: ACM, 2002, pp. 32–41.
[44] K. L. Mcmillan, “Symbolic model checking: An approach to the state ex-
plosion problem,” Ph.D. dissertation, Carnegie Mellon University, 1992.
[45] B. Touhy, private communication, 2009.
[46] “Sun OpenSparc T2.” [Online]. Available: http://www.opensparc.net
[47] L. Liu, D. Sheridan, W. Tuohy, and S. Vasudevan, “Towards coverage clo-
sure: Using GoldMine assertions for generating design validation stimulus,”
University of Illinois Urbana Champaign, Tech. Rep., 2011.
[48] J. H. Kelm, D. R. Johnson, M. R. Johnson, N. C. Crago, W. Tuohy, A. Ma-
hesri, S. S. Lumetta, M. I. Frank, and S. J. Patel, “Rigel: an architecture
and scalable programming interface for a 1000-core accelerator,” in ISCA
’09: Proceedings of the 36th Annual International Symposium on Computer
Architecture, 2009, pp. 140–151.
[49] “OpenRisc 1200.” [Online]. Available: http://opencores.org/openrisc,or1200
65
[50] “European Space Agency SpaceWire.” [Online]. Available:
http://spacewire.esa.int
[51] “International Test Conference Benchmarks.” [Online]. Available:
http://itc02socbenchm.pratt.duke.edu
[52] R. M. Karp, “Reducibility among combinatorial problems,” The Journal of
Symbolic Logic, vol. 40, no. 4, pp. 85–102, 1975.
[53] “Valgrind Dynamic Analysis Tools.” [Online]. Available:
http://valgrind.org/
[54] “Ruby.” [Online]. Available: http://www.ruby-lang.org
[55] “Synopsys VCS.” [Online]. Available: http://http://www.synopsys.com/
66
