Design debugging has become a resource-intensive bottleneck in modern VLSI CAD flows, consuming as much as 60% of the total verification effort. With typical design sizes exceeding the half-million synthesized gates mark, the growing number of blocks to be examined dramatically slows down the debugging process. The aim of this work is to prune the number of debugging iterations for finding all potential bugs, without affecting the debugging resolution. This is achieved by using structural dominance relationships between circuit components. More specifically, an iterative fixpoint algorithm is presented for finding dominance relationships between multiple-output blocks of the design. These relationships are then leveraged for the early discovery of potential bugs, along with their corrections, resulting in significant debugging speed-ups. Extensive experiments on real industrial designs show that 66% of solutions are discovered early due to dominator implications. This results in consistent performance gains in all cases and a 1.7x overall speed-up for finding all potential bugs, demonstrating the robustness and practicality of the proposed approach.
INTRODUCTION
Once functional verification discovers a discrepancy between a design and its specification, it returns a counter-example in the form of an error trace exhibiting an erroneous behavior of the design. Design debugging is the process of analyzing this counter-example and tracking down the bug(s) in the design. This is still a predominantly manual task in the industry. With the growing size and complexity of designs and error traces, bugs are increasingly difficult to locate. Hence, it comes as no surprise that today design debugging consumes as much as 60% of the total verification effort [1] .
With the aim of alleviating the design debugging cost, several methodologies have been proposed over the years to automate this process [2] [3] [4] [5] [6] . The output of an automated design debugger is a set of potential bug locations, referred to as solutions. Each solution denotes a set of RTL lines or blocks, where corrections can rectify the erroneous behavior in the given counter-example. The automated debugger must return all solutions, along with their corrections, with the engineer being given the final task of identifying the real bug and fixing it.
Modern debuggers make heavy use of formal tools, such as Binary Decision Diagrams [3] , Boolean Satisfiability (SAT) [4] , Quantified Boolean Formulas [5] and Maximum Satisfiability [6] . In all these techniques, finding each solution requires a separate call to the formal engine. With typical design sizes exceeding the half-million synthesized gates mark, discovering solutions one-byone is computationally expensive and limits the effectiveness of automated debuggers. This work addresses this issue by generating on-the-fly implied solutions, thus reducing the number of iterations for returning all solutions. This is done by using structural dominance relationships between circuit components.
A node u is said to be a (structural) single-vertex dominator of another node v if every path from v to a primary output passes through u. Single-vertex dominators can be found in linear-time [7, 8] and have been used for optimizing various CAD tasks, e.g., test pattern generation [9, 10] . More recently, they have been leveraged in the gate-level debugger in [4] , which performs an initial debugging pass on selected dominator gates. However, state-of-the-art automated design debuggers operate on the RTL-level [11, 12] , where bugs occur in multiple-vertex, multipleoutput blocks in the circuit. As such, it is difficult to make use of single-vertex dominators at the RTL-level. A multiple-vertex block a dominates another multiple-vertex block b if every path from every node in b to a primary output passes through a node in a. Unlike existing approaches for finding multiple-vertex dominators, where block boundaries are not specified in advance [13] [14] [15] , we are interested in establishing dominance relationships among a fixed set of blocks, naturally provided in a hierarchical RTL design.
The initial contribution of this work is a fixpoint algorithm that iteratively calculates dominance relationships between a predefined set of multiple-vertex blocks in a design. Next, it is proven that for each (set of) block(s) returned as a solution by the automated design debugger, every corresponding (set of) dominator(s) is a separate implied solution. As such, applying our fixpoint algorithm as a preprocessing step, the number of design debugging iterations for finding all solutions can be significantly reduced. Furthermore, we prove that corrections for implied solutions can be automatically generated without explicitly analyzing these solutions. It is shown that dominator-based solution implications are guaranteed to be valid given any error cardinality.
The proposed method is conveniently presented and implemented on top of a SAT-based automated design debugging framework [4, 11] . However, it is also applicable to simulation-based and other formal diagnosis techniques. An extensive set of experiments on real industrial designs obtained by our partners demonstrates the consistent benefits of the presented framework. It is shown that 66% of solutions are discovered early due to dominator implications. This results in a 1.7x overall speed-up in solving time in an industrial environment, demonstrating the robustness of the proposed approach.
The paper is organized as follows. Section 2 contains preliminaries on automated design debugging and dominators. Section 3 presents the iterative fixpoint algorithm for computing dominance relationships between blocks. Section 4 shows how to leverage block dominators for early solution implications in design debugging. Section 5 gives experimental results and Section 6 concludes the paper.
PRELIMINARIES
The following notation is used throughout the paper. Given a sequential circuit C, the symbol l denotes the set of all nodes in C. The symbols x, y and s label (possibly overlapping) subsets of l, respectively referring to the sets of primary inputs, primary outputs and state elements (flip-flops) of C. For each z ∈ {x, y, s, l}, the Boolean variable z i denotes the ith element in the set z.
To simplify the presentation, we consider designs with single clock-domains, although the described theory is applicable to multiple clock-domains [16] . Time-frame expansion for k clockcycles is the process of replicating, or unrolling, the combinational component of C k times, such that the next-state of each timeframe is connected to the current-state of the next time-frame, thus modeling the sequential behavior of C. For any variable (or set of variables) z i (or z), symbol z t i (or z t ) denotes the corresponding variable (or set of variables) in time-frame t of the unrolled circuit. The behavior of C during the tth clock-cycle is formalized using the transition relation predicate T (s t , s t+1 , x t , y t ), which describes the dependence of the primary outputs y t and next-state s t+1 on the primary inputs x t and current-state s t . The transition relation T can be extracted from C and is normally given in Conjunctive Normal Form (CNF), using the set of nodes l t as auxiliary variables.
The sequential circuit C can also be represented as a directed graph. For convenience, we add an artificial sink node r to this graph, such that the set of nodes V = l ∪ {r} and the set of edges
We reserve the letters u and v to refer to nodes in
Furthermore, let the nodes l of C be grouped into (possibly overlapping) blocks. Each block consists of the synthesized gates of a given block of RTL code, such as an always block in Verilog. Let B = {b 1 , b 2 , . . . , b |B| } denote the set of all blocks, where each b i ⊆ l is a collection of nodes. Note that the same node l i can belong to more than one block because of the hierarchical nature of RTL. The set out(b i ) denotes the outputs of block b i . In the unrolled circuit, the set b t i (out(b t i )) contains the (output) nodes of block b i in time-frame t. Finally, for each node v, we let out −1 (v) = {b j |v ∈ out(b j )} denote the set of blocks in which v is an output.
Consider the sequential circuit in Figure 1 (a). The blocks
Note that y 1 and y 2 are primary output labels for g 3 and g 2 , respectively, and do not represent separate nodes. 
Single-Vertex Dominators
In a directed graph C = (V, E, r) with a single output sink r ∈ V , a node u ∈ V is said to be a structural single-vertex postdominator, or simply dominator, of a node v ∈ V , if every path from v to the sink r passes through u. The set dom(v) = {u ∈ V |u dominates v} consists of nodes that dominate v. As a convention, we consider that a node dominates itself. Furthermore, to ease the presentation, we assume that every node has a path to r (i.e., all dangling logic has been removed).
The immediate dominator of a node v (v = r), denoted by idom (v) , is a provably unique node u (u = v) that dominates v and is dominated by all the nodes in dom(v) − {v}. It can be shown that for all [17] . Therefore it is sufficient to compute all immediate dominators, which can be done in O(|E| + |V |) time [7, 8] . In the directed graph shown in Figure 1(b) , 
In this work, we are interested in finding dominance relationships between blocks in B, rather than between nodes in V . Section 3 outlines our approach, and discusses why methods for computing single-vertex dominators, as well as existing techniques for computing multiple-vertex dominators are not applicable in a design debugging setting.
Design Debugging
This section describes SAT-based design debugging and introduces relevant notation, which is used throughout the paper. Given an erroneous design, a counter-example and an error cardinality N , the task of an automated design debugger is to find all sets of N blocks that can potentially be responsible for the counter-example. More precisely, each returned set of N blocks
. . , |B|}, can be modified to rectify the erroneous behavior exhibited in the counterexample. We refer to each such set of N blocks as a solution of cardinality N . These solutions help manage the tremendous debugging complexity of modern designs [18] by significantly limiting the potentially buggy lines in the RTL. SAT-based automated design debugging [4, 11] encodes the debugging problem as a propositional formula whose satisfying assignments correspond to debugging solutions. The encoding process consists of several steps. Figure 2 illustrates a design debugging encoding for the circuit in Figure 1 (a) and a two-cycle counter-example.
First, a set of error-select variables e = {e 1 , . . . , e |B| } are added to the circuit, such that setting e i = 1 disconnects gates in out(b i ) from their fanins, making them free variables, whereas setting e i = 0 does not modify the circuit. This can be achieved by inserting special multiplexers or switches at block outputs or by directly modifying the CNF of the transition relation. Next, this enhanced circuit is replicated using time-frame expansion for the length of the counter-example k, and such that for all time-frames t, outputs out(b t i ) are controlled by the same error-select variable e i . Figure 2 illustrates this, where each e i is shown as an enable on the side of gates in out(b t i ), across all time-frames t. This allows the SAT solver to modify the outputs of block b i across all time-frames by setting e i = 1 to "fix" any potential errors in
Then, a set of constraints are applied to the initial state, the primary inputs and primary outputs in order to ensure that given the initial state Φ S (s 1 ) and primary input values Φ X (x 1 , . . . , x k ) in the counter-example, the primary outputs yield their expected values Φ Y (y 1 , . . . , y k ) given by the specifications. Φ Y can also be expressed as a set of properties. Finally, an error cardinality constraint Φ N (e) is added, setting P |B| i=1 e i to a pre-specified constant N . The resulting propositional formula is given by: 
Figure 2: Design debugging formulation
where Ten(s t , s t+1 , x t , y t , e) refers to the transition relation predicate of the enhanced circuit at time-frame t. Each assignment to e = {e 1 , . . . , e |B| } satisfying Debug (1) corresponds to a debugging solution, and the SAT solver must find all such satisfying assignments to e. This is normally done by iteratively blocking each satisfying assignment using a blocking clause and re-solving Debug until the problem becomes unsatisfiable. In a satisfying assignment where some e i = 1, the values of out(b t i ) across all time-frames t represent a sequence of corrections, which fix the erroneous behavior in the counterexample. Note that Debug (1) allows these corrections to be non-deterministic functions of the applied primary inputs. Figure 1 (a) to be a buggy implementation. We are also given a two-cycle counterexample with initial state 0, inputs x 1 , x 2 = 0, 1 , 0, 1 and expected outputs y 1 , y 2 = 1, 1 , 0, 1 , demonstrating a mismatch in the second time-frame at the output y 1 .
Example 1 Consider the sequential circuit in
The corresponding design debugging formulation is illustrated in Figure 2 
DOMINANCE BETWEEN BLOCKS
In this section, an iterative fixpoint algorithm is presented for finding all dominance relationships among a fixed set of multiplevertex blocks, which are naturally defined in a hierarchical RTL design. Assuming that internal (non-output) block nodes cannot be primary outputs, any path to a primary output exiting a block must pass through one of its outputs. Furthermore all primary outputs are connected to the artificial sink r. As such, the block dominator relation D ⊆ B × B can be formalized using restricted quantifier notation [19] as follows: (2) . Consider the sequential circuit given in Figure 1(a) . Although x 2 is not dominated by g 1 or g 2 separately, block b 2 = {x 2 } is dominated by block The relation D on the blocks B of C in Figure 1(b) is illustrated in Figure 3 . Unlike single-vertex dominators, a block does not necessarily have a unique immediate dominator block. This can be seen for block b 1 in Figure 3 . As such, algorithms for calculating single-vertex immediate dominators cannot be used for computing block dominators. On the other hand, in existing approaches for computing so-called generalized or multiple-vertex dominators [13] [14] [15] , block boundaries are not defined in advance. Instead, nodes are assembled into multiple-vertex dominators onthe-fly according to certain conventions, e.g., the smallest subset of f anout(v) collectively dominating a node v [13, 14] . This is not applicable in a design debugging setting, where circuit blocks are defined a priori by the hierarchical RTL design.
In this work, the block dominator relation D on the set of blocks B is computed in two steps. First, the block dominators of each node v ∈ V are computed. Then, these block-to-node dominators are used to compute the block-to-block dominator relation D. The block-to-node dominator relation d ⊆ B × V can be formalized as :
We let the set d(v) = {b j |b j dv} consist of blocks that dominate node v. For instance, in Figure 1 
Algorithm 1 shows our pseudocode for computing the block dominator relation D. It first computes the sets d(v) for every v ∈ V (lines 1 to 21). This is done using a fixpoint algorithm, where the set of block dominators of each node is initialized to all blocks B and iteratively refined until it converges to its actual block dominators. These block-to-node dominators are subsequently used on line 23 to compute
On line 1, C T denotes the transpose of directed graph C (i.e., C with edges reversed). The function reversePostordering(C T , r) performs a Depth-First Search of C T starting from r, and sorts the nodes in decreasing finishing times. In general, a reverse postordering is not unique. For instance, for C given in Figure 1(b) , reversePostordering(C T , r) can return r, g 2 , g 3 , s 1 , g 1 , x 2 , x 1 . Traversing V in reverse postorder guarantees for each node u ∈ V that at least one of v ∈ f anout(u) is already visited by the time u is traversed. This will reduce the number of iterations needed to reach a fixpoint when computing the sets d(v) later in the algorithm.
Lines 3 to 6 calculate the sets out −1 (v) for each node v. The iterative fixpoint algorithm for computing the sets d(v) for all nodes v (lines 8 to 20) is based on the traditional data-flow analysis algorithm for finding single-vertex dominators [17, 20] . Lines 8 and 9 initialize each dominator set d(v) to all blocks B for v ∈ V − {r}, and to the empty set for v = r. In each iteration of the while loop, the nodes are traversed in reverse postorder (as calculated on line 1) and a refined set of dominator blocks is computed for each node on line 14. The computation of this refined set of dominator blocks of each node on line 14 is the main difference with the data-flow analysis algorithm for single-vertex dominators. The new set of dominator blocks of a node u ∈ V is updated to be the intersection, over all v ∈ f anout(u), of the union of dominator blocks of v and the blocks in which v is an output. If any of the sets d(v) are changed during an iteration (i.e., the if condition on line 15 is true), the while loop is executed again. The while loop terminates after an iteration where all block-to-node dominator sets remain unchanged. Line 21 adds the blocks in which node v is an output to the dominators of v. 
Proof. In [21] , the authors describe a class of iterative dataflow analysis algorithms. They use a very general lattice theoretic framework to analyze the termination and computation of this class of algorithms, which have a variety of applications (e.g., in compiler optimization [22] ) and are not restricted to calculating dominators. We will use the conclusions of [21] to analyze the computation of the block-to-node dominator relation d in Algorithm 1.
Due to lack of space, we will avoid using lattice algebra, and will instead present the relevant results of [21] in our specific context. The class of algorithms described in [21] have a common structure, essentially conforming to lines 8 to 20 of Algorithm 1, but such that line 14 is replaced by:
with certain conditions specifying the types of functions fv that are allowed. In this proof, we will show that using
(as done in Algorithm 1) satisfies the conditions put forth in [21] , in order to prove the termination and correctness of our own algorithm. Let F refer to any set of functions mapping sets of blocks to sets of blocks. Formally, F refers to a set of functions f of the form:
where P(B) refers to the power set of B (i.e., the set of all subsets of B) and B ⊆ B is any arbitrary set of blocks. In [21] , a set of such functions F is said to be admissible if and only if the following four conditions are satisfied:
1. All functions in F are distributive over ∪:
2. F has an identity function:
Given a set of such admissible functions F and a directed graph C = (V, E, r) with output sink r, the authors of [21] map each vertex v ∈ V to some function in F , which they call fv. This mapping does not have to be one-to-one (i.e., each fv is not necessarily unique) or onto (i.e., {fv|∀v ∈ V } ⊆ F ). They prove that if F is admissible, then any algorithm that conforms to lines 8 to 20 in Algorithm 1, with line 14 replaced by (4), terminates. Furthermore, they show that in such a scenario, at the completion of this while loop, for each v ∈ V , we get:
In our case, we use the following set of admissible functions:
We leave it to the reader to verify that F * is indeed admissible (i.e., it satisfies the four conditions given above), due to lack of space. Next, as done in [21] , we map each node v ∈ V to some function f * v ∈ F * , where:
Replacing the functions f * v in (4) yields line 14 in our algorithm. Since f * v 's are drawn from the admissible set of functions F * , the while loop in Algorithm 1 terminates. Furthermore, using (6), at the completion of this while loop, we have:
Finally, on line 21, out −1 (v) is added to each d(v). As such, by the end of the foreach loop on line 21, we have:
As such, the computed sets d(v) satisfy the definition of the block-to-node dominator relation d given in (3).
Theorem 1 Algorithm 1 correctly computes the block dominator relation D.

Proof. D(b i ) is computed on line 23 as
. Using Lemma 1, we get:
which satisfies the definition of the block dominator relation D given in (2).
The overall run-time of Algorithm 1 is normally dictated by the run-time of the while loop from line 11 to 20. Furthermore, during each iteration of the while loop, line 14 clearly dominates computation time. We assume that all dangling logic has been removed during preprocessing (i.e., every node has a path to r), and as such |V | = O(|E|). Using an aggregate analysis of all executions of line 14 during a single iteration of the while loop, it can be seen that line 14 performs a total of O(|E|) intersections and unions between two sets of size at most |B| (since d(v), out −1 (v) ⊆ B) . We assume that all sets are implemented using ordered lists and therefore intersections and unions can be done in linear time. As such, in a single iteration of the while loop, line 14 takes O(|B| · |E|) time.
Let c denote the loop-connectedness of the directed graph C, which refers to the maximum number of back edges in any cyclefree path in C. The back edges are defined according to the DepthFirst Search performed in reversePostordering(C T , r) on line 1. It is proven in [21] that the number of iterations of the while loop for the general class of such fixpoint algorithms is bounded by c + 2, if and only if the following condition holds:
We show that (8) holds for our set of admissible functions F * given in the proof of Lemma 1. Consider any two functions f * , g * ∈ F * such that f * (B) = B ∪ B and g * (B) = B ∪ B , where B , B ⊆ B are arbitrary sets of blocks. We have:
clearly satisfying (8) . Therefore, our fixpoint algorithm takes O(c · |B| · |E|) time.
LEVERAGING BLOCK DOMINANCE IN DESIGN DEBUGGING
In this section, we show how to leverage the relation D to imply solutions early in the design debugging iterations. In effect, given a solution consisting of a set of blocks, we show that we can replace each block by any of its dominator blocks to get another solution. Formally, it is proven that for each known solution of Debug (1) of the form {b i 1 , . . . , b i N }, every set of the form
is also a solution of Debug (1). Furthermore, it is shown that corrections for each implied solution can also be obtained automatically from the satisfying assignment of the original solution.
First, due to the fixed length of a given counter-example, we must define the following, slightly modified concept of domination. (respectively b j 1 , . . . , b j N ) . 
Definition 3 We say that a block
Lemma 2 V N n=1 (b jn Db in ) ⇒ " S N n=1 b jn " D " S N n=1 b in " Proof. If ∀n[1 ≤ n ≤ N ],
Lemma 3 b j Db
Proof. If b j Db i then every path from b i to a primary output passes through b j . In particular, all paths to a primary output with at most k state elements also pass through b j . N , if {b i 1 , . . . , b i N } is a solution of Debug (1) and
Lemma 4 If
Proof. The theorem can be formalized as:
where we refer to the left-hand-side (right-hand-side) formula of the implication as the LHS (RHS). Let U refer to the k-time-frame expanded circuit obtained from C as described in Subsection 2.2.
across all time-frames in U . Also, let out(I) (respectively out(J)) refer to the set of outputs of I (respectively J). We will partition the nodes in U into three parts, U I , U J and U R , as follows.
Let U J denote the transitive fanout of out(J) in U . Let U I denote the nodes in U that are in the transitive fanout of out(I), but not in U J . Finally, let U R consist of the remaining nodes in U , outside U I and U J . We know that V N n=1 (b jn Db in ), and by Lemma 2 and Lemma 3, we get
Given this and Lemma 4, we can imply that any path from out(I) to a primary output must pass through out(J). As a result, these partitions of U can be represented by the diagram shown in Figure 4 .
Figure 4: Partition of U
Note that in Figure 4 , the output constraints are separated into two subsets:
denotes the output constraints applied at the outputs of U J (respectively U R ). This separation is only needed for this proof and is not required by our method.
We know that given e i 1 = 1, . . . , e i N = 1, there exist assignments to the nodes in U I , U J and U R satisfying the LHS. Let π(U I ), π(U J ) and π(U R ) refer to these assignments. We want to find assignments π (U I ), π (U J ) and π (U R ), such that given e j 1 = 1, . . . , e j N = 1, the RHS is satisfied. These assignments are found as follows.
First consider the subset of output constraints applied at the outputs of U R , denoted by Φ R Y in Figure 4 . Since π(U R ) satisfies Φ R Y and the input constraints to U R (i.e., Φ S ∧ Φ X ) are the same in the LHS and the RHS, setting π (U R ) = π(U R ) will also satisfy Φ R Y in the RHS. Next, consider U I . Note that any path from out(I) to a primary output must pass through out(J). Also, setting e j 1 = 1, . . . , e j N = 1 in the RHS disconnects out(J) from their fanins. Therefore, there are no output constraints applied on U I (i.e., U I is dangling logic in the RHS). As such, π (U I ) can simply "propagate" the values of π (U R ) in U I .
Finally, since the nodes in out(J) are disconnected from their fanins in the RHS, the SAT solver is free to pick any assignment for these variables. Furthermore, setting π (U R ) = π(U R ) already assigned any inputs to U J coming from U R to the same values as the LHS. Therefore, we can simply pick π (U J ) = π(U J ), which will satisfy Φ J Y in Figure 4 . This completes the satisfying assignment π to all the variables in U I , U J and U R in the RHS. Therefore, the RHS is SAT. a solution {b i 1 , . . . , b i N } and its corresponding satisfying assignment π of Debug (1), a sequence of corrections for each implied solution {b j 1 , . . . , b j N 
Corollary 1 Given
In the proof of Theorem 2, we showed how to build a satisfying assignment π of the RHS of (9) given a satisfying assignment π of the LHS. In particular, we showed that the subset of π corresponding to U J is the same as the subset of π corresponding to U J . In other terms, π (U J ) = π(U J ). Since U J is simply the transitive fanout of out(J) in U , the subset of π corresponding to out(J) is also the same as the subset of π corresponding to out(J). As such, given a satisfying assignment π for the original
Overall Flow
The flowchart in Figure 5 illustrates the overall design debugging flow using on-the-fly dominator implications. Algorithm 1 is first run to compute D(b i ) for every block b i ∈ B. Next, the automated debugger builds the original debugging problem, Debug (1), and passes it to the SAT solver. If it is UNSAT, the flow terminates. Otherwise, a solution {b i 1 , . . . , b i N } is returned. A simple implication engine takes in this solution, and using the pre-computed block dominator relation D, generates all newly implied solutions. A blocking clause is added to Debug for each of these implied solutions, as well as the original solution. The resulting debugging instance is given again to the automated debugger, and this process is repeated until the problem becomes UNSAT. Figure 1 
Example 2 Consider the sequential circuit in
EXPERIMENTAL RESULTS
This section presents the experimental results for the proposed dominator-based design debugging flow. All experiments are run using a single core of a Core 2 Quad 2.66 Ghz workstation with 8 GB of RAM and a timeout of 3600 seconds. The proposed debugging framework is implemented using a state-of-the-art hierarchical SAT-based debugger based on [4, 11] , with a Verilog front-end to allow for RTL diagnosis. Minisat-v2.2 [23] is used to solve all SAT instances.
Seven industrial Verilog RTL designs from OpenCores [24] and three commercial designs provided by our industrial partners are used in our experiments. For each design, several debugging instances are generated by inserting different errors into the design. The RTL errors that are injected are based on the experience of our industrial partners. These are common designer mistakes such as wrong state transitions, incorrect operators or incorrect module instantiations. The erroneous design is then run through an industrial simulator with the accompanying testbench, where a failure is detected and a counter-example is recorded. Each block b i ∈ B consists of the synthesized gates corresponding to a (set of) line(s) in the RTL implementing an assignment, an if statement, a module definition, an instantiation, etc. Experiments are conducted with and without dominator implications. dbg-trad refers to the "traditional" debugging flow (without an implication engine), and dbg-dom refers to our extended debugging flow using dominator implications, illustrated in Figure 5 . Table 1 shows the circuit characteristics of each design debugging instance. The first column gives the instance name, which consists of the design name and an appended number indicating a different inserted error. The following four columns respectively show the number gates |l|, the number of blocks |B|, the number of clock-cycles k in the counter-example, and finally the error cardinality N . Table 2 shows the results of all our experiments. The first column gives the instance name. Columns overhead and total #sols respectively refer to the run-time overhead for setting up the problem (i.e., generating the CNF of Debug) and the total number of returned solutions. The overhead run-time includes graph optimizations such as dangling logic removal. The overhead and the total #sols are common for both dbg-trad and dbgdom. Note that the number of solutions for instances with N = 2 can be greater than B (e.g., usb funct-3) because each twoblock combination {b i 1 , b i 2 } that can be modified to correct the counter-example is a solution.
Column four (dbg) shows the total SAT solver run-time using dbg-trad for finding all debugging solutions. The remaining columns present the results of our proposed framework, dbgdom. Column avg |D| shows the average size of the sets D(b i ) computed by Algorithm 1. Next, columns #impl and %impl respectively show the number of implied solutions for each instance and the percentage of implied solutions among all solutions. Column dom shows the run-time of Algorithm 1 for computing the block dominator relation D. Column dbg gives the total SAT solver run-time using dbg-dom, while column dom+dbg adds to this the dominator computation run-time of Algorithm 1. Finally, column impr shows the speed-up achieved by dom+dbg over dbg-trad, first excluding then including the common overhead. Figure 6 plots the ratio of implied solutions for each instance, sorted in increasing order. On average, 66% of all solutions are implied. In other terms, the number of calls to the SAT solver is reduced by a factor of 2.9x due to the early discovery of solutions using our approach. For each solution found by the SAT solver, about 2.6 more solutions are implied on average. This number is significantly less than the average number of dominators of each block, which is 19.5, because many implied solutions in later iterations might have already been found (or implied) in previous iterations. Figure 7 plots the number of found solutions versus run-time for both dbg-trad and dbg-dom for design1-2. It can be seen that while dbg-trad returns solutions at roughly equal time intervals, dbg-dom initially discovers solutions at a fast rate due to new implications, but the rate of discovery of new solutions decreases with time. Returning most solutions early is beneficial because the designer can start examining returned solutions earlier, while the debugger continues to run.
The average speed-up in total SAT run-time from dbg-trad to dbg-dom is 1.8x. In many cases, higher percentages of implied solutions mean less debugging iterations, which result in less total SAT solving time. For instance, in vga-1, 21 out of 23 solutions (91%) are implied, yielding a 2.4x speed-up in total SAT runtime, compared to the averages of 66% implied solutions and a 1.8x speed-up. However, this is not always true because of the unpredictable behavior of SAT solvers. Furthermore, we have not found any clear relationships between problem parameters, such as the size of the circuit or the length of the counter-example, and the speed-up due to solution implications. Including the time to compute the dominator relation D, the speed-up from dbgtrad to dbg-dom is about 1.7x disregarding common overhead, and 1.4x including common overhead. Figure 8 plots the runtimes of our approach (dom+dbg) versus those of dbg-trad on a logarithmic scale, along with the 1x, 2x, 3x and 10x lines, clearly showing the consistent superiority of the proposed method.
CONCLUSION
We present an iterative fixpoint algorithm for computing dominance relationships between multiple-output blocks of a design. We then show how to leverage these dominance relationships to reduce the number of design debugging iterations for finding all potential bug locations, or solutions, in a design. Furthermore, we prove that corrections for implied solutions can be automatically generated without explicitly analyzing these solutions. Finally, an extensive set of experiments on real industrial designs demonstrates the consistent benefits of the presented framework.
