Binary Decision Diagrams (BDDs) 
Introduction
Given a propositional formula, the Boolean Satisfiability Problem (commonly abbreviated as SAT) consists of determining a variable assignment such that the formula evaluates to true, or establishing that no such assignment exists. Although SAT is an NP-complete problem, or at least no polynomial algorithm to solve it is known, SAT solvers have received considerable research attention and large practical instances have been worked out thanks to efficient implementation procedures [1, 2] . Their application ranges from EDA to ATPG, from logic synthesis to verification [3] . In the verification domain, SAT techniques are mainly used for Bounded Model Cheking (BMC), looking for bugs (and counter-examples) of limited length .
BDDs have often been used in the same fields since they started gaining interest at the end of the '80s. Nevertheless, they have never been able to deal with the largest models and problem instances, because of the so called "BDD blow-up" (or memory explosion) problem.
Several recent papers have compared BDD and SAT methodologies on sequential verification within the unbounded and bounded model checking frameworks. Even though no definite conclusion can be drawn, researchers and engineers agree that SAT tools are complementary to BDD-based ones and that the quest for efficient and comprehensive combinational and sequential verification methods is still not completed.
In this work we explore a new way to make BDD-based and SAT-based tools cooperate. Our target is to improve the efficiency of SAT-based BMC with the help of "cheap" and affordable BDDbased operations. To this respect, "approximate" traversals may deal with larger circuits than "exact" ones, at the expense of exactness. Moreover, as the degree of approximation can be trimmed, it is always possible to trade-off memory and time with the accuracy of the result. Unfortunately nothing comes for free and the limit of approximate techniques in verification is that they are not complete, i.e. over-approximate reachability can prove correctness, but it cannot disprove it.
Our driving idea is to complement the initial over-approximate BDD information with a final SAT-solver search, using BDDs to prune and focus the search. In a first phase, we compute (in the forward and/or backward directions) an over-approximate estimate of the traces connecting the initial state set to the target one. Then the estimate is combined, as an additional constraint, with the Bounded Model Check problem, to be solved by a SAT tool.
Our target is to obtain an efficient pruning of the SAT solver search space, which somehow mimics the contribution of "conflict clauses", generated by means of "conflict analysis" in stateof-the-art SAT tools. Each new conflict clause specifies a sub-set of the state space in which there exist no solution. Similarly, our over-approximate knowledge of reachable states restricts the SAT solver state space. We presently implement this extra information as an initial pre-processing or "learning" phase. On the one hand, we might loose some optimizations achievable through a tighter and more dynamic inter-leaving with the SAT solving tool. On the other hand our method is quite simple and it is compatible with any SAT solver, since we do not require any interaction with inner steps of SAT algorithms. As far as we know, this is the first time a symbolic BDD-based over-approximate information is used to prune a SAT-solver search space.
A further minor contribution of our work is to introduce a set of strategies to store a BDD (in a monolithic or conjoined form) as a CNF formula. These methods will be compared in terms of their compactness to generate the resulting CNF problem (in terms of variables, literals and clauses), and their influence over the SAT-engine (in terms of pruning efficiency).
The remainder of this paper is organized as follows. In Section 2 we introduce some preliminary concepts on notation, SAT problems and reachability analysis. Section 3 is dedicated to the related works. Section 4 introduces our approach and Section 5 describes our technique to store BDDs as CNF problems. Sec-tion 6 presents the experimental results. Finally, Section 7 concludes the paper with a brief summary and some hints on possible future work.
Background

Model and Property Definition
The sequential systems we address are usually modeled as Finite State Machines (FSMs). Each FSM is described by a Transition Relation TR, which indicates its present-next state behavior, and an initial state set S.
An invariant property ½ P is checked by attempting to prove (or disprove) the reachability of its complement T (target state set, Ì È) from S.
SAT-Based Model Checking
For an overview on SAT solvers and a complete list of references the reader can refer to the tutorial [3] .
SAT based BMC considers only paths of bounded length and builds a propositional formula that is satisfiable iff there is a counter-example (a path from S to T) of the same length. For the above reason the technique works well in falsification and partial verification, whereas Full verification is usually achieved by BMC with longer and longer bounds, possibly augmented with inductive proofs, when proving correctness rather than seeking for bugs.
SAT solvers generally operate on problems for which is specified in Conjunctive Normal Form (CNF). This form is a twolevel decomposition: The logical AND of one or more clauses, each of which consists of the logical OR of one or more literals. A literal is merely an instance of a variable or its complement.
In order to decide if is satisfied, most solvers adopt variants of the basic Davis-Putnam recursive algorithm. At each step of recursion, the algorithm basically proceeds through the following three steps:
Variable Decision: Assign a value to an unassigned variable so exploring new regions of the search space.
Boolean Constraint Propagation:
Carry out all possible implications due to the previous assignment.
Conflict Analysis: Check for "conflicting clauses", i.e., clauses whose literals are all assigned to a value zero, and in case one of this is discovered to undo the current assignment (so that another assignment can be tried). In this phase, "conflict clauses", i.e., clauses which identify previous conflicts, are also added to the clause database for early detection (and pruning) of bad decisions and/or variable assignments.
BDD-Based Model Checking
A standard BDD-based forward reachability analysis procedure is a breadth-first visit of the state space that starts from R = S and proceeds through a least fix-point (lfp) iteration: Ê Ð Ô Ê ´Ë ´ÁÑ ´ÌÊ Êµµ which returns FR, i.e., the set of forward reachable states. The method is based on the iterated application of the IMG function, to compute symbolic images of the R state set. We indicate with ½ Or AG CTL property.
R the state sets generated at each traversal iteration (the so called frontier sets).
As T may be reached before the fix-point it is possible to avoid a full computation of FR with on the fly tests for intersection with T. A counter-example is eventually computed starting from the array FR of frontier sets R .
CTL model checking procedures are often implemented as backward traversal procedures, computing BR sets in the backward direction. This is easily expressed by swapping the S and T sets, and changing the IMG function with the PREIMG computation.
Approximate Traversals [4, 5] are a popular way to extend the applicability of reachability analysis to larger circuits.
The approach is based on the approximate image (IMG · ) operator, returning over-estimations of exact images:
Notice that although R · represents more states, its BDD representation is usually much smaller and simpler than R, as many mutual interactions and dependences among state variables disappear because of the approximation. As a final remark, let us remember that the limit of approximate techniques is that they allow a sufficient but not necessary check, i.e., they can prove equivalence but they cannot disprove it.
Related Works
With the advent of SAT-based BMC tools a lot of researchers have tried to compare SAT-based method with more traditional BDDbased methods. To this respect different researchers agree that the two approaches are essentially complementary. For example in [6] the author compare BDD-based and SAT-based using a new algorithm solving unbounded model checking problems. As a conclusion he shows that performance strictly depends on the problem instances and no clear winner can be drawn at least at the moment. Driven by the same conclusions, other researchers tried to combine the two approaches. In [7] the authors perform reachability analysis by using a SAT engine to create and manipulate a disjunctive partition of the transition relation and BDDs to represent state sets and deal with them.
To extend approximate traversals to complete checks, a lot of researchers have somehow mixed approximate, exact, forward, and backward traversals [8, 9] . In [8] authors use a combination of approximate forward and backward reachability analysis. The proposed algorithm attempts to prove the mutual reachability between initial and failure states by iteratively performing overapproximate forward and backward traversals. Each new traversal increases the accuracy of the approximation, and the property is proved whenever a forward (backward) traversal reaches a fixpoint outside its target.
Our work shares with these works the idea of focusing and guiding a final search with previously cheaper and approximate ones. In any case, our method ends with a SAT solver call as we use reachability analysis frontier sets to help the SAT solver search in its quest.
Proposed Methodology
The main flow or our methodology is represented in Figure 1 . 00 00 00 00 00 00 11 11 11 11 11 11 00 00 00 00 00 00 11 11 11 11 11 11 00 00 00 00 00 00 11 11 Figure 1(a) shows a graphical representation of the standard SAT-based BMC. As introduced in Section 2, to find a path of length between S and T a combinational unrolling of the circuit representation, TR, of length is generated. By adding the expressions for S and T, and performing a proper variable relabeling for TR a propositional formula is generated in CNF format. The SAT-engine is finally run on the resulting problem to solve it. It is useful here to remember that the value of the bound is generally unknown. This represents one of the drawbacks of the method, as a complete verification requires checks with increasing values of , usually reaching computational limits.
Proceedings of the
Our basic idea is to help the SAT solver with information coming from a BDD-based reachability analysis tool. In the simpler version we perform a standard forward breadth-first traversal as the one indicated in Figure 1 (b) or a backward breadth-first traversal as the one indicated in Figure 1(c) . In reality the applied approach is a little more sophisticated and it is detailed in Section 4.1. This phase gives an over-estimate of the paths (a "cylinder") leading from S to T so that all possible real paths are included in this over-approximation. Notice that, at this stage, we work with BDD tools, then each set of state is represented by means of BDDs ¾ . From a SAT solver point of view these sets of states constitute a guide for the search. Whereas standard "conflict clauses" defines impossible assignments, our state sets define possible assignments so potentially they drastically reduce the search space for the SAT solver.
Notice that a minor contribution of applying approximate reachability analysis before the satisfiability analysis is to identify early terminations ¿ , saving computation time. Moreover, the approximate search may also be useful to identify an inferior limit for ¾ Depending from the kind of BDD representation/decomposition used, which we do not talk about for the lack of space, the state sets may be represented by monolith, disjunctive or conjunctive forms.
¿ Whenever the over-estimation of the reachable state set Ê · does not intersect the target set of states T the property will "pass" and in these are also the hardest cases to be proved by the SAT engine. In these cases the approximate reachability phase would consists in a preliminary and free-of-charge check.
the value of the bound . This can be useful to avoid useless searches for impossible values of . Finally, the over-approximation reveals to be particularly useful for high value of the bound.
The Approximate Reachability Analysis Phase
The approximate traversal phase is very important as far as the resulting space simplification is concerned. From a theoretical point of view, the more the approximate traversal is accurate the more the sets of states are close to the exact ones and the more the SAT solver search space is reduced. From a practical point of view, the more the approximate traversal is accurate, the larger are the BDDs representing the R sets and the more likely is the translation process to CNF formulas (see Section 5) to introduce a larger amount of temporary variables and clauses. As a consequence there is a trade-off between accuracy and usability of the result and this is also balanced by the cost of computing the over-estimation.
Our basic approach is to perform a forward or a backward over-approximate analysis from S to T or from T to S as described in Section 4. More in detail, we follow previous approaches on the use of forward and backward approximations [8, 10] , and we adopt an iterative refinement process based on a sequence of alternate forward and backward traversals to produce better estimates. The pseudo-code of the procedure is shown in Figure 2 . It proceeds through a cycle computing least fix points in the forward and backward direction computing forward FR and backward BR reachable state sets. Each forward fix point is performed by restricting the search with the BR state set and viceversa. The iteration stops when no further simplification occur or when the traversal costs, in terms of CPU time and memory usage, exceed a user-defined threshold. The COSTEVALUATE function checks hardware constraints, memory and time costs after each iteration and the process is stopped whenever the costTh threshold is exceeded.
while (FR BR AND COSTEVALUATE costTh) return (BR) Figure 2 : Over-approximate Forward/Backward Traversal.
Using Symbolic State Sets Information
Once we have our set of states over-estimating a path from S to T we can apply two possible strategies:
Adding the new information to the original CNF problem.
Using each single "ring" of the estimate, R · , to simplify the circuit representation at the corresponding time frame and then create the CNF problem using the simplified instances.
In the first case the approach is straightforward as it is enough to append the set of states, represented as monolithic or decomposed
Notice that, to this respect, the paths found by the over-approximation are always shorter than the exact ones.
Proceedings of the Design,Automation and Test in Europe Conference and Exhibition (DATE'03) 1530-1591/03 $17.00 © 2003 IEEE BDDs, as a set of CNF clauses. In the second case, we try to perform some simplifications before the CNF problem formulation using cofactor based techniques on the BDD representations of the circuit and the state sets.
Notice that in all the cases the counter-example eventually obtained possibly includes some temporary variables generated by the BDD to CNF translation process. So we need to bring it back to the original representation space, by quantifying out the temporary variables (see Section 5).
Dumping BDDs as CNF formulas
Given a BDD representing a function in monolithic or conjunctive form, we develop three possible ways to store it as a CNF formula.
1. The first method, which we call Single-Node-Cut, models each BDD nodes, but the ones with both children equal to the constant node ½, as a multiplexer. Each multiplexer has two data inputs (i.e., the children nodes), a selection input (i.e., the node variable) and one output (i.e., the function value) whose value is assigned to an additional CNF variable. The final number of variables is equal to the number of original BDD variables plus the number of "internal" nodes of the BDD.
The
No-Cut method creates clauses starting from corresponds to the "off-set" (i.e., the set of cubes from the root to the terminal node zero) of the function . Within the BDD for , such clauses are found by following all the paths from the root node of the BDD to the constant node ¼. The final number of variables is equal to the number of original BDD variables.
3. The Auxiliary-Variable-Cut method is a trade-off between the first two strategies. Internal variables, i.e., cut points, are added in order to decompose the BDD into multiple sub-trees each of which is stored following the second strategy. The trade-off is guided by a cut point selection strategy, and we experimented with two methodologies. In the first one, a new CNF variable is inserted in correspondence to the shared nodes of the BDD, i.e., the nodes which have more than one incoming edge. This technique, albeit reducing the total number of literals stored, can produce clauses with a high number of literals . To avoid this drawback, the second method, introduces all the previously indicated cutting points more the ones necessary to break the length of the path to a maximum (user) selected value.
Actually, all the methods described above can be brought back to the basic idea of possibly breaking the BDD through the use of additional cutting variables and dumping the paths between the root of the BDD, the cutting variables and the terminal nodes. Such internal cutting variables are added always (for each node), never or sometimes respectively. While the Single-Node-Cut method minimizes the length of the clauses produced, but it also requires the higher number of CNF variables, the No-Cut technique minimizes the number of CNF variables required. This advantage is counter-balanced by the fact that in the worst case the number of clauses, as well as the total number of literals, produced is exponential in the BDD size (in terms of number of nodes). The application of this method This value is superiorly limited by the number of variables of the BDD, i.e., the longest path from the root to the terminal node.
is then limited to the cases in which the off-set of the represented function has a small cardinality. The Auxiliary-Variable-Cut strategy is a trade-off between the first two methods and the ones which gives more compact results. As a final remark notice that for us the compactness of the formula takes second place after the efficiency on the formula itself of the SAT engine. Figure 3 shows an example of how our procedure works to store a small monolithic BDD. Figure 3(a) represents a BDD with nodes. BDD variables are named after integer numbers ranging from ½ to , to have an easy-to-follow correspondence with the CNF variables. Figure 3(b), (c) and (d) show the corresponding CNF representations generated by our three methods. As in the standard format p indicates the total number of variables used ( is the minimum value as the BDD itself has variables), and cnf the total number of clauses.
Example 1
As a final remark notice that for this specific example the "No-Cut" approach is the one which gives the most compact CNF representation but also the clause with the largest number of literals ( ). 
Experimental Results
Our experimental set-up is made up of three distinct phases. During the first phase, we start from the ISCAS'89 and ISCAS'89-addendum benchmark circuits in a Verilog or blif format, and the properties taken from [10] . From these source files we generate the BMC-CNF formulation of the problem using four different tools: the publicly available VIS, NUSMV and BMC and an home-made generator. As far as our package is concerned it is able to generate CNF formulas both from the original network of the circuit and from its transition relation representation. The generated CNF problem is stored as a standard DIMACS CNF file. While NUSMV, VIS and our tool produce similar results, BMC usually produces CNF files 20-30% more compact both in terms of clauses and (intermediate) variables. Nevertheless the BMC tool does not store the variable correspondence, which we need to be congruent with our reachability analysis information, as a consequence we do not report further experiments with this tool at the moment. Among the other, we always report the best results in term of performance of the SAT solver.
During the second phase, we generate the set of approximate reachable state sets for the circuit. In this phase we use both the Proceedings of the Design,Automation and Test in Europe Conference and Exhibition (DATE'03) 1530-1591/03 $17.00 © 2003 IEEE VIS tool and again our home-made tool. Albeit VIS implements almost all the approximate traversal algorithms presented in the literature, we need the over-approximation of the reachable state set at the same bound level for all sub-machines. As a consequence we need variant of the original Machine By Machine (MBM) algorithm with or without overlapping projections. Our tool, implemented on top of the Colorado University Decision Diagram (CUDD) package, implements the approximation verification method presented in Section 4.1. Once we have generated the BDDs for the over-approximation we store them in different format following the methodology reported in Section 5.
During the third and last phase we run the CHAFF sat engine (both the MCHAFF and the ZCHAFF versions) on the two problem instances, i.e., the original problem formulation and the one generated by merging in the information coming from the reachability analysis phase. Notice that in all the cases CHAFF is run with the default settings.
Our experiments ran on a Pentium IV 1700 MHz Workstation with ½ GByte main memory, running RedHat Linux 7.1. Table 1 and 2 report our results. The meaning of the columns for the two tables is the following. # SV is the number of state variables in the model. # Clauses, # Vars and # Lits represent respectively the number of clauses, variables and literals in the problem (they are reported as an absolute value or as a relative ones). D.M. represents the method we used to dump the BDDs for the states obtained during the approximate traversal ("S" stands for the Single-Node-Cut method, "N" for the No-Cut and "A" for the Auxiliary-Variable-Cut).
# Decs and # Confl. represent the total number of decisions taken and conflicts produced by the sat solver. Finally, Mem. and Time indicate respectively memory occupation (in MBytes) and CPU time (in seconds). As far our technique is concerned the Setup time includes the one to perform the reachability analysis phase and the one to generate the new problem formulation, and the reported memory is mainly the one used by the SAT solver as the one used during the traversal phase is usually much smaller.
For circuit s1512 we report some comparison among different possible settings. First of all, for the property P¿, we store reachable state sets as CNF formulas using the different implemented methods (rows labeled A, S, and N). Secondly, we present some results targeting the influence of the accuracy of R on the CNF problem size. Rows S £ , N £ and A £ report results obtained with a more approximate traversal than the one used for rows S, N and A. For this property the set of generated clauses is larger then the original set from about % to more than ½¼¼¼% in the worst case. More approximate reachable state sets give more compact representation as initially supposed. In Table 2 we report only the best SAT results, which is in correspondence of the A storing method of Table 1 .
Among the other experiments performed, we try to add BDD traversal information to the original problem formulation in different way (at the bottom of the file, at the top, each reachable state set exactly before or after the relative transition relation, following the original order (from S to T) or following a reverse order, etc.). We obtain small differences, in the range of 10-20% in terms of CPU time, and, for the lack of space, we do not report evidence on that issue.
Moreover we also try to guide the SAT solver variable selection, in the Variable Decision phase, with some indications on the variable order and variable coupling derived from the reachability analysis tool. Also in this case we have slight variations in the SAT engine performances, in the order of about 10%, and we again do not report evidence on that.
Conclusions and Future Works
In this paper we propose to exploit inexpensive symbolic approximate forward and/or backward reachability analysis to restrict the overall search space of a SAT-solver engine.
We experimentally compare the resulting problem formulation with the original one and show its power in term of problem simplification and generality.
Among the possible future work we surely need some more experimental work on public domain and industrial benchmarks and difficult-to-find bugs. Moreover, we would like to investigate smarter way to prune the SAT-engine search space using the information coming from the over-approximate estimate. 
