Abstract-This paper proposes a new approach to multilevel logic optimization based on automatic test pattern generation (ATPG). It shows that an ordinary test generator for single stuckat faults can be used to perform arbitrary transformations in a combinational circuit and discusses how this approach relates to conventional multilevel minimization techniques based on Boolean division. Furthermore, effective heuristics are presented to decide what network manipulations are promising for minimizing the circuit. By identifying indirect implications between signals in the circuit, transformations can be derived which are "good" candidates for the minimization of the circuit. A main advantage of the proposed approach is that it operates directly on the structural netlist description of the circuit so that the technical consequences of the performed transformations can be evaluated in an easy way, permitting better control of the optimization process with respect to the specific goals of the designer. Therefore, the presented technique can serve as a basis for optimization techniques targeting nonconventional design goals. This has already been shown for random pattern testability [11] and low-power consumption [28]. This paper only considers area minimization, and our experimental results show that the method presented is competitive with conventional technologyindependent minimization techniques. For many benchmark circuits, our tool Hannover implication tool based on learning (HANNIBAL) achieves the best minimization results published to date. Furthermore, the optimization approach presented is shown to be useful in formal verification. Experimental results show that our optimization-based verification technique works robustly for practical verification problems on industrial designs.
I. INTRODUCTION

M
ULTILEVEL logic optimization figures prominently in the synthesis of highly integrated circuits. The goal of multilevel logic optimization is transforming an arbitrary combinational circuit into a functionally equivalent circuit , circuit being less expensive than according to some cost function. The cost function typically incorporates area, speed, power consumption, and testability as the main objectives of the optimization procedure. This research focuses on optimizing a given circuit with respect to its area, a minimal area representation of the circuit being a good basis for sub-sequent steps targeting high speed, low power consumption, and high testability.
The field of multilevel logic optimization is not as well delineated as the field of two-level optimization [7] , and there exist many different views on the multilevel optimization problem. An early systematic approach was proposed by Ashenhurst and Curtis [2] , [13] and is known as functional decomposition. Functional decomposition, in general terms, is the process of expressing a switching function of variables as a composition of a number of functions, each depending on less than variables. Due to their complexity, early methods based on functional decomposition have been of limited use in practice. However, research in this area is still active. Recent contributions [19] , [25] , [26] , [38] are encouraging, and have proven to be very useful in field programmable gate array (FPGA) synthesis.
Presently, the most flexible and powerful synthesis techniques for combinational circuits are based on Boolean and algebraic manipulations of Boolean networks, pioneered by Brayton et al. [8] . Since they provide good optimization results and can handle circuits of realistic size, these methods have become widely accepted.
Even with much recent progress, e.g., [3] , [8] , [10] , [14] , [16] , [18] , [19] , [25] - [27] , [31] , [32] , [34] , and [38] , the size and complexity of today's integrated circuits leave multilevel logic optimization a major challenge in the field of computeraided circuit design. In particular, high-memory requirements represent the dominating limitation for many methods.
An important attribute of most common synthesis procedures is that they divide the synthesis process into an technology-independent minimization phase and a cell-binding procedure which maps the design to a specific target technology (technology mapping). However, the strict separation of logic minimization from the specific technical design information can sometimes be of disadvantage since the powerful concepts for deriving circuit transformations cannot be oriented at the specific technical data.
Therefore, an important goal of this research is to work toward general logic minimization techniques which operate directly on the structural gate netlist description of the circuit so that the specific technological information of the given gate library is immediately available to guide the optimization process.
Our work is motivated by recent advances in test generation. Over the years, considerable progress has been achieved in combinational ATPG, and it seems wise to utilize the power of modern ATPG methods also in synthesis. ATPG methods are 0278-0070/97$10.00 © 1997 IEEE attractive for two reasons. First, in order to obtain effective test sets, ATPG techniques operate directly on a gate-level description of the circuit. Second, ATPG methods are very memory efficient and typically have memory requirements linear in the size of the gate-level description.
An important contribution exploiting testing techniques in logic minimization has recently been presented by Entrena and Cheng [16] . They propose an extension to redundancy removal (see, e.g., [1] ) and describe an effective method which is based on adding and removing connections in the circuit. The approach to be presented in this paper can be seen as a generalization of the technique in [16] applied to combinational circuits. The advantage of operating directly on the gate netlist has also been recognized by Rohfleisch and Brglez [32] who presented a technique based on permissible bridges which can effectively optimize a circuit after technology mapping.
The methods of [16] and [32] have been shown to be very useful for postprocessing networks that were preoptimized by traditional techniques. On the other hand, they only consider a restricted set of possible network manipulations and, therefore, do not provide the same reasoning power and flexibility as traditional technology-independent synthesis methods.
Therefore, our goal is to develop a multilevel logic minimization method which is competitive with technologyindependent minimization techniques such as [8] and which uses a test generator as the basic Boolean reasoning engine. In this paper, we present a method which is general in the sense that, in principle, it can derive arbitrary transformations in a combinational network. The second contribution of this paper is to introduce a new heuristic guidance for logic optimization. We show how logic minimization (for area) can be guided effectively by a single heuristic concept: the optimization process can be controlled by analyzing implications between circuit nodes. The complexity of reasoning required to derive logic implications is seen to be related to the optimality of the circuit structure and is used in optimization.
A major strength of our method is that it efficiently identifies or creates permissible functions [27] . Therefore, it relates to Muroga's transduction method. There are two main ingredients to our method: the D-calculus of Roth [33] and Recursive Learning [22] . The latter, which is discussed briefly in Section II, is used to derive logic implications in combinational circuits. Analyzing implications is crucial for deriving good circuit transformations. In this aspect our method also relates to [3] and [18] .
Throughout the paper, we attempt to relate the concepts of our ATPG-based method to common concepts in logic synthesis ("division," "permissible functions," "don't cares," and "common kernel extraction").
As pointed out [8] , "division" is central to Boolean/algebraic methods of logic optimization. For example, take the function . A simpler representation of the same function is . This representation can be obtained by defining a division operation ' ' such that . The expression is referred to as a "divisor" of . When developing an ATPGbased method for logic minimization, the following two central issues have to be addressed. 1) How can an ATPG-based method perform Boolean division? 2) How can an ATPG-based method provide "good" divisors? The paper is outlined as follows. The first of the above questions will be addressed in Section III, and the second question is discussed in Section IV. Section V describes and illustrates the general flow of our optimization procedure. Section VI is dedicated to an unconventional application of an optimization technique; we formulate the formal logic verification problem as an optimization problem and demonstrate how the described method can be tailored for logic verification. Section VII shows experimental results.
II. INDIRECT IMPLICATIONS
The optimization method to be presented heavily depends on analyzing implications derived by recursive learning [22] . A more general method to determine implicants in a multilevel network based on AND/OR graphs has been presented recently in [36] . The optimization procedure described in the following sections does not yet exploit the concepts of [36] but only uses recursive learning. Some previous results and terminology are briefly summarized. Recursive learning is a method to determine all value assignments which are necessary for the detection of a single stuck-at fault in a combinational circuit. This involves finding all value assignments necessary for the consistency of a given assignment of values to a set of nodes in the circuit. Determining value assignments necessary for the consistency of a given set of value assignments is often referred to as performing implications.
Consider the gate-level circuit of Fig. 1 . Assume that the value assignments have been made in the circuit. By considering the truth table of an AND-gate, we imply . The variable is an input variable of , and by another implication, we obtain . Variables and are input variables of , and we perform the implications and . In [22] , this type of implication has been referred to as direct implication.
As defined in [22] , direct implications are identified by evaluating the value assignments at each gate and by propagating the signal values according to the connectivity in the circuit. An implication which cannot be determined in this simple way has been called indirect.
While the performance of direct implications is a straightforward procedure, it is more difficult to perform implications which are not direct. Reconsider the circuit in Fig. 1 and assume a value assignment of . A closer study reveals that implies [35] . The implication is not direct, and more sophisticated techniques are required to derive such indirect implications. Recursive learning as presented in [22] represents a technique which allows us to derive all direct and indirect implications for a given situation of value assignments.
Indirect implications play an important role in our strategy for circuit optimization. As will be shown, indirect implications identify promising divisors for transforming the circuit. For a more detailed description on how the reasoning in recursive learning can be used to identify circuit transformations, see also [36] .
III. MANIPULATING COMBINATIONAL NETWORKS BY ATPG
Assume we are given a combinational circuit with primary inputs and primary outputs and containing only the primitive gates AND , OR , NOT . The ANDand OR-gates can only have two inputs. These restrictions are made in order to simplify the theoretical analysis of our method. In the following, such circuits will be referred to as combinational networks. (Of course, a reasonable implementation of our approach can also handle multi-input gates including NAND, NOR, and possibly XOR.) Furthermore, signals in the circuit can have constant values of '0' or '1.' All gates in the circuit have unique labels, and their output signals realize Boolean functions with , where the variables correspond to the primary input signals of the circuit . Following the usual representation of a combinational circuit as a directed acyclic graph (DAG), we say, as in [8] , that a signal lies in the transitive fanout of if and only if there exists a directed path from to in the image of as DAG. Avoiding formalism, depending on the context, we will refer to the primary input signals and the output signals of the gates in circuit as "signals," "functions," or "nodes." Furthermore, we assume that there are no external don't cares; the function of the combinational network with is completely specified. An extension to our method using external don't cares is possible but will not be further considered in this work. 
Shannon's expansion can be understood as a special case of an orthonormal expansion [6] where the functions represent an orthonormal basis, i.e., i) ;
ii) 
The terms and denote the cofactors of this expansion. In the special case of Shannon's expansion, the cofactors are chosen by restricting the original function with respect to a particular variable, as in (1) . We obtain the cofactor for with respect to a variable by setting in the expression for , similarly, the cofactor for results when setting . Note that there is no such simple rule in the more general case of (2) .
Let the cofactors be denoted , with . Further, let denote an incompletely specified function . The cofactors in (2) must be chosen such that the following equation holds: if X (don't care) otherwise.
(3) It is easy to see why (3) is true. Assume that the truth table of is divided into two parts such that is false for all rows in the first part and true for all rows in the second part. If we first consider the part of the truth table of for which is true, we can set to the don't care value for all rows in which is false. This means that the cofactor function must only have the same value as in those rows where is not don't care. Therefore, any valid cofactor for the expansion of (2) covers (denoted ' ') the incompletely specified function as given by (3). This first part of the function is described by the expression . In the second part, we are looking at those rows of the truth table for which is false and obtain . Equation (2) is the basis of our approach to transforming a combinational network. In order to relate our approach to the Boolean/algebraic techniques of [8] , we can refer to function as divisor of . Similarly, can be referred to as quotient, and represents the remainder of the division. Further, note that the combined don't care sets of the two cofactors in (3) are identical to the don't care set passed to a minimization algorithm for Boolean division, described in [8] . The main issue in our approach, as well as in [8] , is to find appropriate (divisor) functions such that the internally created don't cares as given by (3) provide "degrees of freedom" in the combinational network which can be exploited to minimize its area.
Obviously, the result of such an orthonormal expansion (or Boolean division) depends on how the don't cares are used in order to minimize the circuit. (Boolean division is not unique.) In [8] , the don't cares are explicitly passed to an optimization run by ESPRESSO. The approach to be described here, proceeds in a different way and uses a test generator to determine the cofactors in the above expansion. As already observed by Brand in [4] , circuitry tends to have an increased number of untestable single stuckat faults if it is not properly optimized with respect to a given don't care set. This suggests that the don't cares created by the expansion of (2) can also cause untestable stuck-at faults which can be removed by the standard procedure of redundancy elimination. In fact, redundancy elimination is a simple way to minimize the circuit with respect to don't care conditions. Note that redundancy elimination does not require any explicit knowledge about the don't care sets. Throughout this paper, transformations are examined that create internal don't cares. However, these don't care conditions are not explicitly calculated or represented. They are only considered in our theoretical analysis to illuminate where the redundant faults to be eliminated come from.
Example 3.1: To illustrate how don't cares as given by (3) lead to untestable stuck-at faults, consider Shannon's expansion as an example, i.e., take the special case where the divisor is some variable . Note that the original function is a possible cover for both and so that according to (2) we can form the expression . In Fig. 2 (b), this is implemented as combinational circuit for the example, and . Note that the choice of the original function as a (trivial) cofactor ignores the don't care conditions as given by (3) . The fact that the cofactors are not optimized with respect to these don't cares leads to untestable stuck-at faults as indicated. (The cofactors are shaded grey). It is determined by ATPG that , stuck-at-one, and , stuck-at-zero, in the respective cofactor are untestable and can be removed by setting to a constant one or zero, respectively. Fig. 2(c) shows the circuit after redundancy removal. Redundancy removal in this case obviously corresponds to setting to one or zero in the respective cofactors of (1) .
By viewing redundancy elimination as a method to set signals in cofactors to constant values, we have just described an ATPG-based method to perform a Shannon expansion. Clearly, it is not sensible to use a test generator in order to prove that in the Shannon expansion of (1) can be set to constant values. However, this ATPG interpretation of Shannon's expansion is quite useful in the more general case of (2), i.e., when we expand in terms of some arbitrary function . In the general case, it is a priori not known if and what signals in the cofactors can be set to constant values. This, however, can be determined by means of a test generator.
Let be an arbitrary node in a combinational network and be some Boolean function represented as a combinational network. The variables of may or may not be nodes of the combinational network . A new combinational network is constructed as follows. We duplicate all nodes in the transitive fanin of so that there are two implementations of node . This has been illustrated in Fig. 2 
(a) and 2(b). One version is ANDed with , the other version is ANDed with
, and the outputs of the AND gates are combined by an OR gate whose output replaces the node in the original network. In the following, this construction will be represented by the equation . Letting and be Boolean functions represented as a combinational network, we propose to expand function in terms of function by the following method:
2) redundancy elimination with an appropriate fault list.
(4) This expansion can also be understood as a special ATPGbased transduction [27] as it consists of a transformation and a reduction. In the following, we use the terms expansion and transduction synonymously. Since this ATPG-based transduction is one out of many possibilities to perform a Boolean division or orthonormal expansion in a combinational network, it is important to investigate what network transformations are theoretically possible using it. In the following theorem, we prove that the construction of (4) Equation (4) redundancy elimination for stuck-at-one fault at in second summand:
redundancy elimination for stuck-at-one at signal with constant one: 7) Unary operation: for:
Equation (4) redundancy elimination:
redundancy elimination: 1 Equation (4) redundancy elimination: In order to complete the proof, it must be shown that the above expansion also allows arbitrary sharing of logic. This follows easily from the following construction. Let be the original network and be the target network. Further, let denote a network that has tree structure and results from if all sharing of logic is removed by duplication. Similarly, let denote the tree version of the target network. Consider the following construction. First we remove all sharing of logic between the different output cones of the original network so that we obtain . It is easy to derive by the above expansion. Let be some internal fanout branch and assume its stem is the output of an AND gate with input signals and . By choosing a divisor and by performing the above expansion with an appropriate fault list, the AND gate is duplicated, and the fanout point is moved to the inputs of the AND gate. For other gate types, the procedure is analogous. This process is repeated until no more internal fanout points exist and has been obtained. After all sharing of logic has been removed, each output cone is isomorphic to a Boolean expression that can be manipulated arbitrarily as shown using the above axioms. Therefore, it is also possible to obtain the network by the above expansion. The target network results if the duplicated logic is removed. This can be accomplished if equivalent nodes are substituted. If node is to be substituted by node , this can be accomplished by selecting and performing the above expansion. This process can be repeated for well-selected nodes in until network is reached. Suppose is the given combinational network and is the combinational network which is optimal with respect to the given cost function. Theorem 3.1 states that there always exists a sequence of the specified expansion operations such that the optimal combinational network is obtained. However, it does not say which divisors shall be used when applying (4) . As stated in the theorem, if the network has gates with no more than two inputs, it is sufficient to only consider divisors created as function of two nodes in the network. This reduces the number of divisors that (theoretically) have to be examined. Of course, this restriction does not imply that more complex divisors are of no use in the presented expansion scheme. If more complex divisors are used, the network is transformed in bigger steps. Theorem 3.1 does not put any restriction on the choice of divisors to transform the network. Further degrees of freedom for the expansions lie within redundancy elimination. The result of redundancy elimination depends on what faults are targeted and in which order they are processed.
Theorem 3.1 represents the theoretical basis of a general ATPG-based framework to logic optimization. As mentioned, redundancy elimination and the transformation of (4) per se do not represent an optimization technique. However, they provide the basic tool kit to modify a combinational network. In order to obtain good optimization results, efficient heuristics have to be developed to decide what divisors to choose and how to set up the fault list for redundancy elimination. This will be described in the following.
IV. IDENTIFYING DIVISORS BY IMPLICATIONS
Our method of identifying divisors has been motivated by an observation first mentioned in [30] . Indirect implications indicate suboptimality in the circuit. This is illustrated in Fig. 3 .
In the left circuit of Fig. 3 , we consider as the initial situation of value assignments for which we can indirectly imply . This is can be accomplished by means of recursive learning. Note that the existence of the indirect implication is due to the fact that the circuit is not properly optimized. In the optimized right circuit which is functionally equivalent to the left circuit, we note that the implication is direct. One may verify that all examples of indirect implications shown in [22] or [35] are also due to poorly optimized circuitry. Apparently, indirect implications are a key to identifying and optimizing suboptimal circuitry.
Before developing an optimization strategy based on distinguishing between direct and indirect implications, we first study the role of implications in general for multilevel minimization.
Consider again the example of Fig. 3 . For the above expansion, the circuit transformation of (4) requires that all combinational circuitry in the transitive fanin of is duplicated before redundancy elimination is applied. This seems impractical and in the following, we, therefore, consider special cases of the expansion where only one cofactor has to be considered. These special cases are obtained if only such divisor functions are considered which follow from by implication. For the following lemmas, let and be nodes of the combinational network such that is not in the transitive fanout of . (This restriction ensures that the circuit remains combinational after the transformation).
Lemma 4.1: Consider the transformation . Then if and only if the implication is true.
Proof:
in Eq. (2) can be set to '1' and we obtain: with respect to which function has only one cofactor. In other words, in a combinational network, the expansion of Theorem 3.1 can be simplified without any circuit duplication if the specified implications are present. It is interesting to note that this does not sacrifice the generality of the approach.
Theorem 4.1: Let be a node of a combinational network . The gates in the combinational network can have no more than two inputs. Further, let be a divisor which is represented as combinational network and realizes a Boolean function of no more than two variables which may or may not be nodes in such that
1
Note that Lemmas 4.1-4.4 only cover those cases where a node in a combinational network can be replaced by some equivalent function . A function at node can also be replaced by some nonequivalent function if this does not change the function of the combinational network as a whole. Such functions are called permissible functions [27] . By considering permissible functions rather than only equivalent functions as candidates for substitution at each node, we exploit additional degrees of freedom as given by observability don't cares [8] . Permissible functions can also be obtained by recursive learning:
Definition 4.1: For an arbitrary node in a combinational network , assume the single fault stuck-at-. If is a value assignment at a node which is necessary to detect the fault at at least one primary output of , then follows from by " -implication" and is denoted . The conventional implications are a special case of suchimplications. Replacing the implications in Lemmas 4.1-4.4 by -implications, we obtain the following generalization. . In this special case, if then is sufficient to produce a "faulty" signal '1' at node . Now consider the set of all test vectors for stuck-at-one in the original circuit that produce . Every such test will result in a faulty response of the transformed circuit. Therefore, the transformation is only allowed if such a test does not exist. However, if a test for stuck-at-one exists in general, it is required that there is none which produces . This means that is necessary for fault detection and must be true. If this condition is necessary for the special case that , it is also necessary for the general statement since is one of the possible choices to implement . (2), they also provide simplified cases of (4). As will be illustrated in Section V, the constructions based on (4) and the above theorems provide good candidates for the expansion of Theorem 3.1.
Recursive learning can be used to determine all value assignments necessary to detect a single stuck-at fault, i.e., it is a technique to perform all -implications. This is accomplished by two routines make_all_implications(), and fault_propagation_learning() as given in [22] if they are performed for the five-valued logic alphabet of Roth [33] . Therefore, by recursive learning it is possible to derive all cases where Theorems 4.2-4.5 apply.
The number of implications and -implications can be very large so that it is impossible to examine all transformations. At this point, however, we come back to the observation discussed earlier. Implications which can only be derived by "great effort" represent the promising candidates for the transformations as given in Theorems 4.2-4.5. These indirect implications are only a small fraction of all possible implications. In the following, we refer to aimplication as indirect if it can neither be derived by direct implication nor by unique sensitization [17] at the dominators [21] of . In other words, all those necessary assignments obtained by the learning case of routines fault_ propagation-_ learning() and make_ all_ implications() are implied indirectly and provide the set of promising candidates for the circuit transformations.
As it turns out, the concept of relating the complexity of the implication problem to minimality of the combinational network permits a new and promising approach to guiding logic minimization techniques.
V. OPTIMIZATION PROCEDURE
The described concepts have been implemented as part of the HANNover Implication Tool Based on Learning (HAN-NIBAL) tool system. Table I summarizes the general program flow for circuit optimization. HANNIBAL performs logic optimization by applying the described concepts stepwise to all nodes in the combinational network. The optimization procedure moves from node to node in the combinational network. Experiments showed that the optimization results are only moderately sensitive to the order in which the different circuit nodes are processed. However, best results were generally obtained by processing the nodes according to their topological level moving from the primary inputs toward the primary outputs. For a selected node, recursive learning is used to derive promising divisor functions. The candidates found promising are stored in lists and tried in sequence. When identifying implications, it is important that we run recursive learning only for one node at a time and then transform the given node by the implications obtained. Therefore, after modifying the circuit, we have to update the data only for the current node.
For each candidate implication, the circuit is transformed according to the rules given in Section IV. After each transformation, redundancy elimination is employed. To make this process as fast as possible, the deterministic test set is always maintained for the most recent version of the circuit. After each circuit transformation, this test set is simulated to quickly discard many faults from further consideration so that a only few faults have to be targeted explicitly by deterministic ATPG. After redundancy elimination has been completed, it is checked whether the circuit became smaller or not. If it became smaller, the current circuit is maintained, otherwise the previous version is recovered. This is continued for all nodes in the network until no more improvements can be found. In HANNIBAL, several runs are made through the circuit varying the recursion depth and the number of candidate implicants being tried at each node in different runs.
For each step of redundancy removal, we determine the fault list as follows: 1) include in the fault list both stuck-at faults at all signals that were "touched" by recursive learning when deriving the current divisor; 2) exclude from the fault list, all faults in the circuitry added for the current transformation. Limiting the fault list to signals being processed by the eventdriven recursive learning routine proved to be a very good heuristic to speed up fault simulation and ATPG (up to a factor of 4) without significantly sacrificing optimization quality. Example 5.1-"Good" Boolean Division: Consider Fig. 4 . By recursive learning, it is possible to identify the indirect implication . (Please refer to [22] for details of recursive learning.) The fact that the implication is indirect means that it is promising to attempt a Boolean division at node using the divisor . This could be performed by any traditional method of Boolean division. Instead, we use the ATPG-based expansion introduced in Section III.
Applying Theorem 4.4, we obtain the combinational network as shown in Fig. 5 . Actually, in this case we could also apply Lemma 4.3 since is obtained without using any requirements for fault propagation. Note that Theorem 4.4 states that is a permissible function for . (In this case, and are equivalent.) By transformation as shown in Fig. 5 , we introduce the node . Since is used as a cover for , it is likely that the internal don't cares result in untestable single stuck-at faults. This is used in the next step (reduction). By ATPG, the untestable faults indicated in Fig. 5 can be identified. Performing redundancy removal (e.g., [1] ) results in the minimized combinational network as shown in Fig. 6 . Note that we have to exclude the stuck-at faults in the added circuitry in the shaded area of Fig. 5 . If we performed redundancy elimination on line in Fig. 5 , we would return to the original network.
In the example, node in Fig. 4 is implemented by . By indirect implication, we identified the Boolean divisor as "promising" and performed the (nonunique) division , resulting in in Fig. 6 . Note that this is a Boolean-as opposed to algebraic-division [8] . As the example shows, indirect implications help to identify good divisors that justify the effort to attempt a Boolean division.
Example 5.2-"Common Kernel Extraction":
Consider the circuitry of Fig. 7 . The circuit implements two Boolean functions:
and , each of which cannot be optimized any further. Note, however, that the two functions have a common kernel, , which can be extracted and shared so that a smaller circuit is obtained with with It is interesting to examine how the suboptimality of the original circuit is reflected by the indirectness of implications.
Consider Fig. 7 . By recursive learning, it is possible to identify the -implication . Remember that this means that is necessary for detection of , stuck-atone. As can be noted, the necessary assignment is not "obvious." It can neither be derived by direct implications nor by sensitization at the dominators of . The reader may verify that can be obtained by the learning case of recursive learning using fault_ propagation_ learning() [22] . Now the transduction is performed in the usual way. According to Theorem 4.3, the circuit can be modified as shown in Fig. 8 , and redundancy elimination yields the optimized circuit in Fig. 9 .
Note that our method can perform transformations which cannot be performed by the method of Entrena and Cheng [16] and the method of [10] . To the best of our understanding, in the above example, the minimization cannot be obtained by only adding and removing connections as in [10] and [16] . This is because the methods of [10] , [16] require the existence of gates of a certain type at the location where the added connection (gate) is anchored. Based on the expansion described in Section III, our approach uses a wider spectrum of circuit transformation. This could possibly impose higher computational costs, however, our results show that the heuristic strategy of only using indirect implications for circuit transformation can effectively limit the search space.
As presented in [22] Limitations: 1) The examples also show the limitation of our method. By implication analysis we only consider divisors that are already present as nodes in the network. Therefore, we do not completely utilize the generality of our basic approach as given by Theorem 4.1. Extensions are under way to derive implicants and D-implicants [36] for a given node in the network which are not explicitly present as nodes in the network. AND/OR graphs, using which such implications can be derived, have been introduced in [36] .
2) Our techniques operate on a gate-level netlist description. As mentioned, this is of advantage if specific technical information shall be considered in the optimization process. However, it has not yet been considered how the presented techniques can handle circuits with complex gates in an efficient way. Our tool, HANNIBAL, at this point, is limited to handling only the basic gate types, AND, OR, NAND, NOR, INV, XOR. Future work will therefore extend our techniques to handling complex gates so that arbitrary libraries can be processed.
VI. APPLICATION TO LOGIC VERIFICATION
The described minimization approach can also be applied to logic verification. Formal logic verification of integrated circuits has become of great interest for many industrial designers and manufacturers of highly integrated circuits. Especially, in safety-critical applications, it is of great importance to verify that the implemented logic circuit is equivalent to its specification. When verifying digital circuits, an important subproblem is to check whether two combinational circuits are functionally equivalent. Traditionally, this problem is approached by generating a canonical ( unique) form of the circuits to be verified. The circuits are equivalent if their canonical forms are isomorphic. Unfortunately, canonical forms of Boolean functions may grow extremely large even for relatively small designs. The most compact canonical forms known to date are Reduced Ordered Binary Decision Diagrams (ROBDD's) [9] and related graph representations of Boolean functions. Therefore, binary decision diagrams (BDD's) have become very popular for solving logic verification problems. Some classes of circuits, however, are not amenable to a BDD analysis, since the size of the BDD's grows exponentially with the size of the circuit.
More recently, to overcome the limitations of BDD-based approaches, a different approach to logic verification has been proposed in [5] , [23] which exploits the structural "similarity" between the designs. Instead of producing canonical forms these techniques extract the similarity between designs by ATPG and implications between signals in the two circuits. These techniques have only little memory requirements and proved successful in verifying circuits that cannot be verified by BDD-based approaches. Further developments based on these techniques have been proposed in [15] , [20] , [29] , and [37] . Note, however, that such techniques may require excessive amounts of central processing unit (CPU)-time if the circuits have little structural similarity. Therefore, it is an important problem to study how to exploit the "similarity" between designs as efficiently as possible. In this section, we propose to use the presented optimization procedure for this purpose. There is a wealth of powerful synthesis methods, and it should be noted that many of these methods can also be useful in logic verification.
A. Logic Verification by Optimization
Logic verification as proposed by [5] , [23] relies on combining the circuits to be verified as shown in Fig. 10 . This construction has been called miter in [5] and represents a circuit, which maps the verification problem to solving the satisfiability problem for the output line . In [5] and [23] , a test generator is used for this purpose. Proving whether the output of the miter is satisfiable or not is generally a very complex problem. To overcome this difficulty, the approaches in [5] , [23] make use of the fact that structural similarity between the two designs can help to break the problem down. In [23] , implications are identified between different signals of the subcircuits, and these implications are stored at the respective nodes. Similarly, the complexity of the verification problem can be reduced by identifying signals in one circuit which can be used to substitute signals in the other circuit [5] .
Making physical connections between the circuits or storing of implications have a similar effect. They simplify the reasoning for the satisfiability solver by introducing "short cuts" between the circuits so that the satisfiability solver does not necessarily need to fully exhaust both circuits. This has been shown in [5] and [23] if the satisfiability solver is a test generator and in [20] and [29] if the satisfiability solver is based on BDD's.
Note that this type of approach works well if the circuits for comparison have a certain degree of similarity but it may fail otherwise. Therefore, it is important to investigate what techniques can capture a wide spectrum of similarity in an efficient way. The techniques of [5] and [23] rely on relatively strict requirements. The approach of [5] requires that lines in one circuit can be replaced by lines in the other circuit exploiting observability don't cares. In [20] or [23] , it is required that there exist logic implications, e.g., in circuit implies in circuit . This can be a looser requirement than demanding a substitution, but on the other hand, [20] and [23] do not exploit observability don't cares.
Taking all of this into account suggests that the verification problem should be simplified effectively by performing logical transformations in the miter so that logic common to the two designs can be extracted and shared. If the circuits are equivalent, then one circuit must eventually be merged into the other circuit. As a special case, the substitutions of [5] perform such an operation. More generally, any known synthesis technique can be used to accomplish this task. The general goal is to optimize the miter. If the miter is reduced to a constant zero, the two circuits are proved equivalent. If this is not (or only partially) possible, then it must be attempted to generate a distinguishing vector using ATPG.
If the circuits have a fair amount of structural similarity, this means that the miter can be optimized by a sequence of fairly local circuit transformations. If the circuits become less similar, then deriving these transformations becomes more and more complex, and it becomes important to fully exploit the range and power of modern synthesis techniques. The advantage of formulating verification as a miter optimization problem is that the power of modern synthesis techniques becomes available to the difficult problem of logic verification.
As experimentally confirmed in Section VII, circuit transformations derived by indirect implications cover a large spectrum of the circuit manipulations performed in standard synthesis procedures like [8] . Further, since implications permit an easy and effective guidance of the optimization process, we base our verification procedure on the optimization procedure of Section V.
B. Heuristic Guidance in a Miter
Optimization in a miter has special characteristics which are discussed in this section with respect to the optimization procedure of Section V.
Selecting Implications: Using our approach, it must be attempted to identify implications that are valid between two nodes that belong to different subcircuits of the miter. If the corresponding transformations are performed, this introduces a sharing of logic between the circuits. Enforcing a sharing of logic between the circuits has two beneficial effects. It generally reduces the size of the miter, and it tends to increase the degree of similarity in the remaining, unshared parts of the circuits if the original circuits are equivalent. If the two networks are forced to share the same subfunctions, this leaves less "freedom" for the implementation of the remaining parts. This is illustrated in the following example.
Example 6.1: Fig. 11 shows two circuit examples that shall be verified to be equivalent. The circuits are combined to form a miter. For reasons of clarity, we depict the circuits without the extra logic to form the miter. Consider signal in the upper circuit and signal in the lower circuit. By recursive learning, it is possible to identify the -implication . The reader may verify that any test for , stuck-at-one produces the value assignment . According to Section VI, we can perform the circuit transformation as shown in Fig. 12 .
In the transformed circuit, untestable faults can be identified as indicated in Fig. 12 . Removing these redundancies leads to the circuit in Fig. 13 . Note that this transformation has not only introduced a sharing of logic between the two circuits and reduced the size of the miter, it has also increased the degree of structural similarity in the remaining unshared portions of the circuit. As a matter of fact, in this example, the remaining circuit portions are now structurally identical and can be shared by a sequence of very simple circuit transformations.
Substitution: Often, a lot of CPU-time can be saved by restricting the circuit transformations to node substitutions. Notice in Example 6.1 that node in the upper circuit is substituted by node in the lower circuit after removing the redundancy , stuck-at-zero. Often it may be sufficient to restrict all transformations to only finding such substitutions [5] . In this case, if a transformation has been performed for an implication between two nodes and as given in Theorems 4.2-4.5, redundancy elimination needs to be performed only for the appropriate fault at signal . This is faster than considering all faults in the circuit, but on the other hand, it overlooks miter transformations which cannot be obtained by a simple substitution. Therefore, we pass through the circuit several times. In the early passes of our verification procedure, we restrict the redundancy check to the node to be substituted. In the later passes we perform redundancy elimination in the whole circuit.
ATPG in a Miter: Finally, another important aspect should be considered when running the optimization approach of Section V in a miter. The described method heavily relies on evaluating circuit transformations by ATPG. However, for many target faults in the circuits for comparison, the ATPG problem becomes severely more difficult if the circuits are connected to form a miter. In fact, a large number of faults become redundant, but proving these redundancies practically has the same complexity as the verification problem itself. The reason for this is the global reconvergence created by the miter. Therefore, the ATPG tool may waste a lot of time on numerous target faults which eventually have to be aborted.
The effect of the global miter reconvergence on the ATPG process can be eliminated by the following trick. When performing ATPG or fault simulation, faults are declared "detected" as soon as the fault signal has reached the outputs of the subcircuits, i.e., if it has reached the inputs of the XOR-tree that forms signal . In Fig. 10 , these signals are labeled to . Alternatively, the XOR-portion of the miter could be removed during the ATPG-procedure. Note that this is extremely important for an efficient ATPG-process.
VII. EXPERIMENTAL RESULTS
The described methods have been implemented by making extensions to the HANNIBAL tool system. For efficient fault simulation, we integrated the public domain fault simulator FSIM [24] into HANNIBAL. HANNIBAL contains the recursive learning technique of [22] and has options to apply this technique to test generation [22] , logic verification [23] , and logic optimization. Section VII-A shows the results for logic optimization. The results for logic verification using implications and BDD's have been shown in [29] . In Section VII-B, we show results for the verification part of HANNIBAL enhanced by the optimization approach presented in this paper.
A. Results for Logic Minimization
We compare HANNIBAL with other state-of the art optimization tools. For a fair comparison, it is very important to take into account that several different ways of measuring the area costs are currently common practice. HANNIBAL and RAMBO [16] operate on a gate netlist description and measure the area in terms of the number of connections. A connection is an input to a gate with at least two inputs, i.e., single-input gates (inverters, buffers) are not counted. Technology-independent optimization tools like SIS measure the area in terms of numbers of literals. A literal is a variable or its complement used to describe the Boolean function at a node in the Boolean network. In a gate netlist, the number of literals can be obtained by counting the number of inputs of the fanout-free zones (FFZ's) in the network. For a fair evaluation of our tool, we present our results in terms of both, number of connections and number of literals. For RAMBO and HANNIBAL, the number of literals (factored form) has been obtained by reading the optimized circuits into SIS and postprocessing them such that a technology-independent factored form is obtained. For this purpose, we used a SIS script obtained from [12] which performs some standard network manipulations. To count connections for SIS, we map the optimized circuit to a generic library which contains the basic gates that are allowed in our netlist description.
Note that comparing connections or literals may slightly bias the results. Since RAMBO and HANNIBAL optimize in terms of connections whereas SIS uses literals, comparing connections can bias the results in favor of HANNIBAL and RAMBO. Comparing literals gives a certain advantage to SIS. Therefore, for all circuits we always present both area measures.
In all experiments, HANNIBAL passes through the circuit four times performing expansions at every node where recursive learning can identify indirect implications. The recursion depth is 'one' for the first two passes and 'two' for the final two passes. We also experimented with higher depth of recursion. It turned out that recursion depth higher than 'two' did not lead to improved optimization results because a transformation that can be derived by high recursion depth can usually also be obtained by a sequence of local transformations derived by small recursion depth. (Anyway, for the larger designs a recursion depth of 'four' and more is usually not affordable in terms of CPU-time.) Table II shows results for SIS 1.2, RAMBO C and HAN-NIBAL. SIS 1.2 is run using script.rugged which includes the powerful techniques of [31] and [34] . No preoptimization is used to process the circuits in RAMBO and HANNIBAL. As can be noted, for most benchmark circuits, HANNIBAL produces the smallest circuits. This is quite remarkable because it shows that most circuit manipulations performed by conventional technology-independent minimization techniques are covered by the netlist transformations presented in this paper. In particular, heuristic guidance by indirect implications proved surprisingly powerful.
In the next experiment, it is examined how much optimization is possible by HANNIBAL if the circuits are preprocessed by SIS. As shown in Table III , substantial area gains are possible in many cases. For seven out of 25 circuits, the gain is more than 20%. Also note that the CPU-times for HANNIBAL are significantly shorter in many cases if the circuits are run through a technology-independent minimization first, like in the experiment of Table III. Finally, we also compare our results with [10] . In [10] the circuit is mapped to a library with only two-input gates, and results are only shown after preprocessing with SIS. Table IV shows the results for RAMBO (taken from [10] ), PERTURB/SIMPLIFY [10] , and HANNIBAL if the area is measured in terms of two-input gates. We take the subset of the above benchmarks for which results are shown in [10] , and all circuits are preoptimized by SIS. As can be noted, HANNIBAL obtains smaller or equal circuits than RAMBO or PERTURB/SIMPLIFY for all circuits. Note that the results of HANNIBAL and RAMBO could be somewhat improved when compared with [10] if their cost function was changed to optimize the number of two-input gates.
Further experiments confirmed the heuristic that indirect implications indicate promising divisors. We examined how many indirect -implications existed in the circuits before and after optimization. For the ISCAS85 circuits, Table V shows the number of indirect -implications that have been identified by recursive learning with depth '2' for the original circuits as well as for the optimized circuits. We note that HANNIBAL reduces the number of indirect -implications drastically for all circuits. It is interesting that this is also true for SIS in most cases, which confirms that optimization in general is related to reducing the number of indirect implications in the circuit. The results of Table V reflect that many (but not all) "good" divisors for optimization can be obtained by indirect implication. Also, note that Table V explains why the CPUtimes for HANNIBAL are generally shorter if HANNIBAL is run after SIS. If SIS is used first, there are less indirect implications and, hence, less expansions need to be performed.
B. Results for Logic Verification
We demonstrate the performance of our verification technique based on optimization by means of the public domain multiplier c6288 which we verified against its optimized version. The optimized version has been obtained by SIS1.2 [8] using script.rugged. The other circuits listed in Table VI have been obtained from Mentor Graphics Autologic II Logic Synthesis Team. The designs are highly datapath oriented, contain multipliers and rotators and were created in Verilog, synthesized by Autologic II to a commercial ASIC vendor library. The designs were synthesized with different design goals in mind (such as area or performance). The test cases also contain (intentionally) nonequivalent designs. The results show that logic verification based on the optimization procedure performs efficiently and robustly for these practical verification problems. The proposed technique may be outperformed by techniques such as [5] and [23] if the circuits have a high degree of similarity. On the other hand, for circuits with less structural similarity, logic verification by optimization provides a general framework for a more robust verification approach. Our results show that the optimization procedure of Section V can be tailored for efficient miter optimization. In all examined cases the miter could be minimized to a constant signal '0' within short CPU-times.
As described in Section VI, our verification approach uses two phases. The first phase performs substitution only, and the second phase considers more general transformations by running redundancy elimination in the whole circuit. In all cases, the first phase helped to significantly reduce the size of the miter before starting the more CPU-time expensive phase two. All circuit transformations have been derived by recursive learning with recursion depth '2'.
VIII. CONCLUSION
This work has introduced an ATPG-based generalization of Shannon's expansion that provides an adequate theoretical description for ATPG-based logic synthesis. Our research was originally motivated by the observation that indirect implications indicate suboptimal circuitry. We have presented an ATPG-based approach to logic optimization deriving circuit transformations from implications. It has been shown that implications can be used to determine for each node those functions in the network with respect to which this node has only one cofactor. Furthermore, it has been shown that the complexity of performing implications can be related to potential area reduction by Boolean division. This introduces new heuristic guidance and a different view on logic optimization problems. Our results clearly prove the great potential of our method. They also show that our notion of "indirect" implications is indeed most helpful to identify good Boolean divisors.
As has been shown, netlist optimization by HANNIBAL is competitive with technology independent minimization techniques. Future work will, therefore, exploit the main advantage of this approach. Optimization on the gate netlist provides much better insight in the technical properties of the design and, therefore, permits a better guidance when trying to achieve specific optimization goals. It has already been shown that the presented approach is very useful when optimizing for random pattern testability [11] and for low-power consumption [28] .
Further, we formulated logic verification as an optimization problem and demonstrated the usefulness of our optimization approach for logic verification. Our verification method successfully verified a number of industrial designs. Current research examines extending the set of circuit transformations using the concept of AND/OR graphs [36] . This is expected to improve the capabilities of HANNIBAL for both logic synthesis and formal verification.
