We compare the complexity of "internal" and "external" equivalence 
INTRODUCTION
Equivalence checking (EC) has become an important part of design verification [7] . This success can be attributed to a good scalability of the state-ofthe-art equivalence checkers. In turn, this scalability is due to two factors. First, logic synthesis tools usually do not re-encode state variables and so EC of two sequential circuits reduces to EC of combinational circuits bounded by registers and/or primary inputs and outputs. (For this reason, in this paper, we consider only EC of combinational circuits and when we refer to a "circuit" we mean a "combinational circuit".) Second, in many cases circuits can be represented by small BDDs [4] . Then EC of the corresponding combinational circuits of two designs to be compared can be performed efficiently.
EC of large circuits
The existence of large combinational circuits poses a problem in EC. Informally, we consider a circuit as "large", if its BDD cannot be built efficiently. Typically, this happens when a circuit has a large width [1] . The width of a circuit N describes the amount of communication between different parts of N, multiplier being a classical example of a "wide" circuit.
A lot of research in EC has been focused on handling large combinational circuits. In [9] , the idea of combining SAT and BDD based methods was explored. The case when circuits to be checked for equivalence have compact BDDs under different variable orders was studied in [13] . The most popular method of handling large circuits is based on employing cut points [2] [3] . The idea is to prove functional equivalence of circuits N 1 and N 2 inductively. First equivalence of some subcircuits of N 1 and N 2 is established. The outputs of equivalent subcircuits are considered as cut points and new subcircuits are tested for equivalence, inputs of these subcircuits being cut points. This goes on until equivalence of the outputs of N 1 and N 2 is proven in terms of some cut points. This idea was further developed in [10] and successfully used in many equivalence checkers (e.g. [5] [6] ). It may happen that even though N 1 and N 2 are equivalent, they appear to be inequivalent in terms of the chosen cut points. This situation is usually called a false negative. The problem of false negatives was addressed in [12] [14].
Internal and external EC
All the methods described above are meant for "external" EC. Informally, EC is external if no explicit assumptions are made about the origin of combinational circuits N 1 and N 2 to be compared. We will say that EC is "internal" if it is meant for verification of a logic synthesis transformation by which circuit N 2 is obtained from N 1 . So, by definition, internal EC "knows" the relation between N 1 and N 2 . For simple transformations, internal EC is performed implicitly by using some informal reasoning. Suppose, for example, that internal points w and w¢ of a circuit N 1 are functionally equivalent. Then one can optimize N 1 by removing w¢ and feeding its fan-out nodes with w (instead of w¢ ). No explicit procedure is run for EC of N 1 and N 2 because in this particular case the equivalence of N 1 and N 2 is "obvious".
Development of new "non-trivial" methods for internal EC is extremely important because these methods enable new logic synthesis procedures. A powerful method of internal EC was introduced in [8] where logic synthesis and EC of circuits with a common specification were considered. A specification of a circuit N is just a partition of N into subcircuits. Two circuits N 1 and N 2 have a common specification if they can be partitioned into k subcircuits N 1 1 ,.., N 1 k and N 2 1 ,.., N 2 k such that these subcircuits are connected in "the same way" in N 1 and N 2 and corresponding subcircuits N 1 i , N 2 i are toggle equivalent. We will refer to the EC procedure of [8] as EC_TE where TE stands for toggle equivalence.
The importance of EC_TE is twofold. First, it enables a powerful logic synthesis procedure. (We will refer to this procedure as LS_TE where LS stands for logic synthesis and TE for toggle equivalence). of N 1 has m outputs, then the number of m-output subcircuits that are toggle equivalent to N 1 i is huge even for small m. So LS_TE enjoys great flexibility even for specifications of very small granularity.
The second reason why introduction of EC_TE is important is as follows. Usually, external EC is assumed to be as powerful as internal. The results of [8] imply, that, most probably, there is an exponential gap between external and internal EC. The reason is that finding a common specification of N 1 
Our contribution and structure of the paper
In this paper, we further develop the ideas of [8] . Our contribution is threefold. In [8] , no concrete procedure that, given a subcircuit N 1 i , generates a toggle equivalent counterpart N 2 i , was introduced. So one could consider the performance gap between internal and external EC as a mathematical curiosity. The first contribution of this paper is that we give experimental data showing the promise of LS_TE, which makes the theory of [8] much more tangible.
The second contribution is that we show that after a slight modification, EC_TE enables a logic synthesis procedure much more powerful than LS_TE. Given a circuit N 1 and specification N 1 1 ,..,N 1 k , this procedure is to replace each subcircuit N 1 i with a subcircuit N 2 i that implies toggling of N 1 i . We will refer to this procedure as LS_TI (where TI stands for toggle implication). LS_TI offers much more flexibility than LS_TE while its complexity is the same as the complexity of LS_TE. Showing that LS_TI "widens" the gap between internal and external EC is our third contribution. This paper is structured as follows. In Section 2 we give basic notions. Section 3 recalls EC_TE and LS_TE procedures of [8] . In Section 4 we describe an EC procedure enabling a logic synthesis procedure LS_TI more powerful than LS_TE. Section 5 gives experimental data showing the power of LS_TE and explains why LS_TI is more powerful than LS_TE. In Section 6 we discuss the complexity of external EC of circuits produced by LS_TE and LS_TI. We conclude with Section 7.
BASIC NOTIONS

Toggle equivalence of Boolean functions
In this subsection, we recall the notion of toggle equivalence and its properties. All the propositions given in this section are either proven in [8] implicitly specifies the one-to-one mapping K between output vectors produced by N 1 and N 2 . Namely K*(y 1 , y 2 ) is equal to 1 iff y 1 =K(y 2 ).
Implication of toggling
In this subsection, we introduce the notion of implication of toggling. Definition 5. Let f 1 and f 2 be two multi-output functions with the same set of variables X={x 1 ,…,x n }. Toggling of function f 1 implies toggling of f 2 (denoted as f 1 £ f 2 ), if for any pair of assignments x¢ ¢ ¢ ¢,
Toggling of a multi-output function f 1 (x 1 ,..,x n ) strictly implies toggling of a multi-output function f 2 (x 1 ,..,x n ) (denoted as f 1 < f 2 ) if f 1 £ f 2 but f 2 £ f 1 does not hold. Remark 1. We will denote by N 1 £ N 2 (respectively N 1 < N 2 ) the fact that toggling of the function implemented by Boolean circuit N 1 implies toggling of (respectively strictly implies toggling of) the function implemented by Boolean circuit N 2 . n fi {0,1} m and f 2 {0,1} n fi {0,1} k be m-output and k-output Boolean functions of the same set of variables. Let
Note that unless f 1 and f 2 are toggle equivalent, the function K is not invertible.
Testing toggle implication and toggle equivalence
In this subsection, we describe how toggle equivalence and implication of toggling can be tested. Let N 1 and N 2 be two Boolean circuits to be checked for implication of toggling. Let X={x 1 ,.., x n } be the set of input variables of N 1 , N 2 . Let Y={y 1 ,…, y m } and Z={z 1 ,.., z k } be the sets of output variables of N 1 and N 2 respectively. Then
is unsatisfiable (i.e. it is a constant 0). Here N* 1 and N* 2 are copies of circuits N 1 and N 2 , with input variables X*={x* 1 ,.., x* n } and output variables Y*={y* 1 ,…, y* m } and Z*= {z* 1 ,.., z* k } respectively.
The value of Eq(y, y*) where y and y* are assignments to Y and Y* respectively is equal to 1 iff y=y*.
The function Neq(Y, Y*) is the negation of Eq(Y, Y*).
Indeed, S=1 means that for a pair of input vectors x and x*, circuit N 1 toggles (which sets Neq(Y, Y*) to 1) while N 2 does not (which sets Eq(Z,Z*) to 1).
From Proposition 4 it follows that checking for toggle equivalence reduces to two satisfiability checks (SAT-checks for short).
Correlation function
In this subsection, we use the notion of correlation function to extend definitions of toggle implication and toggle equivalence to the case when functions f 1 
LOGIC SYNTHESIS AND EC OF CIRCUITS WITH COMMON SPECIFICATION
In this section, we recall LS_TE and EC_TE procedures of [8] . From now on we assume that circuit N 1 to be optimized has only one output. (If a circuit to be optimized has more than one output, then the LS_TE procedure can be separately applied to every subcircuit feeding an output of N 1 ).
Logic synthesis preserving toggle equivalence
In this subsection, we recall the procedure of Logic Synthesis preserving Toggle Equivalence (abbreviated as LS_TE) introduced in [8] . The pseudocode of the LS_TE procedure is shown in Figure 1 . Since Spec(N 1 ) is topological, one can assign levels to subcircuits N 1 i . We assume that subcircuits 
EC of circuits with a common specification
The EC of N 1 and a circuit N 2 obtained from N 1 by LS_TE can be done by the EC_TE procedure of [8] whose (slightly changed) pseudocode is shown in Figure 3 . EC stands for equivalence checking and TE stands for toggle equivalence. Recall that N 1 is a single-output circuit and, as we mentioned above, LS_TE builds a circuit N 2 that has one output too.
The input to the EC_TE procedure are circuits N 1 and N 2 and their specifications Spec (N 1 )={N 1 1 ,..,
is 'equivalence_function') return('equivalent'); 11 else return('inequivalent');} 
NEW EC AND LOGIC SYNTHESIS
PROCEDURES
In this section, we describe an extension of the EC_TE and LS_TE procedures called EC_TI and LS_TI respectively (where TI stands for toggle implication). First, we give a generic method for introducing a logic transformation through an "enabling" procedure and explain how this method works for EC_TE and LS_TE. Then we introduce EC_TI and LS_TI.
A generic method for introducing logic transformation
The idea of the method is to introduce a logic transformation through an Enabling internal EC procedure (we will refer to it a EEC ). The input to an EEC is an original circuit N 1 , a modified circuit N 2 and some information about the transformation T used to obtain N 2 from N 1 . EEC has to be sound. That is, if EEC says that N 1 and N 2 are equivalent (or inequivalent) it has to give the right answer. Besides, EEC should be able to recognize if N 2 can not be obtained from N 1 by the transformation T. After designing an EEC, one formulates a logic synthesis procedure that given a circuit N 1 generates a circuit N 2 whose equivalence to N 1 can be verified by this EEC. We will say that this synthesis procedure is enabled by this EEC.
Let us illustrate how this method works for introducing LS_TE. The EC_TE procedure satisfies the requirements above for an EEC. Indeed, the input to EC_TE consists of circuits N 1 and N 2 and partitions Spec(N 1 ), Spec(N 2 ) as information about the synthesis transformation to be enabled. The soundness of EC_TE trivially follows from the fact that EC_TE returns the 'equivalent' (or 'inequivalent') answer only if it correctly derived the equivalence (respectively inequivalence) function relating the outputs of N 1 and N 2 . It is not hard to see that LS_TE is exactly the procedure enabled by EC_TE. Indeed, EC_TE checks in topological order if subcircuits N 1 i and N 2 i of Spec(N 1 ) and Spec(N 2 ) are toggle equivalent. In turn, LS_TE builds N 2 by generating toggle equivalent counterparts of subcircuits N 1 i in topological order.
Introduction of LS_TI
Let EC_TI be an EC procedure that is different from EC_TE only in one aspect. Instead of checking if ). Under such a restriction, LS_TI has the same complexity as LS_TE i.e. it is linear in the number of subcircuits in Spec(N 1 ) and exponential in the granularity of Spec(N 1 ).)
LS_TI is a synthesis procedure enabled by EC_TI. That is, if for a given N 1 and specification Spec(N 1 ), LS_TI builds a circuit N 2 with specification Spec(N 2 ), EC_TI will prove N 1 and N 2 to be equivalent.
BIG PROMISE OF LOGIC SYNTHESIS PRESERVING COMMON SPECIFICATION
In this section we give some experimental data showing the power of the LS_TE procedure and discuss the potential of LS_TI that should be much more powerful than LS_TE.
Power of LS_TE
The key procedure of LS_TE is called in the loop (line 4 of Figure 1 ) k times to generate a subcircuit N 2 i that is toggle equivalent to N 1 i ,i=1,..,k. We will refer to it as the TEP procedure (TEP stands for Toggle Equivalence Preserving). Such a procedure has been developed in [17] .
Let M¢ and M † denote subcircuits N 1 i and N 2 i respectively. Given M¢ and a constraint (correlation function) imposed on inputs M¢ and M †, the TEP procedure of [17] builds M † as follows. It constructs a sequence of circuits
(Here M 1 is an identity circuit I p and p is the number of inputs in M † ). That is M i+1 toggles at least "as much" as M, but "strictly less" than M i . Since every circuit M i+1 of this sequence loses at least one toggle of M i , this sequence converges to a circuit M s such that M s £ M and M £ M s . This means that M s is toggle equivalent to M¢ and so M s is the final circuit M † . (A more detailed description of the TEP procedure is beyond the scope of this paper. ) In this paper we give some experimental data on our implementation of the TEP procedure for optimization of multi-output circuits just to show that LS_TE has a great practical potential. So the fact that external EC of circuits produced by LS_TE is problematic is significant. 
In Table 1 we compare the results of optimization of some MCNC benchmarks by SIS [15] and by the TEP procedure. The name of the circuit and the number of inputs and outputs are shown in the first three columns of Table 1 . The results of optimization by SIS with the script 'rugged' followed by technology decomposition (to obtain a circuit of two-input AND gates and invertors) is shown in the fifth column. The results of using TEP to build a toggle equivalent circuit are shown in the fourth (the number of outputs) and sixth (the number of gates) columns.
For the majority of circuits TEP was able to find much smaller toggle equivalent counterparts. In two cases (5xp1 and f51m) , TEP removed all the logic. This means that, for example, for different input assignments, circuit 5xp1 generates different output assignments. So the identity circuit I 7 is toggle equivalent to 5xp1.
Of course, such re-encoding of output assignments of the original circuit requires changing the surrounding logic. To explain why re-encoding may still lead to significant logic reduction let us consider the following example. Suppose that a circuit N 1 to be optimized consists of two subcircuits, N ). Table 2 shows results of applying LS_TE with the heuristic above to logic optimization of some arithmetic expressions with an integer variable x, x ‡ 0. (The second column of Table 2 gives the number of bits in x. ) Each circuit N 1 of Table 2 It is not hard to see that x 2 < C 1 is equivalent to x < C¢ 1 where C¢ 1 is the constant equal to ceiling(square_root(C 1 )). Similarly, C 1 *x < C 2 is equivalent to x < C¢ 2 where C¢ 2 =ceiling(C 2 /C 1 ). So there is a very simple circuit implementation of either Boolean function.
The results of optimization by SIS are shown in the fifth column. The first number of this column gives the number of gates in the circuit obtained after applying script 'rugged' and technology decomposition. The number in parenthesis gives the number of gates in the resulting circuit after applying script 'rugged' many times until the solution stabilizes and then running technology decomposition.
The results of applying LS_TE (that used the TEP procedure of [17] . So LS_TE picked the simplest such a circuit that is I m .
Potential of LS_TI
LS_TI is more powerful than LS_TE because toggle implication is a more general relation than toggle equivalence. So LS_TI is much more flexible than LS_TE (while its complexity is the same as that of LS_TE).
Let 
). Since LS_TI builds a circuit N 2 that is functionally equivalent to N 1 , these unmatched toggles of N 2 i do not reach the output of N 2 . The blocking of the unmatched toggles is done "automatically".
Let us consider advantage of LS_TI over LS_TE by a simple example. Suppose that one needs to implement Boolean function f(x) < 9 where x is an mbit integer. Let f(x) be equal to x 2 at all the 2 m points except for the point x=4 where f(4) is equal to 25 (instead of 16). It is not hard to see that the expression f(x) < 9 is equivalent to x < 3. Suppose that f(x) < 9 is implemented by a circuit N 1 that is a composition of subcircuits N 
EC of circuits produced by LS_TE
In [8] , a top commercial tool was used for "external" EC of circuits with a common specification ( Table 2 of [8] ). These results showed that even for circuits with a common specification of small granularity, their EC was too hard for that tool (even with a 10 hour time limit). On the other hand, all examples were solved by EC_TE within 1-2 minutes.
One can always pick circuits N 1 and N 2 with a common specification that will "break" current EC algorithms. The reason is that an external checker inevitably makes implicit assumptions that can be easily broken. For example, algorithms based on computing cut-points make an assumption that N 1 and N 2 have functionally equivalent internal points. However, if N 2 is produced from N 1 by LS_TE, N 1 and N 2 , in general, have no functionally equivalent points. Algorithms based on BDD computation make an implicit assumption that N 1 and N 2 have a small width while LS_TE can be used for optimization of circuits of arbitrary width. EC based on recursive learning [11] assumes that implications relating points of N 1 and N 2 can be obtained inductively by a computation of small "recursion depth". This assumption can be easily broken as well. The method of [12] also makes a breakable assumption that N 1 and N 2 do not have a large number of reconvergent fan-outs.
In terms of proof sizes (computed with respect to a concrete proof system like resolution), the problem with existing (and most likely any) external equivalence checkers is as follows. It may well be the case that any proofs of equivalence different from the ones generated by LC_TE are much "longer". On the other hand, to find a proof generated by LC_TE one needs to build partitions Spec(N 1 ) and Spec(N 2 ), which is very hard. The reason is that finding a pair of subcircuits 
EC of circuits produced by LS_TI
Although external verification of circuits built by LS_TE looks infeasible, verification of circuits produced by LS_TI is "even harder". The reason is as follows. Suppose that N 2 with specification Spec(N 2 ) is produced from N 1 with specification Spec(N 1 ) by LS_TE. Suppose an external equivalence checker somehow managed to find subcircuits N 1 i and N 2 i that are toggle equivalent. Then, it has to decide whether this toggle equivalence is "accidental" or N are toggle equivalent "accidentally", then one cannot use outputs of N 1 i and N 2 i as "cut-points" to find subcircuits that are toggle equivalent in terms of previous cut-points (because the wrong choice of cutpoints leads to false negatives). However, it is conceivable that toggle equivalence of subcircuits of N 1 and N 2 is a "rare" occasion and so N 1 i and N 2 i are subcircuits of Spec(N 1 ), Spec(N 2 ) with some reasonable probability.
The situation with LS_TI is vastly different. If , N 2 i is extremely unlikely and hence finding Spec(N 1 ) and Spec(N 2 ) looks even "more impossible" than in the case of LS_TE.
CONCLUSION
In this paper, we discuss how "external" equivalence checkers can be affected by the appearance of new powerful logic synthesis procedures. Our results imply that the increasing power of synthesis procedures may make external equivalence checking problematic if not impossible.
