We describe a complete method for the latch mapping problem that is based on the efficient integration of previously proposed techniques for latch mapping as well as novel optimizations for further improvement. The highlights of the proposed approach include a new method of integrating complete methods and incomplete methods for latch mapping, the use of incremental reasoning to optimize the overall algorithm and the use of a conventional combinational equivalence checking tool as the core engine. Experiments confirm that the proposed method retains much of the efficiency and capacity of incomplete methods while providing the completeness of complete methods and derives significant performance improvements from the proposed optimizations.
INTRODUCTION
Combinational equivalence checking (CEC) is a mature and practical technology [2, 7, 8, 9, 111 that is commonly used in current verification methodology. Latch mapping is used to transform the problem of checking sequential equivalence into a combinational equivalence checking problem. Recent work on latch mapping [l, 3, 4, 5, 141 has offered some promising solutions. However, as combinational equivalence checking technology becomes more pervasive in commercial verification flows, there is a need for more powerful and efficient latch mappers to complement them.
Methods for latch mapping can be divided into incomplete methods and complete methods. Incomplete methods use heuristics to group promising matches without providing any guarantees on Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. the correctness or completeness of the matching. They can be function-based or non-function based. Non-function based incomplete methods [4] use name or structural comparisons to group latches. Function-based methods such as those proposed in [l, 41 use random simulation [4] or ATPG-based search [l] to generate inequivalence information, which is used to group latches. Complete methods, on the other hand, are guaranteed to produce a latch mapping if one exists, given sufficient computational resources. Almost all complete methods for latch mapping, proposed in the literature [3, 5, 141, employ a functional fixed-point iteration to refine the universe of all latches into a provably correct and complete grouping. Also, related work on general sequential equivalence checking [6, 12, 131 uses techniques that may be applicable to latch mapping.
In this paper we describe a methodology for the latch mapping problem that is based on the efficient integration of previously proposed techniques as well as novel optimizations to further improve these techniques. This method has been deployed and extensively tested in an industrial setting and has proved to be quite successful. Our methodology for latch mapping is premised on the following observations, based in part on the experiences of other researchers who have worked on this topic.
Incomplete methods can usually match a large percentage of the latches in practical designs, efficiently and correctly [l] . Therefore, any efficient methodology for latch mapping needs to make use of such methods. However, complete methods are the only option for difficult instances of latch mapping, such as those produced by heavy optimization of a design by a tool which does not preserve name correspondences 131 or cases where one of the circuits under comparison has significant redundancies andor replicated portions of logic which the second circuit does not have. This could lead to matchings involving groups of more than two latches or complex intra-circuit groupings. Incomplete methods are usually designed to detect simple, pairwise inter-circuit matchings. Thus, an effective method for latch mapping should be one which combines incomplete and complete methods together in a symbiotic manner. Another benefit of using a complete latch mapping method is that if the given problem is not a latch mapping problem (and hence not solvable through CEC) the complete method could return a partial set of true latch equivalences which can subsequently be utilized by a more general sequential equivalence verifier.
In the light of the above, the major contributions of this paper can be summarized as follows:
0 We formulate and discuss the problem of combining incomplete and complete methods for latch mapping and propose a comprehensive and efficient solution to this problem. 0 We use an industrial-strength, mixed-engine combinational equivalence checker [9] , to implement a complete algorithm for latch mapping, similar to the fixed-point iteration reported in [3, 141. We note that previous approaches have either reported results using a single engine (BDDs in [ 141 and ATPG in [I]) or not given details of the engine used [3] . Thus, this is the first reported application and experimental evaluation of a proven combinational equivalence checking technology for latch mapping. 0 We propose two novel techniques to substantially improve the performance of the above complete method. The key idea is to optimize the use of the CEC tool within the latch mapping framework, rather than using it as a black box.
The rest of the paper is organized as follows. The next section reviews some notation and basic algorithms used in the rest of the exposition. In Section 3 we describe the details of our latch mapping methodology. Section 4 presents an experimental validation of our approach. We present our conclusions in Section 5.
PRELIMINARIES
Sequential circuits can be represented as finite state machines 
The Latch Mapping Problem
Let the two sequential circuits being checked for equivalence be represented by FSMs Mspec (specification FSM) and M I m p l (implementation FSM). Further, to simplify the exposition we assume the circuits have a single clock, the same inputs and outputs, and exactly one initial state, denoted SO,Spec and S 0 , I m p t respectively. We note that the methods presented in this paper can be extended for the case of multiple initial states using the treatment in PI. Thus 
The relation RL is designed to group together latches that are equivalent, under some notion of sequential equivalence. The equivalences that hold, form the refined relation R;+l. Note that the primary outputs 0 of and Mrmpl are ignored when solving the latch mapping problem.
FL(S)
As mentioned in [3, 141, the overall efficiency of the procedure can be significantly improved by refining R : through random simulation, before entering the fixed-point iteration.
PROPOSED METHOD
We believe that a latch mapping methodology needs to employ a combination of incomplete and complete methods in order to provide a comprehensive but efficient solution to industrial latch mapping problems. Our method is based on such a combination. Our EXAMPLE 3.1. Suppose the exact latch correspondence sohtion has latches Z i and l j from Mspec mapped together with latch lk from MI^,^. Furthel; suppose that the supplied partial match RL.,~ is such that (li, zk) E RL.,~ but Zj Then, the P' produced as above will not have latches k and lk.
Hence the complete method will not be able to match latch l j to anything. Consequently, the overall solution (final match) will be missing part of the equivalence 11 zj E lk and therefore be incomplete.
Our method is a modification of the above solution. is precisely the maximum latch correspondence RT"" that would be computed by van Eijk's algorithm executed on the original problem, P. 
Efficient Implementation of Van Eijk's Algori thm
As described in Section 2.2, van Eijk's algorithm involves a fixedpoint iteration. In each iteration latch equivalences of the current latch mapping relation are imposed on the inputs of the circuit model P and the same equivalences verified at the outputs.
This computation can be easily reduced to a combinational equivalence checking problem. Additionally, the two circuits Mspec and MI^,^ often exhibit a lot of structural similarity. Therefore, we have chosen to implement this computation by a state-of-theart, industrial strength combinational equivalence checking (CEC) tool [9] . This tool implements a well-tuned combination of random simulation, BDDs, ATF' G and structural pruning techniques and compares very favorably to the state-of-the-art in combinational equivalence checking.
Like most industrial combinational equivalence checkers today, our CEC tool exploits the structural similarity between the two circuits under comparison. The general approach is to partition the overall equivalence check into a set of smaller equivalence checks. A set of candidate equivalences (referred to as potentially equivalent nodes (PENS) in the sequel) is initially composed from intemal nodes of the two circuits. Then the algorithm sweeps from the primary inputs to outputs successively resolving these candidate equivalences, taking advantage of intemal equivalences proved thus far, till the output equivalences are resolved.
We note that previous implementations of van Eijk's algorithm and related work have been reported using a single engine [ 1, 141 or a dedicated combination of engines [3, 6] . By using an off-the-shelf CEC tool we hope to seamlessly leverage the powerful technology tool includes a set of efficient structural and functional, incomplete methods for latch mapping. We believe these methods are competitive with the state-of-the-art in incomplete methods. However, these methods are not the contribution of this paper and hence will not be discussed here. Our contributions are 1.) a novel method of combining incomplete and complete methods for latch mapping (discussed in Section 3.1) and 2.) an optimized complete method for latch mapping (described in Section 3.2). The complete method is a variant of van Eijk's algorithm [14] , implemented using powerful engines and further enhanced with novel pruning techniques.
A Hybrid Method for Latch Mapping
Previous works on latch mapping have proposed either complete methods [3, 5, 141 or incomplete methods [l, 41 for the problem. In the following we formulate and discuss the problem of combining these two kinds of approaches and propose a comprehensive and efficient solution. To the best of our knowledge, this is the first reported work to explicitly address this aspect of latch mapping. Figure 2 shows a possible flow in which incomplete and complete methods could be used together. If RI and RZ are the partial matches produced by incomplete and complete methods, respectively, the final match can be computed as (RI V Rz), where (R) denotes the transitive closure of relation R. The key element is in generating the modijed circuit model (denoted P*) to be supplied as input to the complete method. More concretely, this problem can be formulated as follows. Suppose RL.,~ is a partial latch mapping given to us, i.e. RL.,~ is an equivalence relation over the set of latches Lsub E L . In practise, the partial match may be generated by a combination of incomplete methods and constraints specified by the designer. Further, we assume that RL.,~ is a true relation, i.e. RL.,,~ C R?"". In other words, the equivalences specified in R L , ,~ are true in any correct and complete solution to the given latch mapping problem. Our objective is to use a complete method to find a complete latch correspondence, taking advantage of the information provided by RL.,~. A simple method of producing the modified circuit model, P* from P and RL.,~ is the following. The equivalences represented by R L , ,~ are imposed on the present-state variables in P by merging all specified equivalent latches. Further, the next-state functions of the latches Lsub and the cones of logic exclusively feeding them are removed from P. The following example illustrates the problem with this approach. built into current CEC tools and also bypass the development effort required to build a dedicated tool of comparable performance.
In our CEC tool PENs are maintained and manipulated as follows. Initially, PENs are composed by simulating the circuit with random vectors. Internal nodes with the same simulation signature are grouped together. Each such group is a PEN set. The interpretation here is that nodes in the same group are potentially equivalent, hence PENs. Nodes falling in different groups cannot be combinationally equivalent. These classes are refined by successively validating PEN pairs from each group, till finally all PENs have been resolved (i.e. either proved equivalent or proved inequivalent and split apart by a witness vector). When used in van Eijk's algorithm, the PEN sets are populated by the internal nodes W = {w,, 202,. . . , wn} of the circuit model P . In each iteration i, the latch inputs L are constrained by the corresponding relation . We will denote this instance of P as P a . Further, if w is an arbitrary internal node in P , wa denotes its instance in Pa. Given an internal node w in P , TFI(w) and TFO(w) denote the set of nodes in the transitive fanin and transitive fanout of w respectively.
In practise, the application of the constraints 77,; on the presentstate latch inputs L is implemented as follows. As mentioned in Section 2.1,RL is represented as a set of equivalence classes. For each such class C j , one latch 13Tep is chosen as the representative and all fanouts of other latches in Cj are re-routed from In the sequel we describe two novel optimizations to substantially improve the performance of the CEC tool, for latch mapping. Although, these optimizations are described in the context of our CEC tool, they are equally applicable to other CEC tools.
Incremental Reasoning
As per the above description, in each iteration of van Eijk's algorithm, the CEC tool creates and verifies a number of internal node PENs. The notion of incremental reasoning seeks to reduce some of this effort. It is motivated by the following observations we made while using van Eijk's algorithm in an industrial setting: 
Ri-1
From the above it is plausible that a large fraction of the PEN equivalences (and inequivalences) that were true in iteration i would also hold in iteration i + 1. Therefore, our proposal is to efficiently extract PEN equivalences (and, inequivalences) that remain invafiant under the refinement of RL-' to RL, performed in iteration i.
This information is then supplied as axioms to the CEC tool in iteration i + 1, which can bypass the verification of these PENs in this iteration. In this manner, the CEC tool only re-verifies information that has potentially changed since the last iteration.
Our method has two aspects, 1.) Using PEN inequivalence information, 2.) Using PEN equivalence information. Using PEN Equivalence Information: Lemma 3.1 provides a way of leveraging equivalences proved in previous iterations to bypass some such checks in the current iteration. However, the conditions stated in Lemma 3.1 are computationally difficult to check. Therefore, we have implemented a safe approximation of them which can be represented, maintained and checked by simple structural operations on the circuit. The implementation requires two data entities to be maintained for each node w in P , namely a single-bit w{ # w; f o r a n y j 2 i. THEOREM 3.3. Given internal nodes w1 and w2 in P, given some iteration i, ifwf-' = wi-' and after updating changed and affected data as per Algorithm 2 I$ 1. afSected(wi) =NULL A changed(wi) = 0 or 2. affected(wi) =NULL A changed(w;) = 0 or 3. changed(wt) = 0 A chunged(wa) = 0 A (afSected(wi) = afSected(w$) # NULL) then w; = w;. Theorem 3.3 and Algorithm 2 form the basis for our method to bypass certain PEN equivalence checks during a run of our CEC tool in van Eijk's algorithm. Algorithm 3 describes our incremental version of van Eijk's algorithm.
PEN Rejection during CEC
We have developed the following simple, but useful optimization to further enhance the performance of the CEC tool during van Eijk's algorithm. This is applicable when van Eijk's algorithm is used in a hybrid flow as in Section 3.1.
As described in Section 3.1 the latches (Lsub) comprising the partial match RL.,~ are condensed into a set of representatives The final PEN sets obtained after this step constitute the relation QZfinal vi. Verify Rip' at the NS latch variables, checking equivalences pair by pair, as with internal PENS in the, last step. The relation obtained after these checks is Ri. 3 . If Ri = Ri-' stop and return RY"" = Ri else iterate
Step 2 with z t i + 1.
Eijk's algorithm, and an arbitrary equivalence class, C of Ehep, the optimization is as follows:
e. all the latches of this class are representative latches, discard C remove from P all internal nodes that fan-out only to latches in C. This optimization simplifies the problem being solved by the CEC tool. It is derived from our practical experience with the typical anatomy of partial matches RL,, b . Our experience is that partial matches specified in 7 2~~~~ are either complete to begin with or they are to be matched with latches that are unmatched so far (i.e. not part of La&). Matchings between latches in Lsub are either not observed (other than the ones stated in RL,,~) or not important from the point of view of the latch mapping problem.
Note that while in theory this optimization makes the overall method incomplete, in practise, in extensive experiments we have never found this to be the case. On the contrary this simple optimization provides significant runtime improvements in many (but not all) cases. In any case, this is an innocuous optimization which can easily be omitted from the method, for a particular problem, if there are any concerns of completeness, without impacting the correctness of the rest of the algorithm.
EXPERIMENTAL RESULTS
In the following we present experimental results on some representative circuits to validate the following two claims regarding the techniques proposed in this paper.
1. The hybrid method of Section 3.1 is an improvement over 2. The optimizations proposed in Sections 3.2.1 and 3.2.2 subOur benchmark suite consists of several large industrial circuits (few hundreds to few thousand latches) as well as the largest of the individual incomplete or complete methods.
stantially improve the complete method.
ISCAS89 sequential circuits. In the case of the ISCAS89 examples the two circuits (to be verified) were obtained by optimizing the original benchmark with the s c r i p t . rugged combinational optimization script in SIS [lo] . The industrial examples are obtained from a wide variety of design scenarios including 1.) combinational optimization of one gate-level design by hand andlor by a CAD tool, and 2.) two designs generated from different synthesis paths from RTL (e.g. one by hand and the other by a CAD tool).
All experiments are reported on a 450MHz Sun UltraSparc-I11 machine, using implementations of the algorithms and techniques described in Section 3.
It is noteworthy that the time taken to solve a given latch mapping problem depends not only on the size (in terms of number of latches) of the problem but also on the difficulty of the problem. As in combinational equivalence checking this depends on how different the two sequential designs being compared are. In our experience almost 50-60% of typical latch mapping problems are relatively simple because the two designs have the same number of latches and are structurally very similar. In such cases the latch mapping solution consists of a pair-wise matching of latches from the two circuits, which can be efficiently and correctly obtained by incomplete methods alone [l, 41. Our contributions are aimed at the remaining 40-50% of latch mapping problems. Figure 3 (a) presents a comparison of the latch mapping results between incomplete methods (a set of proprietary, in-house incomplete methods, as mentioned in Section 3), our implementation of a complete method (see Section 3.2) and the combination of these two as explained in Section 3.1. Note that in almost all the examples the two designs have different number of latches. Columns 2 and 3 show the number of latches in the 2 designs being matched. Columns 4 , 5 and 6 show the match times for the three methods. The number of matches have been noted in parentheses, whenever the method was unsuccessful in obtaining a complete match. In all but 2 of the examples the incomplete method was unsuccessful in finding a complete match. In all but one case, the complete method is able to find the complete match but is computationally much more expensive. Our proposed hybrid scheme, on the other hand, is able to find a complete match in all cases, and much faster than the individual complete method. In the two cases where the incomplete method finds a complete match, the combined scheme does not incur any additional cost. The example Bench 1 provides an interesting case where the complete method cannot find a complete match whereas the combined method does. In this case some of the equivalence checks in the complete method aborted due to resource constraints. When using the combined method, these equivalence checks were circumvented by the partial match supplied by the incomplete methods.
The second set of results, presented in Figure 3 (b), demonstrate the speed-up obtained by using the optimizations presented in Sections 3.2.1 and 3.2.2. The speedup number is a ratio of runtimes, of our implementation of the complete method, with and without the optimizations. The experiments cover two scenarios, 1.) where the complete method is run on the modified circuit model P* (as it would be in the case of the combined method), 2.) when the complete method is run on the original problem P (these cases are marked as (full). In all cases (except one where there is a slight slow-down) the optimizations provide a speedup of 1.5 -2.5X. The example Bench 11 is a case where the two circuits were not equivalent. The results show that the proposed optimizations can be useful in a variety of contexts. The improvements obtained are significant considering the fact that the proposed optimizations carry virtually no computational overhead and provide a speed-up even when the complete algorithm has as few as 2 -3 iterations (this was the case for most benchmarks, with the combined method). 
CONCLUSIONS
Combinational equivalence checkers are widely deployed in industry today. However their application is often plagued by ineffective solutions to the latch mapping problem. We recognize that incomplete methods for latch mapping are inadequate for a large percentage of such problems arising in industry. Our experience indicates that complete methods, when applied alone to such problems, are unduly expensive. In this work we have proposed a complete and efficient methodology that seamlessly integrates an efficient and scalable incomplete method with a powerful complete method based on an in-house, state-of-the-art CEC tool. We further recognize that the application of the complete method as a black box is unable to utilize the information that the underlying problem is latch mapping. We have proposed several inexpensive, yet effective optimizations which further tune the complete method to latch mapping and extract significant performance gains.
We have proved the efficacy of our approach with results on large and complex industrial designs that have undergone extensive manual and automated optimization. In addition, we have presented results on several large public domain benchmarks. We believe that the methodology presented in this paper, along with previous complementary techniques [l, 41 can provide effective latch mapping support to CEC-based verification flows.
