Sequential clock-gating can lead to easier equivalence checking problems, compared to the general sequential equivalence checking (SEC) problem. Modern sequential clock-gating techniques introduce control structures to disable unnecessary clocking. This violates combinational equivalence but maintains sequential equivalence between the original and revised circuits. We propose the use of characteristic graphs (CGs) to extract essential LTL clock-gating properties, which when proved imply certain sequential redundancies. The extraction, proof and subsequent removal of the implied redundancies lead to an efficient SEC procedure for clock-gated circuits. Experiments show that the proposed SEC procedure substantially outperforms existing methods in terms of speed and scalability when applied to several difficult cases.
INTRODUCTION
In a modern VLSI design flow, equivalence checking is used between circuit designs in different stages [11] . As the use of sequential synthesis techniques increases, efficient sequential equivalence checking (SEC) methods become not only more necessary but also enable the use of sequential synthesis in the first place. The complexity of general SEC is P-SPACE complete, and hence more complex than combinational equivalence checking (CEC), which is only NP-complete. This work is motivated by the fact that some sequential synthesis methods only modify a circuit by introducing control structures which are sequentially redundant [9] . Hence equivalence checking can be based on detecting these redundancies, eliminating them and then doing SEC between the resulting circuit and a golden model. We propose such a method, apply it to sequentially clock-gated circuits, and give some experiments comparing the new method against existing techniques.
The contributions of this paper are summarized below:
1. We formulate a graph representation (CG) of an AIG (R) circuit. The CG expresses the overlaying control structure of R. An algorithm is provided that constructs CG from R. 5 . This method was implemented and applied to a number of academic and industrial clock-gated circuits to SEC them against the original circuits. Experimental results show that this method is much more efficient than existing methods for performing SEC on clock-gated circuits. The remainder of this paper is organized as follows: some basic notation is reviewed, and the most relevant previous works are discussed in Section 2. In Section 3, we describe a method for representing the essential features of a clock-gated circuit using a characteristic graph. Section 4 discusses the connection between the characteristic graph and sequential redundancy in clock-gated circuits. The overall flow of our SEC method for clock-gated circuits is given in Section 5, and Section 6 compares the performance of the proposed method with previous works using several sets of experiments. Section 7 summarizes the results and some future directions.
PRELIMINARIES 2.1 Sequential Circuits
A sequential circuit, as shown in Figure 1 , consists of a combinational logic part(CL), and sets of primary inputs(P Is), outputs(P Os) and memory components (flip-flops, F F s). The combinational part represents a functional mapping from PIs and current states of FFs, to POs and the next state of each FF. Combinational and sequential synthesis techniques [10] are applied to sequential circuits in modern VLSI design flows to minimize chip area, reduce power consumption, optimize the clock period, etc. Combinational synthesis preserves the functional mapping to reduce the cost functions, while sequential synthesis provides more flexibility by possibly changing the next state functions and thereby producing further reductions in the costs.
Sequential Synthesis for Low-Power Circuits
Static power refers generally to the power required to maintain the state of a circuit, and dynamic power refers to power needed during switching activity. Here we only focus on the dynamic power consumption.
To reduce the power consumed for updating FFs, synthesis tools apply forward and backward clock-gating [9] . Forward clockgating is used to disable the clock of a FF when all of its supporting dependencies (supports) are known to remain unchanged during the current clock period. Thus, a FF need not be updated if the FFs in its support remain the same as in the previous timeframe. Backward clock-gating turns off the clocks of FFs when the updates of the FFs can never be observed at the outputs.
Generally, forward and backward clock-gating techniques modify the clocks of FFs by using enabling signals. These are sequentially redundant in that they can be removed without modifying any observed behavior. These redundancies are used to minimize the frequency of updating some of the FFs, hence reducing dynamic power consumption. Therefore, the clock-gated circuit will keep the same sequential properties but with fewer updates of the memory components.
As shown in Figure 2 , clock-gating a FF can be modeled with a feedback loop through a multiplexer, where the enable signal, E, controls the switch of O between I and its old value. Thus the modeling of clock-gated circuits only requires general FFs. 
Previous Work on Sequential Equivalence Checking
After synthesis, we need to ensure that the circuits before and after synthesis are sequentially equivalent [4] . SEC for general circuits is typically formulated as a model checking problem on the miter between two sequential circuits (where the PO pairs are XORed to form the outputs of the miter circuit). Thus, sequential model checking techniques, including induction [15] , bounded model checking (BMC) [7] and property-directed reachability(PDR) [8] , can be applied to check if the outputs of the two circuits are always identical. If an output of the miter can ever become 1, sequential equivalence is violated, and model checking can provide an input sequence leading to the violation. Otherwise, the two circuits are proved sequentially equivalent.
However, due to the P-SPACE complexity of model checking, applying this to SEC problems may be too hard. Of course, CEC can be tried and if successful, the two circuits are also sequentially equivalent. However, most effective sequential synthesis techniques tend to change the next state functions.
Savoj at el. [13, 14] proposed a combinational approach to SEC for clock-gating synthesis. This approach aimed at circuits synthesized using satisfiability and observability don't cares (SDC and ODC) [12] . Although it worked well compared to existing methods, there are several weaknesses in this approach: (1) it cannot conclude non-equivalence, and therefore cannot supply a witness to help the designer understand the reason for this, (2) it requires unrolling and there is no suggested number of timeframes for unrolling, (3) it still has a scalability problem, especially when the combinational logic is extremely complicated.
To mediate those problems and improve the efficiency of SEC for clock-gated circuits, we propose the concept of the characteristic graph for clock-gated circuits.
CHARACTERISTIC GRAPH
The characteristic graph(CG) is an abstraction of its corresponding circuit. It is a high-level description, which represents only the essential properties needed for SEC.
Characteristic Graph
A characteristic graph G = (V, E) is a directed graph, where a vertex stands for a group of PIs, POs, FFs or internal signals in the corresponding circuit. Each directed edge represents a signal dependency from one group to another. There are three types of edges: selection-edge, on-edge and off-edge. A selection-edge connects exactly one signal to the target group, to indicate the conditional switch of the group dependency. On-edges and offedges connect the different sets of support groups to the target group if the selection signal is 1 or 0, respectively. The selectionedge of each PI is driven by a constant True, which means the value of each input signal is unconditionally updated. A group containing PIs cannot be driven by other groups. A group covering POs can have edges to other vertices, and such POs are treated as support FFs as well. A precise algorithm is given in Figure 5 for constructing the CG of a sequential circuit. Figure 3 depicts a sequential circuit with its corresponding characteristic graph. A line with a white arrow represents a selectionedge, while solid and dotted lines with black arrows are on-edges and off-edges. The circles (vertices) stand for sets of signals in the original circuit, including PIs, E, A and B, the PO, Q, and FFs, F1 and F2. Note that there is no on-edge or off-edge into each PI, and their selection-edges are driven by constant True. Thus the value of each PI is not driven by any other signal, and it is updated at every clock tick. For the output, Q, the value of F1 (selection-edge) determines if its value is driven by B (on-edge) or F2 (off-edge). Finally, F1 is only driven by E, so it has no off-edge and its selection is T rue.
To sum up, a characteristic graph abstracts the data dependency among 'essential' signals, while ignoring irrelevant combinational logic parts. Although this abstraction is especially motivated by the SEC problem for clock-gated circuits, it can be used in similar problems.
Construction of Characteristic Graph
Given a sequential gate-level circuit, we construct the characteristic graph as follows: (1) recognize selection signals (2) create 
vertices (3) build dependencies.
Recognize selection signals: When the input circuit is generated from a synthesis tool, in which 2-to-1 multiplexers (MUXes) are supported and explicitly expressed, the selector inputs of the MUXes are easily recognized and designated as selection signals. If the input circuit is an and-inverter graph(AIG), then we need to recognize instances of the MUX structure shown in Figure 4 . The output signal O is controlled by S and conditionally switched between A and B. Therefore, the signal S here is recognized as the selection signal for the group consisting of output O. This structural matching can be performed over the AIG very quickly. However, it is possible that some essential MUX controls could be missed. Create vertices: Initially, all POs and FFs are grouped by their common selection signals, while those signals without selection conditions are put into individual groups. For example, in Figure 3 , E, A, B and F1, F2 are put in individual vertices. Each selection signal occupies an individual vertex. Each signal can be included in no more than one vertex.
Given a sequential circuit, Cir, with the sets of primary inputs PI, outputs PO and flip-flops FF, the algorithm for constructing the characteristic graph G = (V, E) is shown in Figure 5 . The function recognize(Cir) at Line 2 is used to detect the MUX structures in Cir and collect the set of selection signals S. Based on S, PI, PO and FF, vertices are created and added into V through Line 3 to Line 11. The function f indT arget(Cir, si) is used to collect the set of signals, which are controlled by si. From Line 12 to 26, the vertices are connected according to the data dependencies in Cir. For the function connect(...) at Line 14, 19, 21 and 26, the first argument is the target vertex, the second is the support vertex, and the third is the edge type. The function getV ertex(V, sup) returns the vertex which covers sup.
The function backtrack(...) at Line 16, 17 and 24 goes back one timeframe from target t and returns the supports (PIs or FFs) on the boundaries. If the third argument, s, and fourth argument, on or off, are specified, this function will only backtrack the specified input side of each target MUX, and collect the corresponding supports.
Once the characteristic graph is constructed, it is used to detect sequential redundancy candidates.
SEQUENTIAL REDUNDENCY AND CLOCK-GATING
Algorithm: Characteristic Graph Construction Input: Cir: a gate-level sequential circuit with the sets of primary inputs PI, primary outputs PO and flip-flops FF Output: G = (V, E): characteristic graph for Cir, with the sets of vertices, V and edges, E.
T = f indT arget(Cir, s i ) //T is the set of FFs and POs controlled by s i 08.
if v is controlled by selection signal s 14.
connect(v, getV ertex(V, s), selection) 15.
for each t covered by v 16.
Sup on = Sup on ∪ backtrack(Cir, t, s, on) 17.
for each support supon in Sup on 19.
for each t covered by v 24.
for each support sup in Sup 26. Given a sequential circuit, sequential redundancy refers to a set of signals, which can be replaced by other signals or constant values (1 or 0), while preserving sequential equivalence to the given circuit. In other words, the fanouts of such signals can be moved to other existing signals or to constants without changing the observed behavior.
Clock gating synthesis can be either backward or forward. During this, additional control signals are placed into a sequential circuit to reduce the frequency of updating the FFs. These extra signals by definition, must be sequentially redundant in order to preserve sequential equivalence.
In this section, we will define and prove sufficient conditions for legal forward and backward clock-gating on sequential circuits. We use CGs to identify potential sequential redundancies.
Forward Clock-Gating
Forward clock-gating aims at turning off clocks for FFs when their support FFs remain at their previous states. Thus, the clock-gating is legal if all target FFs are guaranteed to update their states when their support FFs are updated. Otherwise, if none of the supports are updated, it is immaterial (don't care) if the target FFs are updated. Under proper initial states, the signal for disabling a clock is redundant and can be set to a constant.
Given a CG and a target signal Cf , which is the selection signal for a vertex Vf , the algorithm in Figure 6 is used to formulate a set of sufficient properties for Cf to be sequentially redundant to a constant 1. The function supportV ertex(...) at Line 10 and 12 returns the set of supports, which are driving Vf by on-edges and off-edges. Supn stands for the supports of Vf obtained by backtracking exactly n timeframes. Notice that here we only collect supports from on-edges for the first timeframe, but for the further backtracking, those from off-edges are also included. Each sufficient property is added into the set P of accumulated properties at Line 16.
Theorem: If Cf is 1 at the first timeframe, a sufficient condition across 1 timeframe for Cf to be sequentially redundant to a constant 1 is
where {C 1 vi } is the set of updating conditions for the support vertices of Vf , and XCf is the value of Cf in the next clock cycle.
Notice that if Vf is driven by PIs, those inputs are taken as supports with control signals as T rue, which means C 1 vi equals 1. Proof : For timeframe 0, if the initial state results Cf to be 1, it is safe to replace the selector of Vf with constant 1. Then, the LTL property in Equation 1 guarantees the FFs covered by Vf (called F Ff ) must be updated in the next timeframe whenever any support FFs of F Ff gets updated in the current timeframe. If the support includes a PI then at least one of the Cv i is 1 so Algorithm: Forward Properties Input:
C f //the target signal which must be a control signal CG = (V, E) // characteristic graph containing C f Output: P // set of sufficient properties for C f being stuck-at-1
for each v k in Sup n 14.
C the formula states that F Ff is updated. In all other cases, all support FFs are unchanged from the previous clock cycle and thus the new value for F Ff is the same as the old, and we don't care if F Ff is updated or not. Thus Cf can be 1 or 0 in those cases. By choosing it to be 1 in those cases as well, Cf becomes constant 1, i.e. Cf is stuck-at-1 sequentially redundant. Q.E.D.
Note that Equation 1 is not a necessary condition because the support FFs may change to new states, but the combinational logic may compute a next state for F Ff which is the same as its current state.
The above condition states the legality of a single timeframe forward clock gating. For multi-timeframe clock-gating, supports are collected by backtracking more than one timeframe.
Theorem 1: when Cf is proved to be 1 at timeframe 0 to n − 1, the following property is sufficient for Cf being sequentially stuck-at-1 redundant:
where {C n vi } is the updating conditions for the support vertices, V n i , obtained by backtracking CG across n timeframes as detailed in Figure 6 .
Proof : If X n Cf is 0, Equation 2 implies all C n vi are 0 in the previous n th timeframe, which indeed guarantees all FFs covered by V n i remain in the same states. Therefore, Cf can be 1 or 0 (Vf can be updated or not) in these cases. By choosing it to be 1 in those cases, Cf becomes constant 1. Based on the assumption that Cf is 1 at timeframe 0 to n − 1, Cf is sequentially redundant stuck-at-1. Q.E.D.
The theorem generates a set of properties, any one of which is sufficient for Cf to be sequentially redundant under the corresponding condition of first n states. Note that the algorithm in Figure 6 stops generating new properties when the condition at Line 17 is violated. The while loop from Lines 5 to 16 is terminated when the n th set of supports contains a primary input, which is always updated at each timeframe. Also, if the n th set is the same as any previous support set, there is no need to backtrack CG anymore. This while loop is guaranteed to terminate because the number of vertexes in CG is limited.
Theorem 2: For a target signal Cf , if there exists some n, where the corresponding n-timeframe forward clock-gating condition is satisfied, the target signal Cf is sequentially stuck-at-1 redundant.
The assumption that Cf is 1 in the first n timeframes cannot be checked by the above LTL properties directly. Section 4.4 will explain the practice of the above theorems.
Backward Clock-Gating
Backward clock-gating is used to disable updating FFs when these updates are not observable.
Given a target control signal Cb, which determines if a set of FFs, Vb, is updated, the algorithm in Figure 7 formulates a set of sufficient properties for Cb to be sequential stuck-at-1 redundant.
The function targetV ertex(...) at Line 9 starts from vi to find the target vertices (those driven by vi) in CG, and returns (vertex, type) pairs, where type can be on-edge or off-edge. When the input argument n is 1, it only collects those driven by on-edges (because we are checking this selection signal). If vi is driving vk through an off-edge, we need to negate the control signal C n vk in a property as Line 15 shows. This algorithm stops generating new properties when the n th set of targets is repeating any previous set (Line 23), which means no new properties can be formulated.
Algorithm: Backward Properties Input:
C b //the candidate signal which must be a control signal CG = (V, E) // characteristic graph containing C b Output: P // set of properties for proving C b is stuck-at-1
F anoutM ap n = ∅ // map of (vertex, type) 08.
for each v i in T arget n−1 09.
F anoutM ap n .add(targetV ertex(v i , CG, n)) 10.
T arget n = ∅ 11.
C n v = T rue //control signals for FFs 12.
C n o = T rue //control signals for POs 13.
for each pair (v k , type k ) in F anoutM ap n 14.
C Based on this algorithm, the sufficient property across one timeframe for Cb being sequentially redundant to constant 1 is
where {C 1 vi } stands for the updating conditions of target vertices, and {C 1 oi } is for target vertices containing POs, which are driven by Vb across one timeframe (because any PO is observable).
Proof : The LTL property in Equation 3 guarantees the FFs (F Fb) contained by Vb are updated when any of its targets V 1 i gets updated in the next timeframe. If Cb is 0, this property implies that none of V 1 i gets updated in the next timeframe. This blocks any new state of F Fb to be observed at any outputs. Thus Cb can be 0 or 1 in these cases. By choosing it to be 1, Cb is sequentially stuck-at-1 redundant.
Similarly, we can write down a sufficient condition for backward clock-gating across n timeframes, which justifies target signal, Cb, as sequential stuck-at-1:
where C n vi stands for the updating condition of the i th target vertex, V n i , which is driven by Vb across n timeframes, while C n oi is for the i th target vertex containing PO. The two properties in Equation 3 and 4 are formulated at Line 20 in Figure 7 . Notice that the property across n timeframes is sufficient only when the following property holds for all m, 1 ≤ m ≤ n − 1:
where C m oi refers to the updating condition of the i th vertex containing POs, V m i , which is driven by Vb across m timeframes. Once one of these properties is violated, it implies the clock-gating condition (Cb equals to 0) on Vb is observable at an output (at least one of X m C m oi equals to 1 ), so it might be an invalid clockgating case.
Theorem 3: If all the properties of Equation 5 hold for 1 ≤ m ≤ n − 1, then Equation 4 is sufficient for Cb to be sequentially stuck-at-1 redundant.
Proof : When Cb = 0, Equations 4 and 5 imply that none of the POs for the next n timeframes is observable. In addition when Cb = 0, the first part of Equation 4 implies that none of the final target FFs driven across n timeframes is observable as well. Since these represent all possible ways that the value in controlled vertex Vb can be observed, it would be correct if Cb were 1 as well. Hence Cb is sequentially stuck-at-1 redundant. Q.E.D.
Note that the stopping criterion for n is when the target set T arget n is equal to one of the previous T arget j . This is because continuing to trace forward would just reproduce one of the conditions already seen in Equation 4 for some n.
Theorem 4: For a target signal Cb, if there exists some n, such that the n-timeframe backward clock-gating condition, all corresponding properties in Equation 4 and 5, are satisfied, the target signal Cb is sequentially stuck-at-1 redundant.
Note that there is no initial state requirement for backward clock-gating cases; the n-timeframe backward clock-gating condition alone is sufficient for redundancy.
Using the Characteristic Graph
A characteristic graph exposes the essential properties of the corresponding circuit, including signal dependency and control signals. It contains information required to formulate properties for the legality of a clock-gated circuit. The on-edges and off-edges of a CG connect the targets and supports across each timeframe, while each selection-edge indicates the updating condition for a group of signals. As discussed in Section 4.1 and 4.2, sufficient LTL properties for the legality of forward and backward clockgating can be formulated. Each selection signal associated with a proved property has its corresponding signal in the original circuit which can be replaced by 1.
A F1 F2 F3 Q S1 S2 S3 True Figure 8 : A CG with three stages of FFs. The vertex A is a PI, where the selection-edge is driven by T rue. These signals, S1, S2 and S3 represent the updating conditions for FFs F1, F2 and F3, respectively. Figure 8 , where F1, F2 and F3 are FFs, Q is a PO and S1, S2 and S3 are corresponding control signals. The control signal S1 only can be the backward clockgating case, where a sufficient condition for S1 being sequential stuck-at-1 is G(XS2 ⇒ S1). Because the PO is not driven by F1 across one timeframe, another sufficient condition is G((XXS3 ⇒ S1) ∧ (XXS2 ⇒ S1)).
Consider the CG in
For S2, it can be a forward clock-gating case with the property G(S1 ⇒ XS2), or backward clock-gating with G(XS3 ⇒ S1). S3 can only be a forward clock-gating case, where sufficient properties are either G(S2 ⇒ XS3) or G(S1 ∨ S2 ⇒ XXS3).
As implied by the LTL safety properties, only the fanin cones of those control signals need to be considered by model checking, and hence irrelevant combinational logic will be excluded automatically by state-of-the-art model checking methods. Thus, when model checking the generated properties on the original circuit, the problem size is effectively much less.
There are some limitations of finding sequential redundancy on CGs. Currently each vertex on a CG covers all signals controlled by the same selection signal, so some sequential redundant points may be missed. For example, if a set of FFs is clock-gated by forward condition with a certain control signal, while another set of FFs clock-gated by backward condition with the same signal, both the two cases are legal but cannot be proved by the current formulation. This issue can be resolved by separating all FF into different vertices, but that may result duplicate properties for proving.
Property Modeling and Proving
We describe the circuit-based modeling and proving for the LTL properties in Equation 1 to 5. In sequential circuits, the notation next (X) is represented by adding FFs in front of other signals. For example, the property in Equation 1 becomes As mentioned in Section 4.1, to check the value of Cf from timeframe 0 to n−1 in forward clock-gating properties, the initial states of extra FFs should be 1. For backward clock-gating, the X's are on the left hand side of Equations 4 and 5, so extra FFs need to be added to delay the Cb signal on the right to delay clock gating for n cycles. These FFs need to be initialized to 1 to avoid spurious counter-examples.
The potential redundant signals for clock-gating that are proved thus are applied to simplify the clock-gated circuit.
OVERALL FLOW
Given two sequential circuits, golden and clock-gated designs (G and R), with mapped PIs and POs, SEC verifies if the output sequences are identical when the same input sequences are applied. If R is known to be clock-gated from G, instead of applying general SEC methods, the difficulties of SEC can be reduced using the CG approach. Here we outline the overall flow of this method.
As our use-model, we assume that the golden model G may already be clock-gated in the RTL possibly manually by the designer. Therefore in comparing G and R, we reduce them both with the CG method and apply the proved redundancies to get G' and R'.
Comparisons between Characteristic Graphs
Comparisons between the characteristic graphs CG and CR of the golden G and clock-gated R circuits can be used to identify candidates for sequential redundancy. Since it is assumed that a correspondence between flops and POs in G and R is given, we can create a correspondence between nodes in CG and CR leading to pairs {(VG, VR)}. The following situations may exist for the pairs.
1. VG and VR are driven by exactly the same set of vertices (set of signals) with identical edges, which implies the two sets of signals might be combinationally equivalent.
2. VG and VR are driven by the same set of vertices with on-edges (off-edges), but the vertices driving off-edges (onedges) are different. In this case, the selection signals are regarded as candidates for sequential redundancy.
3. VG is only driven by a set of vertices with on-edges, while the supports of VR are conditionally switched between this support set and VR itself. In this case, the selection signal of VR is a candidate for sequential redundancy.
4. VG and VR are driven by different sets of vertices with onedge, but there is no selection signal. If the paired signals are combinationally equivalent, then the two FFs are already proved sequentially equivalent. Otherwise, the difference must be resolved by model-checking, but possibly on a greatly simplified problem where other proved redundancies are used. Figure 10 outlines an algorithm to perform SEC between two circuits, G and R, and report if they are sequentially equivalent(EQ) or not(NON-EQ).
Algorithm Flow
The function CEC(...) at Line 1 performs the general combinational equivalence checking between corresponding signals in each pair and returns a set of unproved pairs. The function charGraph(...) returns the corresponding characteristic graph for the specified circuit.
The loop between Line 7 and 15 verifies each candidate and revises the corresponding CG by the proved redundancy one by one . At Line 7, the function comparison(...) analyzes these unresolved pairs as described in Section 5.1 and propose a candidate of sequential redundancy. The function def ineP roperty(...) applies the ideas in Section 4 for each target signal, and generates the set of corresponding LTL properties (P) to be proved. Notice that not only R but also G can contain sequential redundancy, so from Line 8 to 11, properties are defined for CG or CR, respectively. 
if candidate ∈ G 09. P = def ineP roperty(candidate, C G ) 10. else 11. P = def ineP roperty(candidate, C R ) 12.
proof = multiP rove(P) 13.
if isLegal(proof) 14.
revise ( The function multiP rove(...) at line 12 verifies a circuit with multiple outputs. Given a set of properties P, multiP rove(P) verifies all properties simultaneously, and then returns the result set proof, which lists both proved and disproved properties. At Line 13, isLegal(proof) analyzes the result and determines if candidate is sequentially redundant using Theorems 1 to 4. Due to the possible dependencies among candidates, the corresponding CG should be revised by proved redundancy before the next run. All proved candidates are used to simplify G and R into G' and R' at once (Line 16).
Finally, we perform SEC(G', R') to check if G' and R' are sequentially equivalent. If G' and R' are proved NON-EQ, SEC(G', R') can return a counter-example, which is also valid for G and R. Therefore, the proposed algorithm can provide counter-examples to users for debugging.
It might be that previous synthesis done on R has destroyed some of the MUX structure in the circuit, and then some MUXes might not be recognized during the CG construction. Thus, our algorithm might fail to identify all sequentially redundant points. In this case, we identify as much redundancy as possible and still use SEC for the final checking. Figure 11 : The revised circuit for Figure 3 with its characteristic graph, where the inputs, outputs and FFs are perfectly mapped to those in the golden design.
To show how this algorithm works, consider the circuits in Figure 3 and 11 as G and R. At line 1, only the pair for F2 fails CEC and is added into nonEQ. From Line 4 to 7, E, the selection signal of F2, is a candidate of sequential stuck-at-1 redundancy.
Since it can only be backward clock-gating, the corresponding property, G(XF1 ⇒ E), is created and proved. Then R is simplified into R' by replacing the selector of the MUX of F2 by constant 1, while G' is the same as G. Therefore G' and R' become identical to Figure 3 , and can be proved equivalent by SEC easily. Finally, the algorithm returns that G and R are 
EXPERIMENTAL RESULTS
We compare the CG method against two state-of-the-art methods, (1) model checker super prove [6] , which won the singleoutput track in the Hardware Model Checking Competition 2014 (HWMCC'14) [2] , and (2) Absec, a command implemented in ABC [5] which performs the algorithm in [14] .
The CG method, SEC(G, R) is implemented in ABC. The multiP rove(...) function used is multi prove, which won the multioutput track in HWMCC'13 [1] (not held in HWMCC '14) . We also apply super prove to the final SEC between G' and R'.
All experiments were performed on a 16-core 2.60GHz Intel(R) Xeon(R) CPU with a 1500 second time limit. The example circuits were clock-gated at the RTL, and then synthesized into AIGs to create R. Each input for super prove is a multi-output miter between a golden design (G) and its clock-gated circuit (R). The inputs for Absec and the CG method are G and R given before mitering.
Performance for General Clock-Gated Cases
First we examine the applicability and efficiency of the three methods applied for general clock-gated cases. Table 1 lists five cases with their circuit sizes, along with how they are clock-gated. The first three circuits were downloaded from OpenCores [3] (G) and modified (R), while the last two cases were created (both G and R) for this comparison. The CG results are separated into two stages: Simplify includes Line 1 to 16 in Figure 10 , while SEC refers to the final SEC at Line 17.
Because Absec is implemented in ABC only for backward clockgating, it is not applicable to the forward clock-gating cases (indicated with N/A in Table 1 ).
As can be seen in Table 1 , the proposed CG method significantly outperforms the other two methods. Although the general model checking method super prove can prove some of the forward and backward clock-gating cases, it requires much more run time than the specialized methods. We see that Absec can reduce the sequential complexity and prove equivalence for backward cases. The final two columns of Table 1 show that the CG method is very efficient in both the redundancy finding phase and the final SEC proof after the redundancies have been removed.
Comparisons of Scalability
The second set of experiments compares the scalability of above methods by applying them the same design (qmult taken from OpenCores [3] ), but with varying bit-widths. As the widths increase, the combinational part gets more complex. All of these cases were modified using only backward clock-gating in order to allow Absec to be applied. Table 2 shows that as the complexity of the circuit increases, the runtimes of super prove and Absec increase sharply. In contrast, the CG method is not affected by the increased complexity because the complicated combinational logic is effectively excluded.
CONCLUSIONS
This paper presents a novel SEC method for clock-gated circuits. The proposed method is based on constructing a characteristic graph (CG) to model essential signals of a circuit. It uses CG to formulate sufficient properties for sequential redundancies. These properties are proved and used to simplify the circuit after which SEC becomes easy. The experimental results show that the CG method is scalable and effective, and substantially outperforms existing techniques.
Future research will include experiments on benchmarks which have been more additionally modified before or after clock-gating to see how missing some essential signals in the CG will effect performance. Also we need to experiment with errors inserted in the clock-gating to see how we can detect these and feedback useful counter-examples.
In addition, we want to explore how the CG concept might be applicable to other verification and synthesis problems. For example, based on the ideas in Section 4, a CG might be useful to synthesize clock-gating on a circuit. It might be used or extended to extract control paths from circuits with many arithmetic operators, and then used to mediate equivalence checking for these.
ACKNOWLEDGE
This work is supported in part by SRC contract 2265.001. We also thank industrial sponsors of BVSRC: Altera, Atrenta, Cadence, Calypto, IBM, Intel, Jasper, Mentor Graphics, Microsemi, Real Intent, Synopsys, Tabula, and Verific for their continued support.
