to as power management. At the system level, this involves shutting down blocks of hardware during a period of time for which they are not being used.
Several methods have been presented that perform the shutdown of a section of the circuit on a clock-cycle base. Depending on some input conditions, the clock driving some of the registers in the circuit is inhibited, therefore reducing the switching activity in the fanout of those registers. These techniques are referred to as datadependent power management techniques.
The use of data-dependent power management techniques creates some interesting testability problems. In order to detect the input conditions that allow for the disabling of the clock signal, some extra circuitry has to be added to the original circuit. Since the functionality of the circuit is not being altered, this logic is redundant, thus dramatically reducing the observability of some of the nodes in the circuit.
In this paper, we first describe an approach for the complete testing of poweroptimized circuits using data-dependent power management. The approach we propose is based on a two input vector sequence. The first input vector guarantees that we can always set the output of the latches to the required value, even if the clock is disabled in the second input vector.
Using this testing strategy, we then present results that show that using state-ofthe-art techniques, automatic test pattern generation for the power managed circuit is not significantly more difficult than for the original circuit.
POWER MANAGEMENT TECHNIQUES
In a synchronous digital system controlled by a global clock, a generally accepted model for the power dissipated by a gate is given by: is the expected number of gate output transitions per global clock cycle (Najm 1993).
It follows directly from Equation 1 that one way to reduce power consumption is to reduce the switching activity
in a circuit. So called power management techniques that shutdown hardware for periods of time in which it is not producing useful data are effective methods of reducing the power consumption of a circuit. Shutdown can be accomplished by either turning off the power supply or by disabling the clock signal.
System-level approaches identify idle periods for entire modules and turn off the clock for these modules for the duration of the idle periods (Chandrakasan, Sheng & Brodersen 1992, Chapter 10) . Detection and shutdown of unused hardware is done automatically in current generations of Pentium and PowerPC processors. The trend is that future generation of processors will provide software controls for selective hardware shutdown, feature already available in the Fujitsu SPARClite processor.
Power Management Techniques

3
Scheduling algorithms that maximize the shutdown period of execution units in a system have been presented (Monteiro, Devadas, Ashar & Mauskar 1996) . Given throughput constraints and the execution units available, operations are scheduled such that those that generate controlling signals are computed first, thus indicating the flow of data through the circuit. Power is saved by only activating hardware modules involved in computing the final result, all other modules being shutdown.
At the logic level, some shutdown techniques, such as precomputation (Alidina, Monteiro, Devadas, Ghosh & Papaefthymiou 1994) , guarded evaluation (Tiwari, Ashar & Malik 1995) and gated-clock finite state machines (Benini, Siegel & Micheli 1996) , have been proposed recently and are among the most efficient power optimization techniques at this design level. These techniques are named data-dependent power management techniques as power management is achieved on a clock-cycle basis, function of the input conditions at the beginning of each clock cycle.
The precomputation method (Alidina et al. 1994 ) adds a simple combinational circuit (the precomputation logic) to the original circuit. Under certain input conditions, the precomputation logic disables the loading of all or a subset of the input registers. Under these input conditions, no power is dissipated in the portions of the original circuit with only disabled registers as inputs. We will analyse this technique in more detail in the next section.
Guarded evaluation (Tiwari et al. 1995) identifies cones internal to the circuit that can be shut down under certain input conditions. In the process, it creates new transition barriers (guards) in the form of additional latches or OR/AND gates. Instead of adding the precomputation logic to generate the clock disabling signal, guardedevaluation uses signals already existing in the circuit.
The gated-clock finite state machines (FSM)'s approach (Benini et al. 1996 ) is based on identifying self-loops in a Moore FSM . If the FSM enters a state with a self loop, the clock is turned off. In this situation, the inputs to the combinational logic block do not switch, and thus we have virtually zero power dissipation in that block. When the input values cause the FSM to make a state transition, the clock signal is again allowed to function normally. Techniques to transform locally a Mealy machine into a Moore machine are presented so that the opportunity for gating the clock is increased.
All these techniques achieve power reduction by stopping transitions at the inputs from propagating to combinational logic blocks. However, this has the undesirable consequence of making the testing of the circuit more difficult. Since for some input combinations the loading of the registers is disabled, it is not possible to have all combinations at the inputs of the combinational logic blocks.
We present a method for the automatic test pattern generation of data-dependent power managed circuits in Section 4. Although the problem is similar for the three techniques presented above, we will focus on the precomputation technique in the remainder of this paper. For this reason, in the next section we analyze this technique in more detail.
¡
In a Moore FSM, the outputs are completely defined by the present state, whereas in a Mealy FSM the outputs depend both on the present state and primary input lines. . Methods to automatically determine the precomputation logic of a circuit are described in (Alidina et al. 1994) .
Using these same principles, different precomputation architectures have been proposed in (Alidina et al. 1994 , Monteiro, Rinderknecht, Devadas & Ghosh 1995 . 
ATPG Techniques for Power Managed Circuits
ATPG TECHNIQUES FOR POWER MANAGED CIRCUITS
We address the problem of generating test patterns to detect faults in circuits using data-dependent power management techniques. These techniques, overviewed in Section 2, all use mechanisms to prevent transitions in some logic signals from propagating to combinational logic circuits, thus reducing power consumption. As a consequence, some input combinations to the combinational logic block may no longer be possible, hence dramatically reducing the effectiveness of ATPG programs.
Definition of the Problem
We will assume that we have full controllability of the inputs and observability of the outputs of the original circuit, i.e., the circuit before the power management techniques have been applied. Our objective is to measure how the testability of the circuit before (Figure 1 ) and after ( Figure 2 ) power management compare.
Suppose the combinational logic block in Figure 1 can have a total of stuckat faults. Since we are assuming full controllability and observability of the inputs and outputs respectively, standard ATPG techniques can be used for test generation.
After we apply power management, extra logic has been added to the circuit. In is not much larger than . Thus, it is not the extra number of faults that makes test generation significantly more difficult.
The major test generation problems arise from the fact that the precomputation logic will prevent some input combinations to the logic block from happening. For this reason, in order to fully test the power managed circuit, we may need two input vectors. We describe this approach in Section 4.3.
Still, even using two input vectors for testing, the precomputation logic introduces many redundant faults. Redundant faults can create difficulties for most ATPG programs as they try to prove that the fault is indeed redundant. In Section 5, we describe an ATPG algorithm that can efficiently handle all the redundant faults introduced by the precomputation logic.
Testing using Scan Techniques
Before we describe the our testing approach based on a two input vector sequence, it is worth mentioning the case for circuits using scan test techniques. With scan techniques, the registers in the circuit are connected in series. During the testing process, these registers can be directly loaded with the desired values and their contents can also be read. We therefore have full observability and controllability of the inputs and outputs of the combinational blocks in the circuit. For circuits with scan techniques, precomputation does not create any significant additional testing problem as all registers can be directly loaded, thus circumventing the precomputation logic. The only extra concern is the test of the precomputation logic, which as previously stated should be a very small fraction of the total logic.
However, there is some overhead associated with scan techniques. This overhead is generally too expensive for all registers in the circuit to be included in the scan chain. In practice, partial scan is used, where only a fraction of the register are in the scan chain. Under partial scan, a sequence of input vectors may have to be generated in order to set the output of registers not in the scan chain to some desired value. Hence, the motivation for our two input vector based testing approach that we present next.
Testing using a Two Input Vector Sequence
As described before, whenever power management is asserted, some input transitions are prevented from reaching a portion of some combinational logic block. This may impede certain input combinations at this logic block from happening. We propose to solve this controllability problem by using a two input vector sequence. As observability of the outputs is assumed, the result of the test can be verified after the second input vector.
There are two situations to consider. First, consider a fault that can be detected using some input combination that disables the precomputation logic. Then, the first input vector is of no relevance since the second input vector will be loaded into the input registers and thus we are able to set the inputs of the combinational logic to any value that is required to detect the fault. Now, consider that some fault can only be detected by input combinations that assert the precomputation logic. In this case, we build the first input vector such that precomputation is disabled and load the desired values to the registers disabled by precomputation. Next, the second input vector needs only have the correct values for the remaining registers since the first registers will already have the correct values and will not be disturbed since the precomputation logic will be active. 
Generating the Two Input Vector Sequence
We present a circuit transformation technique for the automatic generation of the two input vector sequence used in the testing of power managed circuits. This will allow us to use combinational ATPG techniques, thus avoiding the more computationally expensive sequential test generators (Abramovici, Breuer & Friedman 1990 ). The proposed transformation is shown in Figure 3 , obtained from the precomputed circuit of Figure 2 . We have duplicated the inputs of the original circuit. Note that the values ¡ corresponding to the inputs used in the precomputation logic are not defined by the transformation. Yet, for the transformation to make sense, we have to make sure that they disable the precomputation logic so that ¡ can be loaded to the registers. This condition is also shown in Figure 3 .
We can now run standard ATPG tools on the circuit after transformation to obtain values for . However, we have duplicated the number of inputs, increasing dramatically the input search space of the ATPG tool. Additionally, the circuit after transformation will have a significant amount of redundancy, further complicating the problem for the ATPG tool. In the next section we describe an ATPG tool that can efficiently handle this problem. 
Testability Analysis of Circuits using Data-Dependent Power Management
ALGORITHMS FOR ATPG
As described in the previous sections, data-dependent power management based on precomputation can introduce a large number of redundant faults. In addition, the solution of using two input vector sequences for detecting faults in the resulting circuit potentially duplicates the search space. As a result, circuits using data-dependent power management are expected to be significantly harder for test pattern generation tools. This is indeed the case and traditional ATPG algorithms are likely to be unable to yield acceptable fault coverages for circuits using data-dependent power management. In particular, this is the case with the D-algorithm, PODEM, FAN and SOCRATES (Abramovici et al. 1990 ) and with recent implementations of these algorithms (Lee & Ha 1993) .
Nevertheless, given the relationship between Propositional Satisfiability (SAT) and ATPG, recent work on efficient search algorithms for SAT (Silva & Sakallah 1996) can potentially enable the development of ATPG algorithms specifically suited for circuits with many hard-to-detect faults.
Satisfiability-Based ATPG
It is well-known that fault detection problems can be cast as instances of SAT (Larrabee 1992 , Stephan, Brayton & Sangiovanni-Vincentelli 1996 . Basically, the valid assignments to the nodes of a circuit can be represented by a Conjunctive Normal Form Formula (CNF). For ATPG, we just need to consider two replicas of a given circuit, one denoting the good circuit and the other denoting the faulty circuit (on which the given fault must be activated). By creating the OR of the XOR's of the primary outputs of the two circuits, and by requiring the output of the OR gate to assume value 1, we create a satisfiability problem whose solution is a test pattern for the given fault (Larrabee 1992) . In general, additional information is added to the CNF representation in order to prune the amount of search.
It is generally accepted that SAT-based ATPG algorithms have a few significant drawbacks. First, representing each fault detection problem as an instance of SAT is extremely time-consuming. Indeed, known experimental results indicate that CNF formula creation can take as much as 75% of the total testing time (Stephan et al. 1996) . Second, since all clauses in a CNF formula must be satisfied, test patterns may become over specified. Consequently, SAT-based ATPG algorithms may yield test sets larger than necessary.
Despite these drawbacks, SAT-based ATPG algorithms are particularly versatile in that improvements to SAT algorithms can be readily applied and extended for ATPG.
The GRASP SAT Algorithm
The GRASP SAT algorithm is detailed in (Silva & Sakallah 1996) , and it is basically a backtrack search algorithm. However, GRASP is able to analyze the causes of conflicts, i.e. situations of the search in which one or more clauses have all literals set to 0. Analysis of the causes of conflicts can be used for implementing several powerful pruning techniques:
By analyzing the causes of conflicts we can backtrack directly to the cause of each conflict. Hence, GRASP implements non-chronological backtracking. Given that the causes of conflicts are identified, they can be recorded as new clauses, and so these new clauses can be used to augment the original CNF formula. Hence, we can prevent known conflicts from being identified again during the search. Careful analysis of the structure of conflicts permits identifying variable assignments which are deemed necessary for a solution to be found. Since GRASP identifies more necessary assignments due to the causes of conflicts, the search is further pruned.
As illustrated in (Silva & Sakallah 1996) , new pruning techniques can be easily incorporated into GRASP. In addition, preliminary experimental results strongly suggest that GRASP is one of the most efficient SAT algorithms for highly structured instances of SAT.
Moreover, instances of SAT obtained from fault detection problems for stuck-at or bridging faults are highly structured, and GRASP performs particularly well on these benchmarks. As a result, circuits having a significant number of hard-to-detect faults are potentially amenable for a ATPG tool based on GRASP. The experimental results given in Section 6 strongly support this motivation.
The proposed ATPG algorithm, named TG-GRASP, basically encodes fault detection problems using the approach given in (Stephan et al. 1996) and uses GRASP as the back-end search engine. In addition, a few additional features have been incorporated into TG-GRASP:
To prevent the over-specification of test patterns, TG-GRASP implements syntactic satisfiability, which permits early identification of sufficient conditions for satisfiability before satisfying all clauses in the CNF formula. This technique can be viewed as equivalent to restricted forms of dynamic head line identification that can be identified by circuit-based ATPG tools (Silva & Sakallah 1994) . Because GRASP records new clauses during the search as by-product of conflict analysis, some of these clauses, in particular the ones solely associated with variables in the good circuit, can be re-used for subsequent faults, thus potentially pruning the amount of search for subsequent fault detection problems. We refer to these clauses as pervasive clauses.
Syntactic satisfiability reduces computation time for detectable faults, whereas pervasive clauses, because they add additional constraints to the search, potentially reduce the computation time for all faults.
EXPERIMENTAL RESULTS
In this section we compare the testability of a subset of the circuits in the MCNC'91 combinational benchmark set before and after adding data-dependent power management. In Table 1 we present the statistics for the circuits used. Under the column Original we give the number of primary inputs (PI), the number of primary out- puts (PO), the number of gates (Gates) and literals (Lits) in the circuit and its power dissipation (Power). Also in Table 1 , we show the power savings obtained after precomputation is applied to each circuit. Under Precomp., we give the number of literals in the precomputation logic and the power of the power managed circuit. We can see that the size of the precomputation logic is in general much smaller than the original circuit. The last column of Table 1 indicates the percentage savings in power obtained through precomputation. As we can observe, power reductions of upto 77% are possible.
To measure the testability of the circuits, we have used two SAT-based ATPG tools, TEGUS (Stephan et al. 1996) and TG-GRASP, described in Section 5. The CPU times reported are for a Sun 5/85 machine with 64 MByte of physical memory. All tools were compiled with the same optimization options.
In Table 2 the ATPG results for the original MCNC'91 benchmark circuits are shown. The transformation described in Section 4.4 was applied to the precomputed circuits and the ATPG programs were run on the modified circuit. The ATPG results after precomputation are given in Table 3 In each table F, D, R, A and CPU denote, respectively, the total number of faults, the number of detected faults, the number of redundant faults, the number of aborted faults and the CPU time for each tool. For each benchmark all faults are targeted. This solution permits a larger set of faults to be studied and guarantees that each ATPG tool is presented with exactly the same set of faults. For both the original and precomputed circuits, TEGUS and TG-GRASP have comparable and acceptable performance in all circuits. Nevertheless, for a few circuits the pruning techniques used in TG-GRASP make the difference and lead to reasonably smaller CPU times. Note that for benchmark dalu, TEGUS aborts one fault. These results, and the fact that TG-GRASP aborts no faults, illustrate the robustness of the TG-GRASP algorithmic solution. Even though precomputation introduces a large number of redundant faults, there exist ATPG tools which can test the resulting circuits with 100% fault coverage.
