and then calculates all the logic and timing implications due to that assign- 
Introduction
Domino logic circuits [I] are extensively used in high performance I% croprocessors [2-61 since they offer several advantages over static CMOS logic, namely higher speed, reduced transistor-count (resulting in reduced die area) and hazard-free operation. However. with technology scaling, designers find 11 difficult to deploy dynxmic logic [71 because it has an increased susceptibility to switch failure (i.e. an erroneous gate transition) due to noise and process variations. Static CMOS on the other hand is very robust to switch failure and is more likely to exhibit only delay failures due to noise. In this paper, we study noise-induced switch failures in domino logic and describe a methodology to derive test vectors that can be used to test a domino logic circuit robustly for such failures.
Switch Failures
We define a switch failure [8,9] as an irreversible and erroneous transition of a domino gate output. Several sources of noise can cause switch failures in domino logic circuits. Of these sources. crosstalk noise caused by capacitive coupling to neighboring lines. subthreshold leakage and charge sharing are especially important. With technology scaling, the importance of crosstalk noise and subthreshold leakage will increase. Although. the effect of charge sharing is not expected to increase 161. it is a significant problem for domino circuits in current technologies and may combine with other sources to cause a switch failure.
To generate a test that satisfies the complicated timing and logic requirements to detect a switch failure, we use the algorithm TEST-GEN outlined in Figure 1 . TEST-GEN follows a PODEM-like [I21 approach. Initially. logic x is assigned to all primary inputs. Then the algorithm proceeds in a recursive fashion as follows. At evely level of recursion, it assigns a logic value at time t = 0 to a previously unspecified primary input ment. (The calculation of the timing and logic-changes due to any prim-ary input assignment is performed in the Timedlmply routine of Figure I .)
The newly calculated logic and timing information is used to determine if a switch failure is possible at the targeted gate using a method detailed in Section 3. (This is performed in the conflictdetect routine of Figure 1. ) If the failure is still possible, the algorithm assigns another primary input at the next level of recursion. If the failure is no longer possible, the algorithm backtracks from the last decision and returns to the previous level of recursion. The algorithm returns a test pattem if all the conditions for switch failure are satisfied and the resulting error has been propagated to a primary output within the duration of the circuit's specified clock cycle. TEST-GEN has two major differences from a standard PODEM based lest-generation approach [12] . Each difference is clarified in detail next.
TEST-GEN
1. The algorithm uses a time-based ATPG as opposed to a logic-only ATPG. This means the algorithm maintains a timing window (i.e.
the minimum and maximum time that a signal can transition) with every signal line along with the traditional logic value of either 0, 1, or 5. The timing window changes dynamically during the test generation process and is computed in a manner similar to that described in [Ill. We also understand that the gate delays vary as a result of switching activities of neighboring wires (due to the effect of crosstalk on delay), so this effect is calculated dynamically during the ATPG process as well [13] . 2. To determine if a switch failure has occurred at a gate output, we calculate the maximum possible noise effect at a dynamic node of a gate from the current signal line values and transition windows. The noise calculation is described in greater detail in the next section. 
Maximizing Noise
Consider the complex domino gate shown in Figure 2 . Assume that at a given stage of ATPG, we know that A=l, B=O, and there is a crosstalk glitch present on line B. For this circuit. the crosstalk effect is maximized if the remaining inputs are assigned C = l , either D=O or E=O, and one of F, G and H is 0. Leakage is maximized when C=l, D@E=O, and only two of the signals F, G and H are logic 1. Charge sharing is maximized if the circuit values are C=H=O and D=E=F=G=l. We can observe that some of the requirements for maximization of one noise effect conflicts with the requirements of another. For example, the requirements for maximum charge sharing conflict with those of leakage. In our work. we independently calculate the maximum noise effect due to each noise source and combine their effects. Obviously, there is some overestimation of the total noise effect when calculating the effect of each noise source independently.
However. this overestimation is reduced during ATPG as more and more inputs are specified. Next, we describe how the maximum noise effect due to each source is determined. 
Crosstalk
Crosstalk noise is dependent on how many aggressors (i.e. signal lines capacitively coupled to a victim line) transition and the relative timing of their transitions. A detailed procedure for conservatively estimating the impact of crosstalk due to a partially-specified vector is presented in [I I] .
In addition, we utilize the methodology presented in [I I] to calculate the maximum discharge AQcF.,.. from the dynamic node.
Charge Sharing
To estimate the maximum voltage drop due to charge sharing for a partially-specified vector V,. the partially-specified vector is converted to a fully-specified vector 4. For example, assume a partially-specified V, has established gate inputs A=l and B=O for the gate shown in Figure 3a .
The fully-specified vector V, of A=C=D=E=G=I and B=F=O maximizes charge sharing since this V, connects the dynamic node to the maximum number of intermediate nodes (i.e. 2 , 3 and 4) without creating a discharge path to ground. (See Figure 3b .)
The conversion of a partially-specified vector to a fully-specified vector can be achieved using two depth-first traversals of a gate's transistor schematic. Once the fully-specified vector is obtained, we calculate the voltage drop due to charge sharing using the model presented in [IO] . In 
where Q. is the initial stored charge at the dynamic node after precharge. In [14] , it is shown that this model can accurately predict switch failure due to charge sharing.
Leakage
Using the BSIMZ MOSFET model [IS], the expression for subthreshold leakage current through a transistor is given by where W and L are the channel width and length of the transistor, respectively, A is a constant, uga is the gate-to-source voltage of the transistor, VTHO is the threshold voltage, Uds is the drain-to-source voltage across the transistor, K T / q is the thermal voltage, 7 is the body-effect coefficient, and 7 is the DIBL coefficient. From this equation, we can observe that the leakage increases with ads. Thus, when multiple off-transistors are stacked in series, their leakage decreases significantly since U,+ across each transistor is reduced. Thus, the maximum leakage current through a transistor ti occurs when all other transistors in series with ti are turned on. In addition, the leakage depends only on the W / L ratio of the off transistor ti. Since L is identical for all transistors in digital circuits, the problem reduces to maximizing the cumulative W of all off transistors ti with the remaining series transistors being on. This optimization is easily mapped to identifying the maximum cut of a graph model of the evaluate chain. We have shown that the max-cut of a graph corresponding to a domino logic evaluate chain can be found in polynomial time [14] .
Combination of Noise Sources
Our methodology for integrating noise due to multiple noise sources is based on the following observation: If the reduction of voltage at the dynamic node is not too severe, then the various noise mechanisms act independently. For example, in Figure 5 some amount of charge L~Q~ is removed from the dynamic node due to crosstalk. Simultaneously, there is also a reduction of the dynamic node voltage due to charge sharing among circuit nodes 1, 2, 3 and 5. If the amount of voltage reduction at the dynamic node due to noise leaves the victim transistor MNA in saturation, the current due to crosstalk is independent of the voltage at the dynamic node. From Figure 5 . it is also observed that the victim transistor affected by a crosstalk glitch is always outside any charge sharing path, meaning that crosstalk discharge occurs through a separate path to ground. In other words, crosstalk discharge does not involve any of the nodes that participate in charge sharing. Thus, the charge loss due to crosstalk is independent of the charge redistribution due to charge sharing. Given that the initial charge stored in the dynamic node is Qid. crosstalk drains AQ from the dynamic node and charge sharing causes a redistribution of the dynamic node charge, causing a reduction of the voltage by some factor K .
The two effects can be combined to give V D = e Q ' x K , where V D is the voltage at the dynamic node and C , , ,
is the input capacitance of the static invener connected to the dynamic node. Hence, to obtain the final voltage due to the combined effect of all three noise sources, we independently derive the charge loss AQ,,,,, due to crosstalk, A Q J . .~ = s Izdt due to leakage, CVDD and GGND due to charge sharing, and combine them using equation 3.
charge-sharing path crosstalk-discharge path -i Figure 5 : An ~x~m p l e domino gate that has bath crosstalk discharge due 10 a glitch on line A and charge sharing due to device capacitances a1 nodes I , 2 . 3 and
5.
If the dynamic node voltage V D predicted by equation 3 is less than or equal to the switching threshold of the output inverter, the failure is possible. Otherwise, the failure is not possible and the test generation process I has to backtrack from a previous circuit input assignment. 
Simulation Results
We applied our method to a dual-rail domino Wallace lree multiplier circuit 1161, implemented in a 2-metal, O.l8pm, 1.8V technology. The multiplier consists of 1806 transistors, arranged in 43 identical adder cells that formed a total of 172 domino gates. A layout of the multiplier was generated automatically using an industrial place and route mol. A netlist containing parasitic capacitances was extracted from the layout using Space [171. In order to validate our methodology for combined analysis of charge sharing, crosstalk and leakage, we inserted several noise failures into the multiplier circuit. The method for inserting a noise failures is as follows. We first select a test vector that creates a small crosstalk glitch on a victim line. No charge sharing exists at the destination gate of the victim line since device capacitances are initially too small. Also, the glitch by itself is not large enough to cause a switch failure at any of the destination gates of the victim line. Next, we incrementally increase the device capacitances of the transistors of the victim gate to increase charge sharing. The increase of the device capacitances is continued until we observe an error at one of the destination gates. Once we observe the error in Hspice, the modified netlist that exhibits the failure is used as input to our test generation tool. If the test-generation methodology is sound, we expect the test generation tool to identify the failure. In addition, the tool should generate all the tests, including the original input vector used for creating the failure, that can b t h activate the f d u r e and propagate the resulting error to a circuit The outcome of test generation for three failures is.listed in Table 1 .
Column one of Table 1 shows the failure identification number. Column two shows the total number of test vectors generated for the failure. Column three shows the number of generated vectors that succeeded in detecting the switch failure. For each failure, our test generator identifies the failure at the victim line. The test generator also provides a set of test vectors for each failure and in every case, the test set subsumes the original test 
