In this paper, a new leakage-tolerant circuit design technique for high fan-in domino circuits is presented. This technique uses stacking effect to reduce the leakage of the evaluation network of domino gates. It also uses a current mirror in parallel with the evaluation network to reduce the evaluation delay. Depending on the fain-in, the proposed technique exhibits 2.0X to 17.7X leakage and noise tolerance improvement compared to a standard domino counterparts designed in a 70-nm technology node.
Introduction
Dynamic logic has been widely used for very high performance that cannot be achieved with the static logic styles [I] . However, the dynamic logic styles are more sensitive to noise than the static logic styles. The poor noise immuuity of domino circuits is due to their low switching threshold voltage, which is equals to the threshold voltage of NMOS devices in their cvaluation networks. As the technology scales down, the supply voltage is reduced for low power, however, this requires the threshold voltage (V,) scaling to achieve high performance. Threshold voltage reduction results in less noise immunity for domino logic gates. Moreover, reduced threshold voltage exponentially increases subthreshold leakage, as illustrated in Fig.  I . Besides leakage increase, the input noise sources such as crosstalk and supply noise also increase with technology scaling [2] . Due to all of these trends, domino logic circuits suffer from leakage and noise immunity in deep submicron regimes. The leakage iminuriity is more problematic in high fan-in domino circuits because of larger leakage due to more parallel evaluation paths. Since the leakage current is proportional to fan-in of domino OR-gates, the noise immunity decreases with fan-in increase. Leakage and noise-tolerance are major issues for wide domino OR gates, because the evaluation transistors are all in parallel, leaking the charge from the precharge node [3]. To reduce leakage power while maintaining system perfonnance. dual-Vu, designs were proposed
[4], [ 5 ] . The dual V* techniques utilize high Vm off the circuit critical paths to reduce leakage current while utilizing low VQ on the critical paths to obtain high performance. High Vh cannot be used to improve leakagehoise immunity of domino circuits, because domino circuits are used in critical paths of a design to achieve high performance. Moreover, the Dual Va requires an expensive process technology. We propose a new circuit technique for domino logic design that improves leakage immunity of high fanin domino logic gates without using dual Va. Fig. 2 shows the conventional standard domino circuits for high fan-in OR gates. The scheme in Fig.   2 (a) is a footed standard domino (SD) logic, whereas tho one in Fig. 2(b) i s a footless standard domino logic. The footed structure typically shows better noise and leakage tolerance because of the leakage rcduction in the evaluation path as a result of stacking effect [6] . In the conventional domino circuits ( Fig. Z(a),(b) ). the robustness of standard domino circuits can be improved by upsizing the keeper transistor [3], [7] . The keeper ratio (K) is defined as the ratio of the current drivability of the keeper transistor to that of the evaluation transistor: The rest of the paper is organized as follows: in section 11, the metric used for noise imniunity measurement in our experiments is described. Section 111 describes our proposed circuit, and the sinlalation results are shown in section IV. Finally, section V concludes the paper.
Noise Immunity Metric
For comparing different circuit techiiiqucs for robustness to leakage and noise, we apply identical noises pulses to all inputs of the evaluation network during the evaluation phase, and the amplitude of tho resulting noise pulse at the output is measured. The noise immunity metric is the Unity Noise Gain (UNG), defined as the amplitude of the input noise at the input that causes the same amplitude of noise at the output In this technique, a pulse noise emulates crosstalk type of noise at the input. The input noise level can be increased by increasing either the noise pulse duration or amplitude. In our experiments, we change the input noise level by changing its amplitude only.
IIL Proposed Leakage Tolerant Domino
Our proposed circuit technique for leakage tolerant domino is illustrated in Fig. 4 . The transistor M7 is added to provide stacking effect for leakage reduction in the evaluation phase. However, increased height of transistor stack in the evaluation path increases the evaluation delay. To reduce the evaluation delay, a current mirror (MS) is added in -,--,---++
Fig. 4. Proposed leakage tolerant domino
parallel with the evaluation network to increase the discharging current if the evaluation Fesults in discharging of the dynamic node. Tiaiisistor M9 provides a feedback from the output to the dynamic node. It is added to be able to fully dischargc the dynamic node (to avoid short circuit current on the static inverter) if the dynamic node is discharged in the evaluation phase (output switches to higli). The proposed circuit works as follows: when the clock is low, the circuit is it1 the precharge phase, atid the dynamic node (dyn-node) gets precbarged to high.
The footer transistor (M6) is off and therefore tlie current mirror (M8) is also off, pulling no current froiii the dynamic node (dyn-node). In tho evaluatioii phase when the clock is high, this circuit shows excelleiit noise immunity duo to the stacking effect offered by the transistor M7. When the clock is high, if all the inputs arc zero, this stacking effect reduces leakage of tlie evaluation network. Ilowcvor, if at least oiie of the inputs switches to high, theti tlic mirror transistor pulls large current from the dynamic node resulting in a high to low transition on tlie dynamic node. In this casu, output of the gate bansitions to high and the NMOS transistor M9 is htrned ON to firlly discharge tlie dynamic node. 
Iv. Simulation ReRults
The dynamic OR gates based on the footless standard domino (SD) (Fig,2(b) ), conditional keeper logic (CKL) (Fig. 3) , and the proposed circuit (Fig.  4) for fan-ins of 8, 16, 32, and 44 were designed in HSl'ICE at the worst case Ioff corner of tho 70nm Berkeley Predictive Technology node [ 101. For the standard domino logic, the keeper ratio (defined it1 (1)) is increased froin 0.18 to 0.8 in order to extract different data points for delay and UNG. For our proposed circuit, tlie ONG-delay trade off is made by increasing the size of mirror transistor, M8 (Fig. 4) .
The curves of UNG vs. delay for these circuits are illustrated in Pig. 6 . As observed from this figure, our proposed circuit exhibits significant increase in UNG coinpared to the conventional techniques for all faill-ins. It is also evident from Fig. 6 that the effectiveness of the coiiventionaf keeper upsizing method is limited in terms of UNG improvement, especially for higher fan-ins, and results in considerable performancc degradation. Moreover, for each fan-in, if the UNG is required to be lorgcr tlian a certain amount, the proposed technique exhibits better perforinaiice. For example, it can be observed from Fig. 6 that if tlie required UNG for a 16, 32, or 64-input domino OR gate is required to be greater than 0.2, tlieii the proposed impletnetttation shows better performoncc and robustness compared to tfie standard and conditional keeper doiriitio dcsigns. In the standard and conditioiial-koeper domino gates, the UNG considerably drops with fanin increase; however, the UNG does not drop with fan-iii increase in the casc of the proposed domino. This is due to the fact that the voltage-drop across the footcr transistor (M7 in Fig. 4 
