ABSTRACT
INTRODUCTION
As a common logic in high-speed performance chip design, domino circuits are widely used and can be classified into footerless and footed domino [1] [2] [3] . The footed domino has better timing characteristics because the footer transistor isolates the pull-down network (PDN) from ground during precharge phase so the dynamic node does not discharge through the PDN. In footerless dominos circuit evaluation delay is reduced and consumes less power. Owing different characteristics the footerless and footed dominos both are extensively used in high microprocessors. In a multistage domino, the first stage is typically footed and the others in chain are footerless [3] .
With aggressive scaling of CMOS device reduces the threshold voltage (V t ) accompanies with the exponential increase of subthreshold leakage current (I sub ) which is a concern not only for leakage power consumption but also for noise immunity. For solving I sub problem many techniques at circuit level have been proposed which includes input vector control [4] , body-bias control [5] , dual-V t [6] , transistor-stack effect [7] and so on.
In fact, I gate increases exponentially with the scaling of oxide thickness (t ox ). 2003 International Technology Roadmap for Semiconductor (ITRS) predicts that t ox will decrease from 13Å for the 65nm generation to 9Å for 35nm [8] . With such thin t ox , accordingly, I gate is becoming a significant contributor to the total leakage current as CMOS process advances to sub-65nm regime. The probability of electron tunneling is much higher than the probability of hole tunneling through the silicon-dioxide used as gate oxide in bulk CMOS technology. Simulation results shows that I gate of a PMOS device is much lower when compared with I gate of NMOS device as shown in Fig. 1 with similar physical dimensions (width, length and t ox ) in a 65 nm technology and at the same potential difference across the gate insulator. The I gate produced by an NMOS transistor is 81.5 times higher at supply voltage 1.2V and 16 times higher at supply voltage of 0.2V when compared with PMOS transistor. The difference of I gate between NMOS and PMOS transistor is increased with increase of supply voltage as illustrated in Fig. 1 . During ideal mode or at low temperature most of the power consumption occurs due to I gate and during nonideal mode or at high temperature most of the power is consumed by the I sub . So new circuit technique should be efficient enough to reduce the I gate and I sub at low and high temperatures respectively. Kao et al. [6] indicated that high clock and high input (CHIH) signals are preferable to reduce I sub in sleep mode dual-V t footerless domino gate. However, the CHIH sleep state produces great gate oxide leakage current (I gate ) through the PDN transistors in both footed and footerless dominos. The most recent, comprehensive analysis of the total leakage at 65nm including I sub and I gate of footerless dominos was carried out by Z. Liu et al. [9] . Considering the impact of I gate on the total leakage current, the study indicates that high clock and low input (CHIL) state is preferable in dual-V t footerless dominos, particularly at low sleep temperatures.
In this paper, a new circuit technique is proposed which reduces the I gate and I sub leakage current with inputs and clock signal combination. The proposed circuit consumes less active power for low and high die temperatures but with more delay and area overhead compared with standard dual-threshold (dual-V t ) domino logic circuit.
The paper is organized as follows: Section 2 characterizes leakage current in domino circuit are surveyed. In Section 3 the proposed lector dual-V t domino circuit is explained. Simulation results are given in Section 4 following the conclusion in Section 5.
CHARACTERISTICS OF LEAKAGE CURRENT IN DOMINO CIRCUIT
This section is divided into two subsections namely 2.1 and 2.2. In Section 2.1 comparison of sub threshold and gate oxide leakage current produced by PMOS and NMOS transistors for low-V t and high-V t is shown. In Section 2.2 working of standard dual-V t domino is discussed.
I sub and I gate current analysis of a single transistor
Maximum gate oxide leakage and sub threshold leakage currents produced by PMOS and NMOS is shown in Fig. 2 . In Fig. 2(a) four components of I gate are shown: Gate to channel tunneling current (I gc ), gate-to-source tunneling current (I gs ), gate-to-drain tunneling current (I gd ) and gateto-body tunneling current (I gb ) [9] . I gs and I gd are the edge tunneling currents from gate to source and drain terminals respectively, through the gate-to-source and gate-to-drain overlap areas. I gc is shared between source and drain terminals [10] . I gb is smaller than the other three components of gate tunneling current and it is typically several orders of magnitude. As shown in Fig. 2 (a) maximum gate oxide leakage current flows when the transistor is turned ON and maximum potential difference between gate-to-source and gate-to-drain terminals. As shown in Fig. 2 (b) maximum sub threshold leakage current flows when the transistor is turned OFF and maximum the potential difference between source and drain terminals.
A comparison of normalized gate oxide and subthreshold leakage currents produced by NMOS and PMOS transistors for low-V t and high-V t in a 65nm dual-V t CMOS technology is listed in Table 1 . The data are measured for low and high die temperatures. Transistor Length = 65nm, Width = 1µm, Low-V t = 0.22V, High-V t = 0.423V, V DD = 1V. For each temperature, leakage currents are normalized by subthreshold leakage current produced by a high-V t PMOS transistor.
Firstly, the I gate produced by a low-V t NMOS is 81x and 72.5x higher than the I gate produced by a low-V t PMOS at 110 0 C and 25 0 C respectively, as illustrated in Table 1 . It shows that the probability of hole tunneling is much smaller than the probability of electron tunneling through the gate insulator. Therefore, the I gate produced by a PMOS device is much smaller than the I gate produced by a NMOS device with similar physical dimensions (width, length and t ox ) in a 65 nm technology and at the same potential difference across the gate insulator [11] .
Secondly, the I gate produced by a low-V t NMOS is 9.1x at 110 0 C and 9x at 25 0 C higher than I gate by a high-V t NMOS transistor. Relatively higher gate tunneling barrier for the electrons is exploited in this paper by using a high-V t NMOS transistor at the input of a domino circuits to reduce the gate oxide leakage current overhead of the proposed dual-V t domino circuit technique.
Standard Dual-V t Domino Logic
The standard dual-V t domino logic is shown in Fig.3 . The first dual-V t domino logic circuit was proposed by Kao [12] employing dual-V t transistors for reduction of subthreshold leakage circuit. For maintaining the same delay as in standard footerless domino circuit the critical signal transition should occur through low-V t during evaluation phase. Alternatively, during precharge phase signal transition is not a critical issue for maintaining in the performance of the circuit and the transistors that are active during precharge phase having high-V t transistor [13] . The feedback keeper transistor parallel with precharge transistor whose gate is biased with the output voltage is employed to maintain the dynamic voltage against coupling noise, charge sharing problem and subthreshold leakage current [14] .
The working of standard dual-V t domino circuit is as follows: When the clock is low the prechrage transistor MP 1 (high-V t ) is ON and charges the dynamic node, this phase is called precharge phase. During the precharge phase output node goes low and MP 2 (high-V t ) transistor turns ON maintaining the dynamic node in the high state. The output of the domino logic is independent of the inputs applied at the evaluation network only the leakage current is dependent on the input vectors applied. Now when the clock is high transistor, MP 1 is OFF and transistor MP 2 is dependent on the output of the domino circuit, this phase is called evaluation phase. The dynamic node charging will depend on the input vectors applied and according to that output node will be low or high. The subthreshold and gate oxide leakage will also depend on the applied input vectors. 
LECTOR DUAL -V T DOMINO LOGIC
The proposed circuit technique effectively enhances the reduction of subthreshold and gate oxide leakage simultaneously. The proposed circuit is illustrated in Fig.4 . The concept behind the approach is the reduction of leakage power using the effective stacking of transistor between the path from supply voltage to ground. The observation is based on [15] , [16] and [17] in which a state with only one transistor is OFF between the supply voltage and ground is more leaky then the state with more than one transistor is off in a path from supply voltage to ground. In our approach a low-V t MP 4 (PMOS) and MN 2 (NMOS) LCTs are introduced between the precharge and evaluation network and the gate of these transistors are controlled by the source of each other. The drain node of MP 4 and MN 2 are connected together to form the input of the inverter. In this configuration, transistor MP 4 and MN 2 switching will depend on the voltage potential at node N 2 and N 1 respectively. So for any combination of input in the pull-down network one of the LCT will operate near its cut-off region and increase the resistance between V DD and ground rails leads to the reduction of leakage current. High-V t NMOS transistors replaces the low-V t input transistors of pull-down network to reduce the gate oxide leakage current.
The proposed domino gate operates similar to standard dual-V t domino gate. In proposed domino circuit when the clock signal turns low the dynamic node is charged high through the transistor MP 1 (high-V t ) and MP 4 (low-V t ) . The charging of dynamic node is almost independent of the previous clock input state. Suppose if the inputs are low before the clock sets low then node N 2 will be at low potential and transistor MP 4 offers the less resistance path for charging of dynamic node or if the inputs are high before the clock sets low then the voltage at node N 2 is not sufficient to turn MP 4 completely to OFF state (MP 4 is operating near its cut off region). The resistance of MP 4 will be lesser than in OFF resistance allowing the dynamic node to get charge high. The charging of the dynamic node is called precharging phase. In this case output of the domino circuit is independent of the inputs applied at the evaluation network only the leakage current is dependent on the input vectors applied, the combination of clock and inputs low clock and low input (CLIL) and low clock and high input (CLIH) is shown in Fig.5 and Fig.6 respectively. Now when the clock turns high or standby mode this is called evaluation phase, depending on the inputs the dynamic node gets charged or discharged. If all the inputs are low the dynamic node will not be discharged by the evaluation network and the output of the inverter will be low and it turn ON the transistor MP 2 (high-V t ), the voltage at node N 1 will turn ON the transistor MN 1 (high-V t ) but the voltage induced at node N 2 will not cut off the transistor MP 4 it will operate near cut-off region offering high resistance path between V DD and ground reducing sub threshold and gate leakage current. In this CHIL sleep state as shown in Fig. 7 all the transistors in evaluation network exhibits both the I sub and I gate leakage current simultaneously. In this case I sub dominates I gate therefore CHIL is not a leakage reduction sleep state for the proposed circuits. For other sleep state CHIH as shown in Fig. 8 the dynamic node will be discharged through the evaluation network and the output of the inverter will be high. Transistor MP 2 will turn OFF, the voltage at node N 1 will operate the transistor MN 2 near its cut off region again offering high resistance path. The potential at node N 2 will turn ON the transistor MP 4 . So by introducing the low-V t LCTs the resistance between V DD and ground is increased and simultaneously propagation delay of the domino circuit is also increased. The propagation delay will be controlled by sizing of the LCTs. Lector stacking retains the logic state during standby mode as in the standard dual-V t domino logic. In CHIH only I gate flows through the evaluation transistors and therefore this is the leakage reduction state in standby mode. 
SIMULATION RESULTS
BISM4 device model [18] is used for simulating the standard dual-V t domino logic and proposed technique circuits for accurate estimation of subthreshold and gate oxide leakage currents. Following currents are simulated in a 65nm CMOS technology (V tnlow =|V tplow |=0.22V, V tnhigh =0.423V, |V tphigh |=0.365V, V DD =1V and output capacitance C out =1fF) 2-input domino AND gate (AND2), 2-input, 4-input and 8-input domino OR gates (OR2, OR4 and OR8 respectively) by the HSPICE tool [19] [20] . All these circuits are designed with standard dual-V t domino and proposed dual-V t technique. To have a reasonable comparison the sizing of NMOS and PMOS are equal in both the technique circuits. For measuring active power consumption clock pulse of 30ns is applied and measured for low and high inputs at low and high die temperatures.
Comparison is done for total leakage power consumption in all the circuits by both the techniques during ideal and non ideal mode for low and high inputs at 25 0 C and 110 0 C. The low and high input states covers all the worst-case scenario for leakage power consumption that fall in the intermediate state of inputs because the maximum number of OFF transistor will occur when all the inputs are low and minimum number of OFF transistor will occur when all the inputs are high.
Active Power Consumption
Active Power Consumption of the domino circuits are shown in Fig.9 at 25 0 C and 110 0 C. The result shows that active power in lector dual-V t circuits is reduced as compared with the standard dual-V t domino circuits. At 25 0 C the active power consumption decreases by 39.6% in AND2 , 41.3% in OR2, 49.7% in OR4, 57.9% in OR8 and at 110 0 C 36.2% in AND2, 32.4% in OR2, 40.3% in OR4, 38.5% in OR8 when compared with standard dual-V t domino circuits. It is observed that lector dual-V t technique produces slightly weak logic levels due to which it leads to the reduction of active power consumption. Similarly, it is applicable for other domino circuits. 
Leakage Power Consumption at 25 0 C
In this part, it is assumed that sleep period is long and the sleep temperature has fallen to the room temperature. At low temperature in sub-65nm technology I gate is dominant over I sub , based on simulation result for the proposed domino circuit as shown in Table 2 , CHIL and CLIH states are preferred for reduction of leakage current. 
Leakage Power Consumption at 110 0 C
In this part, it is assumed that the sleep mode is short and the temperature keeps 110 0 C during the short sleep period. At high temperature, I sub is dominant over I gate , based on simulation result for the proposed domino circuits as shown in Table 3 , same as at 25 0 C, CHIL and CLIH states are preferred for reduction of leakage current. 
CONCLUSION
In the sub-65nm technologies both the gate dielectric and subthreshold leakage currents must be suppressed for reducing power consumption. Therefore, a new domino technique is proposed for simultaneously reducing gate oxide and subthreshold leakage currents in domino logic circuits at different temperatures.
The proposed domino circuit technique exploits the lector stacking effect employed between precharge and evaluation network and characteristics of high-V t NMOS transistors used as input transistors of domino circuits. Result shows reduction of active power by 39.6% to 57.9% at low and 32.4% to 40.3% at high die temperatures. At low and high temperature CLIH state is preferred for wide OR gates, proposed work improves the leakage power by 69.3% to 92.68% at low temperature and at high temperature the improvement in total leakage power by 58.83% to 72.73% when compared with standard dual-V t domino circuits. This technique can be used for very high speed low power applications.
