# Advances in Radio Science

## Gate Leakage Reduction by Clocked Power Supply of Adiabatic Logic Circuits

Ph. Teichmann, J. Fischer, E. Amirante, St. Henzler, A. Bargagli-Stoffi, Ch. Otte, and D. Schmitt-Landsiedel

Lehrstuhl für Technische Elektronik, Technische Universität München, Theresienstrasse 90, D-80290 Munich, Germany

Abstract. Losses due to gate-leakage-currents become more dominant in new technologies as gate leakage currents increase exponentially with decreasing gate oxide thickness. The most promising Adiabatic Logic (AL) families use a clocked power supply with four states. Hence, the full  $V_{DD}$ voltage drops over an AL gate only for a quarter of the clock cycle, causing a full gate leakage only for a quarter of the clock period. The rising and falling ramps of the clocked power supply lead to an additional energy consumption by gate leakage. This energy is smaller than the fraction caused by the constant  $V_{DD}$  drop, because the gate leakage exponentially depends on the voltage across the oxide. To obtain smaller energy consumption, Improved Adiabatic Logic (IAL) has been introduced. IAL swaps all n- and p-channel transistors. The logic blocks are built of p-channel devices which show gate tunneling currents significantly smaller than in n-channel devices. Using IAL instead of conventional AL allows an additional reduction of the energy consumption caused by gate leakage. Simulations based on a 90nm CMOS process show a lowering in gate leakage energy consumption for AL by a factor of 1.5 compared to static CMOS. For IAL the factor is up to 4. The achievable reduction varies depending on the considered AL family and the complexity of the gate.

#### 1 Introduction

Future applications raise the needs for computing complexity. Device scaling increases the integration density, but with smaller devices and thinner gate oxides, leakage currents are not neglectable any longer and leakage reduction methods are of concern in modern CMOS design methodologies. Many proposals have been presented to supress leakage currents in static CMOS, e.g. Henzler et al. (2004, 2005); Drazdziulis et al. (2003); Narenda et al. (2001); Hamazoglu et al. (2002).

Adiabatic Logic is a promising low power circuit method to reduce the energy dissipation in digital logic, by using a constant current to efficiently charge a capacity. Therefore, a clocked power supply (power clock) is used, that consists of four states. Only during one of the four states the whole supply voltage  $V_{DD}$  drops across the gate. Hence a reduction of the leakage currents is implemented explicitly by the power clock in AL circuits.

In this paper the reduction of gate leakage currents through the power clock is investigated. The following Sect. 2 gives a short overview of the physical principles of gate leakage. Section 3 deals with the adiabatic power clock leading to an estimation of the savings in gate leakage in Sect. 4. A single MOS device is taken into account, driven by a constant supply voltage on the one hand and the dynamic power clock of an adiabatic system on the other hand. Section 5 presents the results of simulations in a 90 nm CMOS technology. Three different adiabatic logic topologies are simulated and compared to an implementation in static CMOS.

#### 2 Gate leakage

In state of the art devices with oxide thicknesses below 2 nm, gate leakage currents become a noticable part of static energy dissipation. Depending on the voltage from gate to substrate, gate tunneling can be divided into two parts. One is called Fowler-Nordheim tunneling (FNT) and the other is called direct tunneling (DT). FNT occurs, when the charges tunnel through a triangular potential barrier. Hence, a high voltage is needed in order to gain FNT. In normal device operation only DT results in a significant gate leakage current value. For an n-channel device, the electrons tunnel from the conduction band of the substrate through a trapezoidal oxide potential barrier (see Fig. 1). The equation describing the DT current density is given by Schuegraf et al. (1994).

$$J_{DT} = AE_{ox}^2 exp \frac{-B\left[1 - (1 - \frac{V_{ox}}{\phi_{ox}})^{3/2}\right]}{E_{ox}}$$
(1)

$$A = q^3 / 16\pi^2 \hbar \phi_{ox} \tag{2}$$

$$B = 4\sqrt{2m^*}\phi_{ox}^{3/2}/3\hbar q \,, \tag{3}$$

where  $\phi_{ox}$  is the barrier height. As  $\phi_{ox}$  for holes in the valence band is 4.5 eV compared to 3.1 eV for the electrons



**Fig. 1.** Direct tunneling of electrons from the conduction band of the substrate into the conduction band of the polysilicon. The electrons tunnel through a trapezoidal oxide potential barrier.

in the conduction band, the gate leakage current in n-MOS devices is higher than the gate leakage current in a p-MOS device. From simulations with parameters of a 90 nm technology we obtained gate leakage currents in n-MOS devices that are about 10 times higher than in a p-MOS device. This leads to the assumption, that designs using predominantly p-channel devices suffer less from gate leakage than those using n-channel devices.

#### 3 Adiabatic power clock

Adiabatic logic families suitable for building digital systems use a power clock consisting of four phases (see Fig. 2). A phase is separated into four states (Fig. 3a) named Evaluate **E**), Hold (**H**), Recover (**R**) and Wait (**W**). The duration of each state is T/4, where T is the period of the clock cycle. During **E** the internal nodes of the AL gate are charged depending on the outputs of the preceding gate. The outputs of the gate are valid during **H**. Charge is recovered from the internal nodes to the oscillator in the **R** state. For symmetry reasons and thus easier generation of the waveform, the state **W** has been introduced.

The gates are connected to the power clock in such a way, that if a gate is in **E** state, its preceding gate is in **H** state. In Fig. 2 the way data is transferred is symbolized by the arrows. So the gate connected to phase  $\phi 1$  evaluates the inputs that are deliverd by the preceding gate at phase  $\phi 0$ .

Looking at the power clock, the full supply voltage drops at the gate for just a quarter of a cycle. During **E** and **R** the voltage is ramped from 0 to  $V_{DD}$  and vice versa, leading to a leakage power dissipation as well. Compared to a static CMOS gate, where the supply voltage  $V_{DD}$  is constant the whole time, a reduction in static power dissipation through gate leakage is expected. Figure 3b shows the gate leakage current for a phase of the adiabatic power clock and it's static CMOS counterpart.

### 4 Estimation of the savings through adiabatic power clock

For a prediction of the savings through the adiabatic power clock compared to static CMOS using a constant



**Fig. 2.** The power clock for the considered adiabatic logic families. It consists of four phases. The arrows show, when data is transferred from one gate to the consecutive gate.

 $V_{DD}$  supply, a single MOSFET device is characterized here. Therefore, the gate leakage energy dissipation factor (GLEDF)  $\eta$  between the energy dissipation caused by gate leakage for the different arrangements in Fig. 4 is introduced.

$$\eta = \frac{E_{gl,CMOS}}{E_{gl,AL}} \tag{4}$$

Taking Eq. (1) and the waveform u(t) of a power clock phase, we can calculate  $\eta$  as follows.

$$\eta = \frac{\int_{T} V_{DD} J_{DT}(V_{DD}) dt}{\int_{T} u(t) J_{DT}(u(t)) dt}$$
$$= \frac{V_{DD} J_{DT}(V_{DD})T}{2 \int_{0}^{T/4} u(t) J_{DT}(u(t)) dt + \int_{0}^{T/4} V_{DD} J_{DT}(V_{DD}) dt}$$
(5)

$$\eta = \frac{J_{DT}(V_{DD})T}{\frac{8}{T} \int_0^{T/4} t J_{DT}(u(t)) dt + \frac{T}{4} J_{DT}(V_{DD})}$$
$$\eta^{-1} = \frac{\frac{8}{T} \int_0^{T/4} t J_{DT}(u(t)) dt}{J_{DT}(V_{DD})T} + \frac{1}{4}$$
(6)

Looking at  $\eta$  in Eq. (6) we see two terms in the denominator. The first term represents the dissipation of energy due to gate leakage during the ramps **E** and **R** of the power clock phase, the second term arises from the constant  $V_{DD}$ drop during **H** state of the power clock phase. Neglecting the first term in the denomiator, the energy dissipation factor  $\eta$  would obviously be four. Considering the whole equation and assuming a linear dependence of the gate leakage current on the voltage across the oxide,  $\eta$  would be as low as two. From Eq. (1) the gate leakage current is exponentially dependent on the voltage across the oxide, resulting in a lower gate leakage than in case of a linear dependence. So the expected value will be in between  $\eta=2$  and  $\eta=4$ .

The GLEDF is dependent on the supply voltage  $V_{DD}$ . If we have a look at the inverse GLEDF  $\eta^{-1}$  we see, that the numerator in the first term is the integral over the gate leakage power at the ramps. The denominator in the equation for



Fig. 3. (a) A phase of the adiabatic power clock is separated into four states. (b) Respective gate leakage current at a single transistor.



**Fig. 4.** The estimation for the savings is performed on a single transistor. On the left one can see the circuit in case of static CMOS and on the right the circuit for the clocked power supply of an adiabatic system.

 $\eta^{-1}$  comes from the integration over the gate leakage power during the **H** state. For higher values of  $V_{DD}$ , the integral of the power during the ramps becomes less remarkable compared to the integral over the gate leakage power during the **H** state.

A Matlab simulation was performed to calculate the factor  $\eta$ . The parameters for modeling the gate leakage current according to equation (1) were taken from a 90 nm CMOS technology. As expected, the results in Fig. 5 show, that  $\eta$  is dependent on the supply voltage. Higher supply voltages result in a higher reduction of gate leakage currents in adiabatic logic compared to static CMOS.

The factor  $\eta$  is an estimation for the savings at a single transistor. For each logic family the topologies and the logic functions have to be taken into account. The Efficient Charge Recovery Logic (ECRL) (Moon et al., 1996) consists of a pair of cross-coupled p-channel devices (Fig. 6a). The logic function blocks F and  $\overline{F}$  are connected between the outputs of the gate and GND. The Positive Feedback Adiabatic Logic (PFAL) (Fig. 6b) uses a latch as basic cell with the function blocks connected from power clock to the outputs of the gate. These two families use logic function blocks consisting of nchannel devices. To additionally reduce the adiabatic losses the Improved PFAL (IPFAL) family (Fig. 6c) has been introduced (Fischer et al., 2003). IPFAL, like PFAL, consists of a cross-coupled latch but the logic blocks are implemented with p-channel devices only. This is advantageous, because gate leakage currents – as mentioned in Sect. 2 – are almost a decade lower in p-channel devices. But all three families have in common, that an inverter cell uses at least twice as many MOSFET devices as the corresponding CMOS gate. Hence, the factor of the estimation is influenced by the logic family and the implemented logic function. A device quantity factor (DQF) fn for the relation of the count of n-channel



**Fig. 5.** A Matlab simulation of equation (6) using parameters of a 90 nm CMOS technology. The energy dissipation factor  $\eta$  is dependent on the supply voltage.

devices is defined. The p-channel devices are neglected, because the gate leakage currents are almost a decade lower than in n-channel MOSFETS.

$$fn = \frac{n_{AL}}{n_{CMOS}} = \frac{transistorcount_{n,AL}}{transistorcount_{n,CMOS}}$$
(7)

Three logic functions are investigated. An inverter (INV), a NAND and a function called LOGIC5. LOGIC5 is a five input logic, where the logic function trees F and  $\overline{F}$  are balanced, i.e. the number of parallel and serial devices is equal in the two logic function blocks. The count of n-channel devices and the corresponding factor fn for the investigated logic functions are listed in Table 1.

ECRL uses n-channel devices for its logic blocks, so the factor fn stays 2, independent of the implemented logic function. For PFAL, the INV uses 4 n-MOSFETs. Two as input transistors in the logic function blocks, and two more in the latch. The CMOS inverter just uses one n-channel device. Hence, the two additional n-channel devices in the latch of a PFAL gate result in a drastic overhead for logic functions with few input signals (e.g. INV, NAND, NOR). But for PFAL gates with a higher input signal count, the overhead caused by the two additional devices in the latch is smaller. For the LOGIC5 gate, the factor fn=2.4, ompared to fn=4for the INV gate. As the logic function blocks in IPFAL are buildt with p-channel devices, the count of n-MOSFETs in IPFAL stays 2, independent of the size of the logic function. So the factor fn is decreased for IPFAL, with increasing size of the logic function.

| <b>Table 1.</b> Transistor counts and DQF $(fn)$ of the investigated logic familie | ies and functions. |
|------------------------------------------------------------------------------------|--------------------|
|------------------------------------------------------------------------------------|--------------------|

|        |                   |                   |                   |                    | C      | C                                        | C                   |
|--------|-------------------|-------------------|-------------------|--------------------|--------|------------------------------------------|---------------------|
|        | n <sub>CMOS</sub> | n <sub>ECRL</sub> | n <sub>PFAL</sub> | n <sub>IPFAL</sub> | fnecrl | fn <sub>PFAL</sub>                       | fn <sub>IPFAL</sub> |
| INV    | 1                 | 2                 | 4                 | 2                  | 2      | 4                                        | 2                   |
| NAND   | 2                 | 4                 | 6                 | 2                  | 2      | 3                                        | 1                   |
| LOGIC5 | 5                 | 10                | 12                | 2                  | 2      | 2.4                                      | 0.4                 |
|        | ⊃out<br>]-⊂in1    | in1D-F<br>inND-F  |                   |                    | inN    | out ———————————————————————————————————— |                     |

Fig. 6. Schematics of (a) ECRL, (b) PFAL and (c) IPFAL cell. The logic function blocks consist of p-channel devices for IPFAL and n-channel devices otherwise.

**Table 2.** Effective gate leakage energy dissipation factor (effective GLEDF) at a supply voltage of 0.8V.

| $\eta_{eff}$ @0.8V | ECRL | PFAL  | IPFAL |
|--------------------|------|-------|-------|
| INV                | 1.43 | 0.715 | 1.43  |
| NAND               | 1.43 | 0.95  | 2.86  |
| LOGIC5             | 1.43 | 1.19  | 7.15  |

With the knowledge of fn an effective GLEDF  $\eta_{eff}$  can be calculated.

$$\eta_{eff} = \frac{\eta}{fn} \tag{8}$$

With Eq. (8) the factor  $\eta$ , obtained from the estimation of a single device, is adapted to the used logic family and the implemented logic function.

The arrangement of the devices in the logic function has an influence on the gate leakage induced energy dissipation. Additionally for PFAL another fact reduces the gate leakage caused by the input transistors. If the voltage of a PFAL inverter input transistor is  $V_{DD}$ , the phase is connected to drain and source of the input device, leading to decreasing voltages  $V_{GS}$  and  $V_{GD}$ . Hence, the gate leakage is reduced with the rising voltage of the phase. As the DQF contains no information about the arrangement of the devices inside the logic function blocks,  $\eta_{eff}$  is a rough estimation.

The effective GLEDF  $\eta_{eff}$  can be calculated with the factor  $\eta$ . For the supply voltage of 0.8 V,  $\eta$  is 2.86 (see Fig. 5). With that,  $\eta_{eff}$  @0.8 V can be calculated using the information on fn printed in Table 1, leading to Table 2.

#### 5 Simulation results

Looking at the simulation arrangement of the INV function in Fig. 7, a chain of five gates is used for the simulation. Figure 7 shows the scheme of the simulation arrangement. The gates S1 and S2 are used to provide a realistic input signal at the device under test S3. The gates S4 and S5 are used to provide a realistic load. The chain is terminated with capacities of 0.1 fF. The transistor model is BSIM4 with parameters of an industrial 90 nm CMOS technology. The supply voltage is 0.8 V.

The simulation results are shown in Fig. 8. The effective GLEDF  $\eta_{eff}$  is plotted against the frequency. First we take a look at the influence of the logic functions on  $\eta_{eff}$  for each adiabatic family.

ECRL has approximately the same value for all logic functions. For the INV function, ECRL reaches a factor  $\eta_{eff}=1.5$ . Looking at the simulation results for the NAND function one can see, that the simulated value is worse than the prediction. This comes from the fact, that parallel connected devices cause more gate leakage than serial connected devices. The ECRL and the CMOS NAND contain a serial connection of two n-channel devices. Additionally, the ECRL NAND includes a parallel connection of n-channel devices in the  $\overline{F}$ block. In fact, the different gate leakage for serial and parallel connected devices is not considered in the estimation, leading to a deviation from the predicted value. For PFAL the factor  $\eta_{eff}$  increases for functions with more inputs, as the overhead due to the two additional n-MOSFETs inside the latch becomes less important. The advantageous location of the logic blocks in PFAL result in a higher  $\eta_{eff}$  for LOGIC5 for PFAL ( $\eta_{eff} \approx 1.4$ ), compared to ECRL ( $\eta_{eff} \approx 1.3$ ). PFAL reaches its maximum of  $\eta_{eff}$  for functions with higher input counts.

a)



Fig. 7. The simulation arrangement for the INV is a chain of five gates. For the functions with higher input counts, each input uses a chain of two inverters to condition the signal.



**Fig. 8.** Energy dissipation factor  $\eta_{eff}$  for the investigated logic functions.

IPFAL follows the trend in the prediction, but the estimated factor  $\eta_{eff}$ =7.15 for LOGIC5 is not reached. Actually the value for the IPFAL LOGIC5 is close to 4.

#### 6 Conclusion

In this paper we have shown that the power clock used in the investigated adiabatic logic families reduces gate leakage currents implicitly. The simulation results show that ECRL and PFAL save up to 30% and IPFAL up to 75% of dissipated energy through gate leakage, compared to static CMOS. The effective gate leakage energy dissipation factor stays nearly constant for ECRL, independent of the size of the function blocks. PFAL and IPFAL save most for devices with large logic blocks. IPFAL reaches the highest effective GLEDF  $\eta_{eff}$ , since an IPFAL gate is buildt of p-MOSFETs mainly. An estimation method, based on the simulation of a single transistor has been presented and adopted for the investigated logic families and the implemented functions.

The power clock implements a power down at gate level for a quarter of the clock period. Hence, all leakage currents are supressed by the adiabatic power clock in **W** state. Therefore, also subthreshold leakage is reduced by the clocked power supply in adiabatic logic circuits.

Acknowledgement. This work is supported by the German Research Foundation (DFG) under the grant SCHM 1478/1-3.

#### References

- Blotti, A., Di Pascoli, S., and Saletti, R.: Simple model for positivefeedback adiabatic logic power consumption estimation, Electronic Letters, Vol. 36, No. 2, 2000.
- Drazdziulis, M. and Larsson-Edefors, P.: A Gate Leakage Reduction Strategy for Future CMOS Circuits, ESSCIRC European Solid State Circuit Conference, 2003.
- Fischer, J., Amirante, E., Bargagli-Stoffi, A., and Schmitt-Landsiedel, D.: Improving the Positvie Feedback Adiabatic Logic Family, Kleinheubacher Berichte, 2003.
- Hamzaoglu, F. and Stan, M. R.: Circuit-Level Techniques to Control Gate Leakage for sub-100 nm CMOS, ISLPED, 2002.
- Henzler, St., Berthold, J., Georgakos, G., and Schmitt-Landsiedel, D.: Single Supply Voltage High-Speed Semi-Dynamic Level-Converting Flip-Flop With Low Power And Area Consumption, PATMOS International Workshop on Power and Timing Modeling, Optimization and Simulation, 392–401, 2004.
- Henzler, St., Nirschl, Th., Skiathitis, S., Berthold, J., Fischer, J., Teichmann, P., Bauer, F., Georgakos, G., Schmitt-Landsiedel, D.: Sleep Transistor Circuits for Fine-Grained Power Switch-Off with Short Power-Down Times, accepted for publication at ISSCC, 2005.
- Moon, Y. and Jeong, D.-K.: An Efficient Charge Recovery Logic Circuit, IEEE Journal of Solid-State Circuits, Vol. 31, No. 4, 514–522, 1996.
- Narendra, S., Borkar, Sh., De, V., Antoniadis, D., and Chandrakasan, A.: Scaling of Stack Effect and its Application for Leakage Reduction, ISLPED, 2001.
- Roy, K., Mukhopadhyay, S., and Mahmoodi-Meimand, H.: Leakage Current Mechanisms and Leakage Reduction Techniques in Deep-Submircometer CMOS Circuits, Proc. of the IEEE, Vol. 91, No. 2, 305–327, 2003.
- Schuegraf, K. F. and Hu, Ch.: Hole Injection Si O<sub>2</sub> Breakdown Model for Very Low Voltage Lifetime Extrapolation, IEEE Transactions on Electron Devices, Vol. 41, No. 5, 761–767, 1994.