The rising demand for portable system is increasing the importance of low power as a design consideration. In this sense, leakage power is increasing much faster than dynamic power at smaller dimensions. Peak values of supply current are related to noise injected into the substrate and/or propagated through supply network, limiting the performances of the sensitive analog and RF portions of mixed-signal circuits. This paper analyses how these three aspects, dynamic power, leakage power and peak power, can be considered together, optimizing the sizing and design of basic cells, with a reduced degradation in performances. The suited sizing of basic cells, show the benefits of the proposed technique, validated through simulation results on 130 nm nand, nor and inverter cells.
Introduction
Power consumption and power-related issues have become a first-order concern for most designs, because of the rising demand for portable system [1] [2] [3] . Many techniques for low power design of VLSI circuits targeting both dynamic and leakage components of power dissipation in CMOS VLSI circuits have been recently presented [1] [2] [3] . The primary method used to date for reducing power has been supply voltage (V DD ) reduction, although this technique begins to lose its effectiveness as voltages drop to sub-one volt range and further reductions in the supply voltage begin to create more problems than are solved [1] [2] [3] . For this reason, it is important the optimization of geometries for cells at an specific V DD .
A well known expression [1] for the power consumed in a CMOS circuit is shown in eq. (1).
This work has been sponsored by the Spanish MEC TEC2007-65105 TICOCO and the Junta de Andalucía TIC2006-635 Projects (1) Eq. (1) includes dynamic, shortcircuit and leakage power. In the same way, power is related to the supply current consumption curve: the average power, defined as the (dis)charge (dynamic) current plus the shortcircuit current, the leakage current. Besides this, the maximum value of supply current for a given pattern, known as peak current, has been traditionally indirectly related to switching or di/dt noise [4, 5] being the cause of the limitation of the performances of the sensitive analog and RF portions of mixed-signal circuits.
This paper proposes a simulation-based optimization procedure, allowing the achievement of geometries for basic cells, providing reduced values of peak current and both dynamic and leakage power consumption.
The organization of the paper is as follows: Section 2 analyzes the supply current related issues; Section 3 presents the optimization methodology; Section 4 includes demonstrative simulation results. Finally, the main conclusions are presented.
Supply current related issues
Much work has been done to establish precise models for the power consumption in their different components [1] [2] [3] 7] . In most cases, not analytical but empirical models are achieved because of multifactorial dependencies. We are interested in analysing and exploiting their dependencies on V DD and transistor sizes. A brief discussion on this is shown below.
Peak of supply current
The maximum value of supply current (I peak ) shows multiple dependencies on output load, input-output coupling capacitance and input slope, bringing as a consequence several empiric models for I peak [6, 7] .
Geometry optimization in basic CMOS cells for improved power, leakage, and noise performances
Considering a CMOS inverter under the assumption that the gate has been sized such that the pull-up and pull-down network transconductances are roughly equal, the peak current in the load device will occur at a gate voltage of approximately V DD /2 for usual input transitions. At this time, the load transistor will experience a small drain-to-source voltage which has been caused by the active transistor which begins switching the capacitive load once the input voltage exceeds its threshold voltage. In nanometric devices, the saturation voltage is typically much smaller than the expected long-channel value of V GS -V TH . When V GS =V DD /2, we expect that a drain-to-source voltage in the range of (V GS -V TH )/2~(V DD /2-V DD /4)/2=V DD /8 will be sufficient to saturate the load device. This approximation allows to estimate the peak current at V GS =V DD /2. Based on the alpha-power law model [8] for I DS , it can be determined for a NMOS transistor (2): (2) being I DS,Sat calculated as the following empirical formulation for deep submicron devices [9] , as (3):
When substituting nominal approximated value for V GS (V DD /2), α (1.3), V DD (1.2V) and V TH (=0.265V), yields the expression in (4) for the peak in supply current, roughly 24% of the device's saturation current, a linear dependence with W/L and a non-linear dependence with supply voltage V DD .
(4)
It should be noticed that this value could fluctuate greatly depending on the output load of the gate and the input slope. For instance, a very small load at the output of the gate would mean that the output voltage can swing more quickly, such that the load transistor will be even deeper in the saturation regime; leading to a larger peak current. On the other hand, with a large load and a quick input transition, the output will not have had time to switch very significantly keeping in the linear region of operation. However, when the load is moderate, equations (2) and (3) yield adequate representations of the load device current. It was noted in [8] that shortchannel devices exhibit more shortcircuit current than long-channel MOSFET's. This is due to the fact that the device saturates at a smaller drain-to-source voltage, allowing for a larger current to flow in the time interval
corresponding to the transition. In this analysis, the effects of resistive-capacitive-inductive supply distribution network have been neglected for simplicity.
Leakage Power
In many new high performance designs, the leakage component of power consumption is higher than the switching component (up to 70% or even higher percentage) [1] [2] [3] . This percentage will increase with technology scaling unless effective techniques are introduced to bring leakage under control.
There are four main sources of leakage current in a MOS transistor:
1. Reverse-biased junction leakage current (IREV) 2. Gate induced drain leakage (IGIDL) 3. Gate direct-tunnelling leakage (IG) 4. Subthreshold leakage (ISUB)
For current technologies, the ISUB is the dominant component among the four components of I leakage [1] .
A usual expression for ISUB for a NMOS transistor is presented in eq. (5), where Vt is the thermal voltage, n is an experimental value, and η accounts for the DIBL effect, that will be neglected for simplicity (η=0) [1] .
As clearly stated from (5), leakage linearly depends on W/L, and exponentially on (V GS -V TH ) and -V DS .
Dynamic (average) Power
Dynamic power consumption in CMOS circuits consists of shortcircuit dissipation and the switching power consumed while charging and discharging load capacitances, according to eq. (1). The dynamic power is associated with the switching of logic states that is central to performing logic operations. (Dis)charge power is proportional to CV DD 2 f, where C is the output capacitance, V DD is the supply voltage, and f is the clock frequency. This power dissipation is in direct proportion to the computation rate, and so can be adjusted to meet application power requirements by adjusting the computation rate. It can also be adjusted, to a more limited extent, by adjusting the supply voltage. The dependence of (dis)charge power on W/L is related to the parasitic capacitance C associated to the gate, being proportional to W and L, excluding wiring capacitance. The shortcircuit power strongly depends on the time that both pull-up and pull-down are in simultaneous saturation state (input slope), being proportional to W/L and (V GS -V TH ) α , as it is stated in eq. (3).
Optimization technique
It is clear from the previous section, that a linear dependence of I peak , I avg , and I leakage on W/L can be considered in a first approach. In the same way, a non linear dependence with V DD is also clearly stated: quadratic for dynamic power considering (dis)charge current, but polynomial for shortcircuit, peak power and leakage.
Additionally, the propagation delay associated to a simple CMOS cell can be evaluated [8] as eq. (6), where the linear (inverse) dependence on W/L, the linear dependence on load capacitance C L and polynomial dependence on V DD are clearly stated. Parameter α is 1.3 for the selected technology.
The main idea is to get benefits from these dependencies in order to optimize the suited selection of geometries and supply voltage for basic cells. A reduction of V DD produces, as immediate consequence, the reduction of the different components of power, but increasing the delay. However the increment in delay can be compensated if the width of the transistors is increased in the same proportion. This increment in the width also produces a quasilinear increment in components of power, so a trade-off between area, delay and power components (leakage, peak and average) can be found.
The proposal is to increment the width of the transistor applying a correction scale factor fw, obtained after a parametric analysis, as the supply voltage is reduced by a factor fvdd, maintaining the propagation delay almost constant, keeping reduced the overhead in power parameters.
To do this, the iterative scheme in Fig. 1 is considered. The start point of the procedure is the selection of the cells and transistor's dimensions, nominally minimum width and length. The gate is simulated and characterized for the selected technology, under the nominal supply voltage (fvdd=1). The propagation delay, measured as the average of high to low and low to high delays (tp=(tphl+tplh)/2), the leakage current (I leakage ), the peak (I peak ) and the average (I avg ) of supply current are measured by SPECTRE simulation. Once characterized, the fvdd parameter is reduced, and a parametric analysis is performed to obtain a value of fw, modifying the width of the transistors of the gates in a factor Wfw, in such a way that the propagation delay for this choice of fw is approximately the same (within e.g. 0.1% of tolerance) than that obtained for fvdd=1. The length of
the transistors are kept constant. The process is repeated with different values of fvdd, until the supply voltage is reduced down to a half of its initial value, keeping the transistors out of the subthreshold region. In our case we have considered four different values for fvdd and obtained the corresponding fw values, saving power and reducing peaks in supply current, as it will be shown in next section.
Simulation results
To perform the SPECTRE simulations needed to run the procedure explained in the previous section, the simulation set-up of Fig. 2 has been considered. The basic cells selected for evaluating the procedure are the basic CMOS inverter, the 2-input NAND and the 2-input NOR gates, as shown in Fig. 2 . The output inverter is used to simulate realistic load conditions for the cell under study while capacitive load (6 fF) matches the wiring load. The input patterns are square 100 MHz waveforms, with 25 ps of transition times. The selected technology was UMC 130 nm, with a nominal supply voltage of V DD = 1.2 V. Dimensions are the minimum for the technology: W n /L n = 0.15/0.12 um (N1 and N3 in Fig. 2 ) and W p /L p = 0.3/0.12 um (P1 Fig. 2 ), being scaled when the transistors are serially connected: W n /L n = 0.3/0.12 um (N2 in Fig. 2 ) and W p /L p = 0.6/0.12 um (P3 in Fig. 2) . The results for the basic gates considering only the variation in V DD and excluding the optimization procedure (fw=1) are shown in From the results obtained, it is clear that, as V DD decreases, tp increases roughly as eq. (6); I peak decreases almost linearly, but quantitatively different as expected from eq. (4), because the maximum value of I peak occurs when ; I avg decreases linearly (average power quadratically) as expected; and I leakage decreases exponentially. The results are very similar for the three gates considered. Obviously, these results are only due to variations in V DD , because geometries remain unchanged (fw=1).
The results obtained after the application of the optimization procedures are shown in table II. The values obtained for fw are shown in the third column, and indicate the overhead in area because this factor is multiplied by the width of all the transistors in the cell. From the results of the table it can be extracted that the value of fw grows exponentially with V DD , in order to keep a value of propagation delay almost constant (eq. (6)). The behavior for supply current parameters is now quite different, because the simultaneous dependence on W (now is Wfw) and V DD . This dependence includes an increment in both leakage, peak and average current, respecting to the equivalent values in table I, because of the increment in W/L, but such an increment is reduced if compared with the gain in operation speed, since it remains constant. This is the main consequence of the optimization process. The trend for average current is almost linear, excepting the operation at low V DD (0.6V), where the value of average current is higher for the Nand and Nor gates, showing a local minimum near 0.8V. The same happens for the peak current, since a maximum value is located at lowest V DD for the Nand and Nor gates. The increment in A graphical description of these results are depicted in Fig. 3 for the inverter, Fig. 4 for the 2-input Nand and Fig.5 for the 2-input Nor, before and after the optimization process. As most interesting results, the optimization process ensures a constant delay with slight increase in power (peak, leakage and average), yielding an effective power saving.
Conclusions
This paper has presented a simulation-based optimization procedure, allowing the selection of geometries for basic cells, providing reduced values of peak current and both dynamic and leakage power consumption. The procedure has been applied to three basic cells in a 130 nm technology, increasing the size of their transistors, keeping constant the propagation delay, and saving power (dynamic and leakage), with an additional relative reducing the peaks in supply current. These results have been obtained for the different cells, showing the Nand and Nor gates an optimum behavior at about 2/3 of V DD . Future work will be devoted to the extension to more complex cells, introducing more design parameters in the optimization process, as output load, for instance. and after (square) the optimization process.
