Abstract. The logical effort method has appeared very convenient for fast estimation and optimization of single paths. However it necessitates a calibration of all the gates of the library and appears to be sub-optimal for a complex implementation. This is due to the inability of this model in capturing I/O coupling and input ramp effects. In this paper, we introduce a physically based extension of the logical effort model, considering I/O coupling capacitance and input ramp effects. This extension of the logical effort model is deduced from an analysis of the supply gate switching process. Validation of this model is performed on 0.18µm and 0.13µm STM technologies. Application is given to the definition of a compact representation of CMOS library timing performance.
Introduction
In their seminal book [1] , I. E. Sutherland and al introduced a simple and practical delay model. They developed a logical effort method allowing designers to get a good insight of the optimization mechanisms, using easy hand calculations. However this model suffers of a limited accuracy when strong timing constraints have to be imposed on combinatorial paths. An empirical extension of the logical effort model has been proposed in [2] for considering the input ramp effect. However the I/O coupling capacitance effect, first introduced by Jeppson [3] , remains neglected in this method.
Moreover, with the dramatic increase of the leakage current, designers may be willing to design with multiple threshold voltage CMOS (MTCMOS) and several supply voltage values. As a consequence, it is necessary to develop a delay model allowing the designers to deal with multiple V TH and V DD values, with reduced calibration time penalty.
In this paper we introduce a physical extension of the logical effort. Both the I/O coupling and input ramp effects are considered together with the timing performance sensitivities to the V TH and V DD values.
The rest of the paper is organized as followed. In section 2, starting from the Sakurai's alpha power law [4] , we first physically justify and then extend the logical effort model. Section 3 is devoted to the validation of the extended model and to its application to the definition of a new timing performance representation. Section 4 discusses briefly how to increase the accuracy in choosing the sampling points to be reported in traditional look up tables before to conclude in section 5.
Timing Performance Model
In the method of logical effort [1] the delay of a gate is defined by
τ is a constant that characterizes the process to be used, p is the gate parasitic delay. It mainly depends on the source/drain diffusion capacitance of the transistors. g, the logical effort of the gate, depends on the topology of the gate. h, the electrical effort or gain, corresponds to the ratio of the gate output loading capacitance to its input capacitance value. It characterizes the driving possibility of the gate.
Using the inverter as a reference these different parameters can be determined from electrical simulation of each library cell. However, if this simple expression can be of great use for optimization purpose no consideration is given to the input slew effect on the delay and to the input-to-output coupling [3] . As a result, the different parameter values cannot be assumed constant over the design space.
As it can be shown from [5] , eq.1 characterizes the gate output transition time. However, timing performance must be specified in terms of cell input (output) transition time and input-to-output propagation delay. A realistic delay model must be input slope dependent and must distinguish between falling and rising signals. In order to get a more complete expression for the gate timing performance, let us develop eq.1, starting from a physical analysis of the switching process of an inverter. We first consider a model for the transition time.
Inverter Transition Time Modeling
Following [5] , the elementary switching process of a CMOS structure, and thus of an inverter, can be considered as an exchange of charge between the structure and its output loading capacitance. The output transition time (defining the input transition time of the following cell) can then be directly obtained from the modeling of the charging (discharging) current that flows during the switching process of the structure and from the amount of charge (C L-TOT ·V DD ) to be exchanged with the output node as
V DD is the supply voltage value and C L-TOT the total output load which is the sum of two contributions: C L-TOT = C INT +C EXT . C INT is the internal load proportional to the gate input capacitance, constituted of the coupling capacitance, C M , between the input and output nodes [3] and of the transistor diffusion parasitic capacitance, C DIFF . Note that C M can be evaluated as one half the input capacitance of the P(N) transistor for input rising (falling) edge [6] , or directly calibrated on Hspice simulation. C EXT is the load of the output node, including the interconnect capacitive component.
In eq.2 the output voltage variation has been supposed linear and the driving element considered as a constant current generator delivering the maximum current available in the structure. Thus the key point here is to determine a realistic value of this maximum current. Two controlling conditions must be considered.
Fast input control conditions
In the Fast input range, the input signal reaches its maximum value before the output begins to vary, in this case the switching current exhibits a constant and maximum value:
This is directly deduced from the Sakurai's representation [4] , α N,P being the velocity saturation indexes of N, and P transistors, K N,P is an equivalent conduction coefficient to be calibrated on the process. From (2, 3) we obtain the expression of the transition time for a Fast input control condition as
τ and R, defined below, are respectively a unit delay characterizing the process, and the dissymmetry between N and P transistors while k is the transistor P/N width ratio. C IN is the gate input capacitance.
Finally p HL,LH and g HL,LH are the parameters characterizing the topology of the inverter under consideration. They are defined by:
As clearly shown the eq.4-6 constitute an explicit representation of the logical effort model [1] for non-symmetrical inverters.
Slow input control conditions
In the slow input range, the transistor is still in saturation when its current reaches the maximum value but its gate source voltage is smaller. This results in a smaller value of the maximum switching current. Let us evaluate this value. From the alpha power law model we can write
where τIN is the transition time of the signal applied to the gate input. This leads to
Defining ∆t as the time spent by the transistor to deliver its maximum current, (8) becomes:
Then under the approximation that the current variation is symmetric with respect to its maximum value, we can evaluate the total charge removed at the output node as:
Combining eq.9 and 10, we obtain the value of the maximum current for a slow input ramp applied to the input as
Finally, by combining eq.2 and 11, the expression of the transition time for slow input control condition is obtained as 
Defining the supply voltage effort as
results in a logical effort like expression of the output transition time for a slow input ramp condition
Note here that for advanced processes, in which the carrier speed saturation dominates (α N,P =1), this expression can be simplified
Unifying Fast and Slow domain representation
Let us now consider the case of a rising edge applied to a gate input. As it can be deduced from (4) and (14), the logical effort delay model
can be used as a metric for both Fast and Slow input ramp domains. Indeed, considering the sensitivity of the different expressions to the input slope, we can express the normalized inverter output transition time as
where σ HL,LH , is equivalent to an input slew effort
Eq.17 is of great interest, since it clearly shows that the output transition time of any inverter of a given library can be represented by a single expression. As shown, the right part of eq.17 is design parameter independent. As a result this gives the opportunity, in a library, to characterize the complete set of drive of an inverter by one look up table of one line. This is obtained by using a representation relative to the slew effort parameter σ. As a validation of these results we represent in Fig.1 , the output transition time variation (with respect to σ HL ) of 7 inverters of a 0.13µm process. As expected all the curves pile up on the same one, representing the output transition time sensitivity to the slew effort σ HL . 
Extension to Gates
Following [7] , the extension to gates is obtained by reducing each gate to an equivalent inverter. For that we consider the worst-case situation. The current capability of the N (P) parallel array of m transistors is evaluated as the maximum current of an inverter with identically sized transistors. The array of N (P) series-connected transistors is modeled as a voltage controlled current generator with a current capability reduced by a factor DW HL,LH . This reduction factor (DW) is defined as the ratio of the current available in an inverter to that of a series-connected array of transistors of identical size. 
) ( ) (
DW corresponds to the explicit form of the logical effort [3] . Let us evaluate the expression of DW HL , LH . In the Fast input range, the maximum current that can provide an array of n serially connected transistors is defined by:
where R N,P is the resistance of the bottom transistors working in linear mode. Using a first order binomial decomposition of (20), the reduction factor for Fast input controlling conditions is obtained as
which is a generalization of the result introduced in [7] for a DSM process, with full carrier speed saturation (α=1), where the value of the reduction factor reduces to
Using this reduction factor of the maximum current (21) to evaluate the output transition time (2) 
As shown p and g, the parasitic contribution to the gate delay and the logical effort, strongly depend on the topology of the considered cell through the parameters k and DW. Expression (23) constitutes an extension of the logical effort model considering the input ramp effect.
Gate Propagation Delay Model
A realistic delay model must be input slope dependent and must distinguish between falling and rising signals. As developed by Jeppson in [3] , considering the input-tooutput coupling effect, the input slope effect can be introduced in the propagation delay θ HL,LH as Normalizing (25) In this equation α HL,LH are the Meyer coefficients [6] for falling and rising edges that can be calibrated on the process the average value is α = 0.5). The parameter A is related to the drain diffusion capacitance (C DIFF =A.C IN ).
For a typical cell, the second term of eq.26 could be neglected for value of the electrical effort h greater than 3 or equivalently for important value of the input slew effort σ HL,LH .
Although this term becomes quickly negligible, we have to note here that it can have a significant effect when imposing a small value of the electrical effort. It directly shows the minimum limit of the electrical effort value to be reasonably imposed on a path. Indeed, for small values of h (<2), the first and third terms of (25) become smaller than the second one. Any further increase of the transistor width along the path is inefficient in improving the corresponding gate speed that is limited by the I/O coupling effect and the parasitic content of the gate.
Combining finally (25) and (23) gives a general expression of the normalized propagation delay of any combinational gate as: This expression is of great interest. A part the last term, that is negligible in a typical design range (h between 3 and 6), eq.27 clearly indicates that the propagation delay of any gate of a library can be represented with only one equation.
Validation
In order to validate this model we first compare the estimation of performance of an inverter, designed in a 0.18µm process, calculated from eq.3, 11 and 17, to values obtained from Hspice simulations. Different supply voltage conditions have been considered. The value of the index saturation index has been obtained from direct calibration on the transistor current simulation. Fig.2-3 give some example of the calculated and simulated evolutions of both the inverter maximum switching current and output transition time with respect to σ HL . As shown the accuracy of the model is satisfactory. Note that the underestimation of the switching current for high supply voltage is mainly due to the short circuit current. This has a minor impact on the evaluation of the output transition time that is determined from the 40% and 60% points of the signal voltage swing. 
Discussion
In submicron process the transition time and the propagation delay exhibit a nonlinear variation with respect to the controlling and loading conditions (Fig. 1,4-6 ). This non-linear range must clearly be determined with closest simulation steps because interpolating in this range may induce significant errors. It is obvious that the relative accuracy, obtained with a tabular method, is strongly dependent on the granularity of the table that is not necessarily constant but must cover a significant part of the design range. For that, indications for defining the granularity and the coverage of the design space must be available. As illustrated in Fig. 1,4-6 , the non-linearity corresponds to the Fast /Slow input ramp domain boundary. Thus the evaluation of this boundary is of great interest to determine the data points to be sampled and reported in the look up tables. The proposed model offers an easy way to determine this boundary. It is just necessary to find the controlling and loading conditions for which eq. Then to increase the accuracy of the tabular approach it is necessary to fix the granularity of the table in such a way that the data points belonging to the diagonal of the table respect the condition defined by eq.28.
Conclusion
We have introduced an explicit extension of the logical effort model in order to consider both the I/0 coupling and the input ramp effects and defined a simple but accu-rate representation of the timing performance of the simple CMOS structures. Validation of the model has been done on 0.18µm and 0.13µm processes. Application has been given to the definition of a compact representation of CMOS library timing performance. As discussed this representation must be of great interest in defining the granularity and the evolution of the look up tables used to represent the timing performance of CMOS library.
