Abstract-Efficient physical timing models for complex CMOS AND-OR-Inverter (AOI) and OR-AND-InVerter (OAI) gates have been successfully developed. Through extensive comparisons with SPICE simulation results, the developed models have shown a maximum error of 30% for long-channel and small-geometry CMOS AOIlOAl gates with wide ranges of channel dimensions, capacitive loads, logic input patterns, circuit configurations, device parameter variations, and noncharacteristic waveform input excitations. The error can be further reduced to 16% with commonly used device dimensions. The developed timing models are successfully applied to the autosizing of CMOS AOIl OAI gates. The results show a good accuracy and a reasonable CPU time consumption.
. Furthermore, all these models have reasonable accuracy and consume little computer memory and CPU time. They can also provide a deep insight into the speed nature of digital MOS IC's.
Although CMOS has become a dominant technology in digital VLSI/ULSI [23] , only a few timing models have so far been proposed [5] - [7] , [14] - [17] , [19] - [22] . Among them, the use of the step-response models [ 141 does not guarantee the required accuracy due to the neglect of the strong influence of input signals on delay times [16] .
There are two approaches to cope with the input waveshape effects [17] . One is table driven techniques like Crystal [5] . Crystal is a timing simulator in which transistor resistances can be adjusted according to the input waveforms and device operating regions to obtain a higher accuracy. This approach, however, has been proven to have some limitations [17] . These limitations are, for example, the accuracy problem due to resistance extractions for the gates with different beta ratios, sizes, etc., the problem due to the table interpolations, and the difficulties in optimizations. The other approach is entirely based on device equations. In this approach, the resistance extractions are not required and the table interpolation problem can be avoided. Moreover, this approach is quite suitable for optimization and autosizing. For efficient design automation and optimization, good analytical delay macromodels [6] , [7] , [ 161-[22] entirely based on device equations are required [17] .
Generally, accurate and efficient timing models entirely based on device equations are very useful in various CAD applications in VLSI, such as timing verification, optimization, logic simulation, and autosizing [ 171.
So far, the timing models for inverters, N A N D and NOR gates [5]- [7] , [14] - [ 161, [ 191-[22] have been developed. However, the timgates, which are commonly used in the design of CMOS digital IC's, have not yet been reported. Due to their complicated structures, the computer time consumed in the simulation of the circuits containing those gates is very long. The efficient timing models of those CMOS gates, therefore, are more urgently required than those of other simple combinational logic gates.
ing models for AND-OR-hVerteI' (AOI) and OR-AND-InVerter (OAI) 0278-0070/90/0900-1002$01 .OO 0 1990 IEEE Fig. 1 . A chain of identical CMOS 4-3-2-1-input A 0 1 gates under the worst-case timing condition.
1003
It is the aim of this paper to develop efficient physical timing models for CMOS AOUOAI gates with wide ranges of channel dimensions, capacitive loads, circuit configurations, input excitations, and device parameters. As compared to the SPICE [24] simulation results, the maximum error of model calculation results is 30% within a large applicable range. Fine tuning within a special range can reduce the error to 16%. By using the developed models, the signal timing of the commonly used CMOS AOUOAI gates with less than 4 PMOSFET's or 4 NMOSFET's in series can be quickly computed. Moreover, autosizing of CMOS AOUOAI gates can be performed through the use of the timing models.
The model formulations are described in Section 11. Comparisons between model calculations and SPICE simulations are given in Section 111. An application example of the developed models in autosizing is given in Section IV. Finally, conclusions are made.
TIMING MODELS
As an illustrative example for model formulations, a 3.5-pm CMOS 4-3-2-1-input A01 gate under a worst-case timing condition is considered. The A01 gate circuit is shown in Fig. 1 where the input voltage Vi drives the MOSFET's Mp41 and Mn41. It is known that after the signal passes through several stages, the rising or the falling waveforms at the output node of a stage gradually become the same, being independent of input excitations. Such waveforms are named characteristic waveforms [ 161. The typical characteristic waveforms obtained from the SPICE simulations for the falling input voltage V, are plotted in Fig. 2 . As the name of characteristic waveforms implies [16] , the waveforms of the A01 gate in 1.5-pm CMOS technology are very similar to those of During the rise time period TR, the operating region of each MOSFET can be determined from its drain-source voltage VDs and drain-source saturation voltage VDsAT generated from its corresponding gate-source voltage Vcs and drain-source voltage VDs. From the simulated curves in Fig. 2 , it is found that the MOSFET's in Fig. 2 . According to our observations, this intersection point is located outside the rise time region for typical A01 gates even when the device channel length is down to 1.5 pm. This means that during TR the drain-source voltage I VDD -Vo31 = 1 VDsl is always smaller than VDsATp in the PMOSFET Mp41. Thus it is operated in the linear region during TR and in those complex A01 gates only the linear-region current is involved in deriving the formula of TR. This is true for AOI/OAI gates with long-channel and smallgeometry MOSFET's. Under the similar considerations, the PMOSFET Mp41 of the load stage is mostly in the linear region during TR whereas the NMOS-FET Mn4, of the load stage is mostly in the saturation region. Both devices in the load stage are treated as a capacitive load and their capacitances are calculated according to their operating regions.
From the determined operating region and the large-signal equivalent circuit of each MOSFET, the overall large-signal equivalent circuit of the 4-3-2-1-input A01 gate during the characteristic rise time TR can be built as shown in Fig. 3 where the load stage is effectively represented by a capacitive load to the driver stage. Both the capacitive load and the fixed capacitor C, are included in the capacitance CIZR.
The expressions of all the capacitances in the equivalent circuit are listed in Table I where the device The linearization point in this case is optimally determined at the center point with t = t, where V,( t e ) = VDD/2. All the other node voltages at the point can be found from the characteristic waveforms. After the above cited linearization procedure, all the capacitances listed in Table I become linear and have fixed values. Since the p-n junction reverse-bias capacitance is not a strong function (e.g., exponential function) of voltage, the linearization leads to a limited error. The linear-region drain current used in the modified SPICE program is expressed in Table 11 . By applying the same technique [6], the voltage-dependent mobility and the nonlinear terms in the current expression can be linearized. The resultant expressions of the linearized drain currents in the equivalent circuit of Fig. 3 are listed in Table 111 . Note that the drain currents in the equivalent circuit are those of the linear region, the linearization thus does not lead to an unbearable error. With the linearized capacitances and drain currents, the equivalent circuit of Fig. 3 becomes a linear circuit.
Since the internal characteristic waveform in a digital circuit strongly depends upon the circuit structure rather than the input excitation, it can be well characterized by the poles and zeros of the corresponding equivalent circuit. In complicated circuits like CMOS AOI/OAI gates, they tend to have a dominant pole. Thus the signal timing can be analytically modeled from the dominant pole only. Dominant-pole calculation in the S domain, however, is too tedious in a complex logic gate. In this paper, the zero value time constant method [26] will be applied to simplify the dominantpole calculation while obtaining a satisfactory result.
To compute the dominant pole of the equivalent circuit in Fig.   3 , the input port is short circuited. Thus the output resistance looking into each capacitor with all others taken away can be found. The expressions of all the output resistances are given in Table IV . From these expressions, it can be seen that the effective resistances of each MOSFET, for example, M,,,, in the associated output resistance expressions, for example, R 4 R , R S R , and R 1 2 R , are different. ( 1 ) 'dP2 1 * 'p2 1 "p21Dv0 5 -' g 2 lSvO 4 '
TABLE I THE EXPRESSIONS OF THE CAPACITANCES I N THE EQUIVALENT CIRCUIT USED FOR THE CHARACTERISTIC RISE TIME CALCULATION OF
where
"Stel u~ilt-te l=p41,p3l,p21,pl,n44,n43,n42,n33,n32,n22. 
TABLE IV THE EXPRESSIONS OF THE O U T P U T RESISTANCES LOOKING INTO THE CAPACITOR PORTS IN THE EQUIVALENT CIRCUIT OF A CMOS 4 -3 -2 -l -I N P U T
Xp10Xp2LD 'p3iXp310
'pls"~215'p3ls
' *'
Because the output voltage is assumed to have a single-pole response, the characteristic rise time TR can be written as
Similarly, the large-signal equivalent circuit for the characteristic fall time calculation can be obtained as shown in Fig. 4 . The characteristic fall pole P, and fall time TF can be written as
Both P, and P, are nonlinear equations of P I / P f . However, they can be easily solved by using numerical iterations.
The rise propagation delay TpLH as defined in Fig. 2 can be expressed as
( 5 )
For simplicity, empirical laws for the initial delay times tdr and tdf were found. As a result, the rise propagation delay TPLH and the pair delay Tp can be reformulated by the simple relations
T p
Note that the above equations are universal and can be used to calculate the delay times for various A01 gates with satisfactory accuracy, as will be verified in the following section. It can be realized that the delay equations (6) and (7) are universal only when TR and TF are calculated from the derived formulas.
For the 4-3-2-1-input A01 gate under nonworst-case timing condition, the input voltage Vi drives an NMOSFET and a PMOSFET other than M n 4 1 and Mel, respectively. The delay times can be similarly modeled but with different capacitances and currents.
In the PMOSFET part of the 4-3-2-I-input A01 gate, there are four different branches each with 1, 2, 3, or 4 PMOSFET's in parallel. The four branches can be connected in series to form 24 different configurations. The delay times of these different A01 gates can be characterized by using the developed modeling technique. Besides the 4-3-2-1-input A01 gate with 24 different configurations, the modeling technique can also be applied to other A01 gates with less than 4 NMOSFET's or 4 PMOSFET's in series. The timing models for all these A01 gates have been built with the delay time equations of (6) and (7).
A six-digit code is designed to represent various A01 gates for easy identification. Each of the first four most significant digits (MSD's) represents the number of parallel PMOSFET's in a branch. The branch farthest to the output node corresponds to the first MSD and so on. The fifth digit is used to denote the position of the PMOSFET branch driven by the input voltage. The driven PMOSFET branch farthest to the output node is denoted as 1 whereas that nearest to the output node as 4. The sixth digit denotes the position of the driven NMOSFET in a series NMOSFET branch. Starting with 1 for the top MOSFET, each of the lower MOSFET's is assigned to a successively larger number. According to the above description, the A01 gate in Fig. 1 can be represented by 4-3-2-1-1-4. More examples are given in Fig. 5(a)-(d) where the A01 gates are shown by the symbolic diagrams. A similar code system is used to represent OAI gates. Two examples are given in Fig. 5(e) and A 4-3-2-1-input OAI gate under the worst-case timing condition is shown in Fig. 6 where the input voltage drives the MOSFET's Mn41 and Mp41. Similar to the A01 gates, the NMOSFET branches in series can be arranged to form 24 different configurations. For all these different configurations of OAI gates and other simpler OAI gates with less than 4 PMOSFET's or 4 NMOSFET's in series, the worst-case and the nonworst-case timing can be characterized by applying the same modeling technique. The timing models for all these OAI gates have been built. The universal delay time equations for the OAI gates are
(f). where TR and TF are the calculated characteristic rise and fall times of OAI gates, respectively. Note that the developed timing modes can be applied to CMOS AOI/OAI gates with both long-channel and small-geometry devices, because a small-geometry AOI/OAI gate has an inherent output voltage waveform nearly independent of the device channel lengths and its timing can be characterized similarly as in the longchannel case. However, this similarity does not exist in simple CMOS inverters, and NAND/NOR gates [ 6 ] , [22] .
COMPARISONS WITH SPICE SIMULATIONS
To verify the accuracy of the developed analytical timing models, extensive comparisons between theoretical calculations and SPICE simulations were made for the AOI/OAI gates with different configurations, device sizes, device parameters, capacitive loads, and input excitations. Fig. 7(a) shows the 3.5-pm CMOS 3-2-1-0-1-3 A01 gates with C, = 5.0 pF and normal device parameters, whereas Fig. 7(b) shows the 4-3-2-1-1-4 OAI gates with C, = 0 pF and VTo, reduced to 0.3 V . It is found that the maximum error is 30% in the calculated delay times for the AOI/OAI gates with fixed load capacitors.
The input waveform effect on the output signal timing is incorporated into our model through the term VGs . VDs in the draincurrent expression of the driven MOSFET's. For example, the rise time calculation of the A01 gate as shown in Fig. 1 , the drain current Idp4, of the driven PMOSFET Mp4, has a term VGs . VDs which is related to V, . ViIj and then is linearized as V, ( t < , ) . VClj(t). Since 1, is related to the output rise pole P , and V, to the input fall pole P,, the term V, ( t , ) in XP4," of the resultant linearized drain current IP4, and all the resistances is a function of P f / P , as may be seen from Tables I11 and IV . Finally, the output rise pole P, becomes a function of the input fall pole P, and the input waveform effect is included. Thus the developed model can predict the output responses under noncharacteristic input waveform excitations. Table V shows the 3.5-pm CMOS 1-2-3-4-4-2 A01 gates driven by the step input and the input waveforms with rise and fall times two times as large as those in the characteristic waveform case. The ability to calculate the noncharacteristic waveform timing makes the developed models more practical and versatile in computing the timing of CMOS AOUOAI gates.
There is a compromise between the model accuracy and the applicable ranges of the models. Thus the maximum error of 30% can be reduced if the model applicable range is slightly confined to those gates with commonly used device dimensions. For example, the maximum error of delay times for the CMOS A01 gates with I n p u t E x c i t a t i o n S t e p 1007 (E$) Data Type T R T P L 3 TF 1 T P H L I commonly used device dimensions can be reduced to 16% by properly tuning the universal constants in (6) and ( 
Similarly, the tuned equations for the CMOS OAI gates are
As expected, the developed timing models can also be applied to complex small-geometry CMOS AOI/OAI gates. It is found that the maximum error is still 16% for various 1.5-pm CMOS AOI/ OAI gates with commonly used device dimensions. Part of comparisons are shown in Fig. 8(a) and (b) for 1.5-pm CMOS A01 and OAI gates with Cl, = 0.0 pF, respectively.
IV. APPLICATIONS OF THE TIMING MODELS I N AUTOSIZING
To demonstrate the application of the developed timing models in autosizing, the timing models are implemented in an experimental autosizing program called the T h i n g Synthesis and Analysis . In the TISA, two popular CMOS design strategies are adopted in synthesizing the device sizes. One strategy is that all the same type of MOSFET's in series in a logic gate are designed with equal channel widths and so are all MOSFET's in parallel [IO] . The other is that only the input excitation pattern which leads to the worst-case timing of a logic gate is considered in sizing. This results in a save design so that the actual chip delay is always smaller than that synthesized. Under these two strategies, the optimal device sizes which lead to the minimum total delay time can be obtained by solving the developed timing equations through the numerical optimization algorithm [27].
As shown in Fig. 9 , a 1-b full adder is designed. The synthesized device sizes are listed in Table VI. Table VI1 lists the comparisons between SPICE simulations and model calculations under 
CARRY
the consideration that the input pattern ABC changes from 100 to 110. It is found that the error is 10.8 and 7.2% for the outputs CARRY and SUM.
V. CONCLUSION
Efficient physical timing models for complex 1.5-and 3.5-pm CMOS AOI/OAI gates have been successfully developed to calculate the signal timing without performing troublesome SPICE transient simulations. Under the characteristic waveform consideration, the rise time and fall time equations in the developed models are first derived from the dominant pole of the linearized large signal equivalent circuit of the gate. To efficiently find the dominant pole, the zero value time constant method [26] is adopted. Universal laws are then found to calculate the delay times from the calculated rise and fall times.
Extensive comparisons between theoretical computations and SPICE simulations were made. It is found that the developed timing models have a maximum error of 30% in calculating the signal timing of the CMOS AOI/OAI gates with wide ranges of device dimensions, capacitive loads, device parameter variations, logic input patterns, and input excitation waveforms not deviating much from the characteristic. The same maximum error is found in the timing calculations of all different configurations of the CMOS AOI/OAI gates with less than 4 NMOSFET's or 4 PMOSFET's in series. However, the model error can be further tuned to 16% for 1.5 and 3.5-pm CMOS AOUOAI gates with commonly used device dimensions.
The application of the developed timing models in autosizing has also been demonstrated successfully. With the aid of the developed timing models, the sized gates can have a much less deviation in delay times from the simulated values. This gives a more correct design than the case of using a rough timing model. The consumed CPU time is still in the reasonable range.
