Efficient, generalized delay and power equations are proposed for large scale CMOS circuit analysis and optimization achieved by transistor and interconnect wire minimization. The proposed model equations are used to analyze the entire power-delay trade-off with less complexity and faster computation time. New equations can be adopted to perform the optimization of transistor and interconnect wire size concurrently. A single stage CMOS circuit and a clock generation block fabricated in 0.48 um CMOS process are given as experimental examples.
INTRODUCTION
With CMOS digital circuits and digital systems having a few million transistors, the analysis at the transistor level using a circuit simulator requires a powerful computer, long computation time, and a special numerical method to handle convergence problems. If intensive design optimization is needed for CMOS circuits, it can be impossible, because the design optimization process requires the iteration of circuit analysis until the goal of optimization is reached. It is required to have the best of two simulation approaches: that is, the high accuracy of circuit simulation at the transistor level and the fast speed of logic simulation at the gate level. The unified equations for transistor and interconnect wire are needed to satisfy design requirements simultaneously using optimization tools such as [1 -3] for transistor size, [4] only for interconnect wire sizing and [5] for gate sizing and buffer insertion. The model formulation section derives the delay and power equations using the distributed Elmore delay model [6] , the multilevel empirical interconnect model [7] and input clock delay model [8] . In the model parameter extraction section, the method to extract key parameters from MOSFET BSIM model is introduced.
Two examples are presented in the experimental results section. Final results are discussed in the conclusion.
MODEL FORMULATION Preliminaries
Two resistance parameters (PMOS and NMOS channel resistance) and four capacitance parameters (drain capacitance, source capacitance, PMOS gate capacitance, and NMOS gate capacitance) are used in the model equation formulation. Resistance and capacitance parameters are added into the delay and power model equations to consider the interconnect wire effect. The model parameters and definitions used in the derivation of the delay and power model equations are explained below. In this paper, we use R pð pÞ and R nð pÞ to denote PMOS and NMOS channel resistance. Let C dpð pÞ ; C dnð pÞ ; C dpsð pÞ ; C dnsð pÞ ; C spð pÞ ; C snð pÞ ; C spsð pÞ ; C snsð pÞ ; C nð pÞ ; C gpð pÞ ; C gnð pÞ ; and C gð pÞ denote PMOS drain capacitance, NMOS drain capacitance, PMOS drain sum capacitance, NMOS drain sum capacitance, PMOS source capacitance, NMOS source capacitance, PMOS source sum capacitance, NMOS source sum capacitance, sum of C dn and C sn , PMOS gate capacitance, NMOS gate capacitance, gate sum capacitance of C gp and C gn , respectively, where ( p ) is the number of series transistors. C A is expressed as pC spð pÞ þ C dnð pÞ þ C wð jÞ =2: For the interconnect wires, R Lð jÞ ; C wð jÞ ; and C Nð jÞ are used to denote wire routing resistance, wire routing capacitance, and branch capacitance in the metal routing path where N ð jÞ is the number of branches in the metal routing path. C Lð jÞ is the wire routing sum capacitance having the following expression, C wð jÞ =2 þ C wðiþ1Þ =2 þ C Nð jÞ : The transistor channel resistance is expressed as R nðor pÞk ¼ A nðor pÞk =W nðor pÞk where A nðor pÞk is NMOS (or PMOS) unit resistance. The sum ðC dp þ C sp Þ is F 1k W nðor pÞk where F 1k is the unit sum ðC dp þ C sp Þ capacitance. The interconnect resistance and capacitance having the interconnect wire information are given as R Lð jÞk ¼ D ð jÞk =W wð jÞk where D ð jÞ is the unit-width wire resistance, and C wð jÞk ¼ E ð jÞk W wð jÞk where E ð jÞ is the unit-width wire capacitance. The load capacitance at the end node is expressed as C Lð jÞk ¼ E ð jÞk =2W wð jÞk þ C LðloadÞ :
the total capacitance at node A shown in Fig. 2 , denoted as C Anðor pÞ ; is expressed as follows:
The model parameters defined in this section are obtained by the method described in the parameter extraction section. Figure 1 shows the single input delay model including the interconnect wire. Rise and fall times are estimated by a useful method of RC chain approximation. RC approximation allows us to compute the delay time with a reasonable accuracy. This RC approximation is derived by the Elmore delay model. Modeling techniques by resistance and capacitance approximation are very useful in analyzing VLSI circuits with less computation time. It is extremely useful to optimize the design parameters where the model equations have non-linear functions with many variables. The average delay time is obtained by taking an average of the rise and fall delay times. The delay of a logic inverter can be expressed as FIGURE 2 NAND delay and power models having N-inputs and J interconnect wires. 
Delay Model Equations
The delay model equation for two inputs (NAND or NOR) is similar to the inverter delay model except that it has additional resistance and capacitance.
The general rise time with p inputs and j interconnect wires is approximated using the rise model shown in Fig. 2 ,
The general equation for fall delay time is expressed as
By taking the average of rise and fall delay times the generalized average delay with p inputs and j interconnect wires is given by
By substituting the model parameters defined in 2.1 Preliminaries into Eq. (1), the delay equation is expressed as a function of transistor and interconnect wire size. If one metal wire is used between drivers,
where
Next, the M-stage and N-input delay model equation including the interconnect wire effect is derived. The delay equation is a general equation, which is used for any large CMOS circuit. The model of the M-stage and N-input delay is shown in Fig. 3 . When the effect of input transition is incorporated, the total stage delay along the critical path including gate delay and interconnect wire delay is given as
where a ¼ ð1 þ 2jV T j=V DD Þ=3; T r is the clock slope, and D m is the stage delay at m stage.
Power Model Equations
Power consumption for CMOS circuits is expressed as the sum of the dynamic power component, the shortcircuit power component, and the DC leakage component, given as
where k is the switching factor, C L is the output load capacitance, V DD is the supply voltage, f P is the operating frequency, I SC is the short-circuit current, and I leakage is the DC leakage current. Dynamic power consumption occurs when the current charges the transistor load capacitance from the supply voltage, and discharges to the ground. The switching factor is related to the transition probability of the input signals. In the case of static logic the switching factor depends on the previous logic state while in dynamic logic it is up to the input signal probability. Another source of power consumption is the direct current flow between the power supply and the ground, which exists for a short time. When the input voltage is in the range V tn , V in , V DD 2 V tp , P and N, type transistors are on simultaneously, and then current flows through a direct path between the power supply and the ground. The short circuit current is dependent upon the rise and fall times of the input signal, the threshold voltage, and the ratio of channel-width to channel-length.
The average short circuit current can be minimized when input rise and fall times are much faster than the clock period. There are two types of DC leakage currents: one is the sub-threshold leakage current, and the other is the reverse bias leakage current. The sub-threshold leakage occurs due to carrier diffusion between the source and the drain. The source-drain current in the sub-threshold region is particularly important for the short channel low power circuits, because it affects the standby static power consumption. The second leakage component is the parasitic diode leakage current between the drain (or source) and the substrate. The short-circuit and leakage power can be incorporated in the power model by adding the process dependent parameters, h 1 and h 2 to the total power. In order to get the dynamic power, assume that the rise and fall times of the input signals are much faster than the clock period, and that there are V t drops in the MOSFET channel resistance series as shown in Fig. 2 . After integrating the equation, P dðaveÞ ¼ 1=T Ð t2 t1 vðtÞiðtÞdt; the average dynamic power is given as
The transmission of logic one (V dd ) is degraded as it passes through the NMOS gate. If considering V t drop in the transistor series, the average dynamic power is
The dynamic power is expressed as a function of the transistor size and interconnect wire size.
Accordingly, the dynamic power for one stage is given as
The total power model equation is obtained from the dynamic power and two parameters, h 1 and h 2 .
: h 1 and h 2 are process dependent parameters that include leakage and short-circuit power components. The single stage dynamic power is a nonlinear function for the reason that the interconnect capacitance is a nonlinear function of the interconnect wire size. However, it can be approximated by a piece-wise linear function. The M-stage N-input total power is obtained by summing the power consumption of each stage.
is the total power at m stage.
MODEL PARAMETER EXTRACTION
When the drain current is determined, the channel resistance is calculated by Ohm's law. However, due to the complexity of the current equation, the channel resistance is expressed as a non-linear function with many variables. The exact analytical and semi-analytical transistor models are used for high accuracy in the CMOS circuit simulations. However, when the number of transistors gets larger in CMOS VLSI circuit designs, it is necessary to find a stable solution without numerical convergence problems. Simulations with exact models usually require a long computation time. An empirical model for the channel resistance is obtained using the process based MOSFET models. One such model, useful in approximating the behavior of the voltage and current characteristic of the MOSFET, relates the channel resistance to the gate size. The transistor channel resistance is expressed as R ðonÞ ¼ V ds =I ds < A=W where the unit resistance, A, is determined by the SPICE simulation. The unit channel resistance is shown in Fig. 4 .
The gate capacitance is given by the sum of the oxide (C ox ) and semiconductor (C s ) capacitances, and is a nonlinear equation. The empirical model parameters for the gate capacitance are obtained by C ¼ Q ðtÞ =DV where the charge, Q, is calculated by the SPICE simulation. The total gate capacitance, C gðtotÞ ; is comprised of C gn ¼ B n W n and C gp ¼ B p W p where B n and B p are the NMOS and PMOS unit capacitance, respectively. They are shown in Fig. 5 .
The unit gate capacitance is used as a basic parameter in the delay and power model equations. The parameters for the drain and source capacitances are also given by the same model used for the gate. The drain and source capacitances are expressed in the transistor size terms, C dsn ¼ K c W n and C dsp ¼ J c W p where K c and J c are the drain/source unit capacitance for NMOS and PMOS transistors shown in Fig. 6 .
The multi-level interconnect wire resistance, which is used to make the interconnection between the drivers, is expressed as R ðtotÞ ¼ r s1 ðL 1 =W 1 Þ þ r s2 ðL 2 =W 2 Þ þ · · · þ r sn ðL n =W n Þ where r s is the sheet resistance. The interconnect resistance is simply obtained by multiplying the sheet resistance by the ratio of the metal length to metal width of the interconnect wires. The interconnect resistance can be expressed in terms of the metal wire width, W w
Capacitance models for multilevel interconnect wiring are quite complex. Three dimensional field simulators can be used to calculate the metal capacitance. However, they are constrained to simulate a few physical configurations and are too complex to handle many different configurations. Empirical modeling approaches overcome these problems with a reasonable accuracy. Multilevel empirical models [6] are used to calculate the parasitic capacitance of M1, M2 and M3. The approach uses three primitive structures to construct multilevel metal configurations. Three basic capacitances are line-to-line, line-to-ground, and cross-over capacitance. Figure 7 shows the capacitance components and geometrical parameters. The interconnect capacitance is a non-linear function of the metal width. However, it is expressed as a linear function using a piece-wise linear approximation:
where E m is the unit metal capacitance. Figure 8 shows the M1, M2 and M3 interconnect capacitance per unit area.
EXPERIMENTAL RESULTS
As an example, consider the single stage CMOS circuit having an interconnect routing wire and an output load ðP=N ¼ 40 mm=20 mmÞ: Delay, power, and power-delay product are plotted as a function of the transistor width shown in Figs. 9 -11. Open circle symbols are SPICE simulation data, and open triangle symbols are calculation data from the delay and power equations.
Calculation results of the model equations closely match the SPICE simulations. There is about 3 -8% mismatch between the model and SPICE, depending on the transistor size, W nðor pÞ : The computation time of delay and power by the new model equations is much FIGURE 7 Interconnect wire capacitance components and geometrical parameters for three metal layers. FIGURE 8 M1, M2, and M3 interconnect capacitance per unit area versus transistor width. faster than the SPICE simulation. A practical clock generation block is presented as the other example to provide comparison data among the model, SPICE, and silicon. Figure 12 shows the critical path of the test clock circuit.
SPICE simulation data (, 3.3 ns) has less delay than the measured silicon data (, 3.7 ns). This is because the effect of routing wire RC delay is not included properly in the SPICE simulation model. The SPICE simulations are shown in Fig. 13 . Clock delay data (, 3.859 ns), calculated by the model, is close to the measured data because more accurate interconnect wire models are included in the model equations. A photomicrograph of the fabricated test clock circuit is shown in Fig. 14. The test clock circuit is fabricated in 0.48 mm CMOS triple metal process technology. Figure 15 shows the waveforms generated from the XY data of a digital oscilloscope.
When the effect of the transition time is incorporated, the delay along the clock path is calculated by 
CONCLUSION
Two experimental examples show that the calculated data using the new delay and power equations closely match the result of SPICE. However, a small discrepancy between the model equations and SPICE simulation results from a linear and piece-wise linear approximation of the model parameters extracted from the BSIM transistor models. This implies that the accuracy of the new equations can be improved by adding more model parameters, depending on the range of the transistor and interconnect wire size. The computation time of the delay-power equations is much faster than the SPICE simulation time, which depends on the size of circuits. The model equations can be used for optimizing the power-delay trade-off. They are also used to analyze the static gate and the interconnect wire delay simultaneously. For future work, an optimization algorithm to handle large design parameters expressed by delay-power equations needs to be developed as well as the accuracy improvement of the model parameters (Table I) . 
