Study on modeling techniques for CMOS gate delay calculation in VLSI timing analysis by Jiang Minglü
-I-
S t u d y o n M o de l i n g T e ch n i qu e s f o r
C M O S G a t e D e l a y C a l c ul a t i o n i n
V L S I T i m i n g A na l ys i s
July 2011
JIANG, Ming Lu
Waseda University
-II-
Content
Abstract.............................................................................................. 1
Chapter 1 Introduction ...................................................................... 5
1.1 Background .....................................................................................................8
1.1.1 Static Timing Analysis ...........................................................................................................8
1.1.2 Basic Conceptions of Gate Delay Model ..........................................................................10
1.2 Dissertation Motivations and Contributions ..............................................15
References............................................................................................................19
Chapter 2 Overview of Conventional Gate Delay Models................ 23
2.1 The Empirical Method for Gate Delay .......................................................25
2.2 The RC- ..................................................................28
2.3 Effective Capacitance Model for Gate Delay .............................................32
2.4 Equivalent Gate Model for Gate Delay.......................................................35
2.5 Efficiency Improved Gate Delay Model......................................................38
References............................................................................................................41
Chapter 3 Effective Capacitance Model for Gate Delay Considering
Input Waveform Effect .................................................................... 43
3.1 Introduction...................................................................................................45
3.2 Proposed Method ..........................................................................................50
3.2.1 Analytical Expressions .........................................................................................................50
3.2.2 Procedure for calculating Ceff(actual)................................................................................57
3.2.3 Driving Output Resistance Calculat ion .............................................................................59
3.3 Tests and Comparisons.................................................................................61
3.3.1 Experimental Results for Various Rd .................................................................................61
3.3.2 Experimental Results for Various tin..................................................................................62
3.3.3 Experimental Results with Various Gates and RC- ............................................64
3.4 Conclusions ....................................................................................................66
References............................................................................................................68
Chapter 4 Accurate Effective Capacitance Model for Gate Delay with
RC Loads Based on the Thevenin Model ......................................... 70
-III-
4.1 Introduction...................................................................................................72
4.2 Proposed Algorithm......................................................................................77
4.2.1 Analytical Expressions for Effective Capacitance...........................................................77
4.2.2 Algorithm for Key Parameters t20 and t80..........................................................................84
4.2.3 Procedure for Calculating Ceff and Gate Delay .................................................................87
4.3 Tests and Comparisons.................................................................................89
4.3.1 Experimental Results for Various Rd and tin......................................................................89
4.3.2 Experimental Results for Various Capacitance Values ..................................................91
4.3.3 Experimental Results for Various Gates ...........................................................................93
4.4 Conclusions ....................................................................................................96
References............................................................................................................97
Chapter 5 A Non-iterative Method for Delay Calculation of CMOS
Gates ...............................................................................................100
5.1 Introduction.................................................................................................102
5.2 Preliminaries................................................................................................104
5.3 Proposed Model...........................................................................................108
5.3.1 Analytical Derivation for Non-iterat ive Algorithm...................................................... 108
5.3.2 Error Analysis and Algorithm for Key Parameter ........................................................ 113
5.3.3 Gate Delay Calcu lation with Non-iterative Method..................................................... 119
5.4 Tests and Comparisons...............................................................................122
5.4.1 Experimental Results for Various R ................................................................................ 122
5.4.2 Experimental Results for Various tin............................................................................... 123
5.4.3 Experimental Results for Various Gates and RC- .......................................... 125
5.5 Conclusions ..................................................................................................127
References..........................................................................................................128
Chapter 6 Conclusions ....................................................................131
6.1 Dissertation Conclusions ............................................................................132
6.2 Future Works ..............................................................................................134
Related Papers ................................................................................135
Acknowledgments ...........................................................................136
Publication List ...............................................................................138
-1-
Abstract
-2-
In VLSI designs, designers have to do the timing analysis in order to estimate the
ability of a VLSI circuit to operate at the specified frequency. Although this kind of
work can theoretically be implemented using a circuit simulation, such an approach to
simulate all timing conditions of a design with several million gates is too slow. In
contrast, static timing analysis (STA) is a fast and exhaustive verification of all timing
checks of a design. In STA, a crucial work is to calculate the gate delay time. Since
the CMOS gate is composed of non-linear components, it is difficult to obtain a
precise and efficient gate delay model. Thus, this dissertation is mainly focused on the
issues of improving the accuracy and efficiency of gate delay calculation. In the
conventional methods for gate delay time, input signal of each gate is always simply
assumed as a linear ramp. However, the actual signal will become more and more
nonlinear after transferring through many gates and interconnects. As a result,
computation has a significant error when the non-ramp input is assumed as the ramp
waveform. Therefore, the input waveform effect should be considered in the gate
delay calculation. In cell level delay calculation, an equivalent gate model called
Thevenin model that considers each non-linear gate as a combination of two linear
components is widely used. Most of the conventional methods for gate delay are
based on the condition that the actual load and the corresponding equivalent
capacitive load have the same charge. This condition is accurate in the actual circuit.
However, with the Thevenin model, there is charge difference between capacitance
load and interconnect load, which has a large influence on gate delay calculation. In
order to improve the accuracy, a new condition with the Thevenin model is required.
Besides, in the previous works, most of them use iterative algorithms to ensure the
accuracy. The iterative methods are too slow for gate delay calculation of modern
VLSI designs. Meanwhile, the existing non-iterative methods have the disadvantages
of low accuracy and using over-simple gate model.
To overcome above drawbacks, three new models that focus on different issues
have been proposed in this dissertation, respectively. First, an advanced gate delay
model is proposed with adding the effect of non-ramp input waveform. Second, a
simple and accurate method is proposed to calculate gate delay in the Thevenin model
-3-
where the effect of charge difference is considered. Last, a non- iterative method is
presented, which can improve the efficiency of gate delay estimation without
significant accuracy loss. The dissertation is organized with six chapters as follows. In
Chapter 1 (Introduction), the background and some basic conceptions of this research
are briefly introduced. Then the motivations and contributions of this dissertation are
presented. The last section of this chapter is to describe the organization of this
dissertation. In Chapter 2 (Overview of conventional gate delay models), the
development procedure of gate delay calculation and some different types of
conventional methods are overviewed to discuss the issues of accuracy and efficiency
in the conventional methods. Then, the purposes of this research that are to improve
the accuracy and efficiency of gate delay calculation are shown. In Chapter 3
(Effective capacitance model for gate delay considering input waveform effect), an
advanced model for calculating the effective capacitance that is usually used to
compute CMOS gate delay is proposed to consider both the interconnect load effects
and the non-ramp input waveform effect. First, the non-ramp input effect is presented
through some actual examples and the computation error caused by this problem is
analyzed. Then, an analytical method for overcoming the non-ramp input problem is
proposed and the detailed procedure of the proposed method is given. The nonlinear
influence of the input waveform that can increase the gate delay time is modeled as
one part of the effective capacitance for calculating the gate delay. The experimental
results show that the average error of proposed method is only about 3.7%, while that
of conventional method is more than 15% when the input is non-ramp.
In Chapter 4 (Accurate effective capacitance model for gate delay with RC loads
based on the Thevenin model), a method that focuses on the charge difference
problem between the effective capacitance load and interconnect load in Thevenin
model is proposed. First, the conventional methods for gate delay time based on the
Thevenin model are overviewed. At the same time, the description of charge
difference problem in the Thevenin model is given and the errors of conventional
methods are analyzed. Then the proposed algorithm for solving the charge difference
problem and the procedure for gate delay calculation are shown in detail. The
-4-
proposed method is based on some simple and accurate approximations, which do not
add much computation complexity. The accuracy of proposed method with a 1.3%
average error is much better than the conventional method with a 7.3% average error.
In Chapter 5 (A non-iterative method for delay calculation of CMOS gates), a
non- iterative method for improving the efficiency of effective capacitance calculation
is presented. In the proposed method, a simple polynomial approximation is used to
modify the nonlinear effective capacitance equation. The detailed error analysis of the
polynomial approximation is given. Through using the proposed method, the value of
effective capacitance and gate delay time can be computed without requiring any
iteration. The efficiency of gate delay calculation has been obviously improved. Using
our explicit method, the CPU time of conventional iterative model can be reduced by
half. Meanwhile, the proposed method keeps a relative high accuracy with a 2.8%
error.
In Chapter 6 (Conclusions), the conclusions of this dissertation are given.
-5-
Chapter 1
Introduction
-6-
Nowadays the integrated circuit productions occupy our daily life everywhere.
They provide us a comfortable life and become so critical in the world. Since the
invention of integrated circuit (IC) in 1958, there has been a large development of
semiconductor technology. As the minimum feature size becomes 32 nanometers, the
transistor number can be more than 1 billion on a chip, which can provide more
powerful function.
The basic operating principles of large and small transistors are the same. However,
the various electrical parameters of the small size transistors (the channels are equal to
or smaller than m) are quite different from those of larger transistors. At the same
time, many physical and chemical phenomena, such as short-channel effect and
negative bias temperature instability (NBTI), which are negligible in large dimension
MOSFETs, are becoming more and more important in determining the performance
of deep-submicron dimension MOSFETs. Therefore, the performances of VLSI
designs with different process technologies have the large differences according to the
above reasons.
Since the device performances are not constant and the VLSI designs become more
complicate than ever, the circuit verification technology is more important during the
modern integrated circuit design and research. The verification technology can largely
help the designers save the pecuniary cost and reduce the design time. The IC design
process consists of defining circuit function, hand calculation, circuit simulation,
layout of the circuit, simulation with parasitic parameters, reevaluation of the circuit
function, fabrication, and chip testing [1]. Once a circuit has been designed, it must be
verified. Verification is the process of going through each stage of a design and
ensuring that it will work under the specification requirements. In any complicated
design, it is very likely that problems will be found at this stage and may involve a
large amount of the redesign work be done in order to overcome them. In fact, over 50%
of the resources invested in developing systems are reportedly spent on verification
[2].
IC design can be divided into the broad categories of digital and analog IC design.
The different types of circuits require the different kinds of models and methodologies
-7-
to do verification. In high-performance digital IC designs, the system contains many
kinds of gates and interconnects. The signal delay time of each gate should be
calculated in order to estimate the ability of such a system to operate at the specified
frequency. By the rapid development of IC designs, the characteristic value of IC
designs is becoming smaller and smaller. This situation makes interconnects have
larger resistance than ever [41]. Therefore, the larger resistance results in the larger
effect on the gate delay time that is very importance on IC performance [41]. Besides,
the modern IC designs integrate more and more gates on a chip and become more
complicated. Thus, the efficiency of gate delay calculation should be improved to
adapt the quick development of IC designs. In my work, the main content is to find
the advanced models for calculating gate delay accurately and efficiently that are
introduced in the subsequent chapters.
In the following part of this chapter, much more detailed background focused on
the gate delay calculation is introduced in Section 1.1. Then the motivation of this
dissertation is presented in Section 1.2. Finally, Section 1.3 outlines the rest of the
dissertation.
-8-
1.1 Background
1.1.1 Static Timing Analysis
In the digital circuit design, designers have to do the timing analysis in order to
estimate the ability of a VLSI circuit to operate at the specified frequency. The static
timing analysis (STA) is a method of computing the expected timing of a digital
circuit without requiring any circuit simulation. As shown in Fig. 1.1, static timing
analysis is very important in the digital design flow that must be incorporated into the
inner loop of timing optimizers at various phases of design, such as logic synthesis,
floor-planning, and layout (placement and routing). Although this kind of timing
measurements can theoretically be implemented using a circuit simulation, such an
approach needs to consume a large amount of time. Moreover, circuit simulation is
difficult to do exhaustive verification with all timing conditions, for example
evaluating the effect of noise. Therefore, STA is an appropriate method for the fast
and reasonably accurate measurement of circuit timing, which has many benefits,
such as providing quick and efficient information to enhance the design performance
and easing the design debugging procedure.
Figure 1.1 Static timing analysis in digital circuit design flow.
-9-
In timing analysis, we need to check that all signals arrive at certain points within a
prescribed time interval. To do this, the information can be propagated through the
network. In a digital IC design, if all the bits at registers and primary inputs remain
constant from one clock cycle to the next, then the voltage remains constant
everywhere in this circuit. But if at least one bit changes, this information must be
propagated through the network. The information that we propagate is called a signal.
A signal contains the contents of: 1) the information whether the voltage is rising (the
voltage changes from low potential to high potential) or falling (the bit changes from
high potential to low potential); 2) the time when the voltage change occurs with
respect to the primary time of clock cycle; 3) a measure of how fast the voltage
changes; and 4) information on the origin of the signal (a primary input or register)
[3]. The contents of 2) and 3) are the standard information in static timing analysis:
the characteristics of a voltage change over time are encoded by two numbers: the
arrival time (usually the 50% transition time) and the slew (usually the difference of
10% and 90% or 20% and 80% transition times). In all paths of a VLSI design, the
one that have the largest propagation delay is called the longest path. The maximum
frequency is set by the longest path in the design, which is also referred to as the
critical path. For example, a full adder has the longest path from A to Cout as shown as
the gray path in Fig. 1.2.
Figure 1.2 The longest path of a full adder.
-10-
Figure 1.3 Example of signal propagation delay in digital circuit.
In STA, the crucial work is to calculate the signal propagation delay of circuits. The
example of Fig. 1.3 shows that delay time of signal propagation is the sum of delay on
logic gate and interconnects. The techniques on how to calculate the interconnect
delay accurately and efficiently are being developed rapidly, such as AWE method [4],
[5] and PVL method [6]. In contrast, it is difficult to obtain a precise and efficient gate
delay model, because the CMOS gate is composed of non- linear components. Besides,
as the future sizes of VLSIs decreases to the deep submicron region, the
characteristics of the interconnect load have been changed. The thinner interconnect
load results in the larger load resistance. As the load resistance can shield some load
capacitance of a gate, the larger resistance of interconnects has the larger effect on the
signal delay of a gate [41]. When we evaluate the signal gate delay in STA,
interconnect load effect should be considered and it makes the gate delay model more
complicated. Therefore, this dissertation is focused on improving the accuracy and
efficiency of gate delay model. In the following part, the basic conceptions of gate
delay model will be introduced in detail.
1.1.2 Basic Conceptions of Gate Delay Model
In the digital IC designs, a logic gate is a physical model that usually consists of
several transistors or diodes. It performs a logical operation with one or more logic
inputs and produces a single logic output. There are many kinds of logic gates, such
-11-
as NOT gate, NOR gate, XOR gate and NAND gate. The NOT gate also called the
inverter that is the basic module of VSLIs. If the operational principle of inverter is
well known, the more complex modules like adder, multiplier, and microprocessor are
much easy to design [41]. The electrical behavioral of these complex circuit can be
almost completely derived by extrapolating the results obtained from inverters [7].
Moreover, in the gate delay calculation, the timing model of inverter can be used for
more complex gates, since several fast methods [8] have been proposed for reducing a
gate to an equivalent inverter. Using these techniques, the propagation delay of a gate
can be computed quickly and accurately using the inverter timing model and without
the complications associated with trying to generalize the inverter-based model to
complex gates [9]. Due to the above reasons, this research is focused on the gate delay
model of CMOS inverter.
Figure 1.4 The CMOS inverter.
Figure 1.4 shows an inverter gate connecting to a capacitor CL. This inverter
consists of two transistors (one NMOS and one PMOS). In Fig. 1.4, VDD is the full
swing voltage of supply voltage, Vin(t) and Vout(t) are the input voltage and output
voltage that relate to the time t, respectively. The output load CL consists of the gate
capacitance of the inverter, the total gate capacitance of fan-out gates driven by the
inverter, and the interconnect load capacitance. The inverter gate capacitance is the
sum of the gate-to-drain capacitances of both transistors, which consist of the
-12-
gate-to-drain overlap capacitance and a part of the gate-to-channel capacitance. It is
calculated using the parameters Cox (gate-oxide capacitance per unit area) and Cgdo
(gate-drain overlap capacitance per unit channel width) [10]. Besides, the total gate
capacitance of fan-out gates is mainly the sum of gate-to-source capacitance (Cgs) and
gate-to-drain capacitance (Cgd) of each fan-out gate.
Figure 1.5 Definition and waveform of inverter propagation delay.
During the inverter operating, the current charging or discharging the capacitance
with input changes requires some time. Thus, the propagation delay of a gate is
defined to evaluate that the time it takes to transmit a signal from input to output of
the logic gate. Often on manufacturers' datasheets this refers to the time required for
the output to reach 50% of its final output level when the input changes to 50% of its
final input level. For example, the definition of propagation delay is described in Fig.
1.5. For an inverter, because the device parameters have some differences between
NMOS and PMOS, the response times with rising or falling input waveform are also
different [41]. Here, the output rise and fall times are labeled tr and tf respectively,
which are usually defined as the difference between the 20% and 80% or 10% and 90%
points of the output waveform. The gate delay time between the 50% points of the
input and output are labeled tPHL and tPLH, depending on whether the output is
-13-
changing from a high voltage to a low voltage or from a low to a high. These
definitions are extremely important in characterizing the time-domain characteristics
of digital circuits. The rise and fall times have a tight relationship with the gate delay
time. The rise and fall times usually can be obtained during the process of calculating
the gate delay time. At the same time, the rise and fall times are the input transition
times for the next stage gate delay calculation.
The overall gate propagation delay td equals the average value of tPHL and tPLH [10].
The rise time tr and fall time tf can be simply calculated in the following way. Define
CL is the total load capacitance (sum of input capacitance of next gates, output
capacitance of this gate and routing) of an inverter, parameters Vthn and Vthp are the
threshold voltages of NMOS and PMOS, respectively. Moreover, the full swing
voltage is Vdd and the current gains of NMOS and PMOS are n and p. Then, the
output fall time of 10% to 90% points can be expressed as [10]
0 9
2 20 1
2
2
0 1 12 19 20
1 1 2
dd dd thn
dd thn dd
. V V V outL L
f outV V . V
outn dd thnn dd thn
out
dd thn
L
n dd
dVC Ct dV
VV VV V V
V V
n .C ln n ,
V n n
(1.1)
where n = Vthn/Vdd. In the same way, the rise time can be expressed as [10]
0 1 12 19 20
1 1 2
L
r
p dd
p .Ct ln p ,
V p p
(1.2)
where p = |Vthp|/Vdd. The relationship of tf and tr can be approximated as
nr
f p
t .
t
(1.3)
If we want to have approximately the same rise and fall time for an inverter, we need
to make
pn
p n
W
,
W
(1.4)
where Wn and Wp are the channel widths of the NMOS and PMOS, respectively.
Generally, the channel width for PMOS should be increased to approximately two to
-14-
three times that of the NMOS to make tr equals tf [11]. In the performance evaluation
of digital circuits, the response speed, signal noise, and energy consumption of the
circuits are related to the gate delay time and signal slopes, which are determined by
the parameters td, tr and tf. The designers must consider these effects when they try to
improve the performances of designs, such as circuit life, performance stability,
working accuracy, and handling capacity [12] [41]. In the following section, the
motivations and contributions of this research are presented.
-15-
1.2 Dissertation Motivations and Contributions
In the digital circuit design, the gate delay estimation is so important and
fundamental that many works have been done focused on gate delay model. To
achieve a high accuracy result and to improve the efficiency of gate delay model are
two key research issues in this field. However, it is also a difficult task since the
CMOS gate is composed of non- linear components.
For high accuracy, the gate delay model should consider the factors that have the
significant effect on the delay calculation as many as possible. At the same time, since
the VLSI techniques are rapidly developed, the gate delay model also should be
improved to adapt the requirements of new VLSI techniques. When the gate output
resistance is much larger than the interconnect resistance, we can use a single
capacitive load that equals the sum of total load capacitance to calculate gate delay
time. Currently, as the feature size of VLSIs decreases, the interconnect wires become
thinner and thinner that results in the interconnect resistance being much larger than
ever. Then the total capacitance of load is obviously reduced since the resistance has
the ability to obstruct the charge flowing into the load capacitance [13] [41]. If we
directly use the total capacitance load for computing delay time, this simple model
will have a large error that can be more than 50%. Therefore, under this situation, the
gate delay model should consider the interconnect resistance effect and quantify the
effect. Then total capacitance load of gate delay model is modified to a combination
structure of two capacitors and a resistor that is called the RC- The RC-
model is widely used in the gate delay calculation, since it is found that the gate
response with the RC- load ( - load) can be used to replace that of general
interconnect net [14] [41].
The empirical k-factor model has been traditionally used for gate delay calculation,
where the algorithms for waveform of output response and gate delay time are
pre-characterized as a function of input condition (tin) and capacitive load (CL) [15]. In
the empirical method, the gate load is a single capacitive load CL. However, there are
-16-
two capacitances and one resistance in the RC- Thus, the RC- load is not
suitable for empirical methods. Then a conception called the effective capacitance Ceff
has been proposed to overcome this problem [16]. We can find that the equivalent
capacitive load Ceff and original - load have the same output response of a gate. Then
use this Ceff to replace CL, the gate delay time and the output shape of a gate can be
obtained by the empirical method. Various approaches have been proposed for gate
delay estimation based on generating the effective capacitance Ceff [13]-[30].
During the process of computing the gate delay, the conventional models usually
use a ramp waveform as the input signal [13]-[28], [31]-[40]. The VLSI systems are
very complicated that have large amounts of gates and interconnect wires. Even when
the original input is a ramp waveform, it should transfer through many gates and
sometimes the long wires. After that, the signal waveform becomes more and more
non-ramp, which is also the input signal for delay calculation of the later stage gate.
As a result, the gate delay evaluation has a large error when the designers simply use
the ramp assumption instead of the non-ramp waveform. Therefore, when we
calculate the effective capacitance for gate delay, the input waveform effect also
should be counted. In this dissertation, an advanced effective capacitance model is
presented that considers both the non-ramp input and interconnect load effects. The
influence of the non-ramp input signal is modeled as one part of the effective
capacitance value. Test results of our advanced model are very close to that of
HSPICE, and the error is within 4%.
In this dissertation, another subject is to solve the charge difference problem of
computing gate delay based on the Thevenin model. Thevenin model is a very
important equivalent model in the gate delay calculation, which considers each gate as
a combination of the gate driving output resistance and the step voltage source. With
the increasing effects of interconnect resistance, gate output waveforms become
increasingly non-digital and can no longer be modeled as saturated ramps. A solution
to this problem is to use the Thevenin model based on the effective capacitance Ceff
concept [13], [17]. Moreover, the Thevenin model has the advantage that is simple
than the empirical method. The gate delay and gate output response can be analyzed
-17-
through the effective capacitance and input transition time by using the Thevenin
model. Thus the Thevenin model is widely used in cell level delay estimation [13],
[16]-[23]. Most of the conventional methods for obtaining the value of Ceff assume
that the charge of RC- and Ceff load from the initial output time to 50% output
time are the same. With the actual gate model, this condition is tenable. However,
with the Thevenin model, the basic condition is not accurate [23]. In other words,
there have the charge difference which is not considered in the foregoing methods.
The proposed method in this dissertation considers the various influence of charge
difference in the Thevenin model with different circuit conditions. Test results show a
relatively high accuracy that has an average error of 1.3% SPICE simulation results.
Moreover, the proposed method is based on modifying the conventional charge
condition for the effective capacitance and without adding much calculating cost.
For improving the efficiency of gate delay calculation, we should pay our attention
on reducing the computation time and algorithm complexity of gate delay model. In
the previous works, most of them use iterative algorithms to ensure the accuracy [7],
[13], [16]-[19], [22]-[30]. As the feature size of process technology is scaling down,
even a single VLSI system becomes more complicated and has more gates. Iterative
methods for computing the gate delay of such VLSI systems have issues in efficiency.
In iterative methods, the procedure needs to be repeated until convergence. Generally,
this kind of procedure needs three or four iterations. However, the number of
iterations will greatly increase when the initial value is not good. Furthermore, most
of these methods do not consider the convergence conditions of the algorithms. Thus
the algorithms cannot converge in some cases. Moreover, when the iterative method is
applied in a tight synthesis-analysis loop of circuit delay estimation, the evaluation
procedure may need to be repeated hundreds of times under any design modification
because of gate interconnect effect. Consequently, the runtime for gate delay
estimation may not be bearable in the above situation. In contrast, non-iterative
methods for calculating the effective capacitance were proposed [20], [21] and [33].
However, the non-iterative methods presented the results of obviously lower accuracy
in [20] and [21], or used an over-simplified gate model without considering the
-18-
influence of gate interconnect load [33]. In this dissertation, an effective non-iterative
approach for gate delay calculation is proposed, which can overcome the low
accuracy of conventional non-iterative methods and enhance the efficiency of iterative
methods. The proposed method does not require any iteration to obtain the gate delay
and just has an average error within 2.8% SPICE results. Therefore, with relatively
high efficiency and accuracy, the proposed method is suitable for the circuit
optimization loops.
-19-
References
[1] R. J. Baker, H. W. Li and D. E. Boyce, CMOS circuit design, layout and
simulation, IEEE press series on microelectronic systems, 1997.
[2] IBM Verification Technology Research, https://researcher.ibm.com/researche
r/view_pic.php?id=158
[3] -Aided
Design of Integrated Circuits and Systems, vol. 25, no. 9, pp. 1876-1885, Sept.
2006.
[4]
-Aided Design of Integrated Circuits and
Systems, vol. 9, no. 4, pp. 352-366, Apr. 1990.
[5]
-Aided Design of Integrated Circuits
and Systems, vol. 13, no. 6, pp. 763-776, June 1994.
[6]
-Aided
Design of Integrated Circuits and Systems, vol. 14, no. 5, pp. 639-649, May
1995.
[7]
Fundamentals, vol. E-88A, no. 10, pp. 2562-2569, Oct. 2005.
[8] A. Nabavi- s of CMOS gates for supply
-Aided Design of
Integrated Circuits and Systems, vol. 13, no. 10, pp. 1271-1279, Oct. 1994.
[9]
response and propagation delay evaluation of the CMOS inverter for
short- -State Circuit, vol. 33, no. 2, pp. 302-306,
Feb. 1998.
[10] N. H. E. Weste and K. Eshraghian, Principles of CMOS VLSI Design: A
System Perspective, New York: McGraw-Hill, pp. 183-191, 1992.
[11] -connected MOSFET
-20-
-State Circuit, vol. 26, no. 2, pp. 122-131, Feb. 1991.
[12]
r
Trans. on Very Large Scale Integration (VLSI) Systems, vol. 13, no. 10, pp.
1113-1126, Oct. 2005.
[13]
precharacterized
Computer-Aided Design of Integrated Circuits and Systems, vol. 15, no. 5, pp.
544- 553, May 1996.
[14] -characteristic of
resistive interconnect for accurat
Conference on Computer-Aided Design, pp. 512-515, Nov. 1989.
[15] N. H. E. Weste and K. Eshraghian, Principles of CMOS VLSI Design:
Empirical Delay Models, 2nd ed. Reading, MA: Addison-Wesley, pp. 213,
1992.
[16] J.
-Aided Design of
Integrated Circuits and Systems, vol.13, no.12, pp. 1526-1535, Dec. 1994.
[17] F. Dartu, N. Menezes, J. Qian, -delay model for high
pp. 576-580, June 1994.
[18]
RC interconnect in VDSM technology,
Design Automation Conference, pp. 43-48, Jan. 2003.
[19]
Computer-Aided Design, pp. 224-229, Oct. 1997.
[20]
on VLSI Design, pp. 578-582, Jan. 1999.
[21] thms for computing effective
pp. 147-151, April 1998.
-21-
[22]
E Design Automation
Conference, pp. 866- 869, June 2002.
[23]
International Conference on Communications, Circuits and Systems, pp.
2474-2477, June 2006.
[24]
pp. 2795-2798, May 2005.
[25] S. Mei, J. Kawa, C. Chiang, and Y. I. Ismail,
International Workshop on System-on-Chip for Real-Time Applications, pp.
99-104, July 2004.
[26] method
for calculating the effective capacitance with RC loads based on the Thevenin
-A, no.10, pp. 2531-2539, Oct.
2009.
[27]
delay
Design, pp. 296-300, Mar. 2001.
[28]
Trans. Fundamentals, vol. E88-A, no. 12, pp. 3367-3374, Dec. 2005.
[29]
IEEE International Conference on Communications, Circuits and Systems, pp.
1221-1225, May 2008.
[30]
effective capacitance model for calculating gate delay considering input
-639,
Oct. 2008.
[31] -form RC and RLC delay models
2001-2010, Sept. 2007.
-22-
[32]
due to random-
Solid-State Circuit, vol. 40, no. 9, pp. 1787-1796, Sept. 2005.
[33] M. Shao, M. D. F.Wong, H. Cao, Y. Gao, L. -P. Yuan, L. -D. Huang, and S.
International Symposium on Physical Design, pp. 32-38, Apr. 2003.
[34] Z. Huang, A. Kurokawa, Y. Yang, H. Yu, and Y. Inoue, "Modeling the
influence of input-to-output coupling capacitance on CMOS inverter delay,"
IEICE Trans. on Fundamentals, vol. E89-A, No. 4, pp. 840-846, Apr. 2006.
[35] Z. Huang, A. Kurokawa, M. Hashimoto, T. Sato, M. Jiang, and Y. Inoue,
-Aided Design of
Integrated Circuits and Systems, vol. 29, no. 2, pp. 250-260, Feb. 2010.
[36] A. Kabbani, D. Al-Khalili, and A. J. Al-
Circuits Devices Syst., vol. 152, no. 5, pp. 433-440, Oct. 2005.
[37]
fins and geometry aspect ratio of 16-nm multi-
1st Asia Symposium on Quality Electronic Design, pp. 122-125, Aug. 2009.
[38] A. Kabbani, D. Al-Khalili, and A. J. Al-
-Aided
Design of Integrated Circuits and Systems, vol. 24, no. 6, pp. 937-947, June
2005.
[39] P. Maurine, M. Rezzoug, N. Azemard, and
-Aided
Design of Integrated Circuits and Systems, vol. 21, no. 11, pp. 1352-1363,
Nov. 2002 .
[40] -based compact delay
mo
51, no. 7, pp. 1301-1311, July 2004.
[41] Huang, Zhangcai Study on modeling, analysis and design techniques for
nonlinear circuits and systems DSpace at Waseda University, 2009.
-23-
Chapter 2
Overview of Conventional Gate Delay
Models
-24-
With the development of VLSI techniques, various gate delay models have been
proposed to adapt the design requirements. In order to help understand, this chapter
introduces the overview of the development of gate delay models and some typical
conventional methods.
In general, gate delay models always need to be modified with the new problems
caused by the progress of process. For a gate with interconnect load, both the effects
of CMOS gate and its interconnect on gate delay time should be evaluated. When the
gate driving output resistance is much larger than the interconnect resistance, the
effect of interconnect resistance can be ignored. Thus the single load capacitance for
gate delay calculation can simply equal the sum of total gate capacitance and
interconnect capacitance. Since the characteristic values of the interconnect load are
falling down, the thinner and more complicate interconnects in modern designs result
in the larger load resistance that is comparable to the output resistance of a gate. In
this situation, the interconnect resistance has significant effect on gate delay
calculation, because it shields some capacitance of interconnects. With this changing,
the general RC tree load is replaced by a - load to reflect the effect of interconnects
resistance. In order to calculate gate delay with empirical method that has only single
load capacitance, the methods of reducing the -load to a single effective capacitance
appeared [41].
In the following part, the empirical gate delay model, RC- load, effective
capacitance concept, and equivalent gate model that usually used in gate delay
calculation are shown, respectively. Besides, the methods that have the aims of
improving the model efficiency are also introduced.
-25-
2.1 The Empirical Method for Gate Delay
In order to shorten the design time, the digital systems are usually designed at the
gate or cell level. In contrast to designing at the transistor level, the performance of
the gates and cells can be pre-characterized to obviously speedup the circuit
performance analysis. Specifically, gate delay is pre-characterized for static timing
analysis, and short-circuit power dissipation (the power dissipation due to the short
time period for that the p-channel and n-channel transistors are simultaneously on in
the operation cycle) is modeled empirically for circuit power analysis [1]. In the
empirical method for gate delay, the delay values can be obtained by using a two
dimensional lookup table with the indexes of input transition time and load
capacitance as shown in Fig. 2.1. In this table, gate load is a pure capacitive load that
consists of the gate capacitance and the interconnect capacitance. In the gate delay
estimation, the gate output signal is the input of the next stage gate. Therefore, the
output transition time also need to be obtained by this kind of table.
Figure 2.1 Two dimensional lookup table for gate delay.
In the lookup table, when the load capacitance is constant, the gate delay time is
increasing with the input transition time increasing. The relationship between gate
delay td and input transition time tin can be approximated as a linear relationship.
Figure 2.2 shows this linear relationship of td and tin in the actual cases. When the gate
sizes and load capacitance are determined, the gate delay time changes almost as a
straight line with the different input transition time. Similar to the relationship
between td and tin, the gate delay td also has an approximate linear relationship with
-26-
load capacitance CL as shown in Fig. 2.3.
Figure 2.2 Actual cases of the relationship between td and tin.
Figure 2.3 Actual cases of the relationship between td and CL.
With the approximate linear relationships, the lookup table can be generated to an
equation (often called k-factor equation because the polynomial coefficients are k s
[2])
-27-
Input transition
Load capacitance
Delay time
tin1
tin2
CL1 CL2
td1 td2
td3 td4
Figure 2.4 Determine the coefficients of k-factor equation with lookup table.
1 2 3 4d in L L in L int k t ,C k k C k t k C t , (2.1)
where k1~k4 are the coefficients. We can use the different values of td, tin and CL in the
lookup table to determine the values of these coefficients as shown in Fig. 2.4. In this
figure, td1~td4 are the gate delay times obtained with the different values of input
transition time (tin1, tin2) and load capacitance (CL1, CL2). Then the coefficients can be
obtained by the following equations
4 1 1 3 2 1 2 1 2 1 2 21
3 1 4 1 1 2 2 22
2 1 4 1 1 2 3 23
1 2 3 44
d L in d L in d L in d L in
d in d in d in d in
d L d L d L d L
d d d d
t C t t C t t C t t C t / Wk
t t t t t t t t / Wk
,
t C t C t C t C / Wk
t t t t / Wk
(2.2)
where
1 2 1 2L L d dW C C t t . (2.3)
-28-
2.2 The RC- Model for Gate Delay
In the empirical gate delay model, the gate load is a pure capacitive load CL and the
value equals the sum of the all gate capacitance and interconnects capacitance that is
also called the total capacitance Ctot. Figure 2.5 shows an example of a gate driving
the general loads. The gate loads include many fan-out gates and interconnect wires.
These interconnects have not only the capacitance but also the resistance. When the
output resistance of driving gate dominates the resistance of interconnect load s. The
error of using Ctot for gate delay is not obvious. However, as the VLSI technologies
are rapidly improved, the feature sizes of interconnect wires become much shiner and
have more layers. In Fig. 2.6, the interconnect wires of a 90nm CMOS process have
eight layers. The different wire layers have the insulating layers between the each
other, thus the neighborhood two wires compose a capacitor. Because the thickness of
insulating layer also becomes smaller and wires have more layers, the interconnect
capacitance keeps the value or becomes larger while the interconnect wires become
shiner [3]. At the same time, the resistance of interconnect wires is largely increased
as the wires are shiner. The interconnect resistance Rw can be calculated as
wR l,w t
(2.4)
Figure 2.5 Example of general gate loads model.
-29-
Figure 2.6 Interconnect wires in a 90nm CMOS process.
where is the resistivity, w is the interconnect width, t is the interconnect thickness,
and l is the interconnect length. The resistivity is a constant physical parameter of
the metal wires. While the interconnect width and thickness are smaller, the
interconnect resistance is larger with the same length. In the modern VLSI designs,
the interconnect resistance can easily reach hundreds Ohm or even thousands Ohm.
For example, in a 45nm CMOS process, the resistivity -cm, w and t of the
local wire are 0.054 m and 0.0972 [4]. Then the resistance of
local wire is 419 . In this situation, the resistance of interconnect loads is comparable
or larger than the output resistance of driving gate. The influence of interconnect
resistance becomes larger that shields some of the load capacitance from the driver,
particularly on long interconnects such as clock or bus lines [5]. Therefore, the total
capacitance Ctot will always cause the gate delay to be overestimated.
In order to model the interconnects admittance at gate output accurately, the
authors of [6] presented a one-segment RC- in Fig. 2.7 (c). The
-30-
three components of RC- load ( -load) are used to match the first three moments of
the gate driving-point admittance. The output waveform of gate with the RC- load is
reasonably close to that of gate with actual RC tree load. The RC- one
resistance R, two capacitances C1 and C2. The sum of C1 and C2 in - load always
equals the total value of gate capacitance and interconnects capacitance.
In general, the load of a gate contains many modules such as interconnects and
logic gates. This kind of general load is called the RC tree load. Figure 2.7 shows the
procedure of reducing the RC tree load to the much simpler RC- load. In the s
domain, the pulse input Vin(t) of can be expressed as Vin(s)=1. Meanwhile, the
corresponding current through the voltage source is I(s) [6] [41]. Then, the moments
of the admittance Y(s) at the input can be obtained as [6] [41]
Figure 2.7 The RC- -point admittance of a general
RC tree model.
-31-
2
0 1 2
in
I s
Y s y y s y s .
V s
(2.5)
The parameters (y0, y1, y2 are the coefficients of the s domain expression. Figure
2.7 (c) shows the gate with an RC- load, which is used to simplify the gate delay
model. The RC- approximation can be expressed as [6]
2
1
2
1 1
1 2 2
2
1
1
pi
i i i i
i
sCY sC
sRC
C C s R C s .
(2.6)
When we use the moments of Ypi to match Y(s), then the parameters (C1, C2, and R) of
the RC- load can be obtained as the following three equations [6] [41]
2
2
1 1
3
yC y ,
y
(2.7)
2
2
2
3
yC ,
y
(2.8)
2
3
3
2
yR .
y
(2.9)
In the actual cases, because the values of resistance and capacitance of gate load are
positive. Thus the values of coefficients y1, y2, and y3 have following characteristics
1 2 3
2
1 3 2
0 0 0
0
y , y , y ,
y y y .
(2.10)
With the RC- , the general RC tree load of gate is reduced to the three factors
load. Meanwhile, the resistance effect is added into the gate delay calculation. Since
the RC- is proven accurate, this dissertation also uses it to approximate the
interconnect load of gate as same as the conventional methods [1], [5], [7]-[13].
-32-
2.3 Effective Capacitance Model for Gate Delay
In the RC- , the gate load has three parameters. If we use the lookup table
method for gate delay like the empirical method, this kind of tables need four indexes
tin, C1, R, and C2 that is costly in terms of computer memory space and computational
requirements. As the - load is not suitable for the empirical model, a method of
reducing the -load to a single capacitive load is presented in [5].
To develop an accurate model for computing effective capacitance with the effect
of load resistance, the method is to convert the RC- load into a pure capacitance that
can result in the same gate delay time as the original load. In [5], the method is to
determine that the pure capacitance load and RC- load have the same average current
(therefore the same total charge transfer). Figure 2.8 shows the structures of a logic
gate connecting a - load and an equivalent Ceff, respectively. Figure 2.9 shows the
SPICE simulation results of gate response signal when the pulse input is added to the
two kinds of structures in Fig. 2.8. It is easy to find that the output waveform of RC-
type is very close to that of Ceff type from the initial time to 50% point, and the two
curves intersect at the 50% point when the time t = t50.
Figure 2.8 Gate with the -load and the equivalent load Ceff.
-33-
Figure 2.9 Signal shape of gate response when the load is Ceff type and RC- type
respectively.
Referring to Fig. 2.8, we can equate the average currents for waveform of Vout(t) up
to the 50% time point t50 [5], [7]. Thus we can get
50 50
0 0
50 50
1 1
eff
t t
CI t dt I t dt.t t
(2.11)
In [7] [14], CMOS gates are modeled using a combination of quadratic and linear
Following this
reasoning, [7] uses the following wave-shape assumption:
2
20
20 20 50
0i
out
V ct t t
V t .
a b t t t t t
(2.12)
Starting at the initial voltage Vi, the wave-shape is quadratic to the 20% point t20. The
constants, a, b, and c are determined by the factors that should be solved for
computing the gate delay time. One simplifying assumption is that the voltage
waveform and its first derivative are continuous at t20 [7], thus
2
20
202
ia V ct .
b ct
(2.13)
Then with the assumption of waveform shape, the average current of the effective
-34-
capacitance is
20 50
20
200
20
20 20
50
50
1 2 2
2
2
eff
t t
C eff efft
eff
I t C ct dt C ct dt
t
C ct tt .
t
(2.14)
Similarly, the average current of capacitance C1 in the RC- model is given by [7]
[14]
1
1 20 20
50
50
2
2
C
C ct tI t t .
t
(2.15)
At the same time, the average current in C2 for the interval (0, t50) is
50 20 50
2 2
2
2
2202
20 50 20 2 2
50
2
2
t t t
RC RC
C
tcCI t t t t RC RC e e .
t
(2.16)
Then the effective capacitance can be solved by equating Eq. (2.15) and Eq. (2.16) to
Eq. (2.14) [7]:
50 20 50
2 2
2
22
1 2
20 20
50 20 50
1
2 2
t t t
RC RC
eff
RCRCC C C e e .t tt t t
(2.17)
The same as the actual status, the value of Ceff is between C1 and C1+C2 and is
determined by the parameters t50, t20, and R. When R is zero, the value of Ceff is equal
to C1+C2. And, when R is infinity, the value of Ceff equals C1 [7]. With this method,
the effective capacitance of an RC- load can be easily obtained. Then we can use this
effective capacitance instead of the load capacitance CL in the empirical method for
gate delay.
-35-
2.4 Equivalent Gate Model for Gate Delay
In order to simplify the gate delay model, a switch-resistor model that is also called
the Thevenin model was presented. The Thevenin model was proved that is
convenient for gate delay evaluation with the general RC load [1] [16]. Thus the
Thevenin model is widely used in the cell level gate delay calculation. Figure 2.10 (b)
shows the structure of the switch-resistor model with -load. In the switch-resistor
model, each logic gate is converted into the voltage input Vin(t) and the gate resistance
Rd [16]. Thus, the non- linear gate is replaced by the linear components.
Figure 2.8 (a) Timing analysis with RC-
(c) RC-
The net M voltage in Fig. 2.10 (b) and net N voltage in Fig. 2.10 (c) can be
expressed as [12]
-36-
0tdd in
in
M
tdd
in in
in
V t B Ae Cosh t t t
t
V t ,
V t Ae Cosh t t t
t
(2.18)
0
1
d eff
in
d eff d eff
t
R Cdd
d eff d eff in
in
N t t
R C R Cdd
in d eff in
in
V t R C R C e t t
t
V t ,
V t R C e e t t
t
(2.19)
where
2
1 2 2 2 11
22 2 2
1 2 1 2 2 2 1 22
d
d d
R C C RC C C
tgh ,
C C R C C R RC C C R C
(2.20)
1 2 1 2 2
1 2
1 22
d d
d
d
R C C R C C RC
A ,B R C C , ,
Cosh R RC C
(2.21)
2 2 2 2
1 2 2 2 2 1
1 2
2
2
dd
d
R C C R C R RC C C
.
R RC C
(2.22)
There is an accurate approximation for effective capacitance can be obtained
through minimizing the error between VM(t) and VN(t) from 0.2Vdd to 0.8Vdd (Vdd is
the full swing voltage). Then we have
0 8 0 82
0 2 0 2
2 0dd dd
dd dd
. V . V N
M N M N. V . V
eff eff
VV V dV V V dV .
C C
(2.23)
Finally, we obtain [12]
1 2
1
1
1
d eff
d eff
t
eff eff d t
R C
d
t
t
R C
Ae Cosh t B
C f C ,R ,t
R e
Cosh t
e
Cosh
C C .
e
(2.24)
With this effective capacitance equation, we can use an iterative procedure to calculate
the approximate value of Ceff. And the gate delay time is obtained when the value of
-37-
Ceff is convergent during the procedure.
Except for the Thevenin model, the authors in [13] proposed an equivalent gate
model that is called current source cell model for VDSM (Very Deep Sub-Mircon)
delay calculation. In contrast to the Thevenin model, the current source cell model
considers each gate as a combination of a current source Ig, the gate driving output
resistance Rg and the equivalent gate parasitic capacitance Cg in parallel as shown in
Fig. 2.11. As the interconnect resistance has the larger influence to gate delay, the
current source is replaced by a time-varying source in order to improve the accuracy
of gate delay model. However, both of the time-varying current source and the
parallel resistor Rg should be derived by the iterative method, which make the
calculation procedure much complicated.
Figure 2.9 Current source cell model for gate delay.
-38-
2.5 Efficiency Improved Gate Delay Model
In the gate delay calculation, improving the efficiency is one of the main topics.
Meanwhile, it is also a difficult task because the nonlinear property of the CMOS gate.
The problem is that accuracy and efficiency is always a trade-off in the gate delay
model. In [9] and [10], the authors proposed an iterative- less effective capacitance
model for gate delay. This model focuses on that by using the voltage of output pin of
the gate or cell, they can find a non-iterative and fast method for calculating the
effective capacitance that matches the output waveform in a range from 0.3Vdd to
0.6Vdd. Nevertheless, this method presents the results of obviously lower accuracy that
may have the more than 15% errors. Besides, the non- iterative method in [15] uses an
over-simplified gate model without considering the effect of gate interconnects.
In order to improve the efficiency of gate delay calculation without much accuracy
loss, an accurate low iteration algorithm for effective capacitance computation was
proposed in [11]. Figure 2.12 shows the gate load model with RC interconnect that
used in [11] and the correlative parameters.
Figure 2.10 Gate load model with RC interconnect [11].
-39-
As the gate delay equals t50-tin/2, the delay time td can be obtained when t50 is
known. Denote A the product RdRC1C2 and B the sum RdC1+RdC2+RC2. Then a simple
algorithm for calculating time points is
32
50 1 2
1 1
1 0746 0 2928 0 0911,M
mmt . m . . ,
m m
(2.25)
where
1 2
2
2
2 1
3 2
2
3 2 1
2
6 2
24 6
f
f f
f f
t
m B RC
t t RC
m A Bm .
t t RC
m Bm Am
(2.26)
The prediction can be as much as 15% off the theoretical values. Thus it cannot be
used directly to predict the 50% time point of VM(t). Then with the modification, the
accurate time point can be expressed [11]
2
50 50 50 50
5050
50
2,M ,M ,M ,MM M M M dd
,M,M
,MM
V t V t V t V t V
t t .
V t
(2.27)
At the same time the output voltage at net M can be written as
2
2
1
2
2
1
1 0
1 1
i
i fi
t
dd i i f
i f
M
tt
dd i i f
i
tV k RC e c t t
t
V t ,
V k RC e e t t
(2.28)
where 1 and 1 are the two roots of quadratic equation 2+ +1=0, and
2
1
1 2 1
1
2
2 1 2
2
1 21
f
f
f
k ,
t
k ,
t
RCc k k .
t
(2.29)
In Eq. (2.27), the parameter tf can be obtained by a k-factor equation as
-40-
50effeff C f ,N
C k t ,t . (2.30)
Because the effective capacitance load results in the same 50% output time as the
RC- load, t50,M = t50,N. Therefore, with an initial value of Ceff, we can obtain the tf that
is used to calculate t50,M with Eq. (2.27). Then substituting t50,M into Eq. (2.30) to
update tf till that the value is convergent. This procedure usually needs one or two
iterations that less than the other iterative methods in previous works. And this
algorithm produces gate delays with a 4% average error of SPICE results.
-41-
References
[1] mance computation for
Computer-Aided Design of Integrated Circuits and Systems, vol. 15, no. 5, pp.
544- 553, May 1996.
[2] N. H. E. Weste and K. Eshraghian, Principles of CMOS VLSI Design:
Empirical Delay Models, 2nd ed. Reading, MA: Addison-Wesley, pp. 213,
1992.
[3] A. Kurokawa, H. Masuda, J. Fujii, T. Inoshita, A. Kasebe, Z. Huang, and Y
Determination of interconnect structural parameters for best- and
worst-case delays CE Trans. Fundamentals, vol. E89-A, no. 4, pp.
856-864, April 2006.
[4] International technology roadmap for semiconductors 2003: Semiconductor
Industry Association.
[5] -interconnect
effects in a hierarchical timing
Circuits Conference, pp. 15.6.1-15.6.4, May 1992.
[6] -characteristic of
Conference on Computer-Aided Design, pp. 512-515, Nov. 1989.
[7]
-Aided Design of
Integrated Circuits and Systems, vol.13, no. 12, pp. 1526-1535, Dec. 1994.
[8]
Design, pp. 296-300, Mar. 2001.
[9] capacitance computations for
on VLSI Design, pp. 578-582, Jan. 1999.
[10]
nternational Symposium on Physical Design,
pp. 147-151, April 1998.
-42-
[11]
International Workshop on System-on-Chip for Real-Time Applications, pp.
99-104, July 2004.
[12]
Design Automation Conference, pp. 43-48, Jan. 2003.
[13] A. Korshak and J. Lee An effective current source cell model for VDSM
delay calculation 2001 International Symposium on Quality Electronic
Design, pp. 296-300, Mar. 2001.
[14]
transistor m
IEEE Int. Conf. Computer-Aided Design, pp. 546-549, Nov. 1990.
[15] M. Shao, M. D. F. Wong, H. Cao, L. P. Yuan, L. D. Huang, ands. Lee,
Explicit gate delay model for timing evaluation Proc. of ACM/IEEE
International Symposium on Physical Design, pp. 32-38, Apr. 2003.
[16] S. Fang, Z. Hua Calculating the effective
capacitance Proc.
International Conference on Communications, Circuits and Systems, pp.
2474-2477, June 2006.
-43-
Chapter 3
Effective Capacitance Model for Gate
Delay Considering Input Waveform
Effect
-44-
In this chapter, the proposed method with considering both of the input waveform
effect and interconnects effect is described in detail. To predict gate delay time is a
noteworthy work for Static Timing Analysis in very deep submicron designs.
The gate delay with interconnect effect is usually calculated based on generating
the effective capacitance Ceff [12]. Conventionally, the input-signal to the gate is
always assumed as a ramp waveform. However, the input signal is also the output of
previous gate and not the ramp waveform. Thus the simple assumption as a ramp
signal results in significant influence on the gate delay calculation.
In this chapter, an advanced effective capacitance model is proposed to consider
both the input waveform effect and the interconnect loads, where the nonlinear
influence of input waveform is modeled as one part of the effective capacitance for
calculating the gate delay. In the proposed model, the difference analysis between the
input charge with ramp input waveform and that with non-ramp input is done. With
some simple assumptions, the influence of this difference is accounted into the
effective capacitance.
The experimental results show that the accuracy of our model is greatly improved,
because the non- linear effect of input waveform is considered. The proposed method
only has about 3.7% error, while the errors of the conventional methods in [1], [9],
and [15] are much larger.
-45-
3.1 Introduction
In the previous gate delay models, the simplest method that let load be equal to the
sum of interconnect capacitance was proposed in [1], and then the authors of [2]
proposed a more accurate RC- As the RC- model is proved that has a good
accuracy to model the RC tree load of an actual circuit, this simpler structure is
widely used in STA [18]. Nevertheless, the -load contains one resistance and two
capacitances that cannot match the single load CL of the empirical model [18]. To
solve the unmatched problem of load, a method of converting the -load into a single
equivalent Ceff load was presented by [3]. Then, many effective capacitance
algorithms have been published for calculating gate delay with interconnect loads
[4]-[16].
In [4], a gate delay model was obtained by dividing the whole output into two
pieces. In [5], [8], and [9], each CMOS gate was replaced by a linear resistor to
simplify the gate delay calculation. In [6], an algorithm which ends up in an effective
capacitance for both gate delay and 0-0.8Vdd output transition time was presented. In
[7], the effective capacitance was obtained based on the pre-simulation tables. In [10],
the authors proposed a generalized gate/cell-modeling approach for general RLC
loads. In [11], [12], the non-linear waveform of gate response was modeled as several
linear pieces for calculating the equivalent load Ceff of original interconnect load. In
[14], [15], an iteration- less approach was used for effective capacitance computation,
both in the step input and ramp input regimes. And in [13], [16], the authors designed
a model to approximate not only cell timing but also slew sensitivity.
All the conventional effective capacitance models are just used to account the
influence of interconnect loads while the input condition is always the linear
waveform. However, test results of delay evaluation show that there are large
differences between the effects of the linear and non- linear inputs. Therefore, it is
necessary to develop a method to consider input waveform effect. In this paper, an
advanced effective capacitance model is proposed considering both the input
-46-
waveform effect and interconnect wire load, where the nonlinear influence of input
waveform is modeled as one part of the effective capacitance to compute the gate
delay. The experimental proves that the accuracy of our model is greatly improved
when the input waveform effect is considered.
In STA, the gate delay is defined as the time difference when the input and the
output are at the 50% point of full swing, which is called the 50% point delay. Figure
2.9 shows that the output waveforms with - load and equivalent Ceff load are almost
coincident under the 50% full voltage point, and the curves intersect at the 50% point
when the time t = t50. Therefore, the equivalent load Ceff is accurate to simplify the
-load in delay model of ICs. In order to obtain Ceff from -load, the conventional
method assumes that the charge QCeff of Ceff load is equal to the charge Q of - load
from zero time to t50 [3], [4]. Thus we can get
effC
Q Q . (3.1)
Figure 3.1 The ramp signal goes through many gates and long interconnect wires.
-47-
Figure 3.2 The simulation results of waveform shape after the ramp input transferring
through the complex circuit.
The conventional methods for calculating the effective capacitance have good
performance only when the input signal is the ramp waveform. In fact, the current
VLSI systems are very complicated. Even when the original input is a ramp signal, it
should transfer through many gates and sometimes very long wires as shown in Fig.
3.1. In this figure, Vin(t) is a ramp input voltage, and Vin'(t) is the input signal to the
next stage circuit cell, which is the output signal caused by Vin(t) passing through
several gates and long wires. What will happen after the input voltage Vin(t) passing
through these circuit structures? Figure 3.2 answers this question, which shows
simulation results of Vin'(t) in actual situations. It can be seen that the output signals
become nonlinear curves. When calculating the delay of the next stage by
conventional methods, Vin'(t) is assumed as the ramp signal for the next stage s input
as shown in Fig. 3.3. In Fig. 3.3, t20(in) and t80(in) are the time of the input signal
Vin'(t) reaching the 20% point and the 80% point of the full swing respectively. The
points t20(in) and t80(in) are connected as a beeline and extended from 0V to full
voltage Vdd as the waveform Vin"(t). Then the signal Vin"(t) can be regarded as the
-48-
modeled input signal for calculating the delay time of the next stage circuit, where tin
is the input transition time of Vin"(t). However, this method results in a significant
error which can be more than 20%.
Figure 3.3 Modeling the actual input signal Vin'(t) as a ramp input Vin"(t).
Figure 3.4 Applying the modeled input Vin"(t) and actual input Vin'(t) to the same Ceff circuit.
To confirm this error, we show a simulation example using the circuits shown in
Fig. 3.4. The interconnect loads of the two circuits are same and Rd is the gate driving
-49-
output resistance in the Thevenin model. Moreover, the waveform Vout"(t) is the
output signal using the modeled ramp input and Vout'(t) is the output signal using the
actual input. The simulation result is shown in Fig. 3.5. In Fig. 3.5, we make the time
when the input signals Vin'(t) and Vin"(t) are at the 50% Vdd be same, while td1 is the
delay time using the modeled input signal and td2 is the actual delay time using the
actual input signal. The delay time td1 td2 as shown in Fig. 3.5, and the error rate
(td2 td1)/ td2 is about 16.7% in this case. Therefore, it is necessary to develop a
method to eliminate this error.
Figure 3.5 Simulation results of the gate delay time using the modeled input Vin"(t) and
actual input Vin'(t) for the same circuit respectively.
-50-
3.2 Proposed Method
In this section, a new method is proposed to solve the problem of input waveform
effect in detail.
3.2.1 Analytical Expressions
In order to convert the three components of - load into the equivalent load Ceff,
most of the conventional methods are based on Eq. (3.1). Therefore, the charge
amount is very important in calculating the effective capacitance. Similarly, we also
consider the amount of input charge to solve the problem about the input waveform
effect.
As shown in Fig. 3.5, it is clear that the actual delay time td2 is larger than the
modeled delay time td1 when the same effective capacitance load is used. Therefore,
as the modeled ramp input signal Vin"(t) is used to calculate the effective capacitance,
the modeled input charge QVin" should be added some charge difference Q to make
the delay time td1 caused by the modeled input signal Vin"(t) equal the delay time td2
caused by the actual input signal Vin'(t). Since the charge Q has the direct proportion
relationship with the effective capacitance Ceff: Q Ceff, we can add a capacitance
Ceff to Ceff of the circuit as shown in Fig. 3.6, to make the two delay time be same
when the input signals are Vin'(t) and Vin"(t) respectively. In this figure, the currents
flow through the voltage sources are i"(t) and i'(t).
Figure 3.6 Ceff to make the delay time using modeled input Vin"(t) is equal to the delay
-51-
time using actual input Vin'(t).
Figure 3.7 Ceff to make delay times are the same.
Figure 3.7 shows the simulation example when we add an appropriate Ceff, where
t50(in) is the time when the input signal Vin'(t) reaches the 50% point of full swing and
t50 is the time when output voltage reaches the 50% of full swing. The delay time
equals the value of t50 t50(in). From this figure, we can find that the output waveforms
Vout'(t) and Vout"(t) are almost the same. The gate delay time with the modeled input
Vin"(t) is equal to that with the actual input Vin'(t). From Fig. 3.6 (a), when the output
signal Vout"(t) reaches the time t50, Vout"(t50) = Vdd/2. Therefore, we can establish the
following equation as
50 50
0 0 2 in
t t in out dd
eff eff V
d
V t V t Vi t dt dt C C Q .
R (3.2)
Similarly, from Fig. 3.6 (b), when the input signal is the actual input Vin'(t), the
equation can be expressed as
50 50
0 0 2 in
t t in out dd
eff V
d
V t V t Vi t dt dt C Q .
R (3.3)
From Eqs. (3.2) and (3.3), if we want to make the gate delays with modeled input and
actual input are the same, the input charge using modeled input signal should add a
-52-
quantity of (Vdd Ceff)/2 charge compared with that using actual input signal. In other
words, when the input charge with modeled input QVin" is equal to the sum of input
charge with actual input QVin' and the additional charge Q. Therefore, the actual
effective capacitance Ceff (actual) with ramp input waveform can be expressed as
1 1
in
eff
eff eff eff eff eff
eff V
C QC actual C C C C .
C Q
(3.4)
In Eq. (3.4), the values of Q and QVin' need to be determined which are difficult to be
calculated directly. Therefore, we consider the difference between the two input
voltages Vin'(t) and Vin"(t) to solve the problem which can be transferred to the
difference of areas as shown in Fig. 3.8. The areas S1 and S2 can be expressed as
50
50
1 0
2 0
t
in
t
in
S V t dt
.
S V t dt
(3.5)
Figure 3.8 Charge difference of modeled input and actual input.
In Fig. 3.8, the areas S3 and S4 are the area difference between the areas S1 and S2. As
shown in Fig. 3.9, the circuit (b) is the equivalent circuit of the Ceff circuit (a) in the
Thevenin model, where Vin"(t) is the modeled input signal, i"(t) is the input current,
-53-
and Zload(t) is the equivalent resistance of driving output resistance Rd and capacitive
load Ceff. From the circuit (a), we have
in out
d
V t V t
i t .
R
(3.6)
In Eq. (3.6), the output voltage Vout"(t) can be obtained in analytical way as
0
1
d eff
in
d eff d eff
t
R Cdd
d eff d eff in
in
out t t
R C R Cdd
in d eff in
in
V t R C R C e t t
t
V t ,
V t R C e e t t
t
(3.7)
where tin is the input transition time of the ramp input signal and Vdd is the full voltage.
As the analyses of delay model with the rising and falling input are in the same way,
only the rising type is used for discussing in this dissertation. With Eq. (3.6) and the
circuit (b) in Fig. 3.9, the equivalent resistance Zload(t) can be obtained as
load inZ t V t / i t . (3.8)
Thus, the expression of Q is
50
0
t in in
load
V t V t
Q dt.
Z t
(3.9)
Figure 3.9 The equivalent circuit of the Ceff circuit in the Thevenin model.
-54-
In Fig. 3.8, the area differences S are divided into two regions, where S3 is in
Region 1 and S4 is in Region 2. We also calculate the charge difference Q in the two
regions. Region 1 is from initial point to the time t50(in), and Region 2 is from t50(in)
to the time t50. Therefore, the average value of Zload(t) in Region 1 can be expressed as
50
0
1
50 0
t in
inV t / i t dt
Z ,
t in
(3.10)
where Z1 is the average value of the equivalent resistance in Region 1. Similarly, in
Region 2, the average value Z2 of the equivalent resistance Zload(t) can be expressed as
50
50
2
50 50
t
int in
V t / i t dt
Z .
t t in
(3.11)
We define that Q1 is the part of the charge difference Q in Region 1, and Q2 is the
part of Q in Region 2. With Eq. (3.5), the charges Q1, Q2 can be obtained as
50
1 30
1 1
1 1t in
in inQ V t V t dt S ,Z Z
(3.12)
50
50
2 4
2 2
1 1t
in int in
Q V t V t dt S .
Z Z
(3.13)
Since the charge difference Q equals the sum of Q1 and Q2, it can be obtained as
3 4
1 2
S SQ .
Z Z
(3.14)
In Eq. (3.14), the areas S3 and S4 need to be determined. However, it is difficult to
calculate the values of S3 and S4 directly. In order to calculate their values easily, we
use the approximate method as follows. Firstly, the area S3 can be regarded as a
triangle as shown in Fig. 3.10. In Fig. 3.10, the points t20(in), t50(in) and t80(in) are the
time when the actual input signal Vin'(t) reaches the 20%, 50% and 80% of full swing
respectively. The point t0 is the time when the modeled input signal Vin"(t) begins to
raise from 0V. As the points t20(in), t50(in), t80(in) and the modeled input signal Vin"(t)
are already obtained, the triangle S3 can be determined by the three points t20(in),
t50(in) and t0. Let A, B, C are the three sides of the triangle S3 and denote L =
-55-
(A+B+C)/2, so the value of S3 can be obtained by Heron s formula as
3S L L A L B L C . (3.15)
Figure 3.10 The approximate areas for calculating the values of S3 and S4.
Secondly, we calculate the value of the area S4. In Fig. 3.10, t50(Ceff) is the time of
output signal reaching 50% Vdd when the input signal is the modeled input Vin"(t). The
value of t50(Ceff) t50(in) is the delay time when the input signal is the modeled input
signal Vin"(t) and the value of t50 t50(in) is the delay time when the input signal is the
actual input signal Vin'(t). From the time t50(Ceff) to the time t50, since almost all cases
the input current and the area difference between Vin'(t) and Vin"(t) become very small,
the influence of this part is very small in the delay calculation. Since the value of t50 is
unknown and t50(Ceff) is the known quantity, we use the time t50(Ceff) as the end time
instead of t50. Therefore, the interval Region 2 becomes from t50(in) to t50(Ceff).
In Fig. 3.10, we make a beeline through the point t50(in) and the point t80(in), and
the beeline intersects the ramp input waveform Vin"(t) at the point ta. We define that tb
is the point on the waveform Vin"(t) when the time equals t50(Ceff). We connect the
-56-
point t80(in) and the point tb. Therefore, the area S4' is determined by the points t50(in),
ta, tb and tc as shown in Fig. 3.10, where tc is the point of Vin"(t) reaching the full
voltage Vdd. The area S4' can be divided into two triangles which t50(in), ta, tc)
and t80(in), ta, tb). Since the points t50(in), ta, tb, tc and t80(in) are the known
quantities, the value of S4' can be obtained easily. As the error rate Ceff /Ceff is
within 20% in most cases, and S4 S4', the error in calculating the actual effective
capacitance Ceff(actual) is neglectable even when we use the area S4' instead of S4.
Therefore, the value of S4 can be obtained as
4
1 1
2 2 2 5
dd dd
a c b a
V VS t t t t . (3.16)
Figure 3.11 Corresponding areas S21 and S22 of modeled input signal Vin"(t) in Region 1 and
Region 2 respectively.
When the time tb tc, the area S4' becomes a triangle while the points t50(in), t80(in)
and tb are the three vertexes of the triangle. As the three vertexes are known, the value
of S4' can be calculated easily. Similarly, the modeled input charge QVin" also should
be divided into the two regions: Region 1 and Region 2. The corresponding areas of
-57-
the two part charges are S21 and S22 as shown in Fig. 3.11. Thus QVin" can be obtained
as
21 22
1 2
in inV V
S SQ Q Q .
Z Z
(3.17)
Then we can obtain
21 3 22 4
1 2
inV
S S S SQ .
Z Z
(3.18)
With Eqs. (3.4), (3.14), and (3.18), the actual effective capacitance Ceff(actual) can be
obtained as
3 4
1 2
21 3 22 4
1 2
1eff eff
S S
Z ZC actual C .S S S S
Z Z
(3.19)
3.2.2 Procedure for calculating Ceff(actual)
In order to calculate the actual effective capacitance Ceff(actual), we should obtain
the value of Ceff which is directly computed with the modeled ramp input signal Vin"(t).
In this paper, we calculate the value of Ceff by the method in [12]. Figure 3.12 is the
procedure of obtaining the equivalent load Ceff form original -load. It is clear that the
charge Q is the sum of the charges QCap1 and QCap2 transferred to C1 and C2
respectively. Let the single capacitance Ceff equals the sum of Ceff1 and C1. As a result,
we obtain
(3.20)
here QCeff1 is the value of charge in Ceff1 [12]. Meanwhile, the expression of Q is
written as
(3.21)
Substituting (3.20) and (3.21) into (3.1), the following relationship is found
(3.22)
Meanwhile, Eq. (3.22) can be rewritten as
-58-
(3.23)
the voltages on C2 and Ceff1 are expressed as VC2(t) and VCeff1(t) respectively. It is easy
to know that the voltages VCeff1(t) and Vout"(t) are the same at the time t50. Here t50 is
the time of Vout"(t) reaching the 50% Vdd when the input signal is the modeled input
Vin"(t). Then, the equation for Ceff can be written as the following way [12]:
2
1
50
2
1
eff
C
eff
C t t
C V t
C C .
V t
(3.24)
Figure 3.12 The process of RC- load being converted into Ceff.
Using the Kirchhoff Voltage Laws, we can obtain VC2(t) at the time t = t50 as [12]
50
50
2 2
2 50 0
2
1
t t
tRC RC
C outV t e V t e dt k ,RC
(3.25)
where the constant k is equal to zero, because VC2(t) = 0 when the time t = 0. Finally,
the effective capacitance Ceff is expressed as [12]
-59-
50 20
2
5
32
1 2
50 20
31 1
5
t t
RC
eff
RCC C C e ,
t t
(3.26)
where t20 expresses the key time point of Vout"(t) reaching 20% of full swing. Here,
we also can use the other types of effective capacitance algorithm for this Ceff. The
method [12] referred in the above part uses an iterative procedure for the effective
capacitance. Thus the proposed method also needs the iterative procedure for effective
capacitance and gate delay time. The purpose of computing the effective capacitance
is not only for gate delay, and also for the output response of gate.
In the iterative procedure, we need an initial value of effective capacitance for some
parameters related to gate delay. Then these parameters are used to update the new
effective capacitance till the Ceff value or the gate delay time is convergent. The
method in [12] with the ramp input condition for gate delay has the average error
within 3% of HSPICE results, which is one of the error sources in our test results. The
other error source is caused from the approximations in our proposed model.
The procedure for Ceff (actual) is in the following way:
1) Let Ceff equal the sum of C1 and C2 at the initial time.
2) Calculate the time points (t20 and t50) by the k-factor equation or lookup table
method.
3) Apply the values of time points in Eq. (3.26) to find a new Ceff.
4) Go back to step 2 with the new Ceff until the iteration converges.
5) Calculate the value of actual effective capacitance Ceff(actual) by Eq. (3.19).
6) Use the value of Ceff(actual) to calculate the t50 by k-factor equation or lookup
table method like [12], and the gate delay time can be obtained as td= t50-tin/2.
This procedure usually converges within four iterations. With the effective
capacitance value, the gate delay and gate output response can be obtained during this
iterative procedure.
3.2.3 Driving Output Resistance Calculation
Figure 3.13 shows the structure of the Thevenin model. Our proposed method is
based on this equivalent model which is widely used in STA. The value of load
-60-
capacitance CL has the relatively weak correlation with the gate resistance Rd, and
therefore, several simplifying assumption may be used during its calibration [9].
Figure 3.13 The structure of Thevenin model that is the equivalent model of actual gate .
When the gate drives a capacitive load under a step input, if the falling output
waveform is assumed to be of the form [9]
( ) .d L
t
R C
o oV t V e (3.27)
Then the value of Rd can be obtained from the 50% and 90% time points (denoted as
t50 and t90, respectively) as [17]
.
5ln
),(),( 5090
L
inLinL
d C
tCttCtR
(3.28)
The reason for choosing t50 and t90 rather than any other threshold is a result of
empirical observations [9].
In Eq. (3.28), the load capacitance CL is an unknown parameter. The authors of [9]
gave a reasonable value of CL that is the largest capacitance that the gate is expected
to drive. The value of Rd can be approximately modeled as a constant, which is
independent of the load and input signal [9].
-61-
3.3 Tests and Comparisons
Some examples are tested for the method and the experimental results are presents
with various Rd, various tin, various gates and various RC- From the test results,
we can see that the proposed method is very close to the HSPICE simulation, and
much more accurate than the methods in [1], [9], and [15].
3.3.1 Experimental Results for Various Rd
Figure 3.14 Test results of gate delay calculation with different Rd.
Compared with HSPICE simulation, the accuracy of the proposed method and [9]
for delay calculation is clearly shown in Fig. 3.14. In this experiment, the input signal
is assumed to transfer through three series-wound inverters firstly while the output
signal is the non-ramp signal and used as the input signal voltage of Rd and RC load
based on the Thevenin model. The transistor widths of a CMOS inverter are Wp =
2 m for p-channel and Wn = 1 m for n-channel. The parameters of input and load are
tin = 300ps, R , and C1/C2 = 0.8pF/1.2pF. The resistor Rd is the driving output
resistance which is calculated from the actual gate. From Fig.3.14, we can find that
although the gate resistance Rd varies over a wide range ( - ), the proposed
-62-
method is always close to the HSPICE results. Meanwhile, the error of our model for
gate delay is smaller than that of [9].
Table 3.1 Test results of HSPICE, Ctot [1], and our model with increasing Rd.
Rd
Gate Delay (ps) Error (%)
HSPICE Ctot [1] Proposed Ctot [1] Proposed
100 21.4 29.6 19.6 38.3 7
200 40.4 55.5 37.8 37.4 3.7
300 57.6 77.5 55.1 34.5 2.6
400 73.3 96.7 69.7 31.9 3.5
500 87.8 114.9 84.5 30.9 2.6
600 101.4 134.1 98.3 32.2 2.6
700 116.7 153.3 114.2 31.4 3.5
800 134.8 173.3 129.5 28.6 5.4
900 154.8 193.3 149.2 24.9 4.3
1000 175.8 213.5 168.6 21.4 3
Average Error 31.2 3.8
Table 3.1 shows the delay time computed by the proposed method and [1] when Rd
increases ( - ). The parameters of input and - load are set as tin = 200ps, R
, and C1/C2 = 0.1pF/0.2pF. The conventional method [1] computes the gate
delay by the total value Ctot, which is equal to the sum of load capacitances. The test
results show that our proposed method is always close to HSPICE with the different
values of Rd. With the same conditions, the average error of Ctot method in [1] is more
than 30%, while the error of our algorithm is only 3.8%.
3.3.2 Experimental Results for Various tin
Figure 3.15 is the test results of our model and the conventional models with the
input condition tin increasing (300ps-1000ps). The ramp input signal is applied to an
inverter with the resistor Rd and RC- load. After passing through the inverter, the
signal becomes the non-ramp waveform as the input signal for Rd and RC- load. The
transistor widths of the inverter are Wp = 4 m/Wn = 2 m. The parameters for delay
-63-
algorithm are C1/C2 = 0.3pF/0.6pF, R , and Rd To see the data of Fig.
3.15, we can find that our model is much more accurate than [9]. Compared with
simulation results, the error of our model remains in a small range with the different
values of tin.
Figure 3.15 Test results of gate delay calculation with different tin.
In Table 3.2, transition time tin is changed with the step of 100ps. In this experiment,
the gate resistance is Rd . The parameters of interconnect load are R ,
C1=0.1pF/C2 = 0.2pF. The data in Table 3.2 show that the proposed method has the
relative high accuracy compared with HSPICE method, when tin is changed over a
wide interval (100ps-1000ps). The error of Ctot model [1] rapidly increases that can be
more than 50% with the value of tin decreasing. In this experimental and the results
shown in Table 3.1, the test circuits do not add the non-ramp input effect. However,
the results show that the proposed model is much more accurate than the total
capacitance method in [1]. When the non-ramp input effect cannot be neglected in the
timing analysis of complex circuits, the error of method in [1] will be larger. In
contrast, the proposed method can avoid this kind of additional error.
-64-
Table 3.2 Test results of gate delay calculation with increasing tin.
tin (ps)
Gate Delay (ps) Error (%)
HSPICE Ctot [1] Proposed Ctot [1] Proposed
100 54.2 86.6 53.4 59.8 2.6
200 73.3 96.7 72.1 31.9 1.6
300 85.8 105.7 83.2 23.2 3.4
400 94.5 111.1 91.8 17.6 2.9
500 100.8 114.2 97.9 13.3 1.9
600 105.4 116.2 101.5 10.2 3.7
700 108.9 117.6 105.2 8 2.5
800 111.4 118.5 104.1 6.4 6.6
900 113.5 118.9 105.4 4.8 7.1
1000 114.9 119.3 112.2 3.8 2.3
Average Error 17.9 3.5
3.3.3 Experimental Results with Various Gates and RC- Load
Figure 3.12 Test circuit for calculating the gate delay time with various conditions.
Table 3.3 provides the test results of using the proposed algorithm, HSPICE, and
[15] to calculate the delay time of the circuit shown in Fig. 3.16. In this circuit, the
size of the first inverter is fixed at Wp/Wn = 5/2 m. The values of the R1, R2, C3 and C4
are R1 = R2 and C3 = C4 = 0.3pF. The input transition time is tin = 400ps.
Besides, the size of the second inverter, and the values of Rd and RC- load are
variable as shown in Table 1. The output signal of the second inverter is the non-ramp
-65-
signal as the input signal of Rd and RC- load. From the table, it is seen that the
average error in estimating gate delay is only about 3.7% by using the proposed
method while the average errors in method [15] is about 15%.
Table 3.3 Test results of HSPICE, [15], and proposed method with different circuit
parameters.
Wp/Wn
( m)
RCCRd /// 21 Gate Delay (ps) Error (%)
pF/pF HSPICE [15] Proposed [15] Proposed
2/1 400/0.1/0.3/400 150 128 148 14.7 1.3
2/1 500/0.3/0.5/1000 370 311 356 15.9 3.8
5/2 800/0.2/0.5/400 471 397 457 15.7 3.0
5/2 1000/0.8/0.2/800 880 709 850 19.4 3.4
5/2 1000/0.5/1.2/800 1150 842 1083 26.8 5.8
10/5 800/0.2/0.5/800 347 293 338 15.6 2.6
10/5 1000/0.3/0.8/600 870 722 818 17.0 6.0
20/10 400/0.5/1.2/1000 280 249 272 11.1 2.9
20/10 800/0.5/0.5/500 600 518 581 13.7 3.2
50/25 500/1.2/0.3/1000 580 508 562 12.4 3.1
50/25 1000/0.6/0.2/500 630 568 609 9.8 3.3
100/50 1000/0.5/0.8/800 810 706 770 12.8 4.9
100/50 800/0.3/0.6/400 530 465 506 12.3 4.5
Average Error 15.2 3.7
-66-
3.4 Conclusions
In this section, the proposed model for gate delay with non-ramp input effect is
based on generating the effective capacitance Ceff. We use an iterative procedure in
the proposed model. And, the gate delay time can be obtained during the process for
the value of effective capacitance.
The non-ramp input effect in the timing analysis is described in introduction. In the
previous researches of timing analysis, the gate delay models always use the ramp
input signal for parameter of the input transition time. Although the initial input signal
is a ramp input, it should transfer through many logic gates and long interconnect
wires. After this process, the linear signal will become more and more nonlinear. At
the same time, the gate delay time should be calculated one by one. The output of a
logic gate is also the input for the later gate. In this case, if the nonlinear input
waveform is simply assumed as a ramp input for gate delay calculation, the results
will have the significant errors due to the non-ramp input effect.
In order to overcome this problem, we analyze the difference between the input
conditions of ramp waveform and non-ramp one. Based on the charge difference that
stored in the effective capacitance load, the proposed model modifies the charge
condition to derive the new effective capacitance algorithm that accounts the
influence of non-ramp input in gate delay value. Moreover, the influence of non-ramp
input signal is modeled as one part of the effective capacitance. It has the merits that
the proposed method is easy to be implemented in the conventional effective
capacitance algorithms and has a good practicability. In the proposed model, because
the additional part of effective capacitance that related to the non-ramp input effect is
difficult to calculate directly, we focus on calculating the quantity of input charge
difference between the ramp and non-ramp conditions to solve this difficulty. With
some reasonable approximations, the proposed model can obtain a relative high
accuracy without adding much complexity in the computation.
To validate the validity of proposed method on accuracy improvement, we tested
-67-
many examples with our new model and show the test results and comparisons in
some figures and tables. The test conditions of gate with interconnect loads are
different in each experiment and the parameter values are commonly used in the
actual VLSI designs. In comparisons of the proposed method with methods in [1], [9],
and [15], we use the SPICE simulation results as the standard values. It is obviously
shown that the proposed method has a good accuracy for gate delay with the
non-ramp input condition compared with the conventional methods.
-68-
References
[1]
-Aided Design of Integrated Circuits and
Systems, vol. CAD-2, pp. 202-211, July 1983.
[2] -characteristic of
Conference on Computer-Aided Design, pp. 512-515, Nov. 1989.
[3] L. -interconnect
Conference, pp. 15.6.1-15.6.4, May 1992.
[4] apacitance for the
-Aided Design of
Integrated Circuits and Systems, vol.13, no.12, pp. 1526-1535, Dec. 1994.
[5]
delay metric f
Computer-Aided Design, pp. 229-234, Nov. 2000.
[6]
Design Automation Conference, pp. 43-48, Jan. 2003.
[7]
Integrated Circuits Conference, pp. 313-316, May 1998.
[8] F. Dartu, N. -delay model for high
pp. 576-580, June 1994.
[9]
precharacterized CMOS
Computer-Aided Design of Integrated Circuits and Systems, vol. 15, no. 5, pp.
544- 553, May 1996.
[10]
Conference on
Computer-Aided Design, pp. 224-229, Nov. 1997.
-69-
[11]
pp.2795-2798, May 2005.
[12] Z. Huang, A. Kurokawa
Trans. Fundamentals, vol. E88-A, no.10, pp. 2562-2569, Oct. 2005.
[13] ve
Trans. Fundamentals, vol. E88-A, no.12, pp. 3367-3374, Dec. 2005.
[14]
Proc. the 12th International Conference
on VLSI Design, pp. 578-582, Jan. 1999.
[15]
147-151, April 1998.
[16] B.
Conference, pp. 866-869, June 2002.
[17] S. Sapatnekar, Timing, Kluwer Academic Publishers, 2004.
[18] S. Fang, Z. Hua Calculating the effective
capacitance Proc.
International Conference on Communications, Circuits and Systems, pp.
2474-2477, June 2006.
-70-
Chapter 4
Accurate Effective Capacitance Model
for Gate Delay with RC Loads Based on
the Thevenin Model
-71-
In the last chapter, we discuss the non-ramp input effect in the gate delay
calculation. This chapter is focused on the charge difference problem in the delay
estimation with the Thevenin model.
As referred in the previous chapter, the Thevenin model considers each gate as a
combination of two linear and simple components (input Vin(t) and gate resistance Rd).
For a VLSI system, it can be divided into many cells and one cell consists of a lot of
logic gates. The delay time of each gate and cell should be calculated. In the cell delay
calculation, a cell also can be modeled as a special type of gate. We can equate the
input/output of a cell and that of a gate. The structure of this special gate is not needed
to care in the timing analysis.
In modern IC designs, interconnects of logic gate, particularly the long wires, have
the significant influence on gate output waveforms. As the load resistance increasing,
the output waveform becomes more and more non- linear that brings difficulties in
delay evaluation. However, the Thevenin model (switch-resistor model) is a good
choice to solve the problem [17]. Since the clock and bus wires between the different
cells are usually very long, the Thevenin model is widely used in the cell level timing
analysis.
When we use the Thevenin model for delay calculation, the gate load is always
modeled as the RC- load [4], [6]-[10], [14]-[17]. The effective capacitance should be
calculated from the RC- load. The conventional methods for effective capacitance
are usually based on the condition that the charge stored in RC- load equals that
stored in Ceff from initial time to output 50% time. This condition is very accurate in
the actual gate model. However, this condition has an obvious error in the Thevenin
model. In conventional methods, it is not considered that the charges of - load and
Ceff with the Thevenin model are not equal. Therefore, we modify the conventional
condition for effective capacitance to improve the accuracy of delay computation.
In the following part, the problem and influence of charge difference in the
Thevenin model are introduced. Then the proposed model for eliminating the error
with conventional condition is presented. Finally, the proposed model is validated by
the results of experiments and comparisons.
-72-
4.1 Introduction
With the continuous development of VLSI techniques, the width and thickness of
the interconnect wires become smaller and smaller. As a result, nowadays, the
resistance of such interconnects with the same length is much larger than ever. When
the load capacitance is increasingly larger than the gate driving output resistance, the
gate output waveforms become more and more nonlinear [17]. In order to overcome
the trouble, a choice of using the switch-resistor model instead of the actual gate is
suitable. It is proved that the Thevenin model with Ceff concept is a simple and
effective model for delay evaluation of gate with general RC load.
From the RC tree load [1] to RC- load [2], the model of gate load becomes
accurate and simple. However, the empirical method cannot directly use the non-pure
capacitive load to obtain the delay time and output waveform of a gate. In order to
solve this conundrum, gate load is further reduced to an effective capacitance [3].
With the effective capacitance, we can not only obtain the gate delay time, but also
the gate output response that is the input parameter for timing analysis of next stage
gate [13]. Moreover, the effective capacitance is a very important parameter for power
consumption estimation of digital circuit [5]. Subsequently, many effective
capacitance algorithms have been proposed for calculating gate delay with
interconnect loads [4], [6]-[10], [14]-[17].
In [4], a linear equivalent gate model was generated that captures the delays at the
interconnect fan-out nodes. In [6], an algorithm that ends up in an effective
capacitance for both gate delay and 0-0.8Vdd output transition time was presented. In
[7], a low iteration method for effective capacitance is described to improve the
efficiency of gate delay model. In [8] and [9], the authors computed the gate delay by
replacing the CMOS gate with a linear resistor. In [10], the authors proposed a
generalized gate/cell-modeling approach for general RLC loads. In [14] and [15], an
iteration- less approach was used for effective capacitance computation, both in the
step input and ramp input regimes. And in [16], the author designed a Thevenin model
-73-
to approximate not only cell timing but also slew sensitivity. The Thevenin model is
used so widely because it can make the delay calculation much simpler than other
models.
Most of these conventional methods to convert RC- to equivalent load Ceff
are dependent on the phenomenon that charges stored in the two kinds of load are the
same at the time of output reaching 0.5Vdd [17]. In the actual gate model, this
condition is tenable. However, with the Thevenin model, the basic charge condition is
not exact and this problem gets overlooked in foregoing models. Therefore, the results
would have errors when to calculate the effective capacitance Ceff. In this chapter, an
advanced effective capacitance model is proposed considering the charge difference
between RC- load and Ceff to improve the accuracy of gate delay calculation based
on the Thevenin model.
Figure 4.1 The Thevenin model with -load and equivalent load Ceff.
Firstly, charge difference problem of the Thevenin model should be analyzed to
help us understand its influence in the timing analysis. Figure 4.1 shows that the
-load structure of a logic gate is converted into the single equivalent capacitive load
Ceff with the Thevenin model, while the two circuits have the same gate delay. The
-74-
input transition time and parameters of RC- load are known. If the value of Ceff load
is got, the gate delay time is easy to compute by empirical method or other numerical
methods. Most of the conventional methods for effective capacitance are based on the
condition that can be written as
50 50effC
Q t Q t . (4.1)
Eq. (4.1) means that the charge transferred into the RC- load is equal to that
transferred into the effective capacitance from the initial time to 50% output time.
This condition is effective and tenable for gate delay in the actual gate situation as
shown in Fig. 2.9. The experimental results in some conventional methods [11], [12],
[18], and [19] have proved that the above condition is accurate for effective
capacitance calculation. However, this condition cannot be directly used to compute
the value of Ceff in the switch-resistor model.
In the switch-resistor model, the logic gate is replaced by input source Vin(t) and
gate resistance Rd as shown in Fig. 4.1. In contrast, the actual gate is similar as a
current source during the operation region. It has some differences from the voltage
source in the Thevenin model. The Thevenin model cannot model the actual gate
completely and the charge QCeff(t) is not equal to the charge Q (t) during the time
interval of initial to t50. Therefore, delay calculation with the conventional condition
has an unavoidable error when using the Thevenin model.
Figure 4.2 The analysis of gate output voltages with the circuits in Fig. 4.1.
-75-
Figure 4.3 The analysis of gate output currents with the circuits in Fig. 4.1.
To find the influence of charge difference issue in Thevenin model clearly, we use
an actual example to show the charge difference and the delay error caused by this
problem. Figures 4.2 and 4.3 show the example of output voltages and the output
currents of logic gate when the load is RC- structure and equivalent capacitor Ceff
respectively. In this example, the 50% output time t50 of the RC- model equals that
of the effective capacitance model. As shown in Fig. 4.2, the output voltages of RC-
model and Ceff model are not always equal from t = 0 to t = t50. The two outputs
intersect at the initial time point and t50. At the other time points, there are some
differences between two outputs. Because the input waveforms of RC- model and
Ceff model are the same, the charges of two models that pass through the resistor Rd
are certainly not equal. Compare the currents in Fig. 4.3 with the voltages in Fig. 4.2,
and the difference of the currents is more obvious as shown in Fig. 4.3, where we can
see the charge difference Q. In Fig. 4.3, the two real lines are the current waveforms
at the gate output when the load is RC- structure and equivalent capacitor Ceff,
respectively. They have the same 50% output time that means the gate delays caused
Q is between the two current waveforms,
-76-
which is from the time t = 0 to t = t50. If we just use the conventional charge equal
condition to calculate the effective capacitance Ceff(old), the output current waveform
is shown by the dashed line in Fig. 4.3. The time t = t50 is the 50% output time caused
by the effective capacitance Ceff(old). It is clear that the value of t50 is smaller than t50.
Since the time t50 is that we want to obtain for gate delay, then the error of gate delay
by using conventional charge equal condition is (t50- t50 )/ t50, which can be more than
30% in some cases [17]. However, the conventional methods that compute Ceff with
the switch-resistor model, neglect the problem of charge difference [4], [6]-[10], and
[14]-[16]. Therefore, the errors in these methods are inevitable and the gate delay
model should be improved to eliminate this kind of error caused by the charge
difference issue.
-77-
4.2 Proposed Algorithm
In this section, a useful and convenient method is introduced in detail to solve the
problem about the charge difference between RC- load and Ceff load in the Thevenin
model.
4.2.1 Analytical Expressions for Effective Capacitance
Figure 4.4 To calculate Ceff from RC- model in the Thevenin model.
Figure 4.4 shows the procedure of reducing the original RC- load to the single
equivalent capacitance Ceff in the Thevenin model, where Vout(t) is the output voltage
with RC- load and Vout (t) is the output voltage with Ceff load. The two output signals
have some differences as shown in Fig. 4.2. In addition, the effective capacitance Ceff
equals the sum of C1 and Ceff1 that are in Fig. 4.4 (b). The principle of calculating the
effective capacitance Ceff is that the charge transferred into Ceff equals the charge
transferred into RC- load. Therefore, by considering the phenomenon of the charge
-78-
difference Q in Fig. 4.3, we can obtain the following charge equation:
50 50effC
Q t Q t Q, (4.2)
where Q is the charge difference from time t = 0 to time t = t50 as shown in Fig. 4.3.
In [17], the authors have presented an empirical method that computes the value of
Ceff through Eq. (4.2), which can be written as
50 50
50
1
effC
Q t Q t ,
Q .
Q t
(4.3)
The parameter denotes the quantity of charge difference effect. If the parameter is
determined, we can use Eq. (4.3) instead of Eq. (4.1) to obtain Ceff. Since the value of
is difficult to obtain directly, the authors in [17] use an empirical value of = 6% to
reduce the error in gate delay calculation. This empirical value is just a typical value
from simulation results that based on often used gate size and interconnects in a small
region. However, the limitations of this method are very clear.
Figure 4.5 Simulation results of with variable Rd and R.
First, the empirical value depends on the process technology. It is obtained from
-79-
simulation results with one type of process technology. Thus we should reevaluate the
quantity of , when the design process is changed. Moreover, the empirical value of
just can capture the values of interconnects and sizes of gate in the small regions. In
fact, this value will largely change from zero to more than 30% with the different
gates and interconnects. Figure 4.5 shows the experimental results of the value with
different Rd and R in the Thevenin model. In this experiment, the input transition time
and the capacitors in RC- load are constant while the resistors Rd and R are changed
is almost
35% and much larger than the empirical value of 6%. Meanwhile, the value of is
also variable when the input transition time and capacitors are changed. Last demerit
of the empirical value method is that the error of gate delay will be larger when the
effect of charge difference is very small in some cases. For example, as shown in Fig.
4.5, when Rd and R should be almost
zero. If we use the empirical value for gate delay, the result will be added about a 6%
error.
In this dissertation, we use an advanced method to calculate Ceff. In order to
calculate Ceff, the value of Q needs to be determined. However, it is difficult to
obtain Q analytically. Therefore, we should find a method to add the influence of the
charge difference into Ceff without computing the value of Q. As shown in Fig. 4.6,
after the time point t50, the current flowing through the RC- model become larger
than that through the Ceff model. From Fig. 4.6, we can choose a time point tE (tE > t50)
to get the following expression:
50 50eff effC E C E
Q t Q t Q t Q t Q , (4.4)
where QCeff(tE) and Q (tE) are the charge transferred into Ceff and RC- load from time
t = 0 to t = tE respectively. Furthermore, Q is the charge difference between the
charge transferred into Ceff load and that transferred into RC- load from time t = t50
to t = tE. Substituting Eq. (4.2) into Eq. (4.4), we have
effC E E
Q t Q t Q Q . (4.5)
If Q = Q in Eq. (4.5), we can obtain
-80-
effC E E
Q t Q t . (4.6)
Figure 4.6 The charge used to counteract the effect of Q.
As Eq. (4.6) and Eq. (4.1) have the same form, we can use the same method as [11],
[12] to obtain the effective capacitance Ceff, which is shown in the following part. In
Figure 4.6, the delay times of a logic gate with RC- load and Ceff load are equal.
Then the following work is to determine the time tE. In order to choose an appropriate
tE, we should obey two requirements: one is the time point accurate enough to let Q
= Q ; the other is easy to obtain. Due to the two requirements, we let the time point tE
= t80, and here t80 is the time when the output voltage Vout (t) in Ceff model is at the 80%
point of full swing. With a large number of simulations, we find that the value of Q
is always close to the value of Q under the different values of circuit parameters as
shown in Fig. 4.4(a). Moreover, the time point t80 is not difficult to calculate.
Therefore, when tE = t80, the charge difference Q can be used to counteract the error
in the effective capacitance calculation which result in by Q. Figure 4.4 shows the
procedure of obtaining the effective capacitance in the Thevenin model. As tE = t80,
the expression of charge Q (tE) is
280 1 80 2 80out C
Q t C V t C V t , (4.7)
-81-
where Vout(t80) is the voltage of capacitor C1, and C2VC2 (t80) is the voltage of
capacitor C2 at the time t80 in Fig. 4.4(a). At the same time, the charge QCeff(tE) can be
written as
80 1 80 1 80effC out eff out
Q t CV t C V t , (4.8)
where Vout (t80) is the voltage of C1 and Ceff1 at the time t80 in Fig. 4.4(b). The value of
Vout (t80) is 0.8Vdd. Substituting Eqs. (4.7) and (4.8) into Eq. (4.6), we have
21 80 1 80 1 80 2 80out eff out out C
CV t C V t CV t C V t , (4.9)
From Eq. (4.9) and the condition that Vout (t80) = 0.8Vdd, the capacitance Ceff1 can be
obtained as
22 8080
1 1
0 8
0 8 0 8
Cout dd
eff
dd dd
C V tV t . V
C C .
. V . V
(4.10)
In Fig. 4.4, the equivalent load Ceff equals the sum of C1 and Ceff1, and then the
expression of Ceff is
22 8080
1 0 8 0 8
Cout
eff
dd dd
C V tV t
C C .
. V . V
(4.11)
Figure 4.7 The simulation results of the voltage difference V in the different cases.
In Eq. (4.10), the value of Vout(t80) is difficult to compute and the expression for
-82-
Vout(t80) is complicated. If we insert the accurate expression of Vout(t80) into Eq. (4.10),
the final integral equation for effective capacitance is unable to solve. In order to
reduce the complexity of computation, we consider using an effective assumption to
make the Vout(t80) expression simple without much accuracy loss. This assumption is
in the following way. The voltage difference between Vout(t80) and Vout (t80) can be
defined as V = 0.8Vdd Vout(t80). Figure 4.7 shows the simulation results of V in the
different cases, where V1, V2 and V3 are the voltage differences in Case_1,
Case_2, and Case_3 respectively. In the three cases, the values of RC- model are
different, and the three cases have the different 50% gate delay times. The values of
V1, V2 and V3 are very small as shown in Fig. 4.7. We have done a large number
of experiments and found that the ratios of V to Vout (t80) are within 2% in most cases.
Therefore, the voltage difference V can be neglected to make the computation
simple and save the time cost for gate delay estimation. Moreover, the experimental
results in Sect. 4.3 show that this assumption just has a very small influence on the
accuracy of effective capacitance computing. Then we let Vout(t80) = Vout (t80), Eq.
(4.10) can be expressed as
22 80
1 0 8
C
eff
dd
C V t
C C .
. V
(4.12)
Using the Kirchhoff Voltage Laws for the gate load shown in Fig. 4.4(a), thus
(4.13)
Meanwhile, IC2(t) is the current of C2 and can be expressed as [12]
(4.14)
Using Eq. (4.14), Eq. (4.13) is changed to
(4.15)
From Eq. (4.15), we obtain
2 2
2 0
2
1t ttRC RC
C outV t e V t e dt p ,RC
(4.16)
where the constant p is equal to 0, because VC2(t) = 0 when the time t = 0. When we
-83-
obtain the value of VC2(t), the effective capacitance is easy to find by Eq. (4.12). Then
the voltage of C2 at the time t80 can be expressed as
80
80
2
2 80 0
2
1 t tt RC
C outV t V t e dt.RC
(4.17)
Here, the expression of VC2(t80) is an integral equation, it can be rewritten as
2 80
12
1 2
2
1 lim
1 2
N
C out iN i
out out N out
V t V i t k t
RC
k V t t k V t t k V N t t ,
RC
(4.18)
where
80
2
80
1 20 1
i t t
RC
i
N
k e ,
tt ,
N
k k k .
(4.19)
Then we can use the integration approximation method that introduced in [11] and [12]
to calculate the value of VC2(t80). The output waveform can be approximated as a
linear line from 20% time t20 to 80% time t80. The integral equation of VC2(t80) can be
solved like [11], [12] to
80 20
2
2
4
32
80
80 20
30 8 1 1
4
t t
RC
C dd
RCV t . V e .
t t
(4.20)
Applying Eq. (4.20) to Eq. (4.12), the algorithm of effective capacitance is obtained
as
80 20
2
4
32
1 2
80 20
31 1
4
t t
RC
eff
RCC C C e .
t t
(4.21)
From Eq. (4.21), we can find that this algorithm is respect to the effect of interconnect
resistance shielding some capacitance of C2. When the value of interconnect
resistance R increases from zero to infinity, the value of Ceff decreases from C1+C2 to
C1. Since Eq. (4.21) only has two unknown parameters (time points of output reaching
20% and 80% Vdd) for Ceff, this algorithm does not add the computation complexity
compared with the conventional algorithms.
-84-
4.2.2 Algorithm for Key Parameters t20 and t80
In our method to calculate the effective capacitance, two key parameters (t20 and t80)
need to be determined, which are the time points of output response reaching 0.2Vdd
and 0.8Vdd, respectively. The conventional methods for these time points usually use
the lookup table method or the k-factor equations. However, the work to set up the
data tables or high accuracy k-factor equations needs large time consuming. This kind
of datum usually should be supplied by the manufacturer. In this dissertation, we
introduce a simple iterative method that does not need the empirical data to calculate
the values of t20 and t80. The output voltage Vout (t80) with the effective capacitance
load shown in Fig. 4.4(c) can be expressed as follows:
0
1
d eff
in
d eff d eff
t
R Cdd
d eff d eff in
in
out t t
R C R Cdd
in d eff in
in
V t R C R C e t t
t
V t ,
V t R C e e t t
t
(4.22)
where tin is the transition time of input signal. Form Eq. (4.22), we can obtain the
following expressions for t20 and t80. As the analyses of Eq. (4.22) with the rising or
falling input are in the same way, we just use the rising one for discussing here [21].
Therefore, when t20 tin, t20 can be written as
20
20 0 2 1 d eff
t
R C
in d efft . t R C e , (4.23)
and when t20 > tin, it can be written as
20
1
ln
0 8
in
d eff
t
R C
d eff
d eff
in
R C e
t R C .
. t
(4.24)
Similarly, when t80 tin, t80 can be determined by
80
80 0 8 1 d eff
t
R C
in d efft . t R C e , (4.25)
and when t80 > tin, it becomes
-85-
80
1
ln
0 2
in
d eff
t
R C
d eff
d eff
in
R C e
t R C .
. t
(4.26)
Figure 4.8 The flow chart of computing two key points: t20 and t80.
-86-
Figure 4.9 Step signal for switch-resistor circuit model.
In these equations (4.23) - (4.26) for t20 and t80, the values of tin and Rd are the known
quantities, while t20 and t80 are the unknown parameters. Since it is difficult to
compute t20 and t80 directly, we use a simple iterative method to find the approximate
values [17]. In Fig. 4.8 that is the flow chart of the iterative procedure, tx is t20 or t80
that can be obtained in the same way. The detail procedure of the iterative method is
follows: Firstly, set an initial Ceff equal the sum of C1 and C2. The value of t20 or t80 is
computed by using Eq. (4.24) or Eq. (4.26), respectively. If the result is larger than tin,
then this t20 or t80 is the appropriate value for our delay model. Otherwise, we should
recalculate t20 or t80 by Eq. (4.23) or Eq. (4.25). To improve the efficiency of
computation, the effective initials for t20 and t80 are given in the following way [17]:
1 100
in
x, x
xtt t step , (4.27)
and
2x, x int t step t , (4.28)
where tx(step) is the value of t20 or t80 with a step input, and x correspondingly
represents 20 or 80 [17]. Figure 4.9 shows the analysis of the switch-resistor model
with the effective capacitance and step input condition. From Fig. 4.9, we get the
output voltage Vout (t) as
1 d eff
t
R C
out inV t V step e , (4.29)
-87-
where Vin(step) is the step input voltage. Using Eq. (4.29), we can obtain
20 ln 1 25d efft step R C . , (4.30)
80 ln5d efft step R C , (4.31)
Then we can get the effective initials as
1 2
0 2
x, x ,
x,
t t
t . (4.32)
Therefore, tx,0 can be set into the Eqs. (4.23) and (4.25) to compute the new values of
t20 and t80. With the effective initials, we only need one or two iterations to get the
values of above key time points in most cases [17].
4.2.3 Procedure for Calculating Ceff and Gate Delay
After obtaining the values of t20 and t80, the equivalent load Ceff can be computed by
the following iterative procedure.
1) Let the initial Ceff equal the sum of C1 and C2.
2) Use Ceff to calculate t20 and t80 by the method shown in Fig. 4.8.
3) Apply the values of t20 and t80 to Eq. (4.21) for obtaining a new Ceff.
4) Go back to step 2 with the new Ceff until the iteration converges.
In this method, we define = |Ceffn Ceffn-1|, where Ceffn-1 is the effective capacitance
calculated in the (n 1)th iteration, and Ceffn is the new effective capacitance calculated
by Ceffn-1. The stopping condition of the iteration is < 0.01. The above iterative
method can converge within three iterations in most cases and sometimes it needs
four iterations. With the result of Ceff, the key time point t50 that is used to determine
the gate delay time can be obtained by the same method for t20 and t80. When t50
tin, it can be expressed as
50
50 0 5 1 d eff
t
R C
in d efft . t R C e , (4.33)
and when t50 > tin, it can be written as
-88-
50
1
ln
0 5
in
d eff
t
R C
d eff
d eff
in
R C e
t R C .
. t
(4.34)
First, we calculate t50 by Eq. (4.34). If the result is larger than the input transition time
tin, this result is that we want to obtain. Otherwise, we should use Eq. (4.33) to
calculate the t50 value with several iterations. This method is very simple and need not
to obtain the gate delay time by lookup tables. Nevertheless, this method should add a
small error to the gate delay time. After getting the value of t50, 50% signal delay time
td is easy to compute by [12]
50 2
in
d
tt t . (4.35)
In the Thevenin model, the method for gate driving output resistance Rd is in the
following way. The Thevenin model uses the linear components (input Vin(t) and
resistance Rd) to replace the original gate, and the linear resistance Rd is the driving
output resistance of a gate [17]. The value of Rd can be computed by using the largest
possible load capacitance method that was proposed in [9]. The detailed content of
this method is given in Chapter 3 of this dissertation.
-89-
4.3 Tests and Comparisons
Meanwhile, the parameters of load capacitance
and resistance are in [20]. From the test results, it is clearly seen that the proposed
method has a better performance on accuracy than the total load capacitance model
(C_total) and the methods in [4], [6], [15], [17].
The method in [4] has an average error about 7.3% as shown in Table 4.3. The
procedure of this method also can converge within four iterations in most cases.
However, computation of the each iteration in [4] is very complicated. In contrast, the
proposed method for effective capacitance is very simple that just need to determine
two parameters t20 and t80 in the each iteration. As a very typical method in the
foregoing works, the method in [4] is suitable for the comparison with the proposed
method. Therefore, the simple comparisons of proposed method with [4] are given in
Figs. 4.10 and 4.11, and in Table 4.4. The method in [6] that has an average error
about 4.4% (Table 4.3) is more accurate than that in [4]. However, the parameters of
the effective capacitance algorithm in [6] need much time to be obtained. Moreover,
the paper [15] uses an iteration-less method to make the procedure easy. Nevertheless,
it sacrifices the accuracy and the average error is much larger that of proposed method.
The detailed comparison data are given in Table 4.3. In [17], the parameter of Eq.
(4.3) means the influence of the charge difference, and the value of is a fixed
number. The error of calculating the gate delay can be more than 18% in the worst
case. Compared to the method in [17], our proposed method can catch the various
influence of the charge difference in different circuits and has the average error within
1.3% of SPICE simulation results.
4.3.1 Experimental Results for Various Rd and tin
CMOS process. In these experiments, the gate parameters are instead by the gate
-90-
driving output resistance Rd in Thevenin model. The interconnect load of gate is
transferred into the RC- .
Compared with HSPICE, the accuracy of the proposed method, C_total model, and
the method in [4] for delay calculation is clearly shown in Fig. 4.10. The parameters
for gate delay calculation are set as tin = 200ps and R C1 to C2 is
1/2, while C1 is equal to 0.1pF. From Fig. 4.10, we can find that although the gate
output resistance Rd -
method is always close to the SPICE simulation. In contrast, the error of the proposed
model for gate delay is much smaller than the error of C_total model that can be more
than 35%. Furthermore, with the same conditions, the accuracy of our model is also
better than that of [4].
Figure 4.10 Test results of gate delay evaluation with different Rd.
Figure 4.11 is the test results of the proposed and conventional methods with the
input condition tin changing over a wide interval (100ps - 800ps). The ratio of C1 to C2
-91-
is 1/2, while C1 is equal to 0.1pF. The resistance parameters for delay algorithm are R
Rd From the data in Fig. 4.11, we can find that the proposed
model is more accurate than [4]. Compared with simulation results, the error of our
model remains in a small range with the different values of tin. With the same
parameters, the C_total model has a larger error that can be more than 60%.
Figure 4.11 Test results of gate delay evaluation with different tin.
4.3.2 Experimental Results for Various Capacitance Values
Some examples are tested for various RC- s, and the results are showed in the
following tables . In the first experiment, the input
transition time tin = 300ps, driving output resistor Rd Table 4.1 shows the
values of delay time as well as their errors with various value of C1 and C2 when R =
300 . These values are simulated by the SPICE, and calculated by the proposed
method and [17]. From the test results, we can see that the accuracy is not stable. In
-92-
some cases, the error is small while it becomes very large (more than 18%) in other
cases. The reason is that [17] uses a constant empirical value to evaluate the influence
of charge difference, which cannot capture the various values of with the different
gate and interconnect conditions. In contrast, the proposed method is more accurate
than [17] and the error is always within 3% that the various range is very small. In
results of the proposed method, the errors of some cases are zero. It is because that the
errors of proposed method are very small in these cases and the precision of gate
delay value is 1ps, then the small error is ignored. Compared with [17], the advantage
of proposed method is very clear that can keep a relative high accuracy with different
gates and loads.
Table 4.1 Gate delay calculation for various RC- s.
C1/C2
pF/pF
Ceff (pF) Gate Delay (ps) Error (%)
[17] Proposed SPICE [17] Proposed [17] Proposed
0.1/0.1 0.193927 0.188809 82 85 82 3.7 0
0.1/0.3 0.276461 0.313563 122 111 124 9.0 1.6
0.1/0.6 0.341228 0.428011 165 135 161 18.2 2.4
0.2/0.3 0.384439 0.420953 155 146 159 5.8 2.6
0.2/0.6 0.45848 0.557853 209 178 205 14.8 1.9
0.2/0.35 0.400819 0.446449 166 152 167 8.4 0.6
0.3/0.1 0.385826 0.390738 147 146 147 0.7 0
0.3/0.3 0.493244 0.527889 195 184 196 5.6 0.5
0.3/0.6 0.580499 0.665422 247 212 242 14.2 2.0
0.3/0.9 0.661401 0.780317 291 240 285 17.5 2.1
0.05/0.15 0.161467 0.174568 77 72 77 6.5 0
0.08/0.2 0.217213 0.237677 99 92 99 7.1 0
0.03/0.09 0.104826 0.110288 50 48 51 4.0 2.0
Average Error 8.9 1.2
Table 4.2 also shows the comparison between [17] and the proposed method with
-93-
various values of C1 and C2. The gate output resistance Rd = 200 and the
interconnect resistance R = 450 . Besides, the input transition time tin = 200ps. In this
experiment, the interconnect resistance R is larger than two times of Rd. Thus the
interconnect effect is strong with this type of load that shields more capacitance of C2.
However, the proposed method still can capture the delay time accurately. In this
comparison, the average error of [17] is 7.2% while that of the proposed method is
only 1.3%.
Table 4.2 Gate delay calculation for various RC- s.
C1/C2
pF/pF
Ceff (pF) Gate Delay (ps) Error (%)
[17] Proposed SPICE [17] Proposed [17] Proposed
0.02/0.1 0.080112 0.092847 18 16 18 11.1 0
0.02/0.05 0.058907 0.062986 12 11 12 8.3 0
0.03/0.08 0.083189 0.092325 18 17 18 5.6 0
0.03/0.15 0.112549 0.123336 24 22 23 8.3 4.2
0.12/0.28 0.22295 0.252396 46 41 45 10.9 2.2
0.05/0.18 0.132266 0.152986 30 28 29 6.7 3.3
0.28/0.48 0.414631 0.442043 72 69 72 4.2 0
0.6/0.88 0.757086 0.818064 122 115 121 5.7 0.8
0.15/0.25 0.241569 0.273068 49 45 48 8.2 2.0
0.8/1.2 0.972051 1.055126 156 142 153 9.0 1.9
0.5/0.3 0.612665 0.650287 98 93 98 5.1 0
0.8/0.4 0.940123 0.987931 144 139 143 3.5 0.7
Average Error 7.2 1.3
4.3.3 Experimental Results for Various Gates
Table 4.3 provides the comparison when the gate and the values of RC- load are
varied by using a 0.1 m CMOS process. In this experiment, two inverters are
connected in series. The transistor widths of the first inverter are fixed at Wp =
-94-
20 m/Wn = 10 m. In second inverter, n-channel width Wn
-channel to n-channel keeps Wp/Wn = 2/1. The
parameters of interconnect load (C1, C2 and R) are variable as shown in Table 4.3, and
the transition time of input waveform is tin = 300ps. From the table, it is seen that the
error of the proposed method is only 1.3% while the average errors of the methods [4],
[6] and [15] are about 7.3%, 4.4% and 11.4% respectively.
Table 4.3 Test results of gate delay evaluation with varied circuit parameters.
Wp/Wn C1/C2/R Gate Delay (ps) Error (%)
SPICE [4] [6] [15] Proposed [4] [6] [15] Proposed
10/5 0.1/0.25/300 70.2 74.3 73.5 62.8 69.7 5.8 4.7 10.5 0.7
10/5 0.3/0.8/280 161.1 175.3 168 143.2 164.3 8.8 4.3 11.1 2.0
20/10 0.15/0.35/260 60.8 63.9 63.2 52.7 59.3 5.1 3.9 13.3 2.5
20/10 0.3/0.7/620 77.1 82.6 80.3 70.3 76.7 7.1 4.2 8.8 0.5
40/20 0.2/0.3/450 63.9 67.8 66.2 56.2 63.1 6.1 3.6 12.1 1.3
40/20 0.5/0.9/800 83.3 89.2 87.5 73.8 84.2 7.1 5.0 11.4 1.1
80/40 0.1/0.25/200 76.2 81.5 79.9 69.6 75.6 7.0 4.9 8.7 0.8
80/40 0.35/0.6/1000 85.1 91.7 89.1 74.2 86 7.8 4.7 12.8 1.1
100/50 0.4/0.5/410 93.7 101.7 97.4 81.3 94.6 8.5 3.9 13.2 1.0
100/50 0.9/0.3/320 110.3 120.8 115.3 96.7 112.2 9.5 4.5 12.3 1.7
Average Error 7.3 4.4 11.4 1.3
In Table 4.4, the transistor width of n-channel Wn
-channel to p-channel Wn/Wp is
always equal to 1/2. The other parameters for delay calculation are that tin = 200ps,
C1/C2 = 0.3pF/0.4pF, and R
average error of the proposed method is only 1.1%, which is much smaller than the
7.7% error of method [4]. With the same parameters, the error of C_total model is
36.9%. Although C_total model has a good efficiency, its accuracy is unacceptable in
modern IC designs. Thus, we can say that the advanced model of this dissertation has
-95-
the better performance on accuracy compared with C_total model and method [4], even
though both of the process technique and circuit parameters are changed.
Table 4.4 Test results of delay evaluation with different sizes of gate.
Wp/Wn Gate Delay (ps) Error (%)
HSPICE C_total [4] Proposed C_total [4] Proposed
1/0.5 2841 2916 2800 2826 2.6 1.4 0.5
2/1 1319 1393 1278 1309 5.6 3.1 0.8
5/2.5 493 568 462 489 15.2 6.3 0.8
10/5 231 307 221 228 32.9 4.3 1.3
20/10 115 180 124 112 56.5 7.8 2.6
30/15 89 138 99 88 55.1 11.2 1.1
40/20 77 117 85 76 51.9 10.4 1.3
50/25 70 105 77 69 50 10 1.4
60/30 65 96 72 64 47.7 10.8 1.5
70/35 62 88 67 61 41.9 8.1 1.6
80/40 59 84 64 59 42.3 8.5 0
90/45 57 80 62 56 40.4 8.8 1.8
100/50 56 77 61 56 37.5 8.9 0
Average Error 36.9 7.7 1.1
-96-
4.4 Conclusions
This research is focused on accuracy improvement of effective capacitance
calculation in timing analysis. In STA, the Thevenin model that replaces the
non- linear gate by the linear components is widely used to calculate the cell level
delay. Most of the conventional methods for effective capacitance Ceff usually use the
condition that the charges flowing into RC- load and Ceff load are equal from the
initial time to 50% output time. This condition is proven accurate in the actual gate
model. However, there are some differences between the charges stored in the two
kinds of load when the Thevenin model is used. A reason is that Thevenin model uses
a voltage source as the drive stage while the actual gate is similar as a current source
in the operation region. The charge difference between RC- load and Ceff load has a
significant influence on effective capacitance computing. The detailed description of
charge difference and delay error caused by it are presented in Sect. 4.1.
In order to solve the charge difference problem without adding much complexity,
the characteristics of the charge flowing into RC- load and Ceff load are skillfully
used to constitute a new charge condition for effective capacitance. With the new
condition and some assumptions, a simple and useful effective capacitance algorithm
is obtained. In the proposed method, only two parameters t20 and t80 should be
determined to insert into the iterative procedure for effective capacitance. At the same
time, we give a method to calculate the parameters t20 and t80 without empirical
methods (lookup table method or k-factor equations) that need a large number of test
datum and costly memory space.
Many tests and comparisons have been done to check the accuracy of the proposed
method with different CMOS process techniques. Compared with conventional
methods, calculation results of the proposed method have a relatively higher accuracy
and are much closer to HSPICE results (average error is within 1.3%). Furthermore,
the accuracy of proposed method is very stable with various conditions.
-97-
References
[1]
Computer-Aided Design of Integrated Circuits and
Systems, vol. 2, no. 3, pp. 202-211, July 1983.
[2] P. Modeling the driving-point characteristic of
resistiv Proc. IEEE International
Conference on Computer-Aided Design, pp. 512-515, Nov. 1989.
[3] C. L. Ratzlaff, Modeling the RC-interconnect
effect Proc. IEEE Custom Integrated Circuits
Conference, pp. 15.6.1-15.6.4, May 1992.
[4] J. Qia Modeling the effective capacitance for the
RC in Trans. on Computer-Aided Design of
Integrated Circuits and Systems, vol. 13, pp. 1526-1535, Dec. 1994.
[5] J. L. Rossello and J. Segur -based analytical model for the
Trans.
on Computer-Aided Design of Integrated Circuits and Systems, vol. 21, no. 4,
pp. 433-448, Apr. 2002.
[6] Calculating the effective capacitance for the
RC in c
Design Automation Conference, pp. 43-48, Jan. 2003.
[7]
algorithm for effective
International Workshop on System-on-Chip for Real-Time Applications, pp.
99-104, July 2004.
[8] F. Dartu, N. Menezes, J. Q delay model for
high-s ACM/IEEE Design Automation
Conference, pp. 576-580, June 1994.
[9] F. Dartu, Performance computation for
precharacterized CMOS gates with RC loads, IEEE Trans. on
Computer-Aided Design of Integrated Circuits and Systems, vol. 15, no. 5, pp.
544-553, May 1996.
[10] R. Arunachalam CMOS gate delay models for
-98-
Proc. IEEE International Conference on
Computer-Aided Design, pp. 224-229, Oct. 1997.
[11] Z. Hu ive capacitance for gate delay
Proc. IEEE International Symposium on Circuits and System,
pp.2795-2798, May 2005.
[12] Z. Huang, A. A novel model for computing
the effective capacitance of CMOS gates with interconnect
Trans. Fundamentals, vol. E88-A, no.10, pp.2562-2569, October 2005.
[13] Z. Huang, A. Modeling the effective
capacitance of interconnect loads for predicting IEICE
Trans. Fundamentals, vol. E88-A, no.12, pp. 3367-3374, Dec. 2005.
[14] ective capacitance computations for
use in Proc. the 12th International Conference
on VLSI Design, pp. 578-582, Jan. 1999.
[15] A. B. Kahng and S. New efficient algorithms for computing effective
Proc. ACM/IEEE International Symposium on Physical Design,
pp. 147-151, April 1998.
[16] Osculating Thevenin model for predicting delay and slew of
capacitively characterize 39th ACM/IEEE Design Automation
Conference, pp. 866-869, June 2002.
[17] S. Fang, Z. Hua Calculating the effective
capacitance Proc.
International Conference on Communications, Circuits and Systems, pp.
2474-2477, June 2006.
[18] A new algorithm for computing the effective
capacitan Proc. IEEE Custom Integrated
Circuits Conference, pp. 313-316, June 1998.
[19] C. V. Kashyap An effective capacitance based
on del Proc. IEEE International Conference on
Computer-Aided Design, pp. 229-234, Nov. 2000.
[20] International technology roadmap for semiconductors 2003: Semiconductor
Industry Association.
[21] Huang, Zhangcai Study on modeling, analysis and design techniques for
-99-
nonlinear circuits and systems DSpace at Waseda University, 2009.
[22] Z. Huang, A. Kurokawa, Y. Yang, H. Yu, and Y. Inoue, "Modeling the
influence of input-to-output coupling capacitance on CMOS inverter delay,"
IEICE Trans. on Fundamentals, vol. E89-A, No. 4, pp. 840-846, Apr. 2006.
[23] Z. Huang, A. Kurokawa, M. Hashimoto, T. Sato, M. Jiang, and Y. Inoue,
delay analysis in
-Aided Design of
Integrated Circuits and Systems, vol. 29, no. 2, pp. 250-260, Feb. 2010.
-100-
Chapter 5
A Non-iterative Method for Delay
Calculation of CMOS Gates
-101-
In the previous contents of this dissertation, the methods to accurately compute gate
delay are introduced. This chapter discusses the efficiency issue that is another
emphasis of gate delay model except accuracy and proposes a non- iterative method to
improve it.
As the feature size of VLSIs decreases to the nano-meter region, the VLSI designs
become more complicated that have more gates and interconnects. A 32nm CMOS
process can have 11 layers of metal wire, then the coupled capac itances of
interconnects are increased. Thus, the work to obtain an accurate gate delay value
becomes more difficult and time consuming than ever. The conventional methods
usually use iterative algorithms to ensure the accuracy of the effective capacitance,
which is a key parameter for evaluating the delay time of logic gate and capturing the
output signal shape of the real gate response in STA [13]. Accordingly, the efficiency
is sacrificed, because the each gate calculation needs several iterations. We will
discuss the problem specifically in Sects. 5.1 and 5.2.
In this dissertation, an accurate and efficient approach is proposed for gate delay
estimation. With the linear relationship of gate output time points and effective
capacitance Ceff, a polynomial approximation method is used to make the nonlinear
effective capacitance equation be solved without iterative method. Simulation results
of SPICE tool show that the polynomial is quite close to the original function and
does not add the significant error into the gate delay value. Meanwhile, the proposed
method can be implemented in both of the actual gate model and Thevenin model.
The detailed analysis and derivation are presented in Sect. 5.3.
Compared with the conventional methods, the proposed method improves the
efficiency of gate delay calculation. At the same time, the proposed method keeps a
relative high accuracy and the average error is 2.8% of SPICE simulation results. The
experimental results and comparisons are shown in Sect. 5.4.
-102-
5.1 Introduction
In high-performance digital IC designs, timing is critical for circuit logic functions.
Thus designers have to estimate the ability of a VLSI circuit to operate at the
specified frequency. As the circuit simulation consumes a large amount of time, the
Static Timing Analysis (STA) is an appropriate method to compute the expected
timing of a circuit fast and accurately without requiring any circuit simulation. In STA,
the crucial work is to calculate the signal propagation delay, which is equal to the sum
of the gate delay and the interconnect delay [26]. The techniques on how to calculate
the interconnect delay accurately and efficiently are being developed rapidly, such as
AWE method [5], [6] and PVL method [7]. In contrast, it is difficult to obtain a
precise and efficient gate delay model, because the CMOS gate is composed of
non- linear components. Therefore, modeling the gate delay accurately and efficiently
is very important for both performance estimation and optimization of
high-performance integrated circuits.
Various approaches have been proposed for gate delay estimation based on
generating the effective capacitance Ceff, which is the equivalent load of the actual
interconnect. Most of these conventional methods use iterative algorithms to ensure
the accuracy [8]-[14], [17]-[19], [21], and [22]. As the feature size of process
technology is scaling down, even a single VLSI system becomes more powerful. Of
course, the design has more gates and is more complicated. Iterative methods for
computing the gate delay of such VLSI systems have issues in efficiency. In iterative
methods, the procedure needs to be repeated till some good results for the effective
capacitance are obtained. Generally, this kind of procedure needs three or four
iterations. However, the number of iterations will greatly increase when the initial
value is not good. Furthermore, most of these methods do not consider the
convergence conditions of the algorithms. Thus the algorithms cannot converge in
some cases. Moreover, when the iterative method is applied in a tight
synthesis-analysis loop of circuit delay estimation, the evaluation procedure may need
-103-
to be repeated hundreds of times under any design modification because of gate
interconnect effect. Consequently, the runtime for gate delay estimation may not be
bearable in the above situation. In contrast, non-iterative methods for calculating the
effective capacitance were proposed in [15], [16], and [20]. However, the
non- iterative methods presented the results of obviously lower accuracy [15], [16], or
used an over-simplified gate model without considering the influence of gate
interconnect load [20].
In this dissertation, an effective non-iterative approach for gate delay calculation is
proposed, which can overcome the low accuracy of conventional non-iterative
methods and enhance the efficiency of iterative methods. The proposed method does
not require any iterations to obtain the gate delay and just has an average error within
3% compared to SPICE results. Therefore, with relatively high efficiency and
accuracy, the proposed method is suitable for the circuit optimization loops.
-104-
5.2 Preliminaries
Most of the conventional approaches for gate delay modeling can be generally
divided into two types. One is that the gate delay and the output signal are analyzed
through load capacitance and input transition time by using equivalent gate models,
which are the switch-resistor model [4] and the current source cell model [19]. The
switch-resistor model considers each actual gate as a combination of the gate driving
output resistance Rd and a step voltage source as shown in Fig. 5.1(a). In contrast, the
current source cell model considers each gate as a combination of a current source,
the gate driving output resistance Rg and the equivalent gate parasitic capacitance Cg
in parallel as shown in Fig. 5.1(b). The other type approach for gate delay is that
empirically derived expressions for the gate delay and the output waveform are
pre-characterized as a function of input condition tin and capacitive load CL in the
actual gate model (k-factor equation) [1] [8]
CL
Gate
Rd
CL
CL
Gate
Rg Cg CL
(a) (b)
Figure 5.1 Equivalent gate models: (a) switch-resistor model; (b) current source cell model.
As the feature size of IC technology is scaled down, the resistance of interconnect
loads is comparable or larger than the gate output resistance. The work to obtain an
accurate gate delay time becomes more and more difficult. To further improve the
accuracy of gate model, the step voltage source in the switch-resistor model is
replaced with a time-varying voltage source [10] [20]. However, in this model both of
-105-
the time-varying voltage and the series resistor Rd should be derived by the iterative
method. In the same way as the switch-resistor model, although the current source cell
model can be made time varying to capture the effect of resistance of interconnect
load, this modification will surely make the model much complicated. Besides, the
equivalent gate models are usually used in the cell level delay calculation. Compared
with cell level delay estimation, gate level delay computation consumes more time.
Thus, we consider deriving the non- iterative method for effective capacitance based
on actual gate model first.
When the load of gate is a pure capacitor, the gate delay can be completely
pre-characterized as a function of input signal transition time tin and load capacitance
CL [1], [8]. In STA, the gate delay td is defined as the difference between the time of
input reaching 50% full voltage (0.5tin) and the time of output reaching 50% full
voltage (t50), which is called the 50% point delay. The delay time td is fitted to
k-factor equation that approximates the delay time td as a constant plus k times the
load capacitance CL for a given tin [1], [8]
d in Lt k t ,C . (5.1)
In order to replace the pure capacitance CL in Eq. (5.1), the original - load of a logic
gate should be converted into an equivalent capacitive load Ceff [8] [26]. The RC-
model uses a combination structure of two capacitors and a resistor instead of the
general interconnect wire net, because it is found that the gate output with the - load
can well capture the gate output waveform with the actual interconnect load [2], [26].
Since RC- load was proven accurate, this paper also uses it to approximate the
interconnect loads of gate as same as the conventional methods [2], [8]-[19], [21],
[22]. In order to obtain the effective capacitance from RC- load, we can use the
following condition:
50 50effC
Q t Q t . (5.2)
In the actual gate model, Eq. (5.2) is proven accurate enough for gate delay
calculation. Based on Eq. (5.2), an accurate effective capacitance algorithm was
proposed in [13] and can be written as
-106-
50 20
2
5
32
1 2
50 20
31 1
5
t t
RC
eff
RCC C C e ,
t t
(5.3)
where t20 means the time of gate response Vout(t) reaching 20% full voltage. In Eq.
(5.3), Ceff is the result that we want to obtain, t50 and t20 are the unknown parameters.
Meanwhile, t50 and t20 should be determined by Ceff. Therefore, Eq. (5.3) cannot be
directly solved for Ceff by an explicit method. It is necessary to use an iterative
procedure to compute the approximate results.
Modify gate and interconnect
Calculate driver model data and
Ceff data
Convergence ?No
Calculate delay and output
transition time
Yes
Extract interconnect data
Evaluate the circuit performance
(Does it work well ?)
Complete design
Yes
No
Optimization loop
Figure 5.2 Iterative procedure for delay calculation in the circuit optimization loop.
Like most of the iterative methods for gate delay, this procedure usually converges
with several iterations. For one gate calculation, the iterative procedure does not
impair the efficiency a lot. However, for a cell or a system that has hundreds of
-107-
thousands of gates, this procedure has to spend much time on the iterations. Moreover,
this problem is much worse in the tight circuit optimization loop of delay estimation.
Figure 5.2 shows the iterative procedure for delay estimation in the tight circuit
optimization loop. In the optimization loop, designers need to modify the gates and
interconnect loads of a cell or a system many times to obtain a good circuit
performance. During the gate delay calculation, the output of the previous gate is the
input for the latter gate. Meanwhile, the interconnect loads have a notable effect on
the gate delay. Therefore, designers should compute the total cell delay after every
modification. When the iterative method is implemented for computing, optimization
loop will result in large time consumption on the procedure iterations. Besides, the
method has the following problem. Figure 5.3 shows the trace of Eq. (5.3), where the
component of 5(t50 t20)/(3RC2) is defined as the variable parameter. In Fig. 5.3,
when the value of 5(t50 t20)/(3RC2) is increasing, the value of Ceff is increasing and
the increment speed of Ceff is decreasing. It means that the convergence speed of this
iterative procedure is changed with the different values of 5(t50 t20)/(3RC2). In some
cases that the convergence speed is slow, the iteration number of this procedure will
increase and three or four iterations will not be enough.
C1+C2
C1
Figure 5.3 The trace of Ceff with the various values of component 5(t50 t20)/(3RC2).
-108-
5.3 Proposed Model
In this section, an accurate non-iterative method is introduced in detail to improve
the efficiency of the procedure for Ceff. Since the analysis of falling input signal is the
same as that of rising input signal, this paper will focus on the rising input signal for
simplicity.
5.3.1 Analytical Derivation for Non-iterative Algorithm
In order to obtain a non-iterative method for Ceff without adding complexity, we
consider modifying the original Eq. (5.3) to a new equation that can be computed by
an explicit method. The following content shows how to do the modification. First of
all, Eq. (5.3) can be simplified in the following way. Let x = 5(t50 t20)/(3RC2), the
algorithm of Ceff is that
1 2
11 1 xeffC C C e ,x
(5.4)
where x < 0. If Ceff can be expressed by x, Eq. (5.4) can be modified to an equation
with only one unknown parameter x. When the interconnect loads of gate are reduced
to the pure capacitance Ceff, the effective capacitance Ceff can be used to replace the
load capacitance CL in k-factor equation. At the same time, the gate delay time can be
expressed as td = t50 0.5tin. Then, from the theory of k-factor equation, the
relationship between time t50 and effective capacitance Ceff can be described as a
straight line. Figure 5.4 shows this linear relationship between t50 and Ceff in the actual
cases. When the gate parameters and input transition time are determined, parameter
t50 changes as a straight line with the different values of Ceff. Thus, t50 can be
expressed as
50 1 2 efft k k C , (5.5)
where k1 and k2 are the coefficients of the straight line. With two known points on the
line, the values of k1 and k2 can be determined. Because the unknown parameter t20 is
similar to t50 that has a linear relationship with the parameter Ceff, we use the same
-109-
method of k-factor theory that is gate pre-characterization for this important time
value [13]. The relationship between t20 and Ceff that can be expressed as a straight
line in the actual cases is given in Fig. 5.5. Then t20 can be written as
20 3 4 efft k k C , (5.6)
600
500
400
300
200
100
0
1.51 2 2.5 3 3.5 4
Figure 5.4 Actual cases of the linear relationship between t50 and Ceff.
600
500
400
300
200
100
0
1 1.5 2 2.5 3 3.5 4
Figure 5.5 Actual cases of the linear relationship between t20 and Ceff.
where k3 is the value of t20 when Ceff = 0, and k4 is the slope of the line. For a given
gate, the values of t50 and t20 are stored in two characterized output time tables,
-110-
respectively. Both the tables have the indexes of tin and Ceff as shown in Fig. 5.6.
tin (ps)
Ceff (fF)
50 100 150
10
50
100
The place where the value
of t50 or t20 is stored when
tin=100ps and Ceff=100fF.
Figure 5.6 The t50 and t20 tables with the indexes of Ceff and tin.
With the t50 table, we can obtain the values of t50 and the corresponding Ceff when
input transition time tin is known. Substituting these values into Eq. (5.5), k1 and k2
can be obtained. In the same way, k3 and k4 can be calculated by the t20 table and Eq.
(5.6). With Eqs. (5.5) and (5.6), the difference value between t50 and t20 can be
expressed as
50 20 1 3 2 4 efft t k k k k C . (5.7)
In our proposed method, the t50 and t20 tables are used to calculate the coefficients of
straight lines. These tables need not to store large numbers of datum like lookup table
method or k-factor equation for high accuracy of gate delay times. The reason is that
t50 line and t20 line have the similar characteristics with various Ceff values, and then
the errors of these coefficients are largely counteracted by the subtraction in Eq. (5.7).
Substituting Eq. (5.7) into the equation x = 5(t50 t20)/(3RC2), we can obtain
3 12
2 4 2 4
3
5eff
k kRCC x .
k k k k
(5.8)
In order to simplify the expression, let k = 3RC2/(k2 k4), and u = (k3 k1)/(k2 k4).
Substituting Eq. (5.8) into Eq. (5.4), we can obtain
-111-
1 2
11 1 xk x u C C e .
x
(5.9)
If the value of x is solved by Eq. (5.9), the effective capacitance Ceff can be calculated
by Eq. (5.4). However, as Eq. (5.9) has the component of ex, x cannot be solved
explicitly. Therefore, we need to expand the function f(x) = ex by the
as
20
0 0 0 0
0
0
2!
!
n
n
n
f x
f x f x f x x x x x
f x
x x R x ,
n
(5.10)
where n! denotes the factorial of n (n is an integer and n 0), and f(n)(x0) denotes the
nth derivative of f(x) evaluated at the point x = x0. This equation is called the degree n
Taylor polynomial of f(x) centered at x = x0. The component of Rn(x) is the Lagrange
remainder term that is neglected in the function approximation. This remainder term
is defined as
1
1
01 !
n
n
n
f
R x x x ,
n
(5.11)
where the value of is between x and x0. This equation is just used to estimate the
error when the original function is replaced by the Taylor polynomial. Substituting Eq.
(5.10) into Eq. (5.9), a degree n polynomial equation about x can be obtained as
0 0
1 2
0
11 1
!
nn
n
f x x x
k x u C C .
x n
(5.12)
As f(x) = ex, we have f(n)(x0) = ex0 . From the principle of n
increases, the error between the Taylor polynomial and the original function decreases.
However, if n 5, there is no explicit method to solve Eq. (5.12) for x. In order to
make the algorithm accurate and simple, we let n = 3, which is good enough for this
problem. Error analysis will be given later. Therefore, we can obtain a cubic equation
as
-112-
03
0
1 2
0
11 1
!
nx
n
e x x
k x u C C .
x n
(5.13)
By this cubic equation, the value of x formula without any
iterative approaches. Rewriting Eq. (5.13), we have
00
0
0
2
03 2 0 1 2
0
2 2
2 3
0 0
0
1
1
6 2 2
1 1 0
2 6
xx
x
x
x e x u C Ce kx x x e x
C C
x xe x .
(5.14)
Let coefficients a, b, c, and d be defined as
0
0
0
0
0
2
2
0 1 2
0
2
2 3
0 0
0
6
1
2
1
2
1 1
2 6
x
x
x
x
ea ,
x e kb ,
C
x u C Cc x e ,
C
x xd e x .
(5.15)
Meanwhile, let the parameters p, q be defined as
3
3 2
2
2
27 2 6
3 9
b d bcp ,
a a a
c bq .
a a
(5.16)
x1, x2, x3 of Eq. (5.14) can be expressed as
1 1 2
2
2 1 2
2
3 1 2
3
3
3
bx m m ,
a
bx m m ,
a
bx m m ,
a
(5.17)
where
2 3 2 33 3
1 2
1 3
2
jm p p q ,m p p q , . (5.18)
Here is a complex number, and j is the standard imaginary unit. In Eq. (5.17), only
-113-
one root is the correct value for delay calculation. The way of selecting the suitable
root will be discussed later in this paper. Then with the results of x, the effective
capacitance can be calculated from Eq. (5.4).
Furthermore, the structure of the effective capacitance algorithm in Chapter 4 is
similar to Eq. (5.3) and the algorithm in Chapter 4 is based on the Thevenin model.
Thus the proposed non-iterative method also can be implemented in the Thevenin
model, which is usually used in the cell level delay calculation.
5.3.2 Error Analysis and Algorithm for Key Parameter
Using Eq. (5.14) to calculate the value of x, a key point x0 should be determined.
The value of x0 has a large influence on the accuracy when using Taylor polynomial
approximation. An example with x0 = 0 is shown in Fig. 5.7. The function y2 is the
Taylor polynomial that is used to approximate the function y1 = ex. From Fig. 5.7, we
can see that y2 can only approximate y1 when the value of x is close to 0. The error
becomes larger when the value of x is far from 0. It is more accurate to use the
remainder term describing the error. For example, we assume that the value of error
should be smaller than 0.01. Let = and 0 < < 1, thus the value of is between
x0 = 0 and x. Using Eq. (5.11) and n = 3, we can obtain
y
x
Figure 5.7 Example of Taylor polynomial approximation.
-114-
4
3 0 014!
xeR x x . . (5.19)
When x < 0, we can obtain
0
4 4 0 01
4! 4!
xe ex x . . (5.20)
After solving Eq. (5.20), the value range of x ( 0.7, 0) is obtained that can ensure
the error of approximate polynomial be smaller than 0.01. When the value of x0 is far
from the solution x, the error of polynomial approximation will become larger.
Therefore, the accuracy of the approximation is based on how to determine the
parameter x0. In order to choose a suitable x0, the following two requirements should
be considered. One is the high accuracy. Since the actual solution x is not constant
when the model parameters are changed, the value of x0 should be always close to the
solution x with different gates and interconnect loads. The other is the high efficiency.
We cannot spend much time on finding x0, thus it should be easy to obtain the value
of x0.
In this section, an empirical method is presented to calculate the value of key
parameter x0. In Eq. (5.8), the equivalent capacitor Ceff can be obtained through the
parameter x that is the solution of Eq. (5.9). Besides, Ceff has the linear relationship
with x. We can find a Ceff0 that is always close to Ceff and by using this Ceff0 instead of
Ceff in Eq. (5.8), an x0 can be obtained. Considering the linear relationship between
Ceff0 and x0, this x0 can be regarded as the parameter that is always close to the
solution x. Therefore, the work to determine the x0 is changed to find the
corresponding Ceff0. In selecting the required Ceff0, it is useful to consider the effect of
resistance R shielding some capacitance of C2 in RC- this effect, the
value of effective capacitance decreases from C1 + C2 to C1 when the interconnect
resistance R increases. In order to capture this effect, a Ceff0 close to Ceff can be
evaluated empirically as
0 1 2
d
eff
d
RC C C ,
R R
(5.21)
where Rd is the gate driving output resistance. The value of Rd can be computed by
-115-
using the largest possible load capacitance method that was proposed in [11]. The
detailed introduction of this method is given in Chapter 3 of this dissertation.
Substituting Ceff0 into Eq. (5.8), the value of x0 can be computed as
21
0
d
d
R CC ux .
k k R R
(5.22)
To verify the validity of the above empirical method, we use a CMOS 32nm PTM
model [23], [24] to do tests with the different parameter conditions. Figures 5.8, 5.9
and 5.10 show the results of using the Taylor polynomial y2 to approximate the
original function y1. The parameters Wp and Wn are the widths of p-channel and
n-channel MOSFET in test inverters, respectively. The values of resistance and
capacitance are calculated from the parameters in [25].
y
R ( )
Figure 5.8 Taylor polynomial approximation using empirical value of x0 for various R.
-116-
y
C2 (pF)
Figure 5.9 Taylor polynomial approximation using empirical value of x0 for various C2.
y
tin (ps)
Figure 5.10 Taylor polynomial approximation using empirical value of x0 for various tin.
In the example shown in Fig. 5.8, the interconnect resistance R is varied from 50
to 500 . The capacitance C2 of -load is assumed from 0.01pF to 0.28pF in Fig. 5.9.
Besides, in Fig. 5.10, the value of transition time tin increases from 100ps to 1000ps.
In these tests, the values of x0 that correspond to the different test parameters are
obtained by Ceff0. From the results shown in the three figures, we can see that the
-117-
Taylor polynomial with n = 3 is quite accurate when x0 is determined through the
above empirical method.
In the above tests, the gate sizes and loads are commonly used in the digital designs
with 32nm process. In our proposed method, a key point is to choose the suitable x0 to
ensure the high accuracy of Taylor polynomial approximation. If x0 is far from the
solution x, the error of Taylor polynomial approximation will be larger. The values of
x0 and the solution x are corresponding to Ceff0 and the actual effective capacitance Ceff,
respectively. Here, the error of Ceff0 can be defined as parameter Cerror and Cerror =
|(Ceff Ceff0)/Ceff|. Then, the value of Cerror is increasing when x0 is far from the
solution x, meanwhile, the error of Taylor polynomial approximation becomes larger.
Wp/Wn (um)
0.5/0.25 1/0.5 2/1 4/2 8/4 16/8
Figure 5.11 The trace of error between Ceff and Ceff0 with different gate size.
Do a test, we assume that the input transition time tin and the parameters of RC-
load are constant. The transistor lengths of test gate are set to be Lp = Ln = 40nm. The
following figure shows the relationship between the transistor widths and the value of
Cerror. From Fig. 5.11, we can see that Cerror is increasing as the transistor widths Wp
and Wn are increasing. Moreover, from the empirical algorithm for Ceff0, the error
source is the component of RdC2/(Rd+R). Thus, the error between Ceff and Ceff0
becomes larger when the value of C2 or R is increasing. In order to further check the
-118-
accuracy of our polynomial approximation, we use the relative large size gates and
long interconnect wires to do tests.
y
R ( )
Figure 5.12 Taylor polynomial approximation using empirical value of x0 for various R
(large size case).
y
C2 (pF)
Figure 5.13 Taylor polynomial approximation using empirical value of x0 for various C2
(large size case).
-119-
y
tin (ps)
Figure 5.14 Taylor polynomial approximation using empirical value of x0 for various tin
(large size case).
The test conditions and results are shown in Figs. 5.12, 5.13 and 5.14. From the
results, we can see that the Taylor polynomial approximation with the empirical value
of x0 is still very accurate, even though the test circuit is the large size gate with heavy
loads. Therefore, with the empirical x0, the right-hand side of the equal sign in Eq.
(5.13) can accurately approximate the original function around the solution x of Eq.
(5.9). It is seen that this kind of polynomial approximation does not add a significant
error into the Ceff calculation.
5.3.3 Gate Delay Calculation with Non-iterative Method
After obtaining the parameters of the proposed algorithm, we can compute the gate
delay by the following procedure.
1) Compute the value of x0 by Eq. (5.22).
2) Substitute x0 into Eq. (5.14) and solve Eq. (5.14) for x.
3) Compute the effective capacitance Ceff using Eq. (5.4) with x.
4) Obtain the gate delay by Ceff and k-factor equation.
In solving Eq. (5.14), the number of solutions x is not only one. Besides, the cubic
-120-
equation will have the complex number roots. Of course Ceff with complex number
cannot be used to calculate the gate delay. Therefore, we should choose a correct
value of x for Ceff, which should satisfy the following conditions: 1) x is a real number;
2) x < 0; and 3) the selected negative real x is closest to x0. These conditions are
verified based on a large number of examples.
For a cubic equation as Eq. (5.14), when the coefficients a, b, c and d are the real
numbers, this equation must have at least one real number solution. In the proposed
method, we use the Taylor polynomial to approximate Eq. (5.9), then the solution x
can be calculated explicitly. In Eq. (5.9), let Y1= +u and Y2=C1+C2[1+(1 ex)/x]. The
abscissa of the intersection point P12 of Y1 and Y2 is the solution x, which we want to
obtain. In actual gate output signal, t50 is always larger than t20 and the slope of Eq.
(5.5) is also larger than that of Eq. (5.6) with different Ceff. It means that k1 > k3 and
k2 > k4. Therefore, we can know that < 0 and u < 0. With the condition of x < 0, the
curves of Y1 and Y2 are simply shown in Fig. 5.15.
Figure 5.15 The curve characteristics of functions Y1 and Y2.
In the effective capacitance calculation, no matter what the parameters of gates and
RC- load are, Y1 and Y2 must have the intersection point in the second quadrant. At
the same time, with the monotonic property of Y1 and Y2, they have only one
-121-
intersection point. In Eq. (5.13), let the right part of equal sign be defined as Y3. With
the empirical value x0, Y3 can accurately approximate the part of Y2 that is around the
solution x. Then the functions Y3 and Y1 also can intersect at a point P13 and the
abscissa of this point is very close to the actual solution x. Therefore, we use the
abscissa of P13 that is obtained by Eq. (5.14) instead of the actual solution x to
calculate Ceff. Based on a large number of tests with different gates and interconnect
loads, we found that Eq. (5.14) always has the true solution that is very close to the
actual solution x. The curves of the function in Eq. (5.14) are usually like the form as
shown in Fig. 5.16. When x < 0, Eq. (5.14) has two real number solutions. The reason
is that Eq. (5.13) is multiplied by an x to eliminate the component of 1/x. Then, we
should use the empirical rule to choose the true solution.
Figure 5.16 The general curve form of the function in Eq. (5.14).
-122-
5.4 Tests and Comparisons
In order to verify the performance of the proposed non- iterative method, many
examples were tested with a 32nm PTM model [23]. Test results prove that our
non- iterative method has the relatively higher efficiency compared with method [14]
and the relatively higher accuracy compared with method [15]. Although the paper
[15] uses an iteration-less method, it sacrifices the accuracy with an average error
about 13%. In [14], a low iteration algorithm is used to compute the effective
capacitance within 4% of SPICE results. However, the procedure still needs one or
two iterations. The detailed comparison data are given in Figs. 5.17 5.20 and Table
5.1.
5.4.1 Experimental Results for Various R
R ( )
SPICE
Proposed Method
[15]
Figure 5.17 Test results of non-iterative model for gate delay evaluation with different R.
Compared with HSPICE simulation, the accuracy of proposed model and method
[15] for delay calculation is clearly shown in Fig. 5.17. In this experiment, transistor
widths of the test inverter are Wp = 2 m/Wn = 1 m. Meanwhile, the other parameters
for delay calculation are set as tin = 200ps, C1/C2 = 0.05pF/0.2pF. From the results of
-123-
Fig. 5.17, we can find that although the load resistance R varies over a wide range
(5 - 5 ), the proposed method has the good performance on accuracy that is
always close to HSPICE. In contrast, the method in [15] results in a large error value
that can be more than 15% when the value of R increases.
In the experiment shown in Fig. 5.18, the transistor widths of the test inverter are
Wp = m/Wn = m. Meanwhile, the other parameters of test are tin = 200ps, C1/C2
= 0.2pF/0.5pF. From this test, we can find that our non-iterative delay model keeps a
higher accuracy than [15] when the gate size is bigger and loads are larger.
R ( )
SPICE
Proposed Method
[15]
Figure 5.18 Test results of non-iterative model for gate delay evaluation with large values of
gate and loads.
5.4.2 Experimental Results for Various tin
Figure 5.19 is the test results of the proposed and conventional methods with the
input condition tin increasing (100ps-1000ps). The load parameters of this test are R =
C1/C2 = 0.022pF/0.1pF. The transistor widths of test inverter are Wp = m/Wn
= m. From the data in Fig. 5.19, we can find that the proposed model is much
more accurate than [15]. Compared with HSPICE results that are the standard, the
-124-
error of our algorithm remains in a small range with different tin.
Here, we also use the large size gate and heavy loads to check the accuracy of
proposed method. The load resistance R is . The ratio of C1 to C2 is 1/4, while C1
is equal to 0.1pF. Furthermore, the transistor widths of the test inverter are Wp =
m/Wn = m. From the results shown in the figure, we can see that the proposed
method has a good accuracy for the very wide value ranges of gate size and loads.
tin (ps)
SPICE
Proposed Method
[15]
Figure 5.19 Test results of non-iterative model for gate delay evaluation with increasing tin.
tin (ps)
SPICE
Proposed Method
[15]
Figure 5.20 Test results of non-iterative gate delay model for large values of gate and loads
with varied tin.
-125-
5.4.3 Experimental Results for Various Gates and RC-
Table 5.1 Test results of non-iterative model for gate delay evaluation with varied
parameters of test circuit.
Wp/Wn C1/C2/R Gate Delay (ps) Error (%)
Iteration
Number
of [14]SPICE [14] [15] Proposed [14] [15] Proposed
0.4/0.2 0.05/0.1/50 910 876.3 776.2 884.5 3.7 14.7 2.8 1
0.4/0.2 0.01/0.05/200 385 370.4 335.3 378.1 3.8 12.9 1.8 2
0.4/0.2 0.1/0.2/200 1770 1686.8 1481.5 1747 4.7 16.3 1.3 1
0.5/0.2
5
0.01/0.02/30 182 174.4 210.9 180 4.2 15.9 1.1 1
0.5/0.2
5
0.02/0.05/200 363 349.2 408 355 3.8 12.4 2.2 2
0.5/0.2
5
0.015/0.03/15
0
250 243.3 288.3 241.3 2.7 15.3 3.5 2
1/0.5 0.015/0.06/80
0
192 185.1 171.6 187.8 3.6 10.6 2.2 2
1/0.5 0.008/0.016/6
0
105 100.9 114.8 102.1 3.9 9.3 2.8 1
1/0.5 0.1/0.3/300 920 907.1 1012.9 887.8 1.4 10.1 3.5 2
2/1 0.05/0.3/100 443 426.2 382.3 433.3 3.8 13.7 2.2 1
2/1 0.2/0.5/280 784 761.3 683.6 755 2.9 12.8 3.7 2
4/2 0.2/0.4/400 333 324.7 280.4 324 2.5 15.8 2.7 2
4/2 0.01/0.015/50 95 91.1 80.9 91 4.1 14.8 4.2 1
5/2.5 0.1/0.15/150 191 187 165.2 183.6 2.1 13.5 3.9 2
5/2.5 0.01/0.025/80 103 100.3 87.9 99.4 2.6 14.7 3.5 2
Table 5.1 provides the comparison when the gate and the values of RC- load are
varied. In this experiment, two inverters are connected in series. The transistor widths
of the first inverter are fixed at Wp = m/Wn = m. In the second inverter,
n-channel width Wn increases (0.2 -2.5 ), and the width ratio of p-channel to
n-channel keeps Wp/Wn = 2/1. The input condition of this test is tin = 300ps and the
parameters of modeled -load are variable as shown in Table 5.1. The error value in
the table is defined as Err = |(Di DSPICE)/DSPICE|, i = 1, 2, 3. The parameter DSPICE
is the gate delay obtained by SPICE simulation. Moreover, D1, D2 and D3 are the gate
delay calculated by the methods [14], [15] and the proposed method, respectively.
-126-
Table 5.1 shows that the proposed non-iterative method only has only a 2.8% error
while the average errors of methods [14] and [15] are about 3.3% and 13.5%
respectively. Moreover, the method [14] needs one or two iterations that are not
needed in the proposed method.
To evaluate the efficiency of the proposed method quantitatively, the CPU time
values of the proposed method and the method [14] were tested for 1000 gates on a
HP Compaq dc5800 PC with 2.66GHz Processor and 2GB Memory. In these gates,
the width ratio of p-channel to n-channel keeps Wp/Wn = 2/1, and n-channel widths Wn
are Meanwhile,
each type gates are connected with the following different RC- load. In the modeled
load, the values of C1 and C2 are randomly chosen from 0.01pF to 0.5pF, resistance R
is varied from 100 to 1000 . The CPU time of the method [14] is 1.663s and the
number of total iterations is 1688. With the proposed method, the CPU time is 0.821s.
In [13], the authors showed that the method for calculating the gate delay has an
average error of 2.02%, which is a little less than the error of the proposed method in
this paper. However, the iterative procedure of [13] for one gate typical needs four or
more iterations. The computation complexity of the each iteration in [13] is
comparable to that of the proposed method. Therefore, the proposed method has the
higher efficiency of gate delay calculation without any significant accuracy loss
compared with [13].
-127-
5.5 Conclusions
In STA, one of the most important work is to evaluate the delay time of logic gate
with interconnects accurately and efficiently [26]. With the improvement of VLSI
techniques, the VLSI designs become more and more powerful. The circuit structure
becomes more complicated and more gates are integrated on a single chip. The
conventional methods that usually use the iterative methods for effective capacitance
have issues in efficiency. In order to improve computing efficiency, an accurate and
efficient non- iterative algorithm has been presented to calculate Ceff.
In our method, we use a simple polynomial approximation to modify the nonlinear
Ceff equation. Through the proposed method, the value of Ceff can be directly
computed without the iterative method. Furthermore, with an empirical algorithm for
x0, the proposed polynomial is quite close to the original equation. This approximation
is simple and accurate that does not add much error into the effective capacitance
algorithm. In the text part, we give the detailed error analysis and many examples to
prove the validity of the proposed polynomial approximation. At the same time, the
proposed method can be implemented with both of the actual gate model and the
Thevenin model, because this non-iterative method is independent of the circuit
structure model.
At last, we show a lot of test examples with different gate sizes and interconnect
loads. From the results and comparisons with the conventional methods, we can know
that the proposed method has the higher efficiency for gate delay estimation because
it does not need any iteration. In the meantime, the proposed method keeps a good
accuracy, although the values of test conditions are picked out from the very wide
ranges.
-128-
References
[1] N. H. E. Weste and K. Eshraghian, Principles of CMOS VLSI Design:
Empirical Delay Models, 2nd ed. Reading, MA: Addison-Wesley, pp. 213,
1992.
[2] -characteristic of
resistive interconnect for accura
Conference on Computer-Aided Design, pp. 512-515, Nov. 1989.
[3] C. L. Ratzlaff, Modeling the RC-interconnect
effect Proc. IEEE Custom Integrated Circuits
Conference, pp. 15.6.1-15.6.4, May 1992.
[4] CMOS
IEEE Trans. on Computer-Aided Design of
Integrated Circuits and Systems, vol. 7, no. 12, pp. 1237-1249, Dec. 1988.
[5]
-Aided Design of Integrated Circuits and
Systems, vol. 9, no. 4, pp. 352-366, Apr. 1990.
[6] interconnect circuit evaluation
-Aided Design of Integrated Circuits
and Systems, vol. 13, no. 6, pp. 763-776, June 1994.
[7]
approximation via the Lan -Aided
Design of Integrated Circuits and Systems, vol. 14, no. 5, pp. 639-649, May
1995.
[8]
Computer-Aided Design of
Integrated Circuits and Systems, vol.13, no. 12, pp. 1526-1535, Dec. 1994.
[9] Calculating the effective capacitance for the
RC in c
Design Automation Conference, pp. 43-48, Jan. 2003.
[10] F. Dartu, N. Menezes, J. Qian, and L. T. Pillage, -delay model for high
31st ACM/IEEE Design Automation Conference,
pp. 576-580, June 1994.
-129-
[11] F. Dartu, N. Menezes, and L. T.
precharacterized CMOS gates with RC
Computer-Aided Design of Integrated Circuits and Systems, vol. 15, no. 5, pp.
544-553, May 1996.
[12] Z. Huang, A. Kurokawa, and Y. or gate delay
Proc. IEEE International Symposium on Circuits and System,
pp.2795-2798, May 2005.
[13] Z. Huang, A. Kurokawa, Y Inoue, and computing
the effective capacitance of CMOS gates with Interconnect IEICE
Trans. Fundamentals, vol. E88-A, no.10, pp. 2562-2569, Oct. 2005.
[14]
International Workshop on System-on-Chip for Real-Time Applications, pp.
99-104, July 2004.
[15] ective capacitance computations for
use in Proc. the 12th International Conference
on VLSI Design, pp. 578-582, Jan. 1999.
[16] A. B. Kahng an New efficient algorithms for computing effective
Proc. ACM/IEEE International Symposium on Physical Design,
pp. 147-151, April 1998.
[17]
capacitively charact
Conference, pp. 866- 869, June 2002.
[18] method
for calculating the effective capacitance with RC loads based on the Thevenin
Trans. Fundamentals, vol. E92-A, no.10, pp.2531-2539, Oct.
2009.
[19]
Design, pp. 296-300, Mar. 2001.
[20] M. Shao, M. D. F. Wong, H. Cao, L. P. Yuan, L. D. Huang, and s. Lee,
Explicit gate delay model for timing evaluation Proc. of ACM/IEEE
International Symposium on Physical Design, pp. 32-38, Apr. 2003.
-130-
[21] tive
capacitance of interconnect loads for predicting CMOS gate
Trans. Fundamentals, vol. E88-A, no.12, pp. 3367-3374, Dec. 2005.
[22] model for
calculating the effective capacitance considering input waveform eff
IEEE International Conference on Communications, Circuits and Systems, pp.
1221-1225, May 2008.
[23] Predictive Technology Model (PTM) [Online]. Available: http://www.eas .
asu.edu/ ptm/
[24] W. Zhao and for
sub- , vol. 53,
no. 11, pp. 2816 2823, Nov. 2006.
[25] International technology roadmap for semiconductors 2009: Semiconductor
Industry Association.
[26] Huang, Zhangcai Study on modeling, analysis and design techniques for
nonlinear circuits and systems DSpace at Waseda University, 2009.
[27] Z. Huang, A. Kurokawa, Y. Yang, H. Yu, and Y. Inoue, "Modeling the
influence of input-to-output coupling capacitance on CMOS inverter delay,"
IEICE Trans. on Fundamentals, vol. E89-A, No. 4, pp. 840-846, Apr. 2006.
[28] Z. Huang, A. Kurokawa, M. Hashimoto, T. Sato, M. Jiang, and Y. Inoue,
nanome -Aided Design of
Integrated Circuits and Systems, vol. 29, no. 2, pp. 250-260, Feb. 2010.
-131-
Chapter 6
Conclusions
-132-
6.1 Dissertation Conclusions
High-performance integrated circuits have traditionally been characterized by the
clock frequency at which they operate. In order to check the circuit operating at the
specified speed, the circuit timing analysis of each stage should be done one by one.
One of the timing analysis methods is circuit simulation with a hardware
description language such as VHDL or Verilog HDL. However, simulation method
can only verify the portions of the circuit that get exercised by stimulus. To simulate
and verify all timing conditions of a design with 100 million gates is too slow to
accept. Moreover, the timing cannot be verified completely. Thus, it is very difficult
to do exhaustive verification through simulation. In contrast, statistical timing analysis
(STA) is a fast and exhaustive method for all kinds of timing analysis of a digital
circuit. The STA is static since the analysis of the design is carried out statically and
does not depend on the data values being applied at the input pins. Moreover, timing
verification by simulation method cannot deal with the issues caused by the crosstalk,
noise and parameter variations, which can be added into timing analysis by STA.
In STA, the crucial work is to calculate the signal propagation delay that equals the
sum of gate delay time and interconnect delay time. This dissertation is focused on the
gate delay modeling, because it is difficult to obtain a precise and efficient delay
model with the non- linear logic gate. During the development of STA, the effective
capacitance concept is widely brought to compute the signal transmission delay of
logic gate and capture the output response waveform. Thus, this dissertation is to
improve the accuracy and efficiency of gate delay calculation based on generating the
effective capacitance.
In the first chapter of dissertation, we give the background and motivations of our
research. Then the development process of gate delay model and some typical
conventional methods for gate delay have been overviewed to help understand. Some
basic conceptions such as gate delay, RC- load, and effective capacitance are
introduced. The main part of this dissertation is about the proposed methods for
-133-
improving both the accuracy and efficiency of gate delay model.
First of all, an accurate gate delay model that considering the input waveform effect
is presented in Chapter 3. In the conventional methods for gate delay, the input signal
is always a ramp waveform that is an ideal input. The actual input is the non-ramp
waveform after transferring through many gates and interconnects. In our proposed
model, the non-ramp input effect is considered and modeled as one part of the
effective capacitance. Second, in Chapter 4, we propose an accurate method to solve
the charge difference problem of Thevenin model that is widely used in the cell level
delay calculation. In the proposed method, the original charge condition is modified
based on some simple and accurate approximations. With the modified charge
condition, we can largely improve the accuracy of effective capacitance in the
Thevenin model without adding much computation complexity. Last, a non-iterative
method is proposed to improve the efficiency of gate delay calculation in Chapter 5.
Compared with the conventional iterative methods, our method can directly compute
the effective capacitance for gate delay without requiring any iteration. Meanwhile,
the proposed method can keep a relative high accuracy (average error is only 2.8% of
SPICE results) when the computation time is saved.
In Chapter 6, this dissertation is concluded and the future works is briefly
introduced.
-134-
6.2 Future Works
As the feature size of CMOS process techniques decreases to the nano-meter region,
the parameter variations in semiconductor devices and interconnect are increasing that
induce serious challenges on the performance reliability of integrated circuits (ICs).
IC performance reliability is affected by fabrication- induced process variation and
run-time aging effects. Feature size reduction increases the difficulties of keeping a
precise and stable fabrication process. The physical and electrical parameter
variations in fabrication change the effective channel length and threshold voltage of
device, which have significant impact on IC timing analysis. Therefore, a suitable
method for VLSI timing analysis under process variation is statistical static timing
analysis, which replaces the normal deterministic timing of gates and interconnects
with probability distributions, and gives a distribution of possible circuit outcomes
rather than a single outcome.
The process variation is generally divided into two types: the global variation and
the local variation. In the previous works, the effects of process variation are not
completely considered in the statistical models of delay calculation. The conventional
works only consider the effects of global variation or just focus on mismatch (local
variation) effects. Thus, the research that focuses on establishing a complete and
optimized statistical model for delay calculation with the effects of process variation
is very meaningful. Moreover, in order to make the statistical model further accurate
and reliable, the effects of environment variations such as power supply variation and
temperature variation also should be added into delay calculation.
-135-
Related Papers
[1] JIANG Minglu, HUANG Zhangcai, KUROKAWA Atsushi, LI Na, INOUE
17,
No. 4, pp. 633-639, Oct. 2008.
[2] Minglu JIANG, Zhangcai HUANG, Atsushi KUROKAWA, Shuai FANG and
Yasuaki INOUE method for calculating the effective capacitance
with RC loads based on the Thevenin m Fundamentals,
Vol. E92-A, No. 10, pp. 2531-2539, Oct. 2009.
[3] Minglu JIANG, Zhangcai HUANG, Atsushi KUROKAWA, Qiang LI, Bin
LIN, and Yasuaki INOUE -iterative method for calculating the
Trans. Fundamentals, Vol. E94-A, No. 5, May. 2011.
[4] Minglu JIANG, Zhangcai HUANG, Atsushi KUROKAWA, and Yasuaki
INOUE, advanced model for calculating the effective capacitance
considering input waveform e
Communications, Circuits and Systems, pp. 1221-1225, May 2008.
[5] Minglu JIANG, Qiang LI, Zhangcai HUANG, and Yasuaki INOUE,
non- iterative effective capacitance model for CMOS gate delay c
Proc. IEEE International Conference on Communications, Circuits and
Systems, pp. 896-900, July 2010.
[6] Minglu JIANG, Zhangcai HUANG, Atsushi KUROKAWA, and Yasuaki
INOUE, ffective capacitance model for CMOS gate with
interconnect l The 23rd Workshop on Circuits and Systems in Karuizawa,
pp. 257-262, April 2010.
-136-
Acknowledgments
First and foremost, I am most grateful to my supervisor, Professor Yasuaki Inoue,
who has offered me valuable suggestions in the academic studies. In the preparation
of the dissertation, he had spent much time reading through each draft and provided
me with inspiring advice. Without his patient instruction, insightful criticism and
expert guidance, the completion of this dissertation would not have been possible.
Moreover, in the past five years, his wisdom encouraged me whenever the frustration
comes out and his solicitude helped me both on research and living.
I also thank the co-examiners: Professor Tsutomu Yoshihara, Professor Toshihiko,
and Professor Atsushi Kurokawa, for their valuable instructions and suggestions on
my dissertation as well as their careful reading of the manuscript. Thank them for
sparing their precious time to participate and serve as my Ph.D. committee. Besides,
special thanks to Professor Atsushi Kurokawa for his kind help and patient guide
during my research. His suggestions and revises play a very important role in the
improvements of my published papers.
I would like to show my gratitude to Dr. Zhangcai Huang, a respectable,
responsible and resourceful scholar, who has provided me with valuable guidance in
every stage of my research. Without his enlightening instruction, impressive kindness
and patience, I could not have completed my dissertation. On research, he is like an
instructor for me; in the life, he is my good friend.
I also thank Professor Satoshi Goto, Professor Takeshi Yoshimura, and Professor
Takeshi Ikenaga of Waseda University, who are the leader and members of
Global-COE program, provide me the chance of learning wide field knowledge and
the financial support.
I owe much to my friends and classmates of Inoue Lab for their valuable
suggestions and critiques which are of help and importance in my study life. When I
was a rookie, the veterans such as Dr. Zhangcai Huang, Dr. Hong Yu, and Dr. Shuaiqi
Wang, really helped me when I was truly clueless. Thanks to my research group
members Jinpeng Yu and Li Ding for doing tests, writing report, and discussing the
-137-
research. I am also deeply indebted to other lab mates Na Li, Sui Huang, Guangming
Hu, Shiyu Du, Renyuan Zhang, Qin Luo, Qiang Li, and all the current students, for
their direct and indirect help to me that make my school life more enjoyable.
Finally, I would like to thank my parents, whose support makes it possible for me
to study abroad. I am deeply indebted to my dear wife Jingqiu Huang for her love,
support, and encouragement. Also, I would like to give my sincere thanks to all the
relatives who care about me.
-138-
Publication List
[1] JIANG Ming- lu, Zhang Xiao-bing, and Lei Wei Study on the electron
transfer rate of field emission hopping election cathode, Chinese Journal of
Electron Devices, Vol. 29, No. 1, pp. 65-68, Mar. 2006.
[2] JIANG Minglu, HUANG Zhangcai, KUROKAWA Atsushi, LI Na, INOUE
l. 17,
No. 4, pp. 633-639, Oct. 2008.
[3] Minglu JIANG, Zhangcai HUANG, Atsushi KUROKAWA, Shuai FANG and
Yasuaki INOUE method for calculating the effective capacitance
with RC loads based on the Thevenin m Fundamentals,
Vol. E92-A, No. 10, pp. 2531-2539, Oct. 2009.
[4] Minglu JIANG, Zhangcai HUANG, Atsushi KUROKAWA, Qiang LI, Bin
LIN, and Yasuaki INOUE -iterative method for calculating the
Trans. Fundamentals, Vol. E94-A, No. 5, May. 2011.
[5] Zhangcai HUANG, Minglu JIANG
and wide input range four-quadrant CMOS analog multiplier using active
Eletronics, Vol. E92-C, No. 6, pp. 806-814, June
2009.
[6] Qiang Li, Zhangcai Huang, Renyuan Zhang, Minglu Jiang, Bin Lin and
low voltage CMOS rectifier for low power battery-less
d Nonlinear Theory and Its Applications. Vol. 1, No. 1,
pp. 186-195, Oct. 2010.
[7] Zhangcai HUANG, Atsushi KUROKAWA, Masanori HASHIMOTO, Takashi
SATO, Minglu JIANG
Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol. 29,
No. 2, pp. 250-260, Feb. 2010.
[8] Minglu JIANG, Zhangcai HUANG, Atsushi KUROKAWA, and Yasuaki
INOUE, advanced model for calculating the effective capacitance
considering input waveform e
Communications, Circuits and Systems, pp. 1221-1225, May 2008.
[9] Minglu JIANG, Qiang LI, Zhangcai HUANG, and Yasuaki INOUE,
-139-
non- iterative effective capacitance model for CMOS gate delay c
Proc. IEEE International Conference on Communications, Circuits and
Systems, pp. 896-900, July 2010.
[10] Minglu JIANG, Zhangcai HUANG, Atsushi KUROKAWA, and Yasuaki
INOUE, effective capacitance model for CMOS gate with
interconnect l The 23rd Workshop on Circuits and Systems in Karuizawa,
pp. 257-262, April 2010.
[11] Dan NIU, Zhangcai HUANG, Minglu JIANG
sub-
International Midwest Symposium on Circuits and Systems (MWSCAS),
August 2011.
[12] Li DING, Zhangcai HUANG, Minglu JIANG, Atsushi KUROKAWA, and
-input gate in
on Circuits and Systems (MWSCAS), August 2011.
[13] Na LI, Zhangcai HUANG, Minglu JIANG, and Yasuaki INOUE
efficiency four-phase all PMOS charge pump without body effects,
International Conference on Communications, Circuits and Systems, pp.
1216-1220, May 2008.
[14] Na LI, Zhangcai HUANG, Minglu JIANG, and Yasuaki INOUE -phase
all PMOS charge pump without body effects in standard CMOS t
The 21st Workshop on Circuits and Systems in Karuizawa, pp. 409-414, April
2008.
[15] Qiang LI, Zhangcai HUANG, Minglu JIANG, Renyuan ZHANG, and Yasuaki
INOUE low voltage CMOS rectifier for low power battery- less d
The 23rd Workshop on Circuits and Systems in Karuizawa, pp. 312-315, April
2010.
[16] Renyuan ZHANG, Qiang LI, Zhangcai HUANG, Minglu JIANG, and Yasuaki
INOUE efficient charge pump based on Cockcroft-Walton s
The 23rd Workshop on Circuits and Systems in Karuizawa, pp. 316-321, April
2010.
