Interconnect delay estimation models by Abou-Seido, Arif Ishaq
Retrospective Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 
1-1-2001 
Interconnect delay estimation models 
Arif Ishaq Abou-Seido 
Iowa State University 
Follow this and additional works at: https://lib.dr.iastate.edu/rtd 
Recommended Citation 
Abou-Seido, Arif Ishaq, "Interconnect delay estimation models" (2001). Retrospective Theses and 
Dissertations. 21036. 
https://lib.dr.iastate.edu/rtd/21036 
This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and 
Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Retrospective Theses 
and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, 
please contact digirep@iastate.edu. 
Interconnect delay estimation models 
by 
Arif Ishaq Abou-Seido 
A thesis submitted to the graduate faculty 
in partial fulfillment of the requirements for the degree of 
MASTER OF SCIENCE 
Major: Electrical Engineering 
Major Professor: Chris Chu 
Iowa State University 
Ames, Iowa 
2001 
Copyright © Arif Ishaq Abou-Seido, 2001. All rights reserved. 
11 
Graduate College 
Iowa State University 
This is to certify that the Master's thesis of 
Arif Ishaq Abou-Seido 
has met the thesis requirements of Iowa State University 
/ 
Signatures have been redacted for privacy 
iii 
To my parents 
Ishaq and Samira Abou-Seido 
iv 
TABLE OF CONTENTS 
ABSTRACT 
CHAPTER 1. INTRODUCTION 
1.1 Overview 
1.2 The organization of this thesis 
CHAPTER 2. FITTED ELMORE DELAY MODEL 
2.1 Fitted Elmore Delay model formulation 
2.2 Technology parameters 
2.3 Hspice simulations 
2.4 The FED coefficients 
2.5 Extension to wire sizing 
2.6 Extension to interconnect trees 
2.7 Transformed Elmore Delay model 
CHAPTER 3. IMPROVED EFFECTIVE CAPACITANCE 
3 .1 Resistive shielding 
3.2 Driving point admittance approximation 
3.3 Effective Capacitance Metric (ECM) 
3.4 Improved Effective Capacitance Metric (IECM) 
3.5 ECM and IECM comparison 
CHAPTER 4. DISCUSSION AND FUTURE WORK 
4.1 Discussion 
4.2 Future work 
REFERENCES 
ACKNOWLEDGMENTS 
V 
1 
1 
7 
8 
8 
10 
11 
12 
15 
19 
22 
25 
25 
27 
30 
32 
37 
46 
46 
48 
49 
52 
V 
ABSTRACT 
With the continuous scaling down of very large scale integrated (VLSI) technologies 
and increased die size, the transistors are much smaller, and hence much faster. On the other 
hand, interconnects are narrower. So they are more resistive and slower in transmitting 
signals. This trend has led the interconnect delay to become a significant factor in 
determining the performance of VLSI designs. As the die size becomes larger, global 
interconnect length becomes longer. Thus, the global interconnect delay is beginning to 
dominate a larger portion of the overall system performance. In order to take the impact of 
interconnect delay into account, it is very important to have computationally inexpensive and 
accurate interconnect delay models. 
The primary contribution of this thesis is to present two new interconnect delay 
models, called the Fitted Elmore Delay (FED) and the Improved Effective Capacitance 
Metric (IECM). The FED model is a simple, efficient and reasonably accurate interconnect 
performance estimation model. This model uses a curve fitting technique to approximate the 
accurate Hspice delay data. The functional form used in the curve fitting is derived based on 
the Elmore Delay (ED) model. Thus, our model has all the advantages of the Elmore Delay 
model in addition to better accuracy. It has a closed form expression as simple as the ED 
model and is extremely efficient to compute. More importantly, it is significantly more 
accurate than the ED model. In fact, because of its striking similarity to the ED model, 
optimization of the delay with respect to the design parameters can be easily done. When 
applied to interconnect optimization techniques (i.e., wire sizing), the FED model is three to 
Vl 
four times more accurate than the Elmore Delay model. On the other hand, like the ED 
model, the FED model has the limitation of ignoring the resistive shielding problem. This 
problem occurs when the gate no longer sees the total net capacitance due to the high 
interconnect resistance. 
The Improved Effective Capacitance Metric IECM overcomes the resistive shielding 
problem. We adopt the methodology of O'Brian and Savarino [9] to compute the first three 
Taylor series coefficients of the driving-point admittance in the circuit. The IECM uses these 
Taylor coefficients to derive a closed form solution for the effective capacitance that captures 
the resistive shielding characteristics. The IECM can be computed with similar simplicity as 
the Elmore Delay model. We have tested the IECM on a single-load circuit and multiple tree 
topologies. Experiments show that our model is significantly more accurate than other 
existing interconnect delay models in capturing the resistive shielding characteristics. 
1 
CHAPTER 1. INTRODUCTION 
1.1 Overview 
An interconnect is a conductive path that connects a circuit element (transistor, 
resistor, capacitor, inductor, etc.) to the rest of the circuit. In VLSI, interconnect wires are 
divided into three classes: local, intermediate, and global wires. Local interconnects connect 
the gates and transistors within the functional unit block. Intermediate wires connect the 
functional unit blocks with each other. Global wires provide communication between large 
modules (units) and also provide clock distribution in the chip. Figure 1.1 shows a picture of 
a VLSI circuit. 
Unit 
Functional Unit Block 
(FUB) 
Figure 1.1 VLSI circuit 
Intermediate 
Interconnects 
Global 
Interconnects 
Local 
Interconnects 
2 
With the continuous scaling down of very large scale integrated (VLSI) technologies 
and increased die size, the gates, which compose of transistors, are much smaller, and hence 
much faster. On the other hand, interconnects are narrower. So they are more resistive and 
slower in transmitting the signals. This new trend has led the interconnect delay to become a 
significant factor in determining the performance of VLSI designs. 
As the die size becomes larger, global interconnect length becomes longer 
proportional to the die size. Thus, the global interconnect delay is beginning to dominate a 
larger portion of the overall system performance. 
New materials, such as copper and low dielectric constant (K) materials, have been 
used to improve the performance of interconnect. However, at the global interconnect level, 
the benefit of material changes alone is insufficient to meet the overall system performance. 
Even with the help of copper and low K materials [21], predicted interconnect delay is still 
likely to dominate the chip performance beyond the 0.18 µm technology. Therefore, we can 
expect the significance of interconnect delay to rapidly increase in the near future [ 4]. 
Therefore, interconnect optimization and planning impact all aspects of the recent VLSI 
design cycle. 
Currently, interconnect optimization and planning is usually performed at very late 
stages in the design cycle. Nowadays, the most popular place to optimize interconnects is 
still at the physical design level [ 4]. Figure 1.2 shows the VLSI design cycle. In Figure 1.2, 
the physical level deals with floorplanning, placement, and routing where the wire length is 
primarily determined. 
Circuits designers can no longer ignore the interconnect impacts while designing 
circuits. The new trend of the VLSI design flow is to extend interconnect optimization and 
3 
planning into the logic synthesis level in order to accurately predict the impact of 
interconnect on the system performance in earlier stages. As a result, high level-synthesis, 
logic synthesis, and physical layout tools are becoming more interconnect-centric. In order 
to take the impact of interconnect delay into account, it is very important to have 
computationally inexpensive and accurate interconnect delay models. 
System Specification 
Architectural Design 
Functional Design 
Logic Design 
Circuit Design 
Physical Design 
Fabrication 
Packaging and 
testing 
Figure 1.2 VLSI design cycle [5] 
4 
Currently, circuit simulators like Hspice [6] are used to compute delays very 
accurately but it is quite inefficient since we are dealing with millions of delay calculations in 
the VLSI design. Therefore, there is a need for an interconnect delay model that is both 
accurate and computationally inexpensive. Many interconnect delay models have been 
proposed by analyzing the moments of the impulse response [7]. Asymptotic waveform 
evaluation [8] is a generalized approach to the response approximation by moment matching. 
It is very accurate but computationally very expensive. Hence, many moment-matching 
variants using the first two to four moments have been proposed [9, 10, 11, 12]. Those 
variants are relatively much more efficient but less accurate. Nevertheless, they may still be 
too expensive to be used within the tight optimization loops of design synthesis and layout 
tools. Moreover, for all the models above, the delay is either computed by an iterative 
procedure or expressed as a sophisticated implicit function of the design parameters. 
Sensitive information (i.e., optimal width) cannot be easily calculated due to the complexity 
of the model. Therefore, these models provide little insights into determining the design 
parameters during design or optimization. Closed form delay equations are preferred since 
they are efficient and easy to implement as long as the model has enough accuracy. 
As a result, the Elmore Delay (ED) model metric [13], which is the first moment of 
the impulse response, is the most widely used interconnect delay model during synthesis and 
layout [14]. It can be written as simple and closed form expression in terms of the design 
parameters, which come very beneficial in terms of the design optimization. It is extremely 
efficient to compute and it provides useful insights for the optimization algorithms. It has 
also been shown to have good fidelity with respect to Hspice simulation [15, 16, 17]. The 
primary disadvantage of the Elmore Delay model is that it has limited accuracy. It always 
5 
over-estimates the delay [17]. So a commonly used approach is to scale the Elmore Delay by 
ln(2) or about 0.69 [7]. This scaling factor guarantees the output voltage to reach Vdi2 in the 
RC circuits. We call this metric the Scaled Elmore Delay (SED) model. The Scaled Elmore 
Delay model is more accurate than the Elmore Delay model. However, the SED model can 
significantly underestimate a large portion of delays. 
We present is a new closed form interconnect delay model called the Fitted Elmore 
Delay (FED) model. The FED is simple, efficient and reasonably accurate interconnect 
performance estimation model. For an interconnect wire of length l and width w connecting 
a driver with driver resistance Rd and a load with load capacitance CL, the Fitted Elmore 
Delay model is given by: 
T FED (RD, CL, l, w) = A RD l w + B RD l + C RD CL + D 12 + E 12 /w + F CL l /w 1.1 
The coefficients A, B, C, D, E, and F model are determined by a curve fitting 
technique to approximate the accurate Hspice delay data. The functional form (1. 1) used in 
the curve fitting is derived based on the Elmore Delay model. Thus, our model has all the 
advantages of the Elmore Delay model. It has a closed form expression as simple as the 
Elmore Delay model and is extremely efficient to compute. Most importantly, it is 
significantly more accurate than the Elmore Delay model. In fact, because of its striking 
similarity to the Elmore Delay model, optimization of the delay with respect to the design 
parameters can be easily done. When applied to interconnect optimization techniques (i.e., 
wire sizing), the FED is shown to be more accurate than the Elmore Delay model. 
The FED model has the limitation of ignoring the resistive shielding problem. That 
is, as the gate resistance gets smaller and the metal resistance gets larger, the gate no longer 
6 
"sees" the total net capacitance and the gate delay may be significantly less than expected 
[3]. O'Brian and Savarino [9] proposed a resistor-capacitor (RC) n-type model that 
approximates the entire interconnect tree while still considering resistive shielding effects. 
Their approximation is based on computing the first three moments of the driving point 
admittance at a given node. 
There are several proposed methods [3,19] that deal with the resistive shielding 
problem and use O'Brian and Savarino [9] RC n-type model. Pillage and coauthors [3] make 
the n-type model proposed by [9] compatible with k-factor delay formulas (i .e., the delay 
table modeling methodology) in order to approximate the interconnect tree by an effective 
capacitance that captures the resistive shielding in the circuit. The approach proposed by [3] 
requires a costly iterative procedure until their method converges. Kashyap, Alpert and 
Devgan [19] use the methodology of [9] to derive a closed form solution for the effective 
capacitance based on a unit-step voltage source. The approach of [19] does not provide 
much accurate delay results in comparison with the Hspice simulation tool. 
We present the Improved Effective Capacitance Metric IECM that overcomes the 
resistive shielding problem. It accounts for this resistive shielding by computing the 
effective capacitance to model the downstream capacitance in the circuit. We adopt the 
methodology of O'Brian and Savarino [9] to compute the three components of the RC n-
model to capture the resistive shielding. The IECM can be computed with similar simplicity 
as the Elmore Delay model. We have tested the IECM on a single-load circuit and multiple 
tree topologies. Experiments show that the IECM is significantly more accurate than the 
Elmore Delay model and the model proposed by [19]. 
7 
1.2 The organization of this thesis 
In chapter 2, the Fitted Elmore Delay (FED) model is introduced in details. 
Experimental results are presented to show that the FED model is 3 to 5 times more accurate 
than the Scaled Elmore Delay (SED) model. The FED model is tested on wire sizing as an 
interconnect optimization technique. Experimental results are presented to show that our 
model is 3 to 4 times more accurate than the Elmore Delay model in terms of wire sizing. 
FED is also tested on interconnect trees. We also introduce the Transformed Elmore Delay 
(TED) model which basically has the same form of the Elmore Delay model, but almost as 
accurate as the FED model. In chapter 3, we discuss the resistive shielding problem and 
introduce the Improved Effective Capacitance Metric (IECM). We test the IECM on 
different tree topologies. We compare the IECM with different existing delay models and 
show how the IECM has more accurate delay results. Chapter 4 summarizes the work, and 
discusses future research directions. 
8 
CHAPTER 2. FITTED ELMORE DELAY MODEL 
In this chapter, we present the derivation and some experimental results for the Fitted 
Elmore Delay (FED) model. The basic idea is to approximate an accurate delay data by 
curve fitting to an equation. In order to have a simple closed form model, the functional 
form used in the curve fitting is derived based on the Elmore Delay model. Although the 
Elmore Delay model is not very accurate by itself, it provides useful insights into the 
dependence of interconnect delay with various design parameters. 
2.1 Fitted Elmore Delay model formulation 
In the FED formulation, we model the driver as a resistor since the driver is assumed 
to be in the linear region. A wire is modeled as n-type resistor-capacitor (RC) model. Figure 
2.1 shows an interconnect wire of length l and width w connecting a driver with a driver 
resistance RD and a load capacitance CL, The values Rint and Cnt represent the unit 
interconnect resistance and capacitance respectfully and are calculated by equations 2.1 and 
2.2. 
Cnt = Ca l W + CJ l 
rl 
Rint = -
w 
L 
w 
Ro 
Figure 2.1 Single-load circuit 
Rnt 
2.1 
2.2 
9 
The function form of the FED model is based on the Elmore Delay model. The 
Elmore Delay at a given node is given by: 
2.3 
Where Rk is the local resistance at node k and Ck is the downstream capacitance from 
node k including the capacitance at node k. Applying equation 2.3 on Figure 2.1 in order to 
compute the delay at the loading capacitance CL. 
2.4 
2.5 
Where A = Ca, B = CJ, C = 1, D = r Ca /2, E = r CJ /2, and F = r. If we modify the 
values of A, B, C, D, E, and Fin equation 2.5, we can get a better approximation to accurate 
delay data. On the other hand, the Fitted Elmore Delay model is defined as follows: 
2 2 TFED (RD,CL,l,w) =ARD lw + B RD l + C RDCL + D l +El lw + F CLZ/w 2.6 
Where the coefficients A, B, C, D, E, and F are determined by a multiple linear 
regression [23] to accurate delay data. These accurate delay data points are based on the 
Hspice Simulion tool. We run the Hspice simulations for multiple VLSI technologies. 
2.2 Technology parameters 
In order to test the Fitted Elmore Delay model, we consider 0.25µm, 0.18µm, 
0.13µm, and 0.07µm technologies. The values of the device parameters in our derivation are 
based on the 1999 National Technology Roadmap for Semiconductors [21]. The notations of 
technology parameters are listed below: 
Wmin: minimum wire width, in µm 
r : sheet resistance in .Q / 
ca : unit area capacitance in fF/µm2 
c1 : unit effective fringing capacitance, in fF/µm 
cg: input capacitance of a minimum device, in fF 
rg: output resistance of a minimum device, in K.Q 
The values of technology parameters are listed in Table 2.1. 
Table 2.1 Technology parameters 
0.25 0.25 0.0733 0.0589 0.0819 
0.18 0.18 0.0679 0.0596 0.0641 
0.15 0.15 0.0733 0.0542 0.0538 
0.13 0.13 0.0806 0.0461 0.0433 
0.10 0.10 0.0917 0.0531 0.0448 
0.07 0.07 0.0952 0.0558 0.4040 
0.282 16.2 
0.234 17.1 
0.220 17.3 
0.135 22.1 
0.072 23.4 
0.066 22.1 
11 
2.3 Hspice Simulations 
In the simulations, each interconnect is modeled as a distributed RC n-type model. 
The distributed RC model gives a more realistic view of an actual metal line on a chip. The 
infinite number of a distributed line makes circuit simulation very difficult. Thus, each 
interconnect is modeled as 30 n-type RC segments since it is reasonable enough to capture 
the true delay. 
The accuracy of our model is limited by the accuracy of the Hspice data generated. 
So it is very important to generate accurate Hspice data points. We are using the 
"ACCURATE" option and set the "MEASDGT" option to ten significant figures m the 
Hspice simulations, which can make a 2-3% difference in the values generated. 
To properly curve fit the Hspice data points, we must generate enough data points in 
the region of interest for each process technology. We notice that few values for each design 
parameter (RD, CL, l, w) are enough to capture the proper characteristic of the model. The 
driver resistance RD and the load capacitance CL have been sized using equations 2.7 and 2.8. 
r 
RD= _g_ 
size 
2.7 
2.8 
For driver size, load size, and w1re width, we are usmg six values uniformly 
distributed in the region of interest for each. However, for the wire length, we observe that 
our model has a larger relative error when the wire length is short (i.e., the delay is small). 
So ten values are used for the wire length. Most of these points are chosen over a small wire 
length. We start from l = 450µm so that l = 500µm will not be in the boundary of our model. 
12 
i 
In particular, we are using: (li = 450 y2 - 1 ) where i = 0, ... , 9 and y = (18000/450)11511 to 
generate the length data points. 
For each technology, we run the Hspice simulations on all combinations of the design 
parameter values (i.e., 6 X 6 X 6 X 10 = 2160 points) as shown in Table 2.2. The total CPU 
time for each technology Hspice simulation is about 2 hours on a HP C360 machine with a 
367 MHZ processor and 512 MB of memory. After the Hspice delay data points are 
generated for all technologies, the statistical package SAS [24] is used to perform a multiple 
regression to determine the coefficients of the FED model in equation 2.6. 
Table 2.2 Region of interest and number of points used for design parametrs 
Driver size (rgf RD) (10 to 510) X min. device 6 
Load size (Cuc ) (10 to 510) X min. device 6 
Wire width (w) (1 to 20) X Wmin 6 
Wire length (l) 500 to 18000µm 10 
2.4 The FED coefficients 
The coefficients of the Fitted Elmore model for 0.25µm, 0.18µm, 0.13µm, and 
0.07µm technologies based on SAS [24]. The coefficients of the FED model are listed in 
Table 2.3. After determining the coefficients, our model is now compared with the Scaled 
Elmore Delay (SED) model for delay estimation since it is more accurate than the Elmore 
Delay model due to the scaling factor (In 2) or about 0.69. For each technology, the Scaled 
Elmore Delay model and the Fitted Elmore Delay are found for 3800 points covering the 
whole region of interest. A large portion of these points is on the boundary. 
13 
Table 2.3 Coefficients for the Fitted Elmore Delay model 
B 4.57129 X 10- 2.84521 X 10-
C 0.69545000000 0.69610000000 0.69960000000 0.69682000000 
D 1.67958 X 10- 1.66789 X 10- 1.45498 X 10- 2.09524 X 10-
E 2.29445 X 10- 1.66789 X 10- 1.33342 X 10- 1.44502 X 10-
F 0.05296000000 0.04924000000 0.05886000000 0.07011000000 
The maximum relative error and average relative error are reported in Table 2.4. One 
can see that for our model, the maximum relative error is only 2% and the average relative 
error is less than 0.8%. Figure 2.2 shows the delay by Hspice, our model, and the Scaled 
Elmore Delay model for: RD = rg /100, CL = Cg x 100, w = 6 x Wmin· Our model is virtually 
indistinguishable from the Hspice data for 0.18µm technology. Figure 2.3 shows an enlarged 
portion of Figure 2.2. 
Table 2.4 Relative errors in delay for Scaled Elmore model and FED 
' 
0.25 8.48% 1.68% 2.82% 0.69% 
0.18 8.48% 1.79% 3.13% 0.73% 
0.13 8.49% 1.94% 3.53% 0.79% 
0.07 8.49% 2.00% 4.88% 0.73% 
x Our M c ,de l 
Scaled El rnore 
Hspice 
14 
1200 L'-::--:-. .,,.._ '""'· _,--,-_-:-_ -:-:. -::-::--::---:-. ,,.-,_ .:-::_-,-_ .,,.._ -:-:. _,--,-.-:-~ -:-:. _::-:.~-. - • - . -
1 000 - · - · - · - · -
4 00 - · - · - · - · - - · - · - · - · - · · - · - · - · - · -
200 - · - · - · - · -
Q L-5!:!_;,_ _ _._ ___ __._ ___ ..__ __ _._ ___ ___._ ___ ..__ __ __._ __ __._ __ ___, 
0 2000 4000 6000 8000 1 0000 1 2 0 00 14000 16000 18000 
L eng.th (um) 
Figure 2.2 Comparison of the FED and the Scaled Elmore Delay model with Hspice delay 
65 ~~---_-_-_-_-_-_-_-_-_-_-_:_-_:_-_-_-_~:....:;-----------r------------r-----------, 
00 
55 
50 
30 
25 
20 
15 
X: Our Model 
Scaled Elmore 
Hspioe 
. . 
- • - • - • - • - • - • - • - • - • - • - -- - • - • - • - • - • - • - • - • - • - • - • - ·- - • - • - • - • - • - • - • - • - • - • - • -- - • - • - • - • -. ' ' . . . . ' ' . . . 
' ' . . . . 
- - • - • - • - • - • - • - • - • - • - • _., - • - • - • - • - • - • - • - • - • - • - • - ..... - • - • - • - - • - • - • - • - • - • - • ~- - - • - • - • - • - • - • - - • - • - • 
- • - • - - • - • - - - • - • - • - • _ ... - • - • - • - • - • - - • - • - • - • - • - ..... . - • - • - • - • - - • - - • 
' ' . .
' ' . . 
' ' 
- • - • - • - • - • - • - • - • - • - • - • _ ... - • - • - • - - • - • - .... - • - • - • - _._. - • - • - • -. . "1". - • - • - • - • - • -!· - . - . - . - . - . - . - . - . - . - . - . 
' ' . .
' ' . .
' ' - . - . - . - ..- . - . - . - . - . - . - . - . - . - . - . - . -.. - . - . - . - . - . - . - . - . - . - . - . 
' ' . . 
' ' . .
' . . . . - . - . - . - . - . - . - . - . - ·r. - . - . - . - . - . - . - . - . - . - . - . .._ . - . - . - . - . - . - . - . - . - . - . - . . . 
' . . . 
+ ' • . . 
' . . 
- • - • - • - • - • - • - • - • - • - • -"' - • - • - • - • - • - • - • - • - • - • - • - •r • - • - • - • - • - • - • - • - • - • - • - • -• - • - • - • - • - • - • - • - • - • - • - • 
' ' ' . . . 
' . ' . . . 
' . ' . . . - . - . - . - . - . - . - . -· - . - . - . - . - . - . - . - . - . - . - . - . .... - . - . - . - . - . - . - . - . - . - . - . -- - . - . - . - . - . - . - . - . - . - . - . . . . . . . 
' ' . .10L-----------'-----------'------------..__ ________ __, 
fi(l(} 1000 ?000 
Figure 2.3 Enlarged portion of Figure 2.2 
15 
We notice that our model is still very accurate for points outside the region of interest. 
For each technology, we generate 500 random points such that the driver size and load size 
are from 5x to 1020x min. device, wire width is from 0.5 x to 40 x Wmin , and wire length is 
from 500 to 36000µm. The maximum and average relative errors in delay are reported in 
Table 2.5. There is no significant difference from the results in Table 2.4. 
Table 2.5 Relative errors in delay for points outside the region of interest 
0.25 8.42% 1.57% 2.28% 0.69% 
0.18 8.47% 1.91% 2.51% 0.80% 
0.13 8.48% 1.92% 2.89% 0.90% 
0.07 8.49% 2.41% 4.05% 0.99% 
2.5 Extension to wire sizing 
In this section, we extend the Fitted Elmore Delay (FED) model to an interconnect 
optimization technique, known as the wire sizing. We consider two wire sizing problems. 
The first problem is to optimize a uniform wire width to minimize the delay. The second 
problem is to minimize wire width subject to delay bound. The delay bound is set to 10% 
over the optimal delay. All four technologies are tested. To fairly represent all possible 
16 
design parameters, 100 random points in the region of interest are generated. Note that to 
minimize delay, the optimal widths by the Scaled Elmore Delay model and FED can be 
found by differentiating equations 2.4 and 2.6 respectfully with respect tow. 
The optimal width derived from the Scaled Elmore Delay model is given by: 
:\ :\ 2 2 2 uTsEDluw = ln(2)(caRDl- (rc112)l lw -rCdlw )=0 
WsED_optimal = ( r (CJ l /2 + CL) I Co Ro) 
The optimal width derived from the FED model is given by: 
2 2 TFED (RD, CL, l, w) =ARD l w + B RD! + C RD CL+ D l +El I w + F Cd I w 
:\ :\ 2 2 2 u T FED I uw = A RD l - E l lw - F Cd I w = 0 
WFED_optimal = ((El+ F CL) I A Ro) 
It is obvious that the minimized wire width subject to delay bound by the Scaled 
Elmore Delay model and the Fitted Elmore Delay model can be written in simple closed 
forms. To find the optimal wire width in Hspice, a binary search is used to obtain the 
optimal wire width for every process technology. The results in delay minimization are 
summarized in Table 2.6. On average, for the 0.18 technology, the FED model produces a 
3 .4 x improvement. 
17 
Table 2.6 Relative errors in optimal wire width for Scaled Elmore model and our model 
0.25 6.32% 2.44% 5.40% 1.55% 
0.18 6.28% 2.62% 5.41% 1.61% 
0.13 6.31% 2.68% 5.37% 1.68% 
0.07 6.30% 2.97% 5.13% 1.81% 
The delay versus width for one of the random cases on the 0.18µm technology is 
plotted in Figure 2.4. For this case, RD= r8 /l3 .59, CL= c8 x 121.40, and l = 2674 m. This 
case generates an error of 6.16% for the Scaled Elmore model and an error of 2.47% for the 
Fitted Elmore Delay model when compared with Hspice. This gives us a 2.5 x improvement 
over the Scaled Elmore Delay model for this case. Figure 2.5 shows an enlarged version of 
Figure 2.4. The results on wire width minimization subject to delay bound are summarized 
in Table 2.7. 
Table 2.7 Relative errors in wire width for wire width minimization subject to delay bound 
0.25 23.67% 2.40% 18.19% 1.32% 
0.18 23 .96% 2.67% 18.34% 1.34% 
0.13 24.11% 2.85% 18.53% 1.28% 
0.07 23.69% 4.28% 19.24% 0.83% 
18 
440~-----~--------,----------,------,----------,----------, 
420 
400 
380 
360 
340 
320 
Hspice 
+ Elmore 
o Model 
300 
+ 
.. . .. -t-. -0 
280 
+ + + + + + + :+ 
260~-----~-----~-----------~-----~-----~ 
0 0.1 0.3 
Width (um) 
0.4 0.5 
Figure 2.4 Comparison between the FED and the Scaled Elmore and Hspice 
0.6 
290-------~-------~-------~-----~-------- -~-----------=--=--=--=-=-===:::;--
x Our Model 
Scaled Elmore 
288 Hspice 
X 
286 X 
X X 
X 
X X 
X 
284 x -
X 
ur282 -
280 
278 --
276 
274 -
272~------~-------~-------~-------~--------' 
0.24 0 .28 0.32 0 .36 0 .4 0.4 
Width (um) 
Figure 2.5 Enlarged portion of Figure 2.4 
19 
In Table 2. 7, the Scaled Elmore Delay (SED) model performs poorly in this 
experiment. The average errors in wire width are more than 18%. In fact, because SED 
tends to underestimate the delay, all the wire widths computed according to SED are 
significantly less than those by Hspice. In other words, all the solutions by SED cannot 
satisfy the delay bound. If the Elmore Delay (ED) model is used instead, since ED always 
significantly overestimates delay, there is no feasible solution (i.e., the delay bound is not 
achievable by ED) in most cases. However, if a feasible solution is found, that solution is 
guaranteed to satisfy the delay bound. The FED model underestimates the delay on about 
half the cases. However, since FED is much more accurate, we observe that for all cases, 
FED solutions only violate the delay bound by a percentage less than 1 %. 
2.6 Extension to interconnect tree 
In this section, we extend the Fitted Elmore Delay model to handle an interconnect 
with tree topology as shown in Figure 2.6. The Elmore Delay value for node 2 for Figure 2.6 
is give by: 
TED = RD (ca li W1 + CJ f 1 + Ca l2 W2 + CJ l2 + Ca [3 W3 + CJ [3 + Cu+ CL2 + CL3) 
+ (r f 1 I W1) (( Ca f 1 W1 + CJ f 1) I 2 + Ca l2 W2 + CJ l2 + Ca [3 W3 + CJ [3 + CL2 + CL3) 
+ (r l2lw2) ((ca l2 W2 + CJ l2) I 2 + CL2) 
After arranging the terms: 
TED= Ca RD ( f1 W1 + f2 W2 + [3 W3) 
+ CJ RD ( fi + l2 + [3) 
+ RD ( C L2 + C L3 ) 
+ (r Ca 12) (( l1/w1) Z1 W1 + (2 l1lw1) l2 w2 + (2 Z1/w1) [4 W4 + (l2lw2) l2 W2) 
+ (rcJI 2) (( l1/w1) l1 + (2 l1lw1) l2 + (2 lilw1) h + (l2lw2) l2) 
+ r (( l1lw1) Cu+ (ltfw1) CL3 + (l2lw2) CL2) 
20 
2 
C12 
t, 
\/\11 
Figure 2.6 Routing tree 
The Fitted Elmore Delay model (FED) for interconnect trees is obtained by replacing 
the coefficients of the six terms above by the constants A, B, C, D, E, F found by multiple 
linear regression for a single wire. 
TFED = A RD ( ii W1 + l2 W2 + [3 W3) 
+ B RD ( l1 + l2 + [3) 
+ C RD ( CL2 + C L3 ) 
+ D (( l1lw1) li W1 + (2 l1lw1) lz w2 + (2 l1lw1) [4 W4 + (l2lw2) l2 w2) 
+ E (( l1lw1) l1 + (2 li/w1) l2 + (2 l1lw1) /4 + (l2lw2) l2) 
+ F (( l1lw1) Cu+ (l1lw1) CL3 + (l2lw2) CL2) 
The idea above can be generalized to trees with any topology. For a general tree, let 
T be the set of indices of all tree edges. Let T(i) be the set of indices of tree edges at the 
downstream of edge i. Let the S be the set of indices of all sinks. Let S(i) be the set of 
indices of sinks at the downstream of edge i. Let P(k) be the set of indices of tree edges 
along the path from the driver to sink k. 
21 
The Fitted Elmore Delay for node k = A RD ~. li wi 
L...J1ET 
+CRD ~ . CL} 
L...JJES 
+ F (l·lw·)(~ CL·) 
L...JiEP(k) l l L...JiES(i) J 
In order to test the accuracy of the Fitted Elmore Delay (FED) model, the Scaled 
Elmore Delay (SED) and the FED of several trees with different number of sinks are 
calculated on the 0.18µm technology. The relative error in delay with respect to the Hspice 
simulation is reported in Table 2.8. 
Table 2.8 Relative errors in delay for Scaled Elmore model and FED using interconnect trees 
Tl (2 sinks) 3.06% 1.23% 2.34% 0.65% 
T2 (3 sinks) 4.66% 0.32% 4.53% 0.17% 
T3 (4 sinks) 9.36% 0.26% 9.32% 0.17% 
22 
2. 7 Transformed Elmore Delay model 
Almost all previous algorithms and programs based on the Elmore Delay model can 
use FED instead directly. However, for some results, which depend heavily on the 
functional form of Elmore Delay model (e.g., [27,28]), it is not completely obvious whether 
FED can replace the Elmore Delay model. It would be nice if there is a model with the form 
as the ED model. In this section, we present the Transformed Elmore Delay (TED) model of 
the form: 
TTED (RD, CL, l, w) = aRD ( Ca l w + CJ l + /J CL)+ (r llw)( Cal w/2 + CJ l/2 + /J CL) 2.9 
This model has basically the same form as the Elmore Delay model as shown in 
equation 2.4. The only differences are the technology parameters are changed; the driver 
resistance and the load capacitance are scaled. As a result, all programs and algorithms 
based on the Elmore Delay model can be changed to use our model very easily and obtain 
much better results. Moreover, in the worst case, the Transformed model can use the same 
parameter values for the Elmore Delay mode. So it is always better than the Elmore Delay 
model. 
In order to obtain the coefficients a, Ca, CJ, /J and r so that TTED is a good 
approximation to the Hspice data, we can equate the equations 2.4 and 2.9 in order to end up 
with following equalities. We have to be very careful not to confuse these new coefficients 
(Ca, CJ, and r) with the ones from the 1999 Roadmap technologies in Table 2.1 . 
The equalities are: 
a Ca= A 
a /J=C 
CJ= 2E 
a CJ= B 
r Ca= 2 D 
r /J=F 
23 
By taking the base-10 logarithm of these six equalities, we have the following system 
of linear equations: 
Mx=b 
1 0 0 1 0 log A 
1 0 0 0 1 
loga 
logB 
log/3 
1 1 0 0 0 logC 
WhereM= x= logr , and b = 
0 0 1 1 0 ' log2D 
0 0 1 0 1 
log ca 
log2E 
0 1 1 0 0 
logcr 
logF 
We can see that we have six equations but only five unknowns. So it is an over-
determined system. Thus, we cannot expect to find x that satisfies the system exactly. 
Instead we will seek x which minimizes IIMx - bll2- This is called the least-square problem 
and can be solved by the QR factorization [22]. The parameters obtained by the QR 
factorization for each technology are listed in Table 2.9. 
Table 2.9 Coefficients for Transformed Elmore Delay model 
-Terh. (µm) 0.J8 0.13 . ·0:01: -
a 0.682475 0.684586 0.686041 0.689715 
/3 1.000000 1.000000 1.000000 1.000000 
r 0.053827 0.050068 0.059774 0.070832 
Ca 6.129341x10-17 6.243642Xl0-17 4.786529X10-17 5.823654X10-17 
CJ 8.540268 x10-17 6.669960X10-17 4.468428Xl0-17 4.102596X10-17 
24 
The relative error in delay for the Transformed Elmore Delay model is reported in 
Table 2.10. For the maximum relative error, the Transformed Elmore Delay model is only 
0.8-0.89% worse than the Fitted Elmore Delay model. For the average relative error, the 
Transformed Elmore Delay model is only 0.13-0.55% worse than the Fitted Elmore Delay 
model. We can see that /3 is equal to 1 for all cases. That means that the scaling of the load 
is not useful in improving this delay model. 
Table 2.10. Relative errors in delay for the Transformed Elmore Delay model 
0.25 2.85% 0.78% 
0.18 3.05% 0.81% 
0.13 3.37% 0.86% 
0.07 3.69% 0.93% 
We have seen how the FED and TED are significantly better than the Elmore Delay 
model. However, like the Elmore Delay model, the accuracy of the FED and TED is 
adversely affected by the resistive shielding problem. That is, as the gate "resistance" gets 
smaller and the metal resistance gets larger, the gate no longer "sees" the total net 
capacitance and the gate delay maybe significantly less than expected [3]. The resistive 
shielding problem will be discussed in chapter 3. 
25 
CHAPTER 3. IMPROVED EFFECTIVE CAPACITANCE 
3.1 Resistive shielding 
In the previous chapters, we have studied the advantage of the Elmore Delay model in 
providing simple and computationally inexpensive delay solution. It also provides a good 
fidelity with respect to the accurate Hspice delay. On the other hand, we have found out that 
improving the accuracy of the Elmore Delay model is a critical task since the Elmore Delay 
model does not take the resistive shielding problem into consideration. That is, as the gate 
"resistance" gets smaller and the metal resistance gets larger, the gate no longer "sees" the 
total net capacitance and the gate delay maybe significantly less than expected [3]. For 
example, if we increase the interconnect resistance of the load and keep the gate output 
resistance constant, the total gate delay at the output will decrease since the interconnect 
resistance shields some of the load capacitance. Figure 3.1 shows a single-load circuit with 
resistive shielding problem. Let Rint >> Rd. This condition will make the interconnect shields 
some of the load capacitance. The driver sees an effective capacitance instead of the total 
capacitance. The shielded amount of the capacitance is captured by the scaling factor a. The 
nodes close to the driver are called near-end nodes (i.e., node 1) and the nodes far from the 
driver are called far-end nodes (i.e., node 2). 
Ro 1 R int 2 
C int12 l 
Figure 3.1 Single-load circuit 
26 
The delay at node 1 (near-end node) is given by: 
Elmore1 = RD( Cn/2 + Cn/2 + CL) 
Actual_Delay1 = RD (Cin/2 + a Cn/2 + a CL) 
Effective Capacitance= c(Cn/2 + CL) 
The delay at node 2 (far-end node) is given by: 
Elmore2 = RD( Cn/2 + Cn/2 + CL) + RintC Cn112 + CL) 
Actual_Delay2 = RD(Cn/2 + aCin/2 + aCL) + RintCCin/2 + CL) 
3.1 
3.2 
3.3 
3.4 
From equations 3,1 and 3.4, the driver always sees the total capacitance in the circuit 
regardless of the resistive shielding. From equations 3.2 and 3.4, the driver sees a portion of 
the total capacitance; this portion is called the effective capacitance. Equations 3.4 and 3.2 
will converge to equations 3.1 and 3.3 respectfully when the circuit does not have resistive 
shielding (a= 1). We can also observe that node 2 (far-end) does not suffer from resistive 
shielding as much as node 1 (near-end). In equation 3.4, the product {RntCCn/2 + CL)} is 
larger than {RD(Cn/2 + aCin/2 + aCL)}. Thus, near-end nodes are more sensitive to resistive 
shielding than far-end nodes. 
In this chapter, we present the derivation and some experimental results for the 
Improved Effective Capacitance Metric (IECM) that overcomes the resistive shielding 
problem. It accounts for this resistive shielding by computing the effective capacitance to 
model the downstream capacitance in the circuit. We adopt the methodology of O'Brian and 
Savarino [9] for computing the driving-point admittance in order to capture the resistive 
shielding. The IECM model can be computed with the same simplicity as the Elmore Delay 
27 
model. We have tested the IECM on a single-load circuit and multiple tree topologies. 
Experiments show that the IECM is significantly more accurate than the Elmore Delay model 
and the Effective Capacitance Metric (ECM) proposed by [19]. 
3.2 Driving point admittance approximation 
O'Brian and Savarino [9] propose a method for computing the first three Taylor 
series coefficients of the driving-point admittance Yj( s) at node j. These coefficients are used 
to calculate the three components of their proposed n-model (Rn:,j ,Cn,j ,CJ,j ) at node j as 
shown in Figure 3 .2 
j Rn,j 
Figure 3.2 (Rn:,j ,Cn,j,Cj,j) n-model 
Their algorithm starts at the leaf node of an RC tree and works back to the source in a 
finite sequence of steps [9] in a bottom-up fashion. Suppose that the input admittance of the 
circuit downstream of node j, Yj( s) can be expanded in a Taylor series abouts = 0 as follows: 
2 3 Yj(s) = Yo,j+ YJ,jS + Y2J s + YJ,jS + .. ... 3.5 
The first three moments (y J,j, y2,j and y3,j) of the admittance for node j are only taken 
in this algorithm in equation 3.5. We have found that either second- or third-order moment 
provides sufficient accuracy in all cases of practical interest to us [9]. The moment YoJ in 
28 
equation 3.5 is assumed to be zero since no DC- path to ground is taken. The admittance 
Yj( s) is also assumed to be zero if node j is a leaf node. 
Figure 3.3 illustrates the computation of the driving-point admittance moments for an 
RC segment. Figure 3.4 shows how to combine parallel k branches (k >= 2) in case of 
multiple-load circuits. The values of R1 and C1 are the resistance and capacitance respectfully 
seen by node j as shown in Figure 3.5. 
j+ 1 
• • 
Figure 3.3 Computation of moments across an RC segment 
~,k(S) Flj,k, Cj,k ( Yj+ 1,k(s) Yj (s) I 
j • 
~.1(s 
• 
RL1, CL 1 • 
Yj+ 1, 1 (s) 
Figure 3 .4 Combining parallel admittances 
j+ 1 
1a----1I ---------~ 
29 
-~ 
___ ....,./ 
7 
Figure 3.5 RC segment equivalent 
j Rj j+1 FlCj/2 
In Figure 3.3, the first three moments of the driving-point admittance at node j are 
calculated in a bottom-up fashion as follows: 
Yl ,j = Yl ,j+l + Cj 3.6 
2 1 2 
Y2,j = Y2,J+l + Rj( y 1,j+l - Cj y 1,j+l + 3 Cj ) 3.7 
YJ,j = YJ,J+l - Rj (2y2,J+l Y2,j+l - Cj Y2,j+l) 
2 3 4 2 2 2 2 3 + R · ( Y J · J + - C . Y 1 . 1 + - C- Yl . 1 + - C- ) 1 ,J+ 3 ,] ,J+ 3 1 ,]+ 15 1 3.8 
Once the bottom-up recursive algorithm faces a split in the circuit with k branches 
(k >= 2) as in Figure 3.4, parallel admittance simply add together as follows: 
k 
Yl ,j = L ( Yl,j,m) 3.9 
m=I 
k 
Y2,j = L (y2,j,m) 3.10 
m=l 
k 
Y3,J = L (yJ,j,m) 3.11 
m=l 
30 
After computing the first three moments in a bottom-up fashion up to the near-end 
node (j =1). The three components of then-model at node j =l, as in Figure 3.6, are given 
by: 
2 
R _ Y3,l ,r,1- - --3 
Y 2,1 
2 
C _ Y2,1 f,l - --
Y3,l 
Cn,1 = y 1,1 - C1,1 
3.12 
3.13 
3.14 
Ro 
V2 
r Ct,1 
Figure 3.6 then-model of the driving point admittance for a driver resistance RD 
The methodology of O'Brian and Savarino [9] is used by many existing models 
[3, 19]. Kashyap, Alpert and Devgan [19] adopt the method of [9] to derive the Effective 
Capacitance Metric (ECM) to capture the resistive shielding problem. 
3.3 Effective capacitance Metric (ECM) 
In [19], they compute the three components of the n-model (R,r,J ,Cn,J and CJ,J) to 
capture the resistive shielding. They consider a unit step voltage source driving an RC circuit 
[ 19] in their mathematical derivation. They model R,r, J and C1,1 at node j by a single effective 
capacitance CeJJ,J as shown in Figure 3.7. The effective capacitance Ceff,J has a value less than 
or equal to CJ,J depending on the resistive shielding in the circuit. 
31 
j Rn,j F7c1,j I l Cn,j 
Figure 3.7 Ce.umodel 
According to the ECM model, the effective capacitance at node j, based on a unit-step 
voltage source, is given by: 
T 
--(....L) 
'[' , 
Ce.ff,} (ECM) = CJ,J (1 - e 1 ) 3.15 
Where Tj is the most pessimistic choice for the delay at node j, which can be 
approximated by the Elmore Delay value at that node. The value ; represents the product of 
Rn:,1and CJ,J. 
For a ramp voltage source with a driver transition time of tr, the effective capacitance 
at node j is: 
3.16 
Once the effective capacitance values have been calculated at all nodes. Then, the 
ECM starts calculating the delay in a top-down fashion, which starts at the near end node (j = 
1) up to the leaf node in the circuit. The delay at node j, proposed by the ECM, is given by: 
Where: 
32 
ECMp(}): ECM delay of predecessor of node j 
R1 : local resistance that sees node j 
C1 : local capacitance at node j 
3.17 
We can see that the ECM has the same form of the Elmore Delay model with one 
exception. The ECM sees the effective capacitance at node j, which varies between zero and 
CJ,J depending on the resistive shielding of the circuit. On the other hand, the Elmore Delay 
model always sees the total capacitance at node j regardless of the resistive shielding. 
Although, the ECM provides results more accurate than the Elmore Delay model, we 
introduce the IECM that provides more accurate delay results than the ECM model. 
3.4 Improved Effective Capacitance Metric (IECM) 
We use the same three components of then-model (Rn:,J ,Cn.J and Cn) proposed by [9]. 
We model Rn:,J and C1,1 at node j by a single effective capacitance Ceff,J which has a different 
mathematical form than the one presented by the ECM [19]. We assume the driving input 
signal has a 20%-80% transition time of tr. Thus, we consider a ramp voltage source driving 
an RC tree in our mathematical derivation. On the other hand, the ECM bases their 
derivation on a unit-step voltage source. We will show that our model IECM more accurate 
delay values than the ECM model. 
33 
We start our mathematical derivation from the near-end node where j =l. Consider a 
ramp voltage source driving an RC circuit at node} =1 as shown in Figure 3.7. 
v(t) = Vdd(l - e-ar) u(t) 3.18 
The ramp voltage source has a transition time (tr) from (20% to 80%) of the Vdd as in 
Figure 3.8. 
V ( t) 
V dd -------
0 . 8 V d d- -------
0 . 2 V ci ci -----
! 
I 
I 
Figure 3.8 Ramp voltage source 
The ramp voltage source at t1 is: 
-af 0.2Vdd= Vdd(l - e ) 
The ramp voltage source at t2 is: 
3.19 
3.20 
If we solve equations 3.15 and 3.16 with respect to tr. Mathematically, the constant 
(a) is ( ln(4) ). Equation 3.18 becomes: 
tr 
-(ln(4)t) 
v(t) = Vdd(l - e tr ) u(t) 3.21 
Where: 
34 
tr: 20%-80% transition time of the driver. 
Vdd: Steady state voltage of the ramp function. 
The Laplace transform [26] of equation 3.21: 
1 
V(s) = Vdd ( - -
s 
1 
In( 4) ) 
s+--
tr 
The current into the RC circuit of Figure 3.6 is: 
1 1 sC 
l(s) = V(s)Y(s) = Vdd( - - ---)( 1·1 ) 
s s + ln(4) (sR,r_lc/,1) + 1 
tr 
The inverse Laplace of l(s) is: 
(-e(-ln(4)t/tr) + e(-tl(R11: ,ICJ,1))) 
i(t) = In(4)VddCJ,1 ( ) 
The total charge dumped into the RC circuit up to time (D1) is given by: 
3.22 
3.23 
3.24 
3.25 
The general form of the effective capacitance at node j, as shown in Figure 3.7, is 
given by: 
q(Dj) 
C eff,j(IECM) = V(D j) 
-(D j) 
R -C1 . (1 - e n . ; ·1 ) 
CJ ·(tr-ln(4)R ·CJ·------) ,J lr,J ,J (D ) 
-ln(4)-j 
(1- e tr ) 
= tr - ln(4)R ·CJ · Jr,} ,] 3.26 
35 
Figure 3.8 shows the input (v1) and output (v2) signals of Figure 3.6. Figure 3.8 will 
help us to decide on the value of D1. We approximate the 50%-50% delay by the Elmore 
Delay value (Ej) at node j. The value where the input signal reaches 50% of Vdd is tr/2. 
D1 = E1 + tr/2 
Equation 3.26 becomes: 
Ce.ff, j(JECM) = 
-(E j +tr/2) 
R .c1 . (1-e 1l , J •1 ) 
CJ ·(tr-In(4)R ·CJ·--------) 
,J 1!,J , ] -In(
4
)(Ej+tr/2) 
(1 - e tr ) 
tr - In( 4) R ·CJ · :rr , ] , ] 
Figure 3.8 Voltage versus time 
3.27 
We can see that equation 3.27 depends on the values of E1 and tr. These two values 
capture the resistive shielding effects in the circuit. If E1 = 00 or tr = then the effective 
capacitance value Ce.ff,) converges to Cn since the total capacitance will be seen by the driver. 
This behavior agrees with the behavior seen in the Effective Capacitance Metric (ECM) 
model. 
36 
Once we calculate the effective capacitance value, the IECM starts calculating the 
delay at the near end node up to the leaf node similar to the ECM. The delay at node j 
proposed in IECM is given by: 
Where: 
IECMp(J) : ECM delay of the ancestors of node j 
R1 : local resistance that sees node j 
C1 : local capacitance at node j 
3.28 
We have to assume that IECMp(}J is zero at the driving node (node 1) since the signal 
has not been delayed yet. The ECM has the same assumption. In equation 3.24, we can see 
how the delay value IECM1 is affected by the effective capacitance value. Thus, we have 
managed to improve the accuracy of the IECM by modifying the effective capacitance in 
equation 3.23. We have replaced E1 with (IECMp(J) + SE1 ) for all the nodes. The value SE1 is 
the downstream scaled Elmore Delay value at node j without considering the upstream delay 
of node j since IECMp(J) is capturing this value. Equation 3.27 becomes: 
C e.ff, j(IECM) = 
) (1-e 
Cf,j(tr-ln(4)l?n-,JCJ (SE1 +IEC¼CJ) +tr I 2) -ln(41--------
0-e ) 
tr-ln(4)R ·Ct · i'1C,J ,] 3.29 
37 
The algorithm for computing the delay value at each node in the proposed IECM: 
1. Perform a bottom-up traversal of the tree: 
1.1- Perform moment calculations up to the driving node. 
1.2- Compute the three components of the n-model at each node. 
2. Perform a top-down traversal of the tree: 
2.1- Compute the effective capacitance value at each node. 
2.2- Compute the delay at each node. 
3.5 ECM and IECM comparison 
For 0.18µ technology, the Scaled Elmore Delay model, the ECM, and the ECM are 
compared with the true delay provided by Hspice. First, we tested the ECM and the IECM 
on a single-load circuit with 10-RC segments as shown in Figure 3.9. This test is quite 
challenging for any delay model since the nodes at the near-end will have significant resistive 
shielding. In our test, Every RC segment is modeled as an RC n-type model as in Figure 3.4. 
The values of the resistance and the capacitance are randomly chosen between 1-20 kQ and 
1-20 tF respectfully. The random process is performed ten times. For each node, we 
measure the average relative error of the Scaled Elmore Delay model, IECM, and ECM to the 
true Hspice delay. The average relative error results are shown in Table 3.1. 
0 Ro 
• • • 
Figure 3.9 Single-load circuit 
38 
Table 3.1 Average Relative errors of single-load circuit 
2 110.3 166.3 201.5 98.23 156.7 198.3 52.10 89.22 109.8 
3 31.35 76.31 98.21 24.96 48.28 64.83 12.03 37.41 41.26 
4 19.56 41.03 52.43 13.52 26.74 30.92 4.31 17.16 20.74 
5 17.75 38.41 49.01 8.20 24.51 28.71 2.12 15.52 17.32 
6 15.82 35.62 47.84 7.85 23.97 27.62 2.04 15.12 17.10 
7 14.63 34.96 46.37 7.19 22.12 26.68 1.93 14.84 17.02 
8 14.03 34.17 46.13 7.05 21.91 26.02 1.35 13.72 16.94 
9 13.43 33.80 45.91 6.82 20.73 25.87 1.09 12.26 15.78 
10 12.91 30.21 45.04 6.51 20.04 24.96 0.96 11.14 14.36 
11 9.01 28.07 44.86 6.13 19.87 24.63 0.82 9.34 12.37 
Table 3.1 presents the average relative error at each node of the single-load circuit. 
We can see that the Scaled Elmore Delay model performs the worst of all models. The 
IECM is clearly superior to both the ECM and the Scaled Elmore Delay model for different 
driver transition times at all nodes. The IECM is within 9.01 % of the true delay for all far-
end nodes. The IECM is better than both models for near-end and far-end nodes. We also 
can see how the IECM is performing better for higher driver transition time since the 
mathematical derivation is based on a ramp function. 
39 
We then explored the Improved Effective Capacitance Metric (IECM) on multiple 
tree topologies for 0.18µ technology. We tested ECM and IECM on a two-load circuit with 
7-RC segments as shown in Figure 3.10. We chose the driver resistance to be within 10 Q to 
allow resistive shielding in the circuit. For each RC segments, the values of the resistance 
and the capacitance are shown in Table 3.2. The relative error results are shown in Table 3.3. 
R6,C6 7 R7,C7 8 
-i l CL2 
0 Ro R1,C1 2 R2,C2 3 R3,C3 4 R4 ,C4 s Rs.Cs 6 
----v-.tv -i l CL, 
Figure 3.10 Two-load circuit 
Table 3.2. RC values for Figure 3.10 
1 80 0.5 
2 60 1 
3 60 1 
4 60 1 
5 60 1.2 
6 60 1 
7 60 1.2 
40 
Table 3.3. Relative errors for two-load circuit 
2 31.74 53.11 72.83 28.91 56.40 72.57 15.69 44.83 45.82 
3 9.37 23.78 37.84 7.37 25.91 37.78 6.29 31.61 31.94 
4 3.89 14.85 26.41 2.36 16.48 26.40 1.53 22.25 23.15 
5 2.45 11.90 22.22 1.14 13.33 22.19 1.02 18.58 19.6 
6 2.19 11.09 20.80 0.96 12.41 20.77 0.89 17.39 18.3 
7 17.28 32.42 47.98 15.30 34.78 47.84 11.37 34.41 35.36 
8 14.89 28.35 42.33 13.14 30.46 42.20 9.88 30.62 31.60 
In Table 3.3, our model (IECM) is always better than the ECM and the Scaled Elmore 
Delay model for different driver transition times at all nodes. The IECM is performing the 
best for both far-end and near-end nodes. The IECM is still performing better for higher 
driver transition times. On the other hand, the ECM is losing its stability for higher transition 
time. 
We then tested the two models on a three-load circuit as shown in Figure 3.11 for 
0.18µ technology with 10-RC segments. We chose the driver resistance to be within 10 .Q to 
allow resistive shielding in the circuit. For each RC segments, the values of the resistance 
and the capacitance are shown in Table 3.4. The relative error results are shown in Table 3.5. 
41 
R4,C4 
5 
Rs,Cs 6 R6,C6 7 
0 Ro l CL2 
R1,C1 2 R2,C2 3 R3,C3 4 
R7,C7 
l CL1 
8 
l CL3 
Figure 3.11 Three-load circuit 
Table 3.4. RC values for Figure 3.11 
1 80 0.5 
2 60 1 
3 60 1.2 
4 60 1 
5 60 1 
6 60 1.2 
7 60 1.2 
8 60 1 
9 60 1 
42 
Table 3.5. Relative errors for three-load circuit 
2 14.29 32.09 49.80 9.96 35.98 49.67 8.26 35.44 32.98 
3 4.12 13.89 26.30 2.50 16.26 26.24 1.89 22.21 21.00 
4 3.63 12.43 33.15 1.75 14.59 33.09 1.37 21.14 27.96 
5 24.92 178.66 167.11 19.32 183.8 166.2 10.07 124.1 97.02 
6 8.44 102.21 98.96 4.96 107.4 98.66 2.41 94.34 73.49 
7 6.61 88.85 84.73 3.62 91.96 84.42 2.51 83.54 65.07 
8 3.63 12.43 33.15 1.75 14.59 23.09 1.37 21.14 27.96 
In Table 3.5, we can see how the ECM model is worse than the Scaled Elmore Delay. 
This behavior occurs because of basing their derivation on a unit-step function. On the other 
hand, our model is consistently better than the ECM and the Scaled Elmore Delay model for 
both near-end and far-end nodes. The IECM is within 6.61 % of the true delay for all far-end 
nodes. 
We finally tested the two models on a four-load circuit as shown in Figure 3.12 for 
0.18µ technology with 10-RC segments. For each RC segments, the values of the resistance 
and the capacitance are shown in Table 3.6. The relative error results are shown in Table 3.7. 
43 
R6,C6 7 R7,C7 s 
R1,C1 o Ro 1 l CL2 2 
R2,C2 3 R4,C4 5 Rs,Cs 6 
l CL1 R10,C10 11 Rs.Cs 9 R9,C9 l CL4 l CL3 
Figure 3.12 Four-load circuit 
Table 3.6. RC values for Figure 3.12 
~t Rn.(Q) Cn (f F) 
1 80 1 
2 60 1 
3 60 1 
4 60 1 
5 60 1.2 
6 60 1 
7 60 1.2 
8 60 1 
9 60 1.2 
10 60 1.2 
44 
Table 3.7. Relative errors for four-load circuit 
't .. ,, ' <· 
, ,. 
" 
.,.,,. ,. ..-:-r -
,,;·,:;. ;, "'!;, 
Relativ~ error for dd~y 
' ;,,. .. ,.~ ·~ ..;. ;\~•' ' ' .,. •.• v,· , .. 
, :" , ;-;, ·' -·' 
-Node ; 0.0l·ps ,• 100 ps : 1000.' p.$:: 
:;;-,>~ •·· : ·., ,. ,,: ,,.,. .... ./\,,, cc.•· • .. ;,~t ,_/ "'. ,, ... \('.?'''J: "S~al~d ·~ Scaled: ... , ><'".° ,, -· Scaled .: IECM IECM ~,r ECM " 1,. /ECM · ,, "r, ··?~M}11I -~k \.'~~{~ FE@M,·; E lmo~e ,;- <Elmore Elnlore ·· :· ,t I; 
2 50.43 72.58 88.52 48.48 75.01 88.28 33.43 68.12 67.51 
3 15.75 30.02 41.38 14.47 31.51 41.34 11.91 37.64 38.02 
4 6.94 16.89 26.05 6.05 17.94 26.03 4.47 23.17 24.50 
5 5.84 14.86 23.31 5.03 15.81 23.29 3.61 20.62 21.95 
6 5.53 14.17 22.26 4.76 15.08 22.06 3.40 19.69 20.99 
7 33.52 50.12 63.39 32.07 51.97 63.24 23.39 50.83 51.17 
8 29.84 44.83 56.93 28.25 46.51 56.81 20.98 45.97 46.38 
9 5.84 14.86 23.31 5.03 15.81 23.29 3.61 20.62 21.95 
10 5.53 14.17 22.26 4.76 15.08 22.06 3.40 19.69 20.99 
11 43.47 62.83 76.92 41.77 64.96 76.73 28.86 59.60 59.29 
In Table 3.7, our model IECM still outperforms all models at different transition 
times at all nodes. Four different circuit topologies have shown how the IECM is 
consistently more accurate than the ECM for different driver transition times. The Improved 
Effective Capacitance Metric (]ECM) outperforms the ECM for both near-end and far-end 
nodes at all cases. 
45 
The IECM has a better estimate in capturing the resistive shielding for any circuit in 
comparison with the ECM for the following reasons: 
• IECM uses a ramp voltage source, which gives is more realistic view of the actual 
signal. The ECM uses a unit-step source. 
• IECM uses the Scaled Elmore delay model value. The ECM uses the Elmore Delay 
value. 
• IECM uses its accurate delay value from the previous iteration (IECMp()J + SE1 ), which 
will be passed to the next node. The ECM uses the Elmore Delay value at all nodes. 
46 
CHAPTER 4. DISCUSSION AND FUTURE WORK 
4.1 Discussion 
We introduced the Fitted Elmore Delay (FED) model in chapter 2. We compared the 
FED model with existing interconnect delay models such as the Scaled Elmore Delay model 
and proved that the FED has better delay results. The FED model provides excellent results 
for single-load circuits. Our model also provides good results on tree topologies as long as 
there is no resistive shielding in the circuit. We introduced the Improved Effective 
Capacitance Metric (/ECM) in chapter 3. The IECM manages to overcome the resistive 
shielding problem. We compared the IECM with existing interconnect delay models such as 
the Scaled Elmore Delay model and the ECM model proposed by [19]. We proved that the 
IECM has better delay results. 
The critical point now is to know exactly when to use the FED or the IECM. These are 
the guidelines: 
• For uniform-line circuits: 
We must use the FED model regardless of the resistive shielding. The Curve fitting 
technique can overcome the problem of resistive shielding for simple circuits such as 
uniform-line circuits. 
• For tree topologies: 
If the circuit suffers from resistive shielding, we have to use the IECM. Otherwise, 
we use the FED model. 
• If we have information about the driver transition time tr, we must use the IECM. 
47 
There are two main contributions in this thesis. The first contribution is the Fitted 
Elmore Delay model. The FED model has several advantages over the previous interconnect 
delay models such as: 
• The FED is efficient to compute as the Elmore Delay model. Other accurate 
interconnect model are at least hundreds of times slower than our model. 
• The FED is significantly more accurate than the Elmore Delay model. 
• The FED is written as a simple, explicit formula containing design parameters. This 
feature is very useful when designing interconnect optimization algorithms. 
• Most previous interconnect optimization algorithms based on the Elmore Delay 
model can use the FED model without much change. 
We believe that the FED model can be used in many applications listed, but not 
limited, as follows: 
• Interconnect planning 
• Interconnect optimization 
• Timing-driven placement 
The second contribution of this thesis is the Improved Effective Capacitance Metric 
(IECM). The IECM has several advantages such as: 
• The IECM overcomes the resistive shielding problem. 
• The IECM can be computed with similar simplicity as the Elmore Delay model. 
• The IECM is significantly more accurate than the Elmore Delay model and the 
ECM proposed by [9] 
48 
• IECM can be used in many applications such as interconnect planning and timing-
driven placement. 
• The IECM is easy to implement but not as easy as the FED model. 
• The IECM the driver transition time tr as an input parameter. The FED does not. 
4.2 Future work 
We would like to find an alternative for the O'Brian and Savarino [9] methodology in 
moment matching due to its inefficiency. We would like to extend the Improved Effective 
Capacitance Metric (IECM) to interconnect optimization techniques. We also would like to 
combine FED and IECM into one model. The combined model will be applicable to any tree 
topology whether it has a resistive shielding problem or not. We plan to explore models to 
compute slew rates (driver transition time). Such models will open the domain to consider 
slew rates into the optimization techniques and may result in more accurate delay 
approximations. We plan to extend our models to incorporate inductive parasitic due its 
importance in the recent VLSI scaling technologies. The simple RLC [25] delay model can 
be used instead of the Elmore Delay model. We also plan to extend our model to consider 
other optimization techniques such as buffer insertion and sizing and simultaneous buffer 
insertion/sizing and wire sizing. 
49 
REFERENCES 
[1] Jason Cong and David Pan, "Interconnect Delay Estimation Models for Synthesis and 
Design planning", Asia and South Pacific Design Automation Conference, Jan. 1999, 
pp. 97-100. 
[2] Jan M. Rabaey, "Digital Integrated Circuits: A Design Perspective", l st edition, 
Prentice Hall, 1996. 
[3] J. Qian, S. Pulllela, and L. Pillage, "Modeling the "Effective Capacitance" for the RC 
Interconnect of CMOS Gates", IEEE Transactions on Computer-Aided Design, 1994, 
13(12), pp. 1526-1535. 
[4] Y. Mo, "Post-Layout interconnect optimization algorithims", M.S Thesis, Iowa State 
University, Ames, August 2000. 
[5] Naveed A. Sherwani, Algorithims For VLSI Physical Design Automation, 3rd edition, 
Kluwer Academic Publishers, 1999. 
[6] L. Nagel, "SPICE2, A computer program to simulate semiconductor circuits." TR 
ERL-M520, UC-Berkeley, May 1995. 
[7] L. Pileggi, "Timing Metrics for Physical Design of Deep Submicron Technologies", 
International Symposium on Physical Design, 1998, pp. 28-33. 
[8] L. T. Pillage and R. A. Rohrer, "Asymptotic Waveform Evaluation for Timing 
Analysis", IEEE Transactions on Computer-Aided Design, April 1990, 9(4), pp. 352-
366. 
[9] P. R. O'Brian and T. L. Savarino, "Modeling the Driving-Point Characteristic of 
Resistive Interconnect for Accurate Delay Estimation", IEEE ICCAD, 1989, pp.512-
515. 
[10] B. Tuuianu, F. Dartu, and L. Pileggi, "An Explicit RC-Circuit Delay Approximation 
Bases on the First Three Moments of the Impulse Response", ECMIIEEE on Design 
Automation Conference, 1996, pp. 611-616. 
[11] R. Kay and L. Pileggi, "Primo: Probability Interpretation of Moments for Delay 
Calculation", ACM/IEEE on Design Automation Conference, 1998, pp. 463-468. 
[12] C. Alpert, A. Devgan, and C. Kashyap, "A Two Moment RC Delay Metric for 
Performance Optimization", International Symposium on Physical Design, 2000, pp. 
69-74. 
50 
[13] W. C. Elmore, "The transient response of damped linear network with particular 
regard to wideband amplifiers", J. Applied Physics 19, 1948, pp.1-94. 
[14] K.D. Boese, A. B. Kahng, B. A. McCoy, and G. Robins, "Fidelity and near-optimality 
of Elmore-based routing constructions", IEEE Transactions on Computer-Aided 
Design, 1993, pp. 81-84. 
[15] Jason Cong and Lei He, "Optimal Wiresizing for Interconnects with Multiple Sources", 
ACM Transactions on Design Automation of Electronic Systems, October 1996, 1(4), 
pp. 568-574 
[16] J. Cong, A.B Kahng, C.-K. Koh, and C.-W. A. Tsao, "Bounded-Skew Clock and 
Steiner Routing under Elmore Delay", IEEE! ACM International Conference on 
Computer-Aided Design, 1995, pp. 66-71. 
[17] R. Gupta, B. Krauter, B. Tutuianu, Willis J, and L.T. Pillage, "The Elmore Delay As a 
Bound for RC Trees with Generalized Input Signals", ACM/IEEE Design Automation 
Conference, June 1993, pp. 364-369. 
[18] Jason Cong and Kwok-Shing Leung, "Optimal Wire Sizing Under Elmore Delay 
Model", IEEE Transactions on Computer-Aided Design, 1995, 14(3), pp. 321-336. 
[19] C. Alpert, A. Devgan, and C. Kashyap, "An "Effective" Capacitance Based Delay 
Metric for RC Interconnect", ICCAD, 2000, pp. 229-234. 
[20] A.B. Kahng and S. Muddu, "Efficient Gate Delay Modeling for Large Interconnect 
Loads", IEEE Multi-Chip Module Conference, Feb. 1996, pp. 202-207. 
[21] Semiconductor Industry Association. The International Technology Roadmap for 
Semiconductor. 1999. http://www.itrc.net/ntrs/publntrs.nsf 
[22] David Watkins, "Fundamentals of Matrix Computations", John Wiley & Sons, New 
York, 1991 
[23] R. Lymon Ott, "An Introduction to Statistical Methods and Data Analysis", Duxbury, 
4th Edition, 1993. 
[24] Ravindra Khattree and Dayanand N. Naik, "Applied multivariate Statistics with SAS 
Software", SAS Institute, NC, 2nd Edition, 1999. 
[25] Y. I. Ismail, E. G. Friedman, and J. L. Neves, "Equivalent Elmore Delay for RLC 
Trees", IEEE Transaction on Computer-Aided Design, 1999, 19(1), pp. 83- 97. 
[26] D. V. Widder, "The Laplace Transform", Princeton University Press, Princeton, 11(1), 
1941. 
51 
[27] Chung-Ping Chen, Yao-Ping Chen, and D.F Wong, "Optimal Wire-Sizing formula 
under the Elmore Delay Model", IEEE/ACM Design Automation Conference, 1996, 
pp. 487-490. 
[28] Chung-Ping Chen and D.F Wong, "Optimal Wire-Sizing Function with Fringing 
Capacitance Consideration, IEEE/ACM Design Automation Conference, 1997, pp. 
604-607. 
52 
ACKNOWLEDGMENTS 
I would first like to express my sincere appreciation and thanks to my maJor 
professor, Dr. Chris Chu, for his guidance and tolerance during my stay at Iowa State 
University. Dr. Chu has been always encouraging and motivating me to be the best I can be. 
I would also like to thank my committee for their helpful discussions. I would also like to 
thank Brian Nowak for his help in testing the FED model and his great friendship. 
I would like to thank my family who has always been supportive to me during my 
years at school. Specially, I wish to thank my brother Ehab Abou-Seido for his guidance and 
support. 
Finally, I would also like to thank my best friends, Ahmed Ismail, Akram Kamaschi, 
Elias Radi, Raed Adhami, Sana Akili, and Tala Khashram for their great friendship during 
my stay at Iowa. 
