Modeling and Analysis of Noise and Interconnects for On-Chip Communication Link Design by Tuuna, Sampo
TURUN YLIOPISTON JULKAISUJA
ANNALES UNIVERSITATIS TURKUENSIS
SARJA - SER. A I OSA - TOM. 428
ASTRONOMICA - CHEMICA - PHYSICA - MATHEMATICA
Modeling and Analysis of Noise and Interconnects



















Department of Microtechnology and Nanoscience




Department of Electrical and Computer Engineering
National Institute for Applied Sciences (INSA)
Toulouse, France
Opponent
Distinguished Professor Eby Friedman
Department of Electrical and Computer Engineering
University of Rochester




Painosalama Oy – Turku, Finland 2011
Abstract
This thesis considers modeling and analysis of noise and interconnects in on-
chip communication. Besides transistor count and speed, the capabilities of a
modern design are often limited by on-chip communication links. These links
typically consist of multiple interconnects that run parallel to each other for
long distances between functional or memory blocks. Due to the scaling of
technology, the interconnects have considerable electrical parasitics that affect
their performance, power dissipation and signal integrity. Furthermore, because
of electromagnetic coupling, the interconnects in the link need to be considered
as an interacting group instead of as isolated signal paths. There is a need for
accurate and computationally effective models in the early stages of the chip
design process to assess or optimize issues affecting these interconnects. For
this purpose, a set of analytical models is developed for on-chip data links in
this thesis.
First, a model is proposed for modeling crosstalk and intersymbol interfer-
ence. The model takes into account the effects of inductance, initial states and
bit sequences. Intersymbol interference is shown to affect crosstalk voltage and
propagation delay depending on bus throughput and the amount of inductance.
Next, a model is proposed for the switching current of a coupled bus. The
model is combined with an existing model to evaluate power supply noise. The
model is then applied to reduce both functional crosstalk and power supply
noise caused by a bus as a trade-off with time. The proposed reduction method
is shown to be effective in reducing long-range crosstalk noise.
The effects of process variation on encoded signaling are then modeled. In
encoded signaling, the input signals to a bus are encoded using additional sig-
naling circuitry. The proposed model includes variation in both the signaling
circuitry and in the wires to calculate the total delay variation of a bus. The
model is applied to study level-encoded dual-rail and 1-of-4 signaling.
In addition to regular voltage-mode and encoded voltage-mode signaling,
current-mode signaling is a promising technique for global communication. A
model for energy dissipation in RLC current-mode signaling is proposed in the
thesis. The energy is derived separately for the driver, wire and receiver termi-
nation. The location where the energy is dissipated in current-mode signaling
is shown to vary as a function of wire width. All proposed models in the thesis
include inductive effects and they are verified with SPICE simulations.
3
Acknowledgements
This work was carried out at the Department of Information Technology, Uni-
versity of Turku. I want to express my sincere gratitude to the individuals and
institutions who have supported and enabled the research in this doctoral thesis.
First and foremost, I am grateful to my supervisor, Professor Jouni Isoaho, for
his continuous support and encouragement during the work leading to this the-
sis. I am also grateful to Professor Hannu Tenhunen for his support and many
helpful discussions. I also wish to thank Professor Kjell Jeppson and Professor
Etienne Sicard for their time and effort in reviewing this thesis.
I also appreciate the good company of the people at the Department of
Information Technology. Especially, I want to acknowledge all the current and
former fellow doctoral students with whom I shared many interesting discussions
relating to science and other topics as well.
I am grateful to everyone who have co-authored papers with me. In partic-
ular, I wish to thank D. Sc. Ethiopia Nigussie for fruitful co-operation.
I gratefully acknowledge the funding from the Graduate School in Electron-
ics, Telecommunications and Automation (GETA) and the Academy of Finland
research project. In addition, I wish to express my gratitude to the Nokia






1.1 On-Chip Global Communication . . . . . . . . . . . . . . . . . . 8
1.2 Major On-Chip Noise Sources . . . . . . . . . . . . . . . . . . . . 10
1.3 Thesis Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Thesis Organization and Contribution . . . . . . . . . . . . . . . 12
2 On-Chip Interconnect Modeling Methods 13
2.1 Interconnect Approximations in Physical Design . . . . . . . . . 14
2.2 Model Order Reduction . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Decoupling Method . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Driver Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Modeling of Crosstalk and Intersymbol Interference 23
3.1 Noise Model Derivation . . . . . . . . . . . . . . . . . . . . . . . 24
3.1.1 Determination of Maximum Noise and Propagation Delay 28
3.2 Model Verification . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.1 Switching Patterns . . . . . . . . . . . . . . . . . . . . . . 36
3.3.2 Signal Phases and Rise Time . . . . . . . . . . . . . . . . 38
3.3.3 Intersymbol Interference . . . . . . . . . . . . . . . . . . . 39
3.3.4 Implications of the Case Study . . . . . . . . . . . . . . . 40
3.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4 Modeling of Switching Current and Its Impact on Power Grid
Noise 43
4.1 Modeling of Bus Switching Current . . . . . . . . . . . . . . . . . 44
4.2 Modeling of Power Supply Network . . . . . . . . . . . . . . . . . 47
4.3 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4 Noise Reduction by Skewing . . . . . . . . . . . . . . . . . . . . . 51
4.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5 Reduction of Functional Crosstalk and Power Supply Noise 53
5.1 Reduction of Crosstalk and Power Supply Noise . . . . . . . . . . 54
5.2 Bus and Power Supply Noise Modeling . . . . . . . . . . . . . . . 55
5
5.2.1 Crosstalk Noise and Delay under Bus Skewing . . . . . . 55
5.2.2 Power Supply Noise under Bus Skewing . . . . . . . . . . 58
5.3 Model Verification . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.4 Case Study and Implementation . . . . . . . . . . . . . . . . . . 59
5.4.1 Reduction of Inductive Crosstalk Noise . . . . . . . . . . 59
5.4.2 Reduction of Power Supply Noise using Different Methods 61
5.4.3 Implementation and Reduction of Crosstalk using
Different Methods . . . . . . . . . . . . . . . . . . . . . . 62
5.4.4 Influence of the Number of Skewing Times . . . . . . . . 66
5.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6 Modeling of Process Variation Effects in Encoded Signaling 71
6.1 Bus Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.2 Signaling Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.3 Verification and Case Study . . . . . . . . . . . . . . . . . . . . . 77
6.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7 Energy Modeling in RLC Current-Mode Signaling 85
7.1 Current-Mode Driving Point Impedance . . . . . . . . . . . . . . 86
7.2 Modeling of Energy Dissipation . . . . . . . . . . . . . . . . . . . 88
7.3 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.3.1 Single-Ended Current-Mode Signaling . . . . . . . . . . . 91
7.3.2 Differential Current-Mode Signaling . . . . . . . . . . . . 94
7.4 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94




A Coefficients for Equations in Chapter 3 115




Due to continuous advances in technology scaling, modern integrated circuits
consist of billions of transistors [1]. Traditionally, the operating speed of an
integrated circuit had been assumed proportional to the speed of a logic gate.
The interconnects between the gates were considered as ideal conductors that
propagated signals instantaneously and had little effect on circuit operation.
Such approximations are however no longer adequate, since the physical di-
mensions of interconnects have been greatly reduced while the operating speeds
have increased. For example, in a modern 32 nm technology [2] the width and
thickness of local wires are measured in only tens of nanometers, while the
clock frequency is in the range of several GHz. Due to this scaling, the per-
formance of interconnects is increasingly affected by their electrical parasitics,
i.e. resistance, capacitance and inductance. These parasitics may result in long
propagation delays for signals traveling on interconnects or in signals that have
been distorted by noise. The transmission of such a signal requires charging or
discharging the wire capacitances which in turn consumes energy. This energy
dissipated in the interconnect structure is projected to grow dramatically due
to higher frequencies and increases in the number of metal layers [3]. For ex-
ample, in [4] over 50% of the dynamic power consumption of a microprocessor
was determined to be consumed by interconnects. In addition to transmitting
data signals, on-chip wires are also used to distribute an operating voltage and
the clock signal. The wires need to provide a constant operating voltage across
the chip despite the increasing switching speeds and device count. The design
of digital systems is further complicated by the fact that both wires and devices
also suffer from process variations, i.e. their manufactured properties differ from
the ideal designed values. Overall, due to these growing delay, signal integrity
and energy issues in interconnects, there has been a shift of focus from devices
to wires, or from computation to communication. This has resulted in a need
for novel design tools and models that can be used to analyze and optimize
on-chip interconnects.
7
Figure 1.1: Delay for local (Metal 1) and global wiring versus feature size [3].
1.1 On-Chip Global Communication
The interconnects in an integrated circuit can be loosely divided into local, in-
termediate and global interconnects depending on their length, size and metal
layer. An integrated circuit today often contains several large intellectual prop-
erty (IP) blocks, such as memory, processing elements and interfaces. These
IP blocks need to communicate with each other over long distances and they
are linked by wide global interconnects that span at least one block or at most
the length of the chip edge. While these global interconnects are routed in
the top metal layers, the lower metal layers in turn are used by local narrow
interconnects that connect neighboring gates.
The aforementioned scaling issues do not affect all interconnect types in an
equal manner, as illustrated in Fig. 1.1. Unlike gate delays which are reduced as
their dimensions become smaller, the delay of a fixed-length wire increases when
its dimensions are scaled [5]. For local wires this delay increase is alleviated by
the fact that their length is reduced with scaling since they need to connect
nearby gates whose sizes diminish with scaling. However, the length of global
wires is not scaled with technology since they may need to run across the chip.
This has resulted in a growing delay gap between gates and global interconnects.
Despite such efforts as increased aspect ratios, low-resistivity wire materials like
copper, and low-κ (permittivity) dielectric, global signaling often remains a
major bottleneck in modern integrated circuits.
In order to provide a high bandwidth, on-chip communication links are nor-
mally constructed of multiple wires. Among common communication architec-
8
Wire segment
Drivers                                                           Repeaters or receivers
z=0                                                z=h
Figure 1.2: An on-chip communication link consisting of multiple parallel wires.
tures are point-to-point links, buses and a network-on-chip (NoC) [6, 7, 8]. In
practice, buses are often implemented using techniques such as bus splitting [9]
to reduce the total wire load. NoC links on the other hand are typically modular
and structured interconnects running between routers. In addition, because of
delay and signal integrity issues interconnects are commonly broken with re-
peaters [10] into segments. Therefore, in the physical level the communication
often reduces to multiple wires running in parallel. In this thesis, the focus is
on long, multiple parallel wires that typically form a part of a communication
link as depicted in Fig. 1.2.
A common way to implement a long on-chip communication link is by using
voltage-mode signaling with buffering. The delay of an RC interconnect in-
creases quadratically with length since both resistance and capacitance increase
linearly with wire length. The basic principle behind buffering is to reduce this
delay increase to linear by inserting repeaters along the wire. The total delay
then becomes equal to the number of wire segments multiplied by the individual
segment delay. In addition to delay reduction, buffering can be used to reduce
noise. In order to achieve the desired objective, the repeaters need to be both
spaced and sized appropriately.
In addition to the common voltage mode signaling, other signaling techniques
for global on-chip communication have also been proposed. These include e.g.
encoded, current-mode, and differential signaling. The objective is typically to
enhance signaling speed, power dissipation, signal integrity or a combination of
these. Bus encoding uses additional bus wires and encoding and decoding logic
to alter the signals to be transmitted on a bus. The encoding is used to avoid
certain bit patterns that would result in high noise, delay or power. On the other
hand, in differential signaling, a signal is transmitted over a pair of wires where
the second wire is carrying the complement of the original signal. A differential
signal acts as its own receiver reference and offers improved noise immunity
by rejecting common mode noise. The signal swing is also effectively doubled,
thus increasing noise margins and improving speed as the rise and fall times
at the receiver are reduced [11]. In voltage-mode signaling, the interconnects
need to be fully charged to propagate a signal. This is avoided in current-mode
9
signaling, where the interconnects are terminated with a resistor. Because of
the resistive termination, there is a current flow that the receiver detects to
determine the transmitted logic value. It has been shown that for high data
rates current sensing can be very speed and power efficient in comparison to
voltage sensing [12]. In addition to the above-mentioned signaling techniques
addressed in this thesis, there are also emerging on-chip interconnect paradigms
such as carbon nanotubes [13, 14], optical [15] and RF communications [16].
These interconnects however have several issues that need to be resolved before
they can be used in on-chip communication, and they are not evaluated in this
thesis.
1.2 Major On-Chip Noise Sources
Noise can be defined as the deviation of a signal from its intended or ideal value.
In digital systems, most noise is generated by the system itself [11]. Crosstalk
is noise that is caused by one signal interfering with another signal. The in-
terference is caused by unwanted coupling between a wire and its neighbor. In
integrated circuits, this coupling can be both capacitive and inductive. Mutual
capacitance is the coupling of two or more conductors via an electric field be-
tween them, while mutual inductance is coupling by means of a magnetic field.
A change in the voltage on one wire will inject a current onto another coupled
wire. The affected wire is often referred to as the victim, while the switching
wire is referred to as an aggressor. The induced current for capacitive and in-
ductive coupling is proportional to the rate of change of the aggressor voltage.
This effect increases the importance of crosstalk as the operating speed of cir-
cuits increases. Additionally, the scaling of technology increases the significance
of coupling capacitance since wire height is not scaled as much as the distance
between the wires. The adverse effects of crosstalk include increased delay and
delay variation, voltage peaks on quiet wires and increased energy dissipation
due to coupling capacitance. Signals may also be affected by intersymbol inter-
ference. Unlike crosstalk, where the source of interference is a signal traveling on
another wire, intersymbol interference is caused by successive signals. A signal
can be distorted if the wire does not reach steady state between transitions.
Another major issue in nanoscale integrated circuits is process variation.
Deviations in wire and device parameters affect issues such as timing, signal
integrity and performance [17, 18, 19]. For future technology nodes, a similar or
larger amount of process variation is expected. Also, in addition to the amount
of variations, the sensitivity of transistor performance on process variations
becomes more significant in the nanometer regime [20]. Traditionally, process
variation has been taken into account by using corner based analysis based on
best-case, nominal, and worst-case parameters. The design is required to meet
the specifications at all process corners. While this type of corner analysis has
been successfully employed to model variations between dies (i.e. in inter-die
variation), it is not able to accurately model variations within a single die (i.e. in
intra-die variation). Using a worst-case analysis for within-die variations leads
10
to very pessimistic analysis results since it assumes that all devices on a die have
worst-case characteristics, ignoring their inherent statistical variation [21]. To
overcome these limitations, statistical methods have been proposed for parasitic
extraction, static timing and signal integrity analysis.
A stable operating voltage is required by on-chip devices. Delivering and
maintaining this voltage has however become increasingly difficult as the num-
ber of devices and their operating speeds have increased. The two major com-
ponents of on-chip power supply noise are RI and Ldi/dt noise. The RI drop
is caused by voltage losses due to the resistive component of the power distri-
bution network, while Ldi/dt noise is caused by rapid current changes and the
inductive component of the power distribution network. The amount of power
supply noise is determined by the properties of the power distribution network,
such as the amount of bypass capacitance and the size of power wires, and by
the properties of the load current, such as its magnitude and shape.
1.3 Thesis Objectives
The aim of this thesis is to analyze and provide analytical models for on-chip
communication links for use in the early stages of a design flow. While SPICE-
like circuit simulators offer excellent accuracy, their computational cost is too
high to be used in automated design tools for today’s complex integrated cir-
cuits. Model order reduction methods [22, 23] provide a speed improvement
over circuit simulators at the cost of a slightly reduced accuracy, but their com-
putational cost is still high for the iterative optimization loops during physical
design. During the physical design stages such as floorplanning and global rout-
ing, interconnect area, delay, power and noise need to be quickly estimated
and optimized. For this task, another type of analytical models have been
proposed [24] that provide a high simulation speed at the cost of a somewhat
reduced accuracy or other limitations to their applicability. Such analytical
models need to be developed for multiple issues affecting interconnects, such as
crosstalk noise, process variation, and energy dissipation. Further, the models
need to be developed for communication links consisting of multiple parallel
wires, such as the links between routers in NoCs. Because of electromagnetic
coupling, the models also need to consider the interconnects in the link as an in-
teracting group instead of as isolated signal paths. In addition, the models need
to take into account alternative signaling techniques such as encoded or current-
mode signaling that are promising approaches to global communication links.
Inclusion of inductance in the models is also desirable since wires in the upper
metal layers are wide and they can exhibit significant inductive effects [25]. All
such novel models have to be verified by a comparison to a circuit simulator.
These analytical models can also be applied to case studies to rapidly evaluate
the influence of different design or circuit parameters. Furthermore, reduction
or optimization of issues such as noise is an application for a developed model.
11
1.4 Thesis Organization and Contribution
This thesis addresses signaling in long on-chip interconnects. More specifically,
the focus is on multiple parallel coupled interconnects that are a commonly a
part of a communication link. This structure is referred to as a bus in this thesis
and several issues relating to it are addressed as described below. The models
developed in this thesis are intended for the early stages of a design flow, where
high speed and analytical equations are preferred in order to use the models e.g.
in iterative optimization loops. All models in the thesis are derived for RLC sig-
naling. The thesis is organized into eight chapters. In Chapter 2, an overview of
different on-chip interconnect modeling methods is provided. Emphasis is given
to the decoupling method that is the approach used in this thesis. In Chapter 3,
an analytical model that for the first time evaluates both crosstalk and intersym-
bol interference in buses is proposed. The model takes into account aspects that
have not been included in a single previous analytical model such as inductive
coupling, phases, initial states and bit sequences. The model is then verified and
applied to study crosstalk and intersymbol interference in a bus under different
switching patterns and operating speeds. Intersymbol interference is shown to
affect crosstalk voltage and propagation delay depending on bus throughput and
the amount of inductance. In Chapter 4, a model for the switching current of a
coupled on-chip bus is proposed. While models for the simultaneous switching
noise of a gate with a simple capacitive load have previously been presented, the
proposed model includes the effects of long coupled interconnects. This coupling
is shown to affect the switching current. The influence of skewed inputs is in-
cluded and the model is combined with an existing power grid model to evaluate
induced power supply noise in different locations of the power grid. In Chapter
5, intentional skewing of bus inputs is used to reduce functional crosstalk noise
and power supply noise. Unlike in previously existing methods, the reduction
is achieved as a trade-off with time or timing slack. Models proposed in the
previous chapters are used to demonstrate that the method can be used to re-
duce long-range inductive crosstalk. The skewing method is implemented and
compared to other crosstalk and power supply reduction methods. In Chapter
6, a model for analyzing the effects of process variation on an encoded bus is
proposed. The proposed model includes variation in both the signaling circuitry
and in the wires to calculate the total delay variation of a bus. Characteriza-
tion of encoding circuitry is used together with analytical interconnect modeling
to rapidly analyze level-encoded dual-rail (LEDR) and 1-of-4 signaling. Wire
width variation is demonstrated with the model to affect LEDR signaling more
than 1-of-4 or regular signaling. In Chapter 7, a model for energy dissipation in
RLC current-mode signaling is proposed. A realizable driving point Π model is
presented for an RLC current-mode transmission line. The energy dissipation
is derived separately for the driver, wire and receiver termination. The model
is applied to differential current-mode signaling. The location where energy is
dissipated in current-mode signaling is shown to depend on wire width. Finally,





In general, an interconnect behaves as a waveguide that can be analyzed using
Maxwell’s equations. An interconnect can also be analyzed using transmission
line equations, if it is assumed that the waves on the line propagate in the trans-
verse electromagnetic (TEM) mode [26, 27]. In TEM mode, both electric field
and magnetic field vectors lie in a plane perpendicular to the axis of propagation
as shown in Fig. 2.1.
Conductors that have electrically large cross-sectional dimensions have in
addition to the TEM mode also other modes of propagation [28]. Also, if the
conductor is lossy, i.e. it has a non-zero resistance, the assumption of solely
TEM mode is invalidated since the current flowing through the conductor cre-
ates a an electric field in the direction of propagation. However, if the conductor
losses are small, lossy transmission lines are still assumed to represent the situ-
ation. An inhomogeneous surrounding medium also invalidates the TEM mode
assumption, because a TEM field structure must have only one velocity of wave
propagation. Transmission lines can nonetheless be used assuming that the
velocities are not substantially different. The usage of transmission lines to
represent lossy conductors and/or conductors having an electrically large cross-
section and/or inhomogeneous surrounding medium is generally referred to as
the quasi-TEM assumption. In addition to transmission lines, simpler lumped
segments can also be used to represent interconnects, although the accuracy may
be reduced depending on wire length and signal frequency. A complete solution
that does not assume a TEM mode can be obtained with so-called full-wave so-
lutions of Maxwell’s equations [29]. These solutions generally require numerical
methods that are very time-consuming and therefore impractical in the design of
integrated circuits. Interconnect models based on transmission lines or lumped
RC or RLC segments are thus normally used in on-chip interconnect analysis.
In the following, an overview of different approaches to on-chip interconnect







Figure 2.1: Electromagnetic field structure of TEM propagation. Electric field
E and magnetic field intensity H lie in a plane perpendicular to the axis of
propagation z.
ing the physical design are presented. Second, model-order reduction methods
are reviewed. Third, an overview for the decoupling method that is the model-
ing approach taken in this thesis is presented. Finally, the characterization and
modeling of drivers is also discussed.
2.1 Interconnect Approximations in Physical
Design
Accurate interconnect modeling is an issue in post-layout verification, where
high accuracy is needed. However, in deep-submicron design there is also a
need for another class of interconnect modeling tools for the early stages of
the design flow, where high simulation speed is needed for design optimization.
During the physical design, interconnect area, delay, power and noise are esti-
mated and optimized as a trade-off between different design parameters. For
example, in floorplanning the major functional blocks of a chip are tentatively
placed using criteria such as chip area and interconnect length. Optimizations
during this process include the insertion of buffers into interconnects to reduce
delay and crosstalk noise [30, 31], and reduction of peak temperatures due to
interconnects [32].
Since there is little physical information available during floorplanning, the
optimization possibilities are limited, and the optimization therefore continues
in other design stages when more physical information is available. For example,
after routing, the routes, layers and relative positions of nets are known. Dur-
ing global routing, where the approximate path of each net is planned, crosstalk
noise reduction can be achieved with shield insertion and buffering [33]. In-
14
terconnect process variation such as dishing and erosion can also be reduced
during global routing [34]. After routing, techniques such as buffer insertion
or wire perturbation are not desirable since they may require rerouting. In-
stead, techniques such as gate sizing can be applied to reduce crosstalk [35]
and interconnect delay [36]. Spacing of the wires can also be applied to reduce
interconnect power and delay [37].
In order to achieve a high simulation speed in the iterative optimization
loops during these design stages, analytical interconnect models are required.
These models are often derived by employing different approximations to an in-
terconnect topology in order to obtain a simpler circuit that is then analytically
modeled. For example, in [38], two coupled RC interconnects are reduced to a
lumped two-node circuit that is then analyzed for crosstalk noise. The coupling
is included with a single capacitor. In [24], RC interconnects are reduced to
a six-node lumped template circuit instead. In addition to the wire structure,
also the input signals to the wires can be approximated. Buffering and shielding
are performed in [33] based on a crosstalk metric that approximates the input
signal as an infinite ramp, while in [34] delay based on a RC step response is
used for global routing. Besides lumped circuits, transmission line structures
can also be used as in [39] where two parallel RLC transmission lines are used
to calculate crosstalk noise.
2.2 Model Order Reduction
Model order reduction algorithms provide a speed improvement over circuit
simulators while preserving good accuracy, and they are useful for post-layout
verification where accuracy is a key requirement [24]. In this section, the fun-
damentals of model order reduction of interconnects are reviewed.
Any interconnect structure consisting of resistors, capacitors and inductors
is a linear time-invariant (LTI) system. Continuous-time, LTI systems are often
described using a state-space realization. The state-space model of a multi-input
multi-output system is
dx(t)
dt = Ax(t) +Bu(t)
y(t) = Cx(t) +Du(t)
(2.1)
where x is the state vector, u is the input vector, y is the output vector and A,
B, C and D are matrices.
In linear circuit simulation Modified Nodal Analysis (MNA) [40] is widely
used to form the circuit equations in the form of [41]
C
dx(t)




whereG andC represent conductance and energy storage matrices, respectively,
vector x includes MNA variables, and B and L are mapping matrices. Taking




= LT (G+ sC)−1B. (2.3)
The entries of H(s) can be shown to be in the form of rational polynomials of s
Hij(s) =
bij1 + bij2s+ . . .+ bijms
m
1 + a1s+ . . .+ ansn
. (2.4)
For many interconnect circuits, the number of poles of H(s) can be very
large. Some of these poles have an insignificant contribution to circuit perfor-
mance, and the circuit can be adequately described with a group of dominant
poles. In model order reduction, the objective is to reduce the complexity with
a lower-order state-space system while preserving, or approximating the original
input-output behavior. Asymptotic Waveform Evaluation (AWE) [22] approx-
imates the behavior of a linear circuit by generating moments, or Taylor series
coefficients of H(s), and matching them to form a lower order transfer func-
tion. To overcome the numerical problems of AWE, several other model order
reduction methods have been presented, such as Complex Frequency Hopping
(CFH) [42], Padé-via-Lanczos (PVL) [43], PRIMA [23], Truncated Balanced
Realization (TBR) [44] and parameterized model order reduction [45].
Model order reduction has become an established part of interconnect anal-
ysis. It has been used e.g. for power grid verification [46], interconnect power
consumption [47] and static timing analysis [48]. Model order reduction has
however also some drawbacks. The efficiency of model order reduction reduces
as the number of circuit input-output terminals is increased [49, 50], compli-
cating the analysis of e.g. power distribution networks and large data buses.
Many model order reduction algorithms are not adapted to handle more than
a few tens of terminals [51]. Numerical problems also remain an issue in many
model order reduction methods [52]. Furthermore, the generation of moments
requires successive analyses of an equivalent dc circuit of an interconnect tree
or the application of MNA.
2.3 Decoupling Method
The electrical properties of multiple coupled parallel wires, as in a bus, are often
represented in a concise form using transmission line matrices. The capacitance


















where Cnn is the total capacitance seen by the line n and Cmn is the coupling

















where Lnn is the self-inductance for line n and Lmn is the mutual inductance
between lines m and n. Unlike in the capacitance matrix, Lnn is not the sum

















where Rmm define the resistive losses of each conductor and Rmn are due to the
current return path. For a return path with a large area, Rmn will be close to
zero. For high frequencies, the return current will flow near the signal line to
reduce the impedance of the loop and the non-diagonal terms of the resistance
matrix will be non-zero [53].

































where Vi(z, t) is the voltage at point z of the ith transmission line and Ii(z, t)
is the current through parallel elements.
The transmission line equations for n lines are
∂








The problem of solving multiple coupled lines can be reduced to solving a num-
ber of equations for isolated lines by using a matrix transformation. The cou-
pling represented by non-diagonal elements can be eliminated by diagonalizing
the transmission line matrices. This can be achieved by using a congruence
17
transformation [54] or by using a similarity transformation with improved effi-
ciency and numerical stability [55]. The similarity transformation of matrix A
to matrix Λ is
M−1AM = Λ. (2.8)
If the n× n matrix A has n linearly independent eigenvectors, Λ is a diagonal
matrix whose entries are the eigenvalues of A [56]. M is a square matrix whose
n columns are the eigenvectors of A.
In the decoupling method the mode voltages and currents Vm and Im are
defined as
V(z, t) = MV Vm(z, t) (2.9)
and
I(z, t) = MIIm(z, t) (2.10)
where MV and MI are n×n transformation matrices. L and C are diagonaliz-
able since they are real, symmetric and positive definite [26]. The diagonalized
inductance and capacitance matrices L̂ and Ĉ are
L̂ = M−1V LMI (2.11)
Ĉ = M−1I CMV . (2.12)
In Fig. 2.2 is shown a typical cross-section of a microprocessor with wire and
dielectric layers. Of the global layers, the topmost one is normally used for power
distribution, while the others are used for global signaling. The surrounding of
these wires has the same relative permittivity, except for the vias and the thin
etch stop and dielectric capping layers. The surrounding dielectric is therefore
approximated as homogeneous, in which case [26]
MI = M (2.13)





where T denotes a transpose. If the relative permittivity is not constant, as
in the case of an inhomogeneous embedded low-κ dielectric, an effective rel-
ative permittivity can be used instead. The effective relative permittivity is
determined so that if the inhomogeneous surrounding medium were replaced
by a homogeneous medium having an effective relative relativity none of the
properties of the line would be changed [57]. The diagonalized inductance and
capacitance matrices are then
L̂ = MTLM (2.16)
18
Figure 2.2: Typical cross-section of a microprocessor [3].
Ĉ = MTCM. (2.17)
Assuming lines with same source and load impedance, and per-unit-length
resistance, the boundary conditions can be included as [58]
R̂S = RS (2.18)
ĈL = CL (2.19)
V̂S = MVS (2.20)
where RS is the driver source resistance, CL is the receiver load capacitance and
VS is the driver voltage source. The resistance matrix is assumed to be diagonal
in the first place, i.e.
R̂ = R. (2.21)
The calculated responses of the decoupled lines can then be combined into the
response of the coupled system using (2.13) and (2.14).
Decoupling method, or modal analysis, has been used in different variations
in recent years for on-chip interconnect analysis, e.g. for analysis of periodic
signals [59], modeling of multi-walled carbon nanotubes [60] and signal integrity
19
verification of inductively dominated lines [61]. The method is chosen as the
modeling approach taken in this thesis since it is applicable to more wires and
more complex interconnect topologies than straightforward template circuits.
There is also a trend towards network-on-chip architectures with structured
parallel links between routers that are suitable for modeling with the decoupling
method. The method also enables analytical models and requires no moment
generation and is therefore suitable for analysis early in the design flow.
2.4 Driver Modeling
In addition to modeling the wires themselves, it is necessary to include the
influence of the gates driving the wires. Common approaches to including these
nonlinear devices in the interconnect models are described in this section.
The first gate models characterized the gate simply as a fixed delay. Later
single-parameter models obtained this delay based on the capacitive load driven
by the gate. Then, because of technology scaling, it became necessary to record
also the rise time or slew of the gate output in addition to the gate delay. This
data was needed to accurately model gate properties since the rise time of a
gate input signal and the gate load determine the gate delay and output rise
time. These two-parameter gate models represent the gate delay and output
rise time as a function of input rise time and output load capacitance. The
characterization of a gate where these parameters are collected is performed
using a circuit simulator such as SPICE. The characterization data can be stored
in a look-up table or more compactly in the form of k-factor equations fitted to
the characterization data [62].
The concept of effective capacitance was introduced in [63] since a driver
sees only a part of the total interconnect capacitance due to resistive shielding.
The connection between the driver and the interconnect is gained iteratively
by calculating the effective capacitance and then applying the two-parameter
model to obtain the gate delay and output slew. The effective capacitance is
calculated by equating the mean current into an interconnect with the mean
current into a single capacitor.
To capture the combined effect of a gate and an interconnect, switch-resistor
models can also be used. The switch-resistor model consists of a voltage source
and a linear resistor that represent a gate. The connection between the driver
and the interconnect is easily modeled by including the driver resistance in the
interconnect RLC circuit. Although this facilitates the analysis of the combined
gate and interconnect, the accuracy is limited by the need to map a gate on-
resistance to a linear resistor.
In order to improve model accuracy, extensions of the two-parameter models
have been proposed. A two-ramp model based on two effective capacitances
was proposed in [64], while in [65] the effective capacitance was matched for
both 50% delay and 80% transition time. Also, models that store the gate
output waveform have been adopted. For example, the effective current source
model (ECSM) by Cadence stores the driving point voltage waveforms of each
20
input slew and load capacitance combination, while the composite current source
(CCS) model by Synopsys stores the characterization data as currents.
Recently, current-source models (CSM) have been proposed [66, 67, 68] to
further improve accuracy by addressing issues such as complex interconnect
loads and non-linear input waveforms. These models are based on a nonlinear
voltage-controlled current source that approximates the current drawn by a
gate for a certain value of input voltage, time, output voltage, etc [69]. A
CSM is a major departure from the previous models, since to determine delay
and slew (or voltage response) a circuit simulation must be performed. Instead
of propagating only the delay and slew, CSM propagates the whole voltage
waveform. The high accuracy of CSMs makes them attractive for employment




Modeling of Crosstalk and
Intersymbol Interference
A major source of on-chip noise is crosstalk, which is caused by capacitive and
inductive coupling between wires. Crosstalk noise avoidance is especially im-
portant for on-chip buses, since in buses several interconnects run parallel to
each other for long distances. Signals traveling on buses may also corrupt later
ones if the bus does not reach steady state between signals. This intersymbol
interference is caused by stored energy in reflections, circuit ringing and charge
storage [11]. Ringing and temporary charge storage in interconnects are prob-
lematic in buses, where interconnects are wide resulting in a large capacitive load
and inductive noise. The amount of crosstalk noise and intersymbol interference
depends not only on the electrical properties of interconnects, but also on the
signal transitions. Capacitive coupling causes an increase in propagation delay
when coupled interconnects are switching in the opposite direction, and a de-
crease when they are switching in the same direction. Also the induced crosstalk
voltage on a quiet interconnect depends on the transition activity of neighbor-
ing interconnects. Intersymbol interference, on the other hand, is dependent on
successive transitions. Certain bit sequences can cause an interconnect not to
reach steady state. Signals may also arrive at different phases due to unbalanced
signal paths, or because of deliberate timing intervals to reduce crosstalk noise
or peak current draw [71]. The relative input timing of aggressor and victim
nets influences strongly the coupling noise on a victim net [72, 73].
Over the last decade several analytical models have been proposed for the
estimation of crosstalk noise in coupled interconnects. In [74] a model for up to
five coupled lines has been presented. However, the model is based on a single
L-segment and does not consider inductance. In [75, 76] a Π-model is used,
but inductance is neglected. A model for a bus consisting of distributed RC
lines has been suggested in [77], but signal rise times and inductance are not
included. It has also been assumed that every other wire in the bus carries the
same signal. A propagation delay model for a bus has been presented in [78],
23

















Input to the 1st wire 
Output of the 1st wire 
Input to the 2nd wire 
Output of the 2nd wire 
Figure 3.1: Two coupled 2 mm long interconnects with initial states.
but the model does not consider inductance and is also based on switch factor
analysis. In [79] a model for two distributed RLC wires has been presented,
but inductive coupling has been ignored. It has been shown that the effects of
inductive coupling can be significant for long interconnects [80]. The accuracy of
RC models in crosstalk evaluation is also no longer sufficient for deep sub-micron
circuits [81].
None of the mentioned models take into account both crosstalk and inter-
symbol interference. In this chapter, an analytical RLC Π-model for crosstalk
and intersymbol interference is proposed [82]. The model includes different
phases, signal rise times, initial conditions and bit sequences. The model also
considers both capacitive and inductive coupling between interconnects.
3.1 Noise Model Derivation
The input and output voltages of two coupled interconnects are represented in
Fig. 3.1 to demonstrate the impact of intersymbol interference. Initially both
wires are at quiescent state. In this case the 50% propagation delay of the first
wire is 112 ps. However, when the wire switches up the second time, it has not
reached steady state, and the delay is increased to 120 ps. Additionally, in the
first upwards transition, the crosstalk noise peak on the other line is 202 mV.
Because of the initial state, the second noise peak, at about 2.2 ns, reaches
222 mV. The differences in percentage points are 7.1 % and 9.9 %, respectively.
For any lumped linear time-invariant (LTI) circuit its output can be written














where Y (s) is the Laplace transform of the circuit output, Xm(s) is the Laplace
transform of the mth independent external voltage or current source, M is
the number of external sources, λn(0)s is the Laplace transform of the source
describing the effect of the value λn(0) of the nth state variable at t = 0, N is
the order of the system and Hem(s) and Hin(s) are functions that relate each
external source or initial condition source to the output. If it is assumed that
the circuit is initially in quiescent state, the transfer function Hem(s) relates
the output to the input. In the modeling of crosstalk the initial state of the
interconnects is generally ignored to enable the usage of transfer function and
thus facilitate the calculations.
In the following an analytical model for a coupled bus including the initial
conditions is derived. The bus is assumed to consist of interconnects that have
the same per-unit-length resistance. The spacing between interconnects does
not need to be uniform. The receivers are assumed to have the same load ca-
pacitance. The source resistances of the drivers need to be similar. However,
the driver rise times can be different. This makes it possible to model the influ-
ence of different driver resistances, since from the perspective of a downstream
wire, a slow input driven into a source resistor is almost indistinguishable from
a fast input driven into a larger source resistor [5]. For example, with suitably
selected rise times the aggressor driver resistance can increased while the vic-
tim resistance is maintained. To obtain a closed-form time-domain solution and





















Figure 3.2: Equivalent circuit for an interconnect considering initial conditions.
The driver is modeled as a linear voltage source with source resistance Rs.
The receiver is modeled as a capacitive load. In Fig. 3.2, R and L are the
total resistance and inductance of the interconnect while C1 is half of the total
capacitance of the interconnect and C2 is the sum of the receiver capacitance
and half of the total interconnect capacitance. The possible initial charges in the

















Figure 3.3: The voltage source vs and its components.
using their s-domain equivalent circuits, which have been marked in the figure
with dotted lines. The voltage source vs is modeled as a superposition of three
components vsc1, vsc2 and vsc3 as shown in Fig. 3.3. The components are used
to obtain different input phases for interconnects as well as an input signal with
a non-zero rise time.















vsc3(t) = vi [u(t)− u(t− t1)] (3.4)
where t2 − t1 is the rise time and vi and vf are the initial and final values of
the input signal. t1 is the phase of the input signal and u(t) is the unit step
function. The response of the circuit in Fig. 3.2 to the voltage source vs can be
solved with the following s-domain nodal equations.

















I1 = I2 + I3 (3.7)
In the equations above V0 is the initial voltage at the end of the line and V02
is the initial voltage of capacitor C1. I0 is the initial current flowing through
the inductor. The equations can be used to derive an expression for current I2.





























By substituting them into the expression for I2 and using partial fractions
and inverse Laplace transform the time-domain equation for the current î2 of a




























The voltage v̂3(t) can be obtained in a similar manner and written as



























where v̂out(t) is the response of the decoupled line to input vs. It can be written
as






























bit 1 bit 2 bit 3 bit 4
T
T2
Figure 3.4: Use of bit sequences and phases in the model.
The expressions ai, bi and ci are composed of the variables that are obtained
from (3.5)-(3.7) during the derivation, and they are presented in the Appendix.
Eq. (3.12)-(3.14) are the response of a decoupled interconnect to a single voltage
source vs. The total response of the decoupled interconnect is calculated using
superposition since its input v̂si is a combination of the original voltage sources
vs1, . . . , vsn, as shown in (2.20). The voltages v3(t) and vout(t) and current
i2(t) of a coupled interconnect can then be calculated by using (2.9) and (2.10),
respectively.
To model bit sequences, the state of each coupled wire at the end of a clock
cycle is passed onto the next calculation by setting the final values of vout(t),
i2(t) and v3(t) of an interconnect as its initial values V0, I0 and V02, respectively.
A voltage different from zero (or Vdd) or a non-zero current in these variables also
indicates the presence of intersymbol interference. The usage of bit sequences is
demonstrated in Fig. 3.4, where solid lines represent input signals vs1, . . . , vsn
to the wires. The wire that is switching the earliest is used as a reference wire
that determines the start and end of bits. In Fig. 3.4, wire 1 is the reference
wire and T2 and T3 are the phases of wires 2 and 3, respectively. The dotted
lines in the figure mark the instant when the final values of vout(t), i2(t) and
v3(t) are evaluated. Two successive similar input bits are obtained by setting
the initial voltage vi and final voltage vf equal to each other.
3.1.1 Determination of Maximum Noise and Propagation
Delay
The resulting voltage waveforms can have multiple peaks at various times be-
cause of inductive ringing and different phases. This complicates the task for
finding the maximum induced crosstalk noise, since there can be several local
maxima in the voltage waveform. The global maximum must also be discovered
in as few iterations as possible to obtain a necessary efficiency for VLSI design
tools. It is not possible to construct an algorithm that will find the global max-
28
imum for an arbitrary function, but in this case the physical properties of the
system can be analyzed to alleviate the task. The noise peak on a victim occurs
approximately at the same time as the aggressor voltage reaches maximum. A
method to evaluate the propagation delay tpd of a single RLC line has been





























and where Rt, Lt and Ct are the total resistance, inductance and capacitance,
respectively. The time tpd is not accurate when there is crosstalk noise and/or
intersymbol interference present, but it can nevertheless be used as a starting
point for a search. Newton’s method is an efficient method to find a local max-
imum since the method converges quadratically. Unfortunately, its stability is
very dependent on the starting point. Therefore, simulated annealing is used to
further improve the initial starting point tpd, and to avoid nearby local maxima.
This way the number of necessary iterations could be kept low. The probability
for taking a downhill step in annealing was calculated using Boltzmann proba-
bility distribution.
The suitable parameters for simulated annealing were found empirically. A
fast cooling was used since the starting point was already close to the maximum.
The equation for cooling was T = T ∗ 0.2 where T is the system temperature.
The temperature was reduced twice and for each temperature four random
moves were calculated. After that, two iterations of Newton’s method were
performed, resulting in a total of ten iterations to obtain the maximum noise
induced by an aggressor. The phase of an aggressor was taken into account
by adding it to tpd. The error of this method was compared to an exhaustive
sweep of the complete waveform for maximum voltage. The comparison was
performed for 5000 randomly generated 8-bit buses where electrical parameters
and signal rise times and phases were varied. The wire resistances were 5-780 Ω;
capacitances were 25-4100 fF; inductances were 0.03-9.8 nH; source resistances
were 50-750 Ω; load capacitances were 50-500 fF; rise times were 1-300 ps; and
the phases were 0-300 ps. The switching activity of the bus consisted of random
rising transitions. The results are shown as a histogram in logarithmic scale in
29
Fig. 3.5. The error was under 2% in 93% of test buses. As it can be seen the
error remained very small in the vast majority of cases, except that in some cases
the error was about 100 percent. This happened when the annealing algorithm
was stuck at zero voltage. However, these cases are easily spotted and can be
corrected by rerunning the algorithm with more iterations.


















Figure 3.5: Error caused by the search for global maximum for 5000 random
buses.
To find the 50% propagation delay of an interconnect, a combination of
bisection method and Newton’s method was used. Bisection method was used
to find a suitable initial point for Newton’s method. The search interval for
the bisection method was the time between the input phase of the interconnect
under study and the end of the clock cycle. An initial point between 20% and
70% percent of the operating voltage was found to result in good convergence
for the Newton’s method. An average of three iterations of bisection method
was required to reach this interval. After this, Newton’s method was run twice.
A histogram for the error of this method for 5000 random 8-bit buses is shown
in Fig. 3.6. The electrical parameters were the same as in peak crosstalk noise
evaluation, with both rising and falling transitions on the bus. The error was
within 2% in 96% of test buses. Both proposed methods have thus a good
accuracy and require a small number of iterations.
3.2 Model Verification
The accuracy of the model was verified by comparing it to HSPICE and previous
RC crosstalk models [24, 85, 86]. The HSPICE model consisted of 100 segments.
30


















Figure 3.6: Error caused by the search for propgation delay for 5000 random
buses.
Fig. 3.7 shows the crosstalk voltage on the victim line when one wire is switching
and the other is quiet. The 2 mm long wires were modeled using an RC model.
The self and coupling capacitances of the two wires were 52.77 fF/mm and
71.33 fF/mm, respectively. Resistance was 23.61 Ω/mm. The input signal was
assumed to be a step input. As can be seen, the model is in close agreement
with HSPICE. The induced noise for a 5 mm long interconnect with the same
parameters is depicted in Fig. 3.8. The waveforms from the model and HSPICE
were again nearly identical.
The case when two wires are switching in opposite direction was also verified.
This situation is presented in Fig. 3.9. The RC wires were 2 mm long and had
a rise time of 50 ps. The phase difference was 100 ps. The model was again in
close agreement with HSPICE.
The model was further verified by comparing it to HSPICE using RLC mod-
eling. The self and mutual inductance were 5.15 nH/mm and 3.46 nH/mm, re-
spectively. The resistance and capacitance values remained the same. Fig. 3.10
shows the voltage waveforms when both wires are switching in the same direc-
tion with a rise time of 100 ps. The length of the wires was 2 mm. The induced
crosstalk voltage on a quiet RLC interconnect is depicted in Fig. 3.11. The rise
time of the aggressor was 100 ps.
The capability of the model to represent successive transitions was also ver-
ified. Fig. 3.12 shows the results when a square pulse is applied to two coupled
RLC interconnects. The phase difference between the interconnects was 500 ps
and rise time was 100 ps.
31


















Figure 3.7: Crosstalk noise waveform on a 2 mm interconnect by different mod-
els.


















Figure 3.8: Crosstalk noise waveform on a 5 mm interconnect by different mod-
els.
32



















Figure 3.9: Voltage waveforms on two 2 mm interconnects switching in opposite
directions.


















Figure 3.10: Voltage waveform on a 2 mm RLC interconnect.
33



















Figure 3.11: Crosstalk noise waveform on a 2 mm RLC interconnect.




















Figure 3.12: Square pulse on two 5 mm RLC interconnects with a 500 ps phase
difference.
34
Transmission line behavior becomes significant when the rise time of a signal
is less than or comparable to the time-of-flight delay of the line [87]. This can
be seen from Table 3.1 where the model and HSPICE are compared. The table
shows the peak crosstalk voltages that have been induced by an aggressor onto
a quiet victim line at different wire lengths and rise times. The coupling capaci-
tance and self capacitance of the interconnects were 120 fF/mm and 80 fF/mm,
while the mutual inductance and self inductance were 1.5 nH/mm and 4 nH/mm,
respectively. The wire resistance was 60 Ω/mm, and source resistance and load
capacitance were 300 Ω and 150 fF, respectively. The error of the Π-model in-
creased as aggressor rise time became shorter and wire lengths increased. This
was caused by the inability of lumped circuits to model wave reflections. How-
ever, wide global interconnects present a large load to the driver that slows down
the signal rise time. Furthermore, on-chip global interconnects with a length of
more than 1-2 mm are usually divided with buffers into shorter segments. The
accuracy of an RLC π-circuit in global interconnect modeling has been verified
also in [88].
Table 3.1: Comparison of peak crosstalk voltages on a quiet wire. The values
are normalized to the Vdd
Rise time [ps] Length [mm] Model HSPICE Error [%]
0.5 0.099 0.097 2.1%
5 1 0.142 0.149 -4.7%
4 0.231 0.220 5.0%
0.5 0.093 0.093 0%
50 1 0.140 0.141 0.7%
4 0.230 0.220 4.5%
0.5 0.088 0.088 0%
100 1 0.134 0.134 0%
4 0.229 0.219 4.6%
0.5 0.057 0.057 0%
300 1 0.102 0.102 0%
4 0.219 0.212 3.3%
The accuracy of the proposed model in different operating speeds and wire
lengths was further assessed by comparing the calculated results and HSPICE
simulations for four coupled copper interconnects. The global interconnects
were sized at 0.6 µm × 1.2 µm. The clock frequency was varied between 100
MHz and 2 GHz while the length of the interconnects was between 0.5 mm and
10 mm. The simulation was run for seven clock cycles with the wires switching
in opposite directions. The rise time of the interconnects was set to 10 percent
of the clock cycle, i.e. from 50 ps to 1000 ps. The results are illustrated in
Fig. 3.13. The error was calculated as the average of the difference between
the waveforms from the model and HSPICE. The error of the proposed model
increased with operating speed and wire length. This was again due to the
35
limited accuracy of the lumped Π-model. However, the error remained below
four percent. The run time to obtain the data for Fig. 3.13 was 50 minutes

































Figure 3.13: Difference between the proposed model and HSPICE simulations
for four coupled interconnects.
3.3 Case Study
In high-speed data transmission the amount of noise is affected by multiple fac-
tors. Crosstalk noise and propagation delay are dependent on both the physical
properties of interconnects and the switching activity on them. In this sec-
tion, the model is applied to evaluate the influence of factors such as switching
activity and phases on crosstalk noise and propagation delay in an 8-bit bus.
The influence of intersymbol interference is also considered. The voltages are
evaluated at the far-end of the parallel interconnects.
3.3.1 Switching Patterns
The amount of crosstalk noise and delay variation depends on the switching
activity of coupled interconnects. The model was used to study the influence of
switching activity on crosstalk noise on a bus consisting of eight 2.5 mm long
parallel wires. The wires were sized at 0.6 µm× 1.2 µm and separated by 1.5 µm.
The 8 × 8 transmission line matrices for the interconnects were extracted using
Linpar [89] and FastHenry [90]. The total capacitance and self inductance of
36
a single interconnect were 84 fF/mm and 7.6 nH/mm, respectively. Resistance
was 17.7 Ω/mm. The wires were numbered from one to eight starting from the
left. The fourth wire was used as a reference wire in the measurements, since it
was most susceptible to crosstalk noise. The simulation results for propagation
delay and crosstalk noise are shown in Tables 3.2 and 3.3. In these tables, the
first eight columns describe the switching status of the wires. The symbols ‘↑’
and ‘↓’ in the tables represent upward and downward transitions, respectively,
while the symbol ‘-’ represents no transition on the wire. In all cases the wires
switched simultaneously with a rise time of 100 ps. The results are given for
both RLC and RC models of the interconnect. The RC model was obtained from
the original by reducing the amount of inductance by two orders of magnitude.
As it can be seen from patterns one and two in the Table 3.2, the propagation
delay for the RC wire varied between 60 ps and 217 ps. The minimum delay
was obtained when all wires were switching in the same direction. On the other
hand, the maximum delay was obtained when the other wires were switching
in opposite direction to the fourth wire. However, for the RLC wire the delay
varied between 82 ps and 201 ps. The minimum delay was attained when wires
3 and 5 were switching in the same direction as wire 4 to maximize the delay
enhancement of capacitive coupling, while the other wires were switching in the
opposite direction to optimize the speed improvement by long-range inductive
coupling.
Table 3.2: Propagation delay of the 4th wire in an 8-bit bus for different switch-
ing patterns
Wire Propagation delay
1 2 3 4 5 6 7 8 RLC model RC model
Pattern 1 ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ 154 ps 60 ps
Pattern 2 ↓ ↓ ↓ ↑ ↓ ↓ ↓ ↓ 90 ps 217 ps
Pattern 3 ↑ ↑ ↓ ↑ ↓ ↑ ↑ ↑ 201 ps 149 ps
Pattern 4 ↓ ↓ ↑ ↑ ↑ ↓ ↓ ↓ 82 ps 96 ps
Pattern 5 - - - ↑ - - - - 133 ps 116 ps
Pattern 6 ↑ - - ↑ - - ↑ ↑ 167 ps 103 ps
Pattern 7 ↓ - - ↑ - - ↓ ↓ 98 ps 130 ps
Pattern 8 - - ↑ ↑ ↓ - - - 131 ps 115 ps
The long-range effects of inductive coupling were also seen in propagation
delay variation caused by interconnects further away. The propagation delay of
a single switching interconnect was 133 ps and 116 ps for RLC and RC models,
respectively. As shown in patterns 6 and 7 in the table, switching activity by
wires 1, 7 and 8 caused the delay to vary between 98 ps and 167 ps for the
RLC model and between 103 ps and 130 ps for the RC model. Wires can also
cancel the effects of other wires on propagation delay variation. This situation
is shown in pattern 8, where wires 3 and 5 cancel each other’s influence on the
37
propagation delay of the 4th wire.
The crosstalk voltage induced on the fourth wire when others are switching
is shown in Table 3.3. The maximum induced noise was 0.42 and 0.27 by RLC
and RC models, respectively. The influence of distant wires was more prominent
in RLC modeling, due to the long-distance effects of inductive coupling. The
influence of distant wires can be used to determine a suitable number of wires
to include in the model and thus further increase the efficiency of modeling of
wide buses. Also in crosstalk noise modeling wires can cancel the influence of
other wires. This is depicted in patterns 4 and 5.
Table 3.3: Crosstalk noise on the 4th wire in an 8-bit bus for different switching
patterns. The values are normalized to the Vdd
Wire Crosstalk noise
1 2 3 4 5 6 7 8 RLC model RC model
Pattern 1 ↑ ↑ ↑ - ↑ ↑ ↑ ↑ 0.42 0.27
Pattern 2 - - ↑ - ↑ - - - 0.20 0.17
Pattern 3 - ↑ ↑ - ↑ ↑ - - 0.29 0.23
Pattern 4 - - ↓ - ↑ - - - <0.01 <0.01
Pattern 5 - ↓ ↓ - ↑ ↑ - - <0.01 < 0.01
3.3.2 Signal Phases and Rise Time
Crosstalk noise and propagation delay depend not only on the switching patterns
of interconnects, but also on the relative arrival time of signals. The influence of
timing was studied by increasing the phase of the fourth interconnect in an 8-bit
bus consisting of global interconnects. The other wires switched simultaneously
with upward or downward transitions. All wires had a rise time of 100 ps. The
minimum and maximum propagation delays of the fourth wire for RC and RLC
models are shown in Fig. 3.14. The delay values include the phase of the fourth
wire. The minimum and maximum propagation delays were calculated from all
possible upward and downward transitions of the 8-bit bus. The delay variation
for the RC modeling was greatest when there was no phase difference between
interconnects. As the phase of the fourth wire was increased, the variation in
delay was reduced. However, the reduced delay variation was achieved at the
cost of increased total propagation delay.
The delay variation was much greater when the RLC model was used. This
was due to the ringing crosstalk voltage on the fourth wire that both increased
or decreased delay variation depending on the phase of the fourth wire. For
both RC and RLC models the maximum and minimum propagation delay of
the fourth wire approach each other as the phase is increased. This is due to
the fact that a crosstalk noise pulse has a finite duration.
The amount of induced noise on a quiet victim interconnect is heavily de-
pendent on the rise time of aggressors. In Fig. 3.15 is shown the amount of
38



























Figure 3.14: Influence of phase on the propagation delay of the fourth wire.
crosstalk noise at different rise times on the fourth wire when all other wires are
switching. As it can be seen from the figure, an assumption of a step input at
the driver can result in clear overestimation of noise.
3.3.3 Intersymbol Interference
Two noise forms, such as crosstalk noise and intersymbol interference, can be
cumulative. The initial state of an interconnect affects its propagation delay
and the amount of crosstalk noise induced into it. The influence of intersymbol
interference on propagation delay variation was evaluated by simulating the bus
for two cycles. The initial voltage on all wires of the bus was zero. The bit se-
quence on wires 2, 3, 5 and 6 was ‘10’, and ‘01’ on the fourth wire. Other wires
were quiet. This caused a voltage peak to be induced on the fourth wire during
the first clock cycle. This situation is depicted in Fig. 3.16. To increase the
throughput of the bus, the clock cycle of the bus was shortened, while keeping
all other parameters, such as rise time and bus length, constant. At a through-
put of 250 MB/s, the bus was able to return to steady state between cycles,
leading to a propagation delay of 145 ps at the fourth wire. However, at higher
operating speeds intersymbol interference caused variation in propagation delay,
as shown in Table 3.4. The propagation delay was both increased and decreased
at different operating speeds, depending on whether there was overshoot or un-
dershoot on the fourth wire at the beginning of the next cycle. However, the RC
model induced a positive voltage peak on the fourth wire that did not oscillate,
thus causing a steady decrease in the propagation delay.
Intersymbol interference also influenced the amount of crosstalk voltage in-
39
duced on a quiet wire. The induced voltage was measured at different operating
speeds as in the delay variation measurements. The bit sequence on wires 2 and
3 was ‘11’, and ‘01’ on wires 5 and 6. Other wires were quiet. The induced
crosstalk voltage on the fourth wire during the second cycle was therefore influ-
enced by whether the previous voltage peak had already vanished. The results
are shown in Table 3.5. As it can be seen, the induced voltage peak was both
increased and decreased for the RLC model at different operating speeds, due
to the oscillation at the previous cycle. However, the induced voltage peak only
increased with operating speed when using the RC model.
Table 3.4: Propagation delay variation caused by intersymbol interference









Table 3.5: Peak crosstalk noise (normalized to Vdd) variation caused by inter-
symbol interference









3.3.4 Implications of the Case Study
Crosstalk noise and propagation delay variation in an 8-bit bus were studied.
Simultaneous switching patterns strongly influenced the amount of noise in-
duced on a quiet victim and the propagation delay of coupled interconnects.
40




























Figure 3.15: Influence of rise time on peak crosstalk noise.











































Figure 3.16: Voltage waveforms for three coupled RLC interconnects.
41
Rise times and and phases also contributed to crosstalk noise and propagation
delay variation.
Certain bit sequences on the other hand caused intersymbol interference that
further increased propagation delay variation and summed up with crosstalk
noise. Propagation delay and crosstalk noise variation became more pronounced
at high operating speeds. These variations need to be considered in the veri-
fication and optimization of high speed on-chip communication. Inductance
modeling is also required since ringing and the long-range effects of inductive
coupling made interconnects especially vulnerable to crosstalk noise and inter-
symbol interference.
3.4 Chapter Summary
In this chapter, an analytical time-domain model to evaluate crosstalk and inter-
symbol interference in capacitively and inductively coupled buses was proposed.
The model can be used in design tools for high-performance buses since it takes
into account the effects of inductance, initial states, and bit sequences. Signal
rise times and phases were also included in the model. It was also shown that the
model achieves good accuracy when compared to previous models and HSPICE.
The model was applied to an 8-bit bus to study the amount of crosstalk noise
and intersymbol interference in different switching and timing conditions. Inter-
symbol interference was shown to affect crosstalk noise and propagation delay




Current and Its Impact on
Power Grid Noise
A stable supply voltage is a necessity for the correct operation of an integrated
circuit. Maintaining this operating voltage has become increasingly difficult as
the integration density and the number of devices on a chip increase. Rigor-
ous models for the simultaneous switching noise caused by a CMOS logic gate
have been proposed [91, 92]. An increasing portion of power is however con-
sumed by interconnects. Over 50% of the dynamic power consumption of a
130 nm microprocessor was consumed by interconnects and about half of this
interconnect-power was consumed by global wires [4]. These long wires can not
be accurately modeled as a single load capacitor to the driver. The modeling of
the power supply grid is also needed since the amount of noise varies in differ-
ent locations of the power supply grid. The on-chip power supply network and
global communication design starts early in the design flow. The current draw
of the on-chip buses needs to be known in order to specify the power supply net-
work. The current draw of the buses in turn depends on their design properties
such as width, length and switching activity. Rapid modeling and exploration
of power supply network and bus design parameters is therefore very beneficial
for system level optimization.
In the previous chapter, models and analysis for crosstalk and intersymbol
interference in a bus were presented. In this chapter, power supply noise caused
by a bus is modeled [93]. The switching current of a coupled bus is derived
and the bus model is combined with a power supply network model. The bus
is represented by an analytical RLC transmission line model. The model also
takes into account different switching patterns and coupling between wires that
can both have a considerable effect on the current draw of the bus. A method







Figure 4.1: Two-port model for an interconnect with source and load
impedances.
4.1 Modeling of Bus Switching Current
In addition to modeling the power distribution network, it is necessary to model
or approximate the load on the network. Switching events cause current spikes,
whose shape and magnitude affect the amount of noise on the power distribution
network. Large current spikes are caused by on-chip communication, since buses
are often driven with large drivers and buffered heavily. In buses, coupling
also becomes pronounced since the wires run parallel to each other over long
distances. This necessitates the inclusion of coupling in wire models. Simulation
of power distribution noise in a chip is typically done in two steps: first, the
switching currents of active devices are simulated separately assuming a perfect
supply voltage. Second, the noise in the power distribution network is simulated
using as loads piecewise-linear current sources that approximate the switching
currents. This method helps to keep the analysis computationally feasible. In
this section, transmission line analysis in s-domain is used to derive the switching
current of a coupled on-chip bus under different switching conditions.
A transmission line with a source impedance ZS and load impedance ZL can
be modeled as a two-port circuit as in Fig. 4.1. The terminal equations of the
two-port network are
V1 = a11Vout + a12Iout (4.1)
Is = a21Vout + a22Iout (4.2)
Vs = V1 + IsZS (4.3)
Vout = IoutZL. (4.4)
The relation between the voltage Vout at the end of the interconnect and the





(a11 + ZSa21)ZL + a12 + ZSa22
. (4.5)
A transmission line can be thought to consist of numerous infinitesimally small
RLC segments. A cascade connection of these segments is conveniently ana-
lyzed using ABCD parameters, since the ABCD matrix of the cascade system
44
is simply the matrix product of the individual matrices. As the number of
the RLC segments approaches infinity, the ABCD parameter matrix of a single













(r + sl)/(sc) and θ =
√
(r + sl)sc and where r,l,c are the per-
unit-length resistance, capacitance and inductance, respectively, of the intercon-
nect of length h. The driver is modeled as an exponential voltage source with a








where tr is the exponential signal rise time and u(t) is the unit step function. τ









The current drawn from the power supply network is equal to the current at
the driver end of the interconnect. The current pulse at the receiver end of the
interconnect is smaller since a part of the current is lost charging the intrinsic
capacitance of the interconnect. The current Is entering the interconnect can
be derived by substituting Iout = Vout/ZL into (4.2) and writing it as




By substituting (4.5) into (4.9), the relation between the current and voltage





(a11 + ZSa21)ZL + a12 + ZSa22
. (4.10)
For a distributed transmission line with source resistance and inductance and




Z−10 sinh(θh) + sCL cosh(θh)








where ZS = RS + sLS . The current can not be solved from the equation ana-
lytically, but it can be approximated using a series expansion. The hyperbolic
functions can be written in series form as









+ . . . (4.12)
45






+ . . . (4.13)
and






+ . . . (4.14)
The accuracy of a series expansion depends in general on the number of terms.
A fourth degree approximation was used since the fourth degree polynomials
are the highest that can be solved analytically. A fourth degree approximation







b4s4 + b3s3 + b2s2 + b1s+ 1
. (4.15)




2 + n1s+ n0)e
−τs
d4s4 + d3s3 + d2s2 + d1s+ d0
. (4.16)
The coefficients ni and di are presented in the Appendix. By applying partial








where sp are the roots of the denominator of (4.16).
In order to model the current draw of a bus, the equations are now extended
to multiple coupled interconnects. The bus was modeled as capacitively and
inductively coupled RLC transmission lines. The circuit model of the bus is
shown in Fig. 4.2.
As can be seen from (2.10), the current on a coupled interconnect is a sum
of the currents of decoupled interconnects. The inputs of these decoupled inter-
connects are in turn a sum of the original inputs of coupled interconnects. By
combining these equations, the total current draw of the bus can be derived as













where Mki and M
T
ij are the value of the eigenvalue matrix and its transpose at
the indexes k,i and i,j, respectively. Ii(t, τj) is the input current of a single de-
coupled interconnect (4.17). τj is the time when the j
th driver starts switching.






































































Figure 4.2: Circuit model for a distributed RLC bus consisting of n intercon-
nects.
inductance values of R̂ii, Ĉii, and L̂ii, respectively. In summing the currents,
their direction needs to be taken into account, i.e. whether the driver is switch-
ing up or down. A downward transition is included in the sum with a minus
sign. Also, due to coupling there can be a small current pulse on a non-switching
wire. This current flows either into the power distribution network or ground
depending on the state of the driver.
4.2 Modeling of Power Supply Network
Many existing power grid models focus on post-layout verification. In [94], a
model for evaluating the RLC power grid noise in the early stages of the design
flow has been presented. This model was therefore chosen to be used with the
derived switching current model. The models were combined as follows. The























i=1,i6=j χi,j + (1/2)
∑k
i=1,i6=j Ci,j + C
load





j is the load capacitance at node j and ts is the switching time of
node.
The charge transferred from the power supply network when an on-chip bus
is switching can be obtained by integrating (4.18). If all drivers are switching
47






















If there are drivers switching at different times the integration is performed
for the corresponding time intervals to take into account the direction of currents
as discussed above. The load capacitance corresponding to this charge was then
obtained as Cload = Qtot/Vdd.
4.3 Verification
The bus model was verified by comparing it to HSPICE. The wire properties
were set according to ITRS 65 nm technology node for global wiring. The
width and separation distance of the wires were 145 nm. The resistance and
inductance values were extracted using the field solver FastHenry [90], while the
capacitance values were extracted using Linpar [89]. The driver rise time was
100 ps.
Fig. 4.3 shows the switching current of a 1 mm long 8-bit bus. Every other
driver is switching up while the others are switching down in order to maximize
the load caused by capacitive coupling. The downward switching drivers start
switching 200 ps after the upward ones. The influence of coupling between the
wires is seen as a jump in the current drawn from the power supply network.
As can be seen, the model and HSPICE were close to each other. In Fig. 4.4 is
shown the current draw of a 3 mm long 32-bit bus when all drivers are switching
up simultaneously. The wire sizes remained the same. The model and HSPICE
were again in good agreement with each other.
The simulation runtimes for the calculation of a bus current draw are shown
in Table 4.1. The simulations were run on a 2.8 GHz Pentium 4. The model was
implemented with Matlab 6 while the HSPICE simulations were performed using
the W-element lossy transmission line model. The runtimes were measured for
three different bus widths, i.e. 8-bit, 32-bit and 64-bit. As can be seen, there
is a clear speed-up over HSPICE, especially for the wide 64-bit bus where the
speed-up is over 100. Further speed increases can likely be obtained with a
compiled executable instead of source code interpreted at run-time in Matlab.
Table 4.1: Simulation times for the calculation of current draw curves for dif-
ferent bus widths























Figure 4.3: Current draw of a 1 mm long 8-bit bus. Half of the drivers are
switching up while the others start switching down at 200 ps.




















Figure 4.4: Current draw of a 3 mm long 32-bit bus. All drivers are switching
up simultaneously.
49
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
Figure 4.5: A 10× 10 power supply grid. Each segment is 100µm.
The bus model was also verified together with the power supply grid model.
The power grid was modeled as a square grid as shown in Fig. 4.5. Each 100 µm
long segment was modeled as an RLC circuit. The nodes 1, 10, 91 and 100
were modeled as package pins with a constant operating voltage. The power
grid wires were 2 µm wide and 0.319 µm thick. The 3 mm long 32-bit bus
whose current draw is shown in Fig. 4.4 was placed at node 55. The worst case
operating voltage in each node is shown in Fig. 4.6. The maximum difference
in noise voltages between the model and HSPICE was below 8%.
The runtimes for the simulation of a power supply grid are shown in Ta-
ble 4.2. In HSPICE simulations the load on the supply grid was a piecewise
linear current source representing the current draw curve of a bus. The power
grid simulation was approximately 30 times faster than HSPICE. For larger grid
sizes sparse matrix solvers may be utilized as discussed in [94].
Table 4.2: Simulation times for different power supply grid sizes





























Figure 4.6: Worst case node voltages on the power supply grid.
4.4 Noise Reduction by Skewing
The bus model can be used to analyze the influence of different timing condi-
tions. The current peaks caused by switching buses can be reduced by skewing
the relative switching time of drivers [71]. In Fig. 4.7 is shown the worst case
operating voltage when half of the drivers of the 3 mm long 32-bit bus switch
0 ps, 200 ps, or 400 ps after the other drivers. When there was no skewing, the
maximum noise at node 55 was 4.3% of Vdd. With a skewing time of 200 ps
the maximum noise was reduced by 16% to 3.6% of Vdd. A 400 ps skew further
reduced the noise to 3% of Vdd resulting in a total reduction of 30% in the
power supply noise. The maximum noise was reduced also in the other nodes.
Further increases in skew no longer reduced the noise due to the loss of correla-
tion between the switching drivers. In this way the power supply noise can be
reduced as a trade-off between noise and skewing delay. The maximum possible
skewing time depends on how much slack there is available for a particular bus
when compared to the system level operating frequency.
4.5 Chapter Summary
In this chapter, an analytical model for the switching current of an on-chip
coupled bus was proposed. The model was combined with a power supply
grid model in order to be able to model the worst case power supply noise in
different parts of the power supply grid. The model was verified by comparing it
to HSPICE. The maximum error was below 8%. The reduction in power supply
51
























Figure 4.7: Node voltages on the power supply grid with different bus driver
skewing times.
noise caused by skewing of the drivers was demonstrated. In the case study the
maximum power supply noise caused by a 32-bit bus was reduced by 30% with
a 400 ps skewing time. In the next chapter, the derived switching current model







In digital circuits a large number of logic elements and drivers switch nearly si-
multaneously at the clock edge. This switching places a burden on maintaining
a stable operating voltage. Significant switching currents are caused by global
on-chip communication, since buses are driven with large drivers. Current peaks
in turn cause resistive RI drop and inductive Ldi/dt noise in the power distribu-
tion network. On-chip decoupling capacitors help to reduce the burden on the
power distribution network by acting as temporary charge storage. However,
these capacitors may occupy a considerable area on the chip. In [95, 96, 97] alter-
native methods for peak current or power reduction in buses have therefore been
proposed. The reduction is achieved with additional encoding circuits or wires.
The impact on noise in the surrounding power supply network is also not mod-
eled. In addition to peak current reduction, crosstalk reduction is an important
part of bus design. The adverse effects of crosstalk include increased delay or
delay variation, voltage peaks on quiet wires and increased energy consumption.
Methods for reducing the effects of crosstalk include bus encoding [98, 99, 100],
wire spacing adjustments [101], buffer insertion [33], shielding [102, 103] and
gate sizing [35, 104].
Wire spacing and shielding may dramatically increase the area of a bus.
Buffer insertion in turn increases power consumption and area. Bus encod-
ing requires a smaller area than simple shielding but also additional logic. In
principle, the reduction in crosstalk in these methods is achieved as a trade-off
between noise and circuit area or power. On the other hand, gate sizing, i.e.
altering of driver strengths, is problematic in buses since in a typical bus the
wires have the same size and separation distance, which causes them to act
equally as both an aggressor and a victim.
In this chapter, both functional crosstalk and power supply noise are si-
53
multaneously reduced using primarily another system resource, i.e. time. In
the previous chapters, the crosstalk noise on a bus and the power supply noise
caused by a bus have been modeled and analyzed. In this chapter, the models
are applied to reduce these two noises. Time, or timing slack, has been pre-
viously used in buses for other purposes such as in [105, 106] where the delay
of a coupled bus was reduced by intentionally skewing the timing of adjacent
wires. Skewing was again applied in [107] to reduce the energy dissipation of a
coupled bus and in [108] where it was used to reduce bus peak power. It should
be noted that while intentional skewing has previously been used to reduce such
crosstalk effects as crosstalk induced delay increase [105] and crosstalk energy
dissipation [107] in buses, in this chapter crosstalk voltage induced on quiet
victim wires is reduced as first proposed in [71, 109]. Both inductive and ca-
pacitive crosstalk are also analyzed with an analytical RLC bus model instead
of an RC model. In addition, the power supply noise caused by an on-chip bus
is simultaneously reduced. Unlike many other methods [95, 96, 108] that have
reduced the peak current or power of a switching on-chip bus, the actual impact
of the peak current reduction on noise in the power supply grid is included in
the model. Since the method is primarily based on a trade-off between noise
and delay instead of circuit area or power, it is well suited for area limited
cases, where wire shielding or additional encoding wires are often not available.
Another problematic issue for crosstalk reduction is inductive coupling due to
its long range effects that reduce the effectiveness of wire shielding and spacing.
The proposed method, however, is demonstrated to be effective also for reducing
inductive noise.
5.1 Reduction of Crosstalk and Power Supply
Noise
Skewing can be applied in several ways to a bus. For example, in [105] the
skewing to reduce the delay of a coupled bus was performed by adding a relative
static delay between adjacent wires. In effect, the bus was divided into two parts:
every other wire switched normally at the clock edge and the others after an
imposed delay. This division is however not necessarily effective in reducing
functional crosstalk noise, since the two closest aggressors of a victim wire that
cause the majority of capacitive noise are still switching simultaneously. This
problem can however be avoided by dividing the wires into several groups as
depicted in Fig. 5.1. The relative switching times of the bus drivers are skewed
by inserting a static delay to part of the wires. This also eases the burden on the
power supply network, since there are fewer drivers switching simultaneously.
The skewing time of each driver is a multiple of the interval time Tint. On
the left hand side of the figure, the wires are divided into two groups with two
different skewing times, namely, 0 and Tint. On the right, there are five different
skewing times; 0, Tint, 2Tint, 3Tint and 4Tint. Three and four skewing times














t=0 T 2T 3T
int int int
t=0 T 2T 3T
int int int
t=0 int4T
3 skewing times 4 skewing times 5 skewing times2 skewing times
Figure 5.1: Skewed inputs to a bus using 2-5 different skewing times.
two closest aggressors on both sides of of any victim wire do not switch at the
same time when there are three or more different skewing times. The crosstalk
pulse induced on a quiet interconnect is thus lowered. This is demonstrated in
Fig. 5.2. The figure shows the noise waveform induced on the quiet 4th wire in
the middle of an 8-bit bus when all other wires are switching. There were four
different skewing times as demonstrated in Fig. 5.1. The wires were 2 mm long
and had a rise time of 100 ps. The noise was calculated using three different
interval times Tint. With a zero interval time, all wires switch simultaneously.
In this case, the maximum crosstalk noise on the quiet interconnect was 35%
of Vdd. With an interval time of 250 ps, the maximum noise was reduced to
27% of Vdd. By increasing the interval time to 500 ps, the maximum noise was
further reduced to 19% of Vdd.
Fig. 5.3 shows the switching current of the same 8-bit bus. When all in-
terconnects switched simultaneously, the peak current was 3.6 mA. With an
interval time of 250 ps, the peak current was reduced to 1.8 mA, while an in-
terval time of 500 ps reduced the peak current further to 1.4 mA. The current
waveform formed four distinct peaks, since there were four different skewing
times.
5.2 Bus and Power Supply Noise Modeling
5.2.1 Crosstalk Noise and Delay under Bus Skewing
To be able to determine a suitable skewing time, the crosstalk noise and power
supply noise need to be evaluated efficiently and accurately. The skewing time
should be selected as small as possible to avoid excessive delays, while still
fulfilling signal integrity and power supply noise requirements. In the previous
55


























































Figure 5.2: Crosstalk voltage on a quiet interconnect in an 8-bit bus with interval
times of 0ps, 250ps, and 500ps.























































Figure 5.3: Switching current of an 8-bit bus with interval times of 0ps, 250ps,
and 500ps.
56
chapter, the switching current of a coupled bus was derived. The model is used
in this chapter for a bus with skewing. The delay and crosstalk voltage on a
quiet wire in a skewed bus can be calculated as follows.
The drivers are modeled as exponential voltage sources VS with a source






where τj is the skewing time when the j
th driver starts switching as shown in
Figure 5.1. u is the unit step function and tr is the exponential rise time. The
skewing time of each interconnect is defined as a multiple of interval time τint
τj = [(j − 1) mod p] τint (5.2)
where p is the number of different skewing times.
The receiver is modeled as a capacitive load. The relation between the input





(a11 + ZSa21)ZL + a12 + ZSa22
(5.3)
where a are defined in (4.6) and ZS and ZL are the source and load impedances,
respectively. Similarly to the calculation of current in the previous chapter, the
voltage Vout at the end of a single wire is obtained by combining (5.1) in s-








For multiple coupled wires, by combining (5.4), (2.9) and (2.20) the output










where Vi(t, τj) is the output voltage (5.4) of the ith decoupled interconnect as
a function of time t and skewing time τj .
The maximum crosstalk noise on a quiet victim wire is obtained when all
other wires are switching. In a bus the interconnect in the middle of the bus is
generally the most susceptible to crosstalk noise. The crosstalk noise waveform
on any interconnect can be obtained from (5.5). To determine the maximum
value of crosstalk noise, the maximum of that equation needs to be found.
However, it can not be derived analytically. The maximum was searched for with
Halley’s method. Halley’s method was applied instead of the Newton-Raphson
method since in practical experiments it was less sensitive to the starting point.
As a starting point for the search (3.15) was used. Due to the intentional
skewing the aggressors start switching at different times and thus the voltage
57
waveform on the victim can have several peaks depending on the number of
different skewing times as demonstrated in Fig. 5.2. The starting points for the
search were therefore set as the sum of (3.15) and the time when each group of
aggressors started switching.
The 50% propagation delay of any wire in the bus can be obtained by set-
ting (5.5) equal to 0.5Vdd. The equation can be used to analyze all driver
switching patterns by including downward switching drivers with a minus sign
in the sum. The solution to the equation was again obtained with Halley’s
method. The starting point was obtained from (3.15). Since the equation is
for a single wire, a better starting point was obtained by modifying the total
wire capacitance term in it by multiplying the coupling capacitance with the
appropriate Miller coupling factor depending on the activity of adjacent wires.
5.2.2 Power Supply Noise under Bus Skewing
In order to evaluate analytically the effect of the reduced number of simulta-
neously switching drivers on power supply noise, the power supply grid was
modeled as a network of RLC segments as in [94] as discussed in the previous
chapter. Since the original power grid model does not include non-simultaneous
switching, the characterization of the switching devices was modified in order to
evaluate the change in power supply noise as a function of skewing [110]. The
switching devices in the power grid model are characterized as switching ca-
pacitors with a pre-characterized load capacitance of Cload and switching time
ts. An on-chip bus to which skewing can be applied was instead characterized
using two pre-characterized values of Cload and ts. One characterization was
performed normally with HSPICE for the bus with no skewing, while the other
was performed for a chosen interval time Tint. When the interval time of a bus
changes, its load capacitance also changes since there is a change in the Miller
coupling capacitance. The load capacitance Cload for any interval time was
calculated using (4.18). The corresponding new switching time ts of the load
capacitor for any interval time was then obtained from the two pre-characterized
values by approximating the dependence between Cload and ts as linear. In the
simulations, in addition to the regular characterization of the bus with zero
skew, the bus was also characterized using an interval time of 500 ps.
5.3 Model Verification
The accuracy of the model was verified by comparing it to HSPICE. The verifi-
cation was performed using several different driver rise times, source resistances,
load capacitances, bus lengths and interval times. Two different buses and power
grids were used in the verification. The RLC parameters for the bus and power
grid were extracted using FastHenry [90] and Linpar [89]. The first case was
an 8-bit bus whose wires were 145 nm wide, and 319 nm thick with a separa-
tion distance of 145 nm. The power supply network was a 10×10 grid whose










































Figure 5.4: Maximum crosstalk noise on a 2 mm 32-bit bus as a function of wire
separation distance and interval time.
placed in the middle of the power grid. The verification results for this setup
are shown in the upper half of Table 5.1. Verification results were calculated for
maximum crosstalk voltage, worst-case power grid voltage and worst-case 50%
propagation delay. The delay of the skewed bus was calculated from the first
input switching to the last output switching. Four different skewing times were
used.
The other verification case was a 16-bit bus whose wires were 300 nm wide,
319 nm thick and the separation distance was 400 nm. The power distribution
network was now a 20×20 grid whose sides were 1 mm long. The verification
results for this case are shown in the lower half of Table 5.1. The average error
between the model and HSPICE was 1.4%, while the maximum error was 12.9%.
5.4 Case Study and Implementation
5.4.1 Reduction of Inductive Crosstalk Noise
Inductive coupling has become a concern in global interconnects due to its long
range. Unlike capacitive coupling, inductive coupling is reduced only slowly
with distance or signal line insertion [111]. Even shielding with Vdd or ground
lines may eliminate only part of the inductive coupling [112]. In this section,
the applicability of the proposed skewing method is demonstrated in reducing
inductive crosstalk noise.
Fig. 5.4 shows the influence of wire separation distance and skewing on
59
Table 5.1: Verification results for crosstalk voltage, worst-case supply voltage,
and propagation delay using different bus sizes and interval times
8-bit bus: wire width 145 nm, wire distance 145 nm, wire thickness 319 nm













(Ω) (fF) (ps) (mm) (ps) (V) (V) (V) (V) (ps) (ps)
250 10 25 2 0 0.460 0.469 0.925 0.925 401 406
500 10 25 2 50 0.441 0.451 0.949 0.952 601 608
1000 10 25 3 100 0.450 0.458 0.963 0.968 1475 1471
250 50 50 3 150 0.371 0.373 0.956 0.960 1214 1218
500 50 50 4 200 0.405 0.409 0.964 0.967 2104 2105
1000 50 50 4 0 0.412 0.416 0.958 0.958 2152 2149
250 100 100 5 50 0.380 0.379 0.951 0.952 2436 2435
500 100 100 5 100 0.381 0.382 0.961 0.962 2833 2832
1000 100 100 6 150 0.396 0.398 0.969 0.971 4571 4559
250 10 25 6 200 0.490 0.492 0.948 0.952 3257 3229
500 10 25 7 0 0.495 0.497 0.952 0.952 4225 4115
1000 10 25 7 50 0.487 0.492 0.955 0.957 5124 4868
250 50 50 8 100 0.462 0.460 0.945 0.947 5309 5272
500 50 50 8 150 0.461 0.460 0.957 0.962 5878 5802
16-bit bus: wire width 300 nm, wire distance 400 nm, wire thickness 319 nm
250 10 25 2 0 0.144 0.160 0.915 0.915 134 144
500 10 25 2 50 0.130 0.142 0.943 0.952 299 306
1000 10 25 3 100 0.134 0.140 0.965 0.969 686 694
250 50 50 3 150 0.122 0.108 0.962 0.962 710 718
500 50 50 4 200 0.127 0.126 0.969 0.967 1120 1127
1000 50 50 4 0 0.127 0.131 0.964 0.964 708 718
250 100 100 5 50 0.120 0.125 0.947 0.947 906 917
500 100 100 5 100 0.120 0.124 0.960 0.960 1171 1181
1000 100 100 6 150 0.124 0.127 0.971 0.972 1895 1905
250 10 25 6 200 0.152 0.153 0.948 0.954 1460 1467
500 10 25 7 0 0.153 0.158 0.936 0.936 1422 1422
1000 10 25 7 50 0.148 0.152 0.957 0.962 1874 1852
250 50 50 8 100 0.146 0.141 0.936 0.946 1890 1899
500 50 50 8 150 0.144 0.147 0.952 0.960 2261 2265
60
crosstalk noise in a 2 mm 32-bit bus. The wires were 1.2 µm wide and 0.319 µm
thick. The rise time was 100 ps. The separation distance between the wires
was varied at 0.4 µm intervals from 0.4 µm to 2.4 µm. Skewing was applied
to the bus with the interval Tint time varying from 0 to 250 ps. Four different
skewing times were used. The RLC matrices were extracted using field solvers
for each separation distance, and the maximum crosstalk noise was calculated
with the model for each case. To obtain the maximum crosstalk voltage, the
closest wires on both sides of the quiet victim were switching up to maximize
capacitive crosstalk noise, while wires farther away were switching down to max-
imize inductive crosstalk noise. The maximum crosstalk noise with no skewing
and a separation distance of 0.4 µm was 0.20 V. Since the wires were wide with
a low per-unit-length resistance and ground capacitance dominating coupling
capacitance, the majority of crosstalk noise was caused by inductive coupling.
Of the maximum crosstalk noise of 0.20 V, only 0.05 V was found to be caused
by capacitive coupling by setting the inductive coupling terms to zero.
Fig. 5.4 shows that by increasing the separation distance between wires,
while keeping the wire properties otherwise unaltered, the noise was reduced.
The results gained from the increased wire separation were however the largest
initially, with less reduction in noise from large wire separation distances. As
can be seen, skewing was capable of reducing even the inductive noise. With
an interval time of 250 ps, the maximum crosstalk noise was reduced to 0.05 V.
Further increases in the interval time would not have yielded more reduction in
crosstalk noise since the aggressor waveforms induced on the victim were not
overlapping any more. It is possible to use skewing together with increased wire
separation distance according to available system resources such as routing area
and timing slack. For example, as seen in the figure, the maximum noise could
be reduced to 0.05 V also with a combination of separation distance of 0.8 µm
and interval time of 150 ps.
5.4.2 Reduction of Power Supply Noise using Different
Methods
Many different methods may be applied to mitigate power supply noise. In
this section, the applicability of the model to such analysis is demonstrated and
skewing is compared to other methods. The analysis was performed for a square
10 ×10 power distribution grid with a segment length of 100 µm. Two 16-bit
buses were placed in the grid. The nodes were numbered as in Fig. 4.5. The
driver end of the first bus was placed at node 25, while the driver end of the
second bus was at node 58. The bus at node 25 was 2 mm long, while the bus
at node 58 was 4 mm long. Both buses had a wire width of 300 nm and a wire
separation distance of 400 nm. The rise time of the drivers was 50 ps. The
power grid wires were 1 µm wide and the corner nodes of the grid, i.e. nodes
1,10,91 and 100, had a constant operating voltage simulating connections to
package pins. The resulting worst case drop in the operating voltage due to
the switching buses was calculated with the model and is illustrated in Fig. 5.5.
The average noise in the power grid nodes was 53.2 mV.
61





















Figure 5.5: The effects of different power distribution noise reduction methods.
The difference in maximum propagation delay between the 2 mm and 4 mm
buses was 400 ps. Assuming that both buses operate under the same clock,
there is slack available in the shorter bus to apply bus skewing. The available
slack was used for skewing the shorter bus with four different skewing times.
This reduced power distribution noise in the vicinity of the 2 mm bus as seen
in Figure 5.5. The average noise in the grid nodes was reduced by 9.8 mV.
Since there was no slack available for the longer bus, power distribution noise
caused by it was reduced with decoupling capacitors. Two 10 pF decoupling
capacitors were placed at nodes 59 and 48, which reduced noise in the nodes
close to the 4 mm bus. The average noise reduction was 7.0 mV. The effect of
power grid lines was analyzed by increasing the power grid line width from 1 µm
to 2 µm. This resulted in a considerable overall reduction in power distribution
noise of 24.1 mV, albeit at a high cost in circuit area. Finally all methods were
applied together. The reduction in average noise was 31.5 mV, and the worst
case operating voltage was in all nodes, including the hotspots, 0.95 V or more.
5.4.3 Implementation and Reduction of Crosstalk using
Different Methods
In order to apply the skewing method, delays need to be inserted into most of
the lines. A common way to create a delay is a straightforward inverter chain.
Intentional delays for buses have previously been implemented as additional
wire doglegs [113], or with flipflops with adjustable dynamic delays [107]. For












Figure 5.6: Implementation of skewed bus. The driver side flipflops numbered 2-
4 are delayed. The dotted area in the lower left corner shows the implementation
of the power and ground lines used in the HSPICE analysis.
delayed, precharacterized flipflops. The delay was achieved by adjusting the
flipflop transistor sizes. Implementation of the skewed bus using four skewing
times is shown in Fig. 5.6. It should be noted that while process variations can
affect timing, the skewing is applied to neighboring gates that typically have a
strong spatial correlation, thus limiting the effect on the relative delays between
the lines.
The skewing was implemented using 65 nm technology. In order to determine
the influence of mismatch and power noise on the relative delays, the flipflops
and drivers were simulated with HSPICE using mismatch technology library and
noisy power and ground rails. The outputs of the flipflops and drivers are shown
in Fig. 5.7. As can be seen, the different delays are clearly distinguishable. In
order to avoid significant changes in driver rise/fall times, the output transistors
of the flipflops were not altered, although this would have provided more delay.
The 20%–80% risetimes of the skewed drivers remained between 48 ps and 58 ps.
If in smaller technologies flipflop adjustments are not sufficient, simple inverter
chains could be used instead or in combination with the delayed flipflops.
The skewing method was also implemented together with other typical cross-
talk reduction methods, namely shielding and increased wire separation dis-
tance. A 2 mm long 8-bit bus was used. The drivers were 40x inverters and
receivers 10x inverters. In order to rapidly simulate the power supply noise
caused by the switching bus drivers on the power supply network, the power
and ground rails were modeled as in [91]. The resistance, capacitance and in-
63










































Figure 5.7: Skewed flipflop and driver outputs under mismatch and noisy power
and ground rails. Interval time is 100 ps and operating temperature 75C.
ductance of the power and ground rails shown in Fig. 5.6 were 2 Ω, 0.2 pF and
2 nH, respectively.
The simulation results are shown in Table 5.2. The results for using the
proposed skewing method alone are shown in the first section. When the interval
time Tint was zero, the bus acted as a regular bus. As can be seen, the maximum
crosstalk noise was reduced from 0.63 V to 0.50 V when the interval time was
increased to 100 ps. Since the drivers switched at different times, the load on
the power rails was also reduced. The original worst case Vdd was improved
from 0.90 V to 0.95 V with a 100 ps interval time, thus effectively halving the
power supply noise. The energy required to transmit a byte across the 8-bit bus
was obtained as the average of all possible switching combinations (up, down,
quiet). As can be seen, the delayed flipflops increased the energy consumption
slightly by 0.6%. The average energy consumption of the wires themselves did
not change when skewing was applied. This was due to the Miller coupling factor
that increased for wires switching in the same direction and decreased for wires
switching in opposite directions when skewing was increased, thus canceling
each other. Further analysis of bus energy consumption under delayed inputs
can be found in [107].
The second section of the table shows the results for the combined use of
skewing and shield wires. Shield wires were inserted between signal wires. The
area of the bus was consequently doubled. The shielding alone effectively re-
duced crosstalk from 0.63 V to 0.08 V. Use of skewing helped to reduce the
crosstalk noise further to 0.03 V. Shielding had no improvement on the worst
64
Table 5.2: HSPICE simulation results for the skewing method alone and in
combination with shielding and increased wire separation distance
Skewing only
Tint Max. crosstalk Worst case Vdd Avg. energy/byte Peak power
(ps) (V) (V) (pJ) (mW)
0 0.63 0.90 0.988 5.14
15 0.62 0.91 0.989 5.22
50 0.59 0.94 0.990 4.85
100 0.50 0.95 0.994 4.19
Combination of skewing and shielding
0 0.08 0.89 0.988 4.59
15 0.06 0.90 0.988 4.45
50 0.05 0.94 0.989 4.03
100 0.03 0.96 0.993 3.14
Combination of skewing and increased wire separation distance
0 0.34 0.89 0.513 3.98
15 0.33 0.90 0.514 3.92
50 0.30 0.94 0.514 3.36
100 0.25 0.96 0.519 2.72
case operating voltage, but it was again effectively improved with the skewing
method to 0.96 V.
The third section of the table shows the same area-doubled bus with the
shield wires removed, so that there is an increased separation distance between
the signal wires. Increasing the separation distance was not as efficient in reduc-
ing crosstalk noise as shielding, as the crosstalk noise was reduced from 0.63 V
to 0.34 V. The dissipated energy was however clearly lower, since the total ca-
pacitance of wires was smaller due to the increased separation distance between
them. When skewing was added the crosstalk noise was further reduced from
0.34 V to 0.25 V. The increased wire separation distance had no effect on the
worst case operating voltage, while skewing improved it to 0.96 V as previously.
The increase in average energy consumption due to skewing was 1.2%.
The use of shield wires was the most effective method in reducing crosstalk
noise, although at the cost of a doubled bus area. Doubling the bus area and
using it for increasing the wire separation distance was less effective in crosstalk
reduction, but the average dissipated energy was clearly lower. The skewing
method did not require an increase in the bus area besides the resizing of the
flipflops, and it was the only one to improve power supply noise, but the prop-
agation delay was increased instead. The bus was skewed using four different
skewing times for the drivers, i.e. the maximum skewing time was 3 × Tint.
The maximum propagation delay occurred when the drivers were switching in
opposite directions to maximize the influence of crosstalk, while the minimum
delay occurred when drivers were switching in the same direction. Skewing
of the drivers caused the correlation between the switching events to reduce,
65

































Actual increase in bus delay due to skewing
Linear increase of 3*T
int
Original 8−bit bus
Bus with shield wires
Bus with increased wire separation distance
Figure 5.8: Worst-case propagation delays as a function of interval time for
three different bus structures; normal bus, bus with shield wires, and a bus with
increased separation distance between wires.
thus altering the Miller capacitance. Because of this, the overall increase in the
propagation delay of the bus was less than or equal to 3× Tint.
The change in the propagation delay of the analyzed 8-bit bus is shown in
Figure 5.8. The delays are shown as a function of interval time for the original
bus with skewing, the bus with shield wires and skewing, and the bus with in-
creased separation distance and skewing. The dashed lines show the theoretical
maximum propagation delay due to the applied skewing, that is, an increase
of 3 × Tint, while the solid lines show the actual delay. The difference between
the two is largest for the original bus, which had the largest coupling between
signal wires, while for the shielded bus the lines are nearly identical since there
is very little coupling between the signal wires. The total performance penalty
depends on how much timing slack there is available for the bus or the use of
techniques such as a multicycle time-borrowing bus. In some cases skewing can
be applied to even reduce the overall propagation delay [105]. The presented
model can be applied in the analysis to calculate the propagation delay of the
bus under skewing.
5.4.4 Influence of the Number of Skewing Times
The maximum crosstalk noise and power supply noise do not depend solely on
the interval time, but also on the number of different skewing times. It is possible
to apply the bus and power grid model to any number of different skewing times.
66



































Figure 5.9: Maximum crosstalk noise as a function of interval time using 2-6
different skewing times in a 1-mm 32-bit bus.
The minimum number of different skewing times is one, in which case all drivers
switch simultaneously as normal. On the other hand, the maximum number of
different skewing times is equal to the number of wires, in which case all drivers
switch at a different time. The effect of the number of skewing times on the
maximum crosstalk voltage on a quiet wire in a 1 mm 32-bit bus was calculated
using the model and is shown in Fig. 5.9.
The bus wires were 145 nm thick with equal separation distance and a rise
time of 100 ps and source resistance of 200 Ω. As can be seen, an increase in
the number of skewing times helped to further reduce crosstalk noise, although
after four or more skewing times the effectiveness was reduced. The reason for
this can be seen from Fig. 5.1. As can be seen from the figure, when three
or more skewing times are used, the neighboring two aggressors on both sides
of any victim wire do not switch at the same time, instead there is a time of
at least Tint between them. When the number of skewing times is increased
to four, the time between the two aggressors switching is increased to 2Tint,
helping to further reduce crosstalk voltage. When the number of skewing times
is increased to five or six, the time between the two closest aggressors in the
worst case remains at 2Tint, and the slight reduction in maximum crosstalk over
four skewing times seen in Fig. 5.9 is gained from aggressors farther away.
In Fig. 5.10 is shown the worst case operating voltage as a function of interval
time caused by the same 1-mm 32-bit bus. The bus was located in a power
grid with 1 µm wide power wires. Power was supplied from the corners of the
grid and all drivers were switching from low to high. The number of different
67








































Figure 5.10: Worst case operating voltage caused by a 1-mm 32-bit bus as a
function of interval time using 2-8, 16 and 32 different skewing times.
skewing times used was 2-8, 16 and 32. The results were calculated with the
model. The power supply noise caused by the bus was reduced as the number
of skewing times was increased. This was due to the reduction in the number of
drivers switching simultaneously. For example, with two skewing times half of
the drivers, i.e. 16, switched simultaneously, while with eight skewing times the
number was reduced to four. The lowest noise was achieved with 32 skewing
times, when all drivers switched at different times. In this case the total delay
however becomes impractically large, and the implementation of 32 different
delays is also problematic. Four different skewing times could be seen as a
reasonable compromise that also yielded an efficient reduction in crosstalk noise
seen in Fig. 5.9.
5.5 Chapter Summary
In this chapter, an optimization method based on analytical RLC models was
proposed for simultaneous reduction both of functional crosstalk noise and power
supply noise caused by on-chip buses. The reduction was achieved with skewing
that does not require additional encoding/decoding circuitry or extra wires, but
instead acts mainly as a trade-off between time and noise. The static delays
were implemented using resized flipflops. This makes the method well suited
for area limited cases. The modeling of the bus crosstalk noise and propagation
delay and the power supply noise caused by the bus make it possible to find
a suitable value for the skewing time in the trade-off process. The model is
68
applicable to any number of bus wires and takes into account both capacitive
and inductive coupling between wires. The reduction in power supply noise due
to skewing in the surrounding RLC power grid was also included in the model.
The model was verified by comparing it to HSPICE in 65 nm technology. The
average error was 1.4%.
The method was found to be effective in reducing problematic long range
inductive crosstalk noise in a case study where the maximum crosstalk noise in
an inductance dominated bus was reduced from 0.20 V to 0.05 V. Since the data
or bus layout are not changed, the proposed method can be applied individu-
ally or together with other methods depending on area and timing constraints
to reach targeted crosstalk noise values. The skewing method was implemented
together with shielding and increased wire separation distance in 65 nm tech-
nology. HSPICE simulations showed an increase in bus energy dissipation due
to skewing of less than 1.2%. Skewing was capable of further reducing the func-
tional crosstalk noise levels achieved with the other methods, while also being
able to reduce the worst case power supply noise from 0.1 V to 0.05 V. The
reduction in power supply noise in a power grid was compared to decoupling
capacitors and double-width power lines. Skewing using a 400 ps slack reduced
the original average power supply noise of 53.2 mV by 9.8 mV, while for two
10 pF decoupling capacitors the reduction was 7.0 mV and for the wide power
grid lines 24.1 mV. When skewing was used together with the two other meth-
ods the reduction in power supply noise was 31.5 mV. The influence of different
number of skewing times was studied, where four different skewing times was
found to be a reasonable compromise between the total delay and implemen-








Previously, two noise sources on buses, crosstalk and intersymbol interference,
were modeled and analyzed. The performance of a bus can also be affected
by another source, i.e. process variation. This variation affects both devices
and interconnects. On-chip buses are often driven and buffered with simple
inverters. However, in recent years, due to increasing delay, power and signal
integrity problems, bus encoding has been proposed to alleviate these issues [114,
115, 116, 117]. In bus encoding, there is additional logic that is used to encode
the input signals to the bus and decode them at the receiver end. This logic is
susceptible to process variation.
The devices are affected by e.g. variation in effective channel length, ox-
ide thickness, dopant concentration and threshold voltage [118], while the in-
terconnects are affected by thickness and width variation due to the copper
damascene chemical mechanical polishing (CMP) process and interference in
lithography [119]. CMP is used to planarize the interconnect or inter-layer di-
electric between adjacent metal layers, while the damascene CMP process is
used with copper interconnects to polish the deposited metal. This polishing
results in variation in wire thickness due to differences in wire densities across
the die [120]. In lithography, the main sources of process variation are focus,
exposure dose and mask variations [121].
In this chapter, a model for analyzing signaling over an on-chip bus consisting
of encoding circuitry, drivers, transmission lines, receivers and decoding circuitry
is proposed [122]. The wires are modeled as capacitively and inductively coupled
distributed RLC transmission lines. The driving point effective capacitance for
a bus driver is derived for the decoupling method. The delay of the signaling
circuitry and rise time of the drivers are characterized as a function of the load
capacitance. The effects of process variation are taken into account in both
71
the characterization of the signaling circuitry and in the wire analysis. The
overall delay variation of the bus due to process variation is then calculated.
This combination of analytical interconnect models and characterized logic can
be used to speed up statistical Monte Carlo simulations to analyze the effects
of process variation. The model is verified by comparing it to HSPICE. The
derived model is applied to analyze regular voltage mode, level-encoded dual-
rail (LEDR), and 1-of-4 signaling. The implementation and analysis are done
in 45 nm technology.
6.1 Bus Model
The bus structure considered in this chapter is shown in Fig. 6.1. The bus con-
sists of a number of input signals that are encoded and sent using voltage-mode
signaling over an arbitrary number n wires to the receivers and decoder. The
total delay of the bus includes the delay of possible encoding circuitry, drivers,
wires, receivers, and possible decoding circuitry. Changes in the physical prop-
erties of a wire caused by process variation also affect its electrical properties,
i.e. resistance, inductance and capacitance. The interaction between a driver
and an interconnect with varying electrical properties can be modeled by em-
pirically precharacterizing the driver delays and rise times as a function of load
capacitance and input rise time. Circuit simulators such as SPICE can be used
in this characterization of non-linear devices. The interconnect can then be
analyzed by modeling the driver as a voltage source with the delay and rise
times corresponding to the load. The load to a driver has traditionally been
modeled as the total capacitance of an interconnect. Because of the scaling of
the interconnect sizes, the actual load seen by a driver is smaller due to resis-
tive and inductive shielding [123]. An RC [63] or RLC [124] input admittance
can be mapped to a such effective capacitance in order to maintain compati-
bility with the existing efficient empirical driver models. In moment matching
methods such as AWE [22], the input admittance models for the driving points
are generated at the same time as the approximate transfer functions. In this
section, the effective capacitance is derived instead for the decoupling method
that is used in this thesis.
In general, the delay td of a driver and the rise time tr of its output are
a function of the load capacitance and the rise time of its input, i.e. td =
f(tinr , Cload) and tr = g(t
in
r , Cload). The driver output waveform is modeled as a
saturated ramp as shown in Fig. 6.2. The encoder and driver were characterized
with HSPICE by determining their rise time tr and delay td from the encoder
input to the driver output as a function of load capacitance. In order to simplify
the characterization without loss of generality, it is assumed that the rise time
tinr of the encoder input signals is constant. In order to connect the encoder
and driver characterization with the bus model, the the driving point effective
capacitance seen by the drivers is derived. The circuit model of the bus is shown
in Fig. 6.3.

































Figure 6.2: The saturated ramp model of the driver output waveform.
between them. The wires were modeled as distributed RLC transmission lines.
The bus is driven by the saturated ramp voltage source in Fig. 6.2. CL is the
load capacitance due to the receiver at the end of the wire.








In Chapter 4, the current flowing into a transmission line using a driver mapped
to linear circuit components was derived as (4.10). We apply it here to a prechar-
acterized driver instead in order to evaluate the effects of device process varia-





(a11 +RSa21)ZL + a12 + RSa22
. (6.2)
It should be noted that the value of RS is different from the one in Chapter 4,






































































Figure 6.3: Circuit model of the bus.
and (2.20), and applying (2.10), the current Ik flowing into the k-th wire in a























where m is the order of the terms included in the derivation and u(t) is the
unit step function. The current Ii is calculated using the diagonalized resis-
tance, capacitance and inductance values of R̂ii, Ĉii and L̂ii, respectively. The
effective capacitance seen by a driver was calculated by equating the average
currents drawn by the interconnect Ik and the effective capacitance IC over a











The initial rise time tr and delay td were set according to the total capac-
itance of the wire, and the effective capacitance seen by the driver was then
acquired iteratively using (6.5). In practice three to five iterations were needed
for the effective capacitance to converge. In order to determine the propagation
74
delay of the interconnects, (5.3) is used. By using (2.20) and (2.9), the far-end
voltage of the k-th wire in an n-bit bus is









where Vi is obtained by substituting (6.1) into (5.3) and using partial fraction









sp(t−tr) +Bm−1(t− tr) +Bm
)
u(t− tr). (6.7)
Possible non-switching bus drivers are not included in the summations of
(6.6) and (6.3), while downward switching is included with a minus sign. The
RLC transmission line matrices of the interconnects were extracted using analyt-
ical equations from [125] in order to rapidly evaluate the influence of different
wire properties. The cross-section view of the wire configuration is shown in




Figure 6.4: Cross-section view of wire configuration.
6.2 Signaling Techniques
The model is applied to two different asynchronous signaling techniques, namely
level-encoded two-phase dual-rail encoding (LEDR) [126] and two-phase 1-of-4
encoding [127]. In the asynchronous design approach, no global clock is used
and synchronization is applied instead. Two-phase or nonreturn-to-zero hand-
shaking uses signal transitions to assert data validity and reception. Two-phase
handshaking is preferred for long interconnects since it uses half of the number
of transitions of four-phase or return-to-zero handshaking.
The benefits of LEDR signaling include improved throughput and power
since it requires no reset phase and since only one transition occurs on a wire
per data bit transmission [128]. LEDR uses two wires to encode one bit of data.
One of the wires is the data wire which holds the bit value in standard single
wire encoding while the other wire indicates phase by its parity relative to the
data wire. The encoding in LEDR alternates between odd and even phases. The
encoding of a bit 1 is 01 in odd phases or 11 in even phases while the encoding of

























Figure 6.6: LEDR encoder implementation.
The encoder used in the analysis is shown in Fig. 6.6. The encoder takes
the request and data bits in single-rail encoding and converts them into LEDR
encoding [122]. An inverter was used as both driver and receiver in this signaling.
In LEDR signaling the decoded data is obtained directly from the output of the
data wire. The completion detection for N wires was performed with a C-element








Figure 6.7: LEDR completion detector implementation.
1-of-4 encoding uses a group of four wires to transmit two bits of informa-
tion per symbol. A symbol is one of the two-bit codes 00, 01, 10 or 11 and is
transmitted through activity on one of the wires. 1-of-4 encoding is therefore
76
less sensitive to crosstalk than single-line encoding. The straightforward imple-
mentation of the encoder is shown in Fig. 6.8. An inverter was used as both
driver and receiver also in this signaling. The decoder and completion detector
implementation are shown in Fig. 6.9. The gates and latches were used to de-





























Figure 6.8: 1-of-4 encoder implementation.
6.3 Verification and Case Study
Channel length and threshold voltage are seen as the most important device
variation sources, while effective mobility is also emerging as an additional key
variation source [129]. The focus of the transistor process variation analysis in
this Chapter is on threshold voltage variation, although it can be performed sim-
ilarly also for other sources. The dependence of the delay and rise time of each
signaling circuitry on load capacitance was characterized using HSPICE. The
HSPICE transistor models for 45 nm technology were obtained using Predictive



















Figure 6.9: 1-of-4 decoder implementation.
the threshold voltage. In addition to the normal 0.18 V threshold voltage also
0.15 V and 0.21 V were used. Fig. 6.10 demonstrates the rise time tr of the
LEDR encoder and driver as a function of load capacitance. The 50% delay
td of the LEDR encoder and driver are shown in Fig. 6.11. The curve-fitted
results were used in the effective capacitance calculations in the model. Reg-
ular voltage mode and 1-of-4 signaling circuits were characterized in a similar
manner. A change in the physical properties of a wire, e.g. width or thickness
due to process variation, also caused the effective capacitance to change. Due to
the precharacterization of encoding circuitry as a function of load capacitance,
all changes in wire properties could be analyzed analytically instead of using
SPICE.
The voltage waveforms obtained with the model for driver output and wire
far-end are compared to HSPICE in Fig. 6.12 and in Fig. 6.13. The comparison
was performed for a 16-bit bus. The wire in the middle of the bus was used in
the analysis. The results are shown for regular voltage mode signaling, where
the driver was a 200x inverter and the receiver was a 20x inverter. The receiving
inverter was modeled in the model as a capacitive load CL at the end of the
wire, and this capacitance was extracted with HSPICE. In the first case, the
comparison was performed for a 1 mm long bus where the wire width W and
separation distance S were set to 68 nm as approximated in the ITRS roadmap
for minimum global wiring pitch in 45 nm technology. The comparison was also
performed for a longer 3 mm long bus where wire width and separation distance
were 135 nm. The wire thickness T and distance to groundH were in both cases
162 nm. The input signal tinr to the driver inverter was a falling ramp with a
50 ps fall time as shown in the figures. The model and HSPICE results were close
to each other. The propagation delay of the wire was slightly underestimated
since the saturated ramp did not accurately capture the exponential tail of the
driver output. If needed the exponential tail can be modeled with a source
78





























Figure 6.10: The rise time of the LEDR encoder and driver as a function of load
capacitance with different transistor threshold voltages.
























Figure 6.11: 50% delay of the LEDR encoder and driver as a function of load
capacitance with different transistor threshold voltages.
79
resistor as in [63].
The total delay variation of the bus was obtained by adding the delay of
the receiver and decoder to the 50% delay at the far-end of the wire. The
receiver and decoder delay was also characterized with HSPICE for different
threshold voltages. Table 6.1 shows the amount of delay variation in the 16-bit
bus for different signaling techniques. The delay variation was calculated as
the difference between the delay acquired in the presence of process variation
and the delay acquired without process variation. All signaling techniques had
a 200x inverter as a driver and a 20x inverter as the receiver. The length of
the bus was 3 mm and the wire width and separation distance were 135 nm.
The data to the encoders consisted of all inputs switching. For LEDR and 1-
of-4 signaling the 16-bit bus was encoded into 32 wires. The regular voltage
mode signaling had no encoding or decoding circuitry. As shown in the table,
the model was further verified by comparing the delay variation obtained with
HSPICE and the model. The delay variation was accurately modeled. The first
part of the table shows the delay variation due to threshold voltage variation.
A lower threshold voltage decreased the delay of the bus while a higher voltage
increased it. The second part of the table shows the delay variation when only
the wire properties vary. It was assumed that the wire pitch remains constant
while the wire width varied. As can be seen, for regular voltage mode signaling
the effect of wire variation on bus delay was clearly larger than the effects of
device variation. On the other hand, the LEDR and 1-of-4 signaling techniques
suffered more delay variation from transistor variation due to their encoding
and decoding circuitry. The third part of the table shows the delay variation
when both the threshold voltage and wire properties vary simultaneously.
Table 6.1: The total delay variation of the bus for different signaling techniques
Regular LEDR 1-of-4
Variation source Model HSPICE Model Model
Vth = 0.15V -2.2ps -2.7ps -28.8ps -40.2ps
Vth = 0.21V +3.4ps +3.6ps +34ps +50.7ps
Wire width -10% +27.6ps +27.3ps +34ps +31ps
Wire width +10% -22.7ps -21.9ps -26ps -21ps
Wire thickn. -10% +54.3ps +55.4ps +68ps +63ps
0.15V, width+10% -25.1ps -24.7ps -53.9ps -61.2ps
0.21V, thickn.-10% +57.7ps +58.9ps +103ps +115ps
The presented analytical model was also applied to demonstrate its use in
analyzing the statistical effects of wire variation. Conventional deterministic
static timing analysis (STA) is today often seen as inadequate due to the ris-
ing importance of process variation. In STA, process variation is modeled by
running the analysis multiple times for different process conditions thus creat-
ing the so-called corner files [131]. The number of corner files and simulation
time increases rapidly as the number of variation sources increases. Also, STA
80























Figure 6.12: Comparison between HSPICE and the model for driver output
voltages and wire far-end voltages in a 16-bit 3 mm bus. Wire width and
separation distance are 135 nm.























Figure 6.13: Comparison between HSPICE and the model for driver output
voltages and wire far-end voltages in a 16-bit 1 mm bus. Wire width and
separation distance are 68 nm.
81
can not accurately model within-die variations. To overcome these limitations,
statistical static timing analysis (SSTA) has been proposed. In probabilistic
SSTA [132], signal delays are treated as random variables or probability distri-
bution functions and propagated by performing statistical sum and maximum
operations. In Monte Carlo SSTA [133, 134], statistical samples are generated
with conventional STA methods to obtain the delay distribution. In this chap-
ter, Monte Carlo analysis was used to analyze the delay variation since the
presented model enables fast calculation of samples. Fig. 6.14 shows the de-
lay variation of the bus for different signaling techniques when the wire width
varies. The wire width was varied with a 3-sigma variation of 10% so that the
wire pitch remained constant. The Monte Carlo analysis was done with 1000
samples.


















Figure 6.14: Bus propagation delay variation due to wire width variation in
regular, LEDR and 1-of-4 signaling.
Although the LEDR and 1-of-4 signaling techniques had in general larger de-
lay variation than regular voltage mode signaling, they are able to operate cor-
rectly regardless of the delay since they employ delay-insensitivity. The correct
operation of regular synchronous voltage mode signaling is however susceptible
to delay variation. The proposed model can be applied to evaluate the increase
in delay variation caused by encoded signaling and to determine the need for
encoding techniques with delay-insensitivity.
6.4 Chapter Summary
In this chapter, a model to analyze the effects of process variation on delay
in on-chip bus signaling was developed. The model combined the variation
82
in signaling circuitry and in the wires to calculate the total delay variation
of the bus. The wires were modeled as distributed RLC transmission lines
including capacitive and inductive coupling between them. The effective load
capacitance was derived for the decoupling method. The signaling circuitry was
characterized as a function of its load capacitance. The driver delay and rise
time corresponding to the derived effective capacitance were used to calculate
the far-end voltage of a transmission line. The effects of process variation were
taken into account in the characterization of the signaling circuitry and in the
wire analysis. The overall delay variation of the bus due to process variation was
then calculated. The model was verified by comparing it to HSPICE. The delay
variation of regular voltage mode, level-encoded dual-rail and 1-of-4 signaling




Energy Modeling in RLC
Current-Mode Signaling
In the previous chapter, encoded signaling was addressed. In addition to en-
coded signaling, there are also other alternative signaling techniques for global
interconnects, such as current-mode signaling [12, 135, 136, 137, 138]. In current-
mode signaling, the major difference over voltage-mode is the resistive termi-
nation at the receiver end, because of which there is not a full voltage swing.
In addition, the distributed wire capacitances are not even uniformly charged.
Familiar voltage-mode models are not therefore applicable, and accurate and
efficient models need to be developed for current-mode signaling for issues such
as energy dissipation and propagation delay.
In voltage-mode signaling, the dynamic energy dissipation has traditionally
been obtained from the well-known equation E = CV 2dd, where C is the total
capacitance. Since an increasing portion of energy is dissipated in interconnects,
more accurate and extensive models have recently been presented for voltage-
mode signaling [139, 140, 141]. The traditional CV 2dd model fails to predict the
energy dissipation in high clock speeds where the signal transients do not settle
to a steady-state value [139]. The energy dissipated by wire resistances during
the transient switching is also not accurately included [140]. For a lumped RC
circuit with a step input, the energy dissipated in the resistor is equal to the
energy needed to charge the capacitor, i.e. (1/2)CV 2dd. However, in practice an
input to a wire has a finite rise time which reduces the resistive energy compo-
nent. This energy dependence on rise time is intentionally applied in adiabatic
charging [142, 143]. In addition, a separate evaluation of the driver and intercon-
nect contributions can be useful to understand the sources of energy dissipation
and to analyze local temperature increases [141]. High wire temperatures affect
wire delays and electromigration reliability. Global wires are especially suscep-
tible to higher temperatures since they are farthest away from the substrate
and the heat sink, and since they are surrounded by low-κ dielectrics that have
poor thermal conductivity. Also, due to their large geometries they have a high
85
ability to retain heat [144].
In [145, 146], models for the dynamic and static power dissipation of an RC
current-mode line have been presented. The models do not however include the
aforementioned aspects that have been introduced in later voltage-mode energy
models. In addition, the inclusion of inductive effects is desirable for today’s
high speed circuits [141]. The current-mode driver in [145, 146] is modeled using
a switch-resistor model. More accurate modeling can however be achieved by
characterizing the driver as a function of its load. In voltage-mode signaling,
the behavior of a driver is commonly characterized as a function of its capacitive
load or effective capacitance [63]. More accurate Π-type circuits for modeling the
interconnect load have also been presented [147]. These effective capacitance
and Π-models are however not compatible with current-mode signaling since
there is no resistive path to ground. In current-mode signaling an accurate
representation of the interconnect load is needed since current-mode signaling
is typically used for long-range communication where resistive and inductive
parasitics are often significant.
In this chapter, a novel analytical model for energy dissipation in RLC
current-mode signaling is derived [148]. The energy is derived separately for
the driver, wire and receiver termination components. The effects of transient
and static resistive power and different rise times and clock cycles are included.
A realizable Π-model is derived to model the driving point impedance of an
RLC current-mode transmission line. The Π-model is used to characterize the
driver and in the energy calculations. The output current of a current-mode
RLC transmission line is derived. The modeling is extended to multiple coupled
RLC lines with capacitive and inductive coupling between them, and applied to
model differential current-mode signaling.
7.1 Current-Mode Driving Point Impedance
In this section, a realizable RLC Π-model for the driving point impedance of
a resistively terminated transmission line is derived in order to analyze energy
dissipation in current-mode signaling. The model was also used to character-
ize the drivers. The Π-model parameters are calculated directly from the total
interconnect resistance, capacitance, inductance and receiver resistance. Al-
though Π-models introduce more parameters than a simple capacitance model
into the characterization process, it should be noted that the total number of
gates to characterize is smaller in current-mode signaling since current-mode
gates are usually used only for global communication, and not as e.g. logic
gates. The driving point impedance Ztlin of a terminated RLC transmission line





where ZL is the load impedance at the end of the line and a are defined in
(4.6). h is the length of the line and r, l and c are the resistance, inductance
86
and capacitance of the line per unit length, respectively. By applying a series
expansion to the hyperbolic functions in (7.1), as in (4.12)–(4.14), the driving
point impedance can be written as
Ztlin =
a0 + a1s+ ...
b0 + b1s+ b2s2 + ...
(7.2)
where
a0 = Rrec +Rtot
a1 = Ltot + 1/6R
2
totCtot + 1/2RtotCtotRrec
b0 = 1 (7.3)








where Rtot = rh, Ltot = lh, Ctot = ch and ZL = Rrec which is the resistance
of the current-mode receiver. The driving point impedance of a resistance-




d0 + d1s+ d2s2 +O(s3)
(7.4)
where
n0 = R1 +R2 +Rrec
n1 = R1C2(R2 +Rrec) + L1
d0 = 1 (7.5)
d1 = C2(R2 +Rrec) + C1(R2 +Rrec) + C1R1
d2 = C1R1C2(R2 +Rrec) + C1L1.
In order to obtain the RC-model for the driving point impedance of a resistance-
terminated RC transmission line, we set Ztlin = Z
Π
in while setting inductive values



























rec + 6RrecRtot +R
2
tot)









rec + 3RrecRtot +R
2
tot)









Figure 7.1: RLC Π model for current-mode driving point impedance. R2 and
Rrec are drawn separately for clarity.
where γ = 3Rrec + Rtot. In order to obtain an RLC-model, a fifth condition is
needed to solve the inductance term L1, while also obtaining realizable values.
We set C1 + C2 = Ctot, which is desirable for energy calculations. Similarly,
we already have R1 + R2 = Rtot from (7.3) and (7.5). The values for the RLC






totCtot + 4RtotCtotRrec + 12Ltot)






totCtot + 8RtotCtotRrec + 12Ltot)




totCtot + 3RtotCtotRrec + 6Ltot)









9R2totCtot + 24RtotCtotRrec + 36Ltot
where α = R2totCtotRrec + 15RtotLtot + 24LtotRrec. As can be seen, the RLC
values are always positive. It can be noted that the value of L1 and the induc-
tance seen by the driver depend on the amount of resistive attenuation, or total
resistance of the wire.
7.2 Modeling of Energy Dissipation
In order to be able to use both pre-characterized drivers and switch-resistor
drivers, the input voltage to the wire in Fig. 7.2 is used as the input signal. The
input Vin to the wire is modeled as an exponential voltage
Vin(t) = Vmax[1− e−t/tr ] (7.8)
where Vmax is the characterized or calculated maximum voltage of the input











Figure 7.2: RLC transmission line model for current-mode signaling.
not have a full voltage swing. Instead, the swing depends on driver, wire and
receiver properties. The exponential rise time tr and gate delay were also used
to characterize the driver with the Π model. In case of a driver represented as a
switch-resistor with source resistance Rs along with a voltage source with finite
rise time, Vmax = Vdd(Rtot+Rrec)/(Rs+Rtot+Rrec), and tr is easily calculated









The total energy Etot dissipated by the transmission line in Fig. 7.2 during one





where tc is the clock cycle time. The input current can be obtained from Iin =
Vin/Z
Π
in. For energy calculations a 1-pole form of the input current is used in
order to obtain compact and computationally effective models while maintaining








































The output current Iout of the transmission line in Fig. 7.2 can be calculated




a11Rrec + a12 + a22Rrec
. (7.14)
By applying partial fraction expansion and taking an inverse Laplace transform
the output current becomes






A four pole form of (7.15) is the longest that can be obtained analytically.
This form is applied to capture the complex waveform of the output current in











r a1+a0 . (7.16)





where Vout = IoutRrec. The receiver termination energy dissipation is then









t−1r a1 + a0
. (7.19)




Vin(t)Iin(t)dt − Erec (7.20)
where the integral term is the energy dissipated by the wire and the receiver































The energy Edr dissipated in the driver is then obtained as
90
Table 7.1: Wire properties and RLC parameters used in simulations
Wire width (µm) R (Ω/mm) L (nH/mm) C (fF/mm)
1 mm wire 0.5 161 1.31 164
3 mm wire 1.0 80.6 1.22 243




[Vdd − Vin(t)] Iin(t)dt = Etot − Erec − Ewire. (7.22)
The previous analysis has been for a rising transition. In case of a falling
transition, the wire capacitances are discharged through the receiver and driver.
The input to the wire is then
Vin(t) = Vmaxe
−t/tf . (7.23)
where tf is the fall time of the input signal. Assuming that the system has
reached steady state between signals, the steady state current and voltage values












For multiple coupled wires as in a bus, in the characterization of the driving-
point impedance and the calculation of energy, the coupling capacitance in Ctot
was taken into account by multiplying it with the appropriate switching factor
depending on the transition activity of neighboring wires. (2.10) and (2.20)
where then used to calculate the response of the current-mode system.
7.3 Verification
7.3.1 Single-Ended Current-Mode Signaling
In Fig. 7.3 is shown a comparison between the driving point voltages of the
RLC Π-model and a transmission line. The driver was a 50x inverter and the
transmission line consisted of 40 RLC segments and was terminated with a
150Ω resistor. The comparison was done using HSPICE in 65 nm technology.
Three different wire types, whose RLC parameters were extracted using field
solvers [90, 89], were used. The wire parameters are shown in Table 7.1. Wire
thickness was 319 nm for all cases. As can be seen from Fig. 7.3, the Π driv-
ing point model followed the transmission line well, although as expected the
accuracy was slightly reduced as the wire length increased.
91

























Figure 7.3: Driving point voltages of the presented RLC Π-model and a trans-
mission line. Three different wires were used.
In Fig. 7.4 is shown a comparison between the model and HSPICE for the
power dissipation of a 50x inverter with a rising output driving a 3 mm long
wire with a termination resistance of 150Ω. The results are plotted separately
for the driver, wire, receiver and total power. The power components were
calculated as a product of voltage and current as in (7.10), (7.17), (7.20) and
(7.22). As can be seen, the model and HSPICE are in good agreement. It
should be noted that none of the power dissipation curves falls to zero, since
in current-mode signaling there is a resistive path from Vdd to ground, which
causes a steady state current for rising signals. The brief short-circuit power
in the driver during transitions is not included, but it can be estimated as
approximately 10% of dynamic power [150].
In Tables 7.2 and 7.3 is shown a comparison between the model and HSPICE
for energy and propagation delay. The values were calculated using the wire
properties in Table 7.1. The driver and receiver were varied as shown. The
total energy dissipation was calculated for two different cycle times, i.e. 0.5 ns
and 1 ns. The threshold used in delay calculation was 0.4 mA. The delay was
calculated using two different forms of (7.15), i.e. 1 pole and 4 poles. The
energy dissipation was calculated using (7.13). As can be seen, the 4 pole
approximation of the delay was in good agreement with the HSPICE results
with an average error of 1.9 %, while the shorter 1 pole approximation had an
average error of 13.3 %. The energy dissipation had an average error of 1.9%.
Verification results using a switch-resistor driver are shown in Fig. 7.5. A
histogram with 1500 random samples was generated by comparing the model
92























Figure 7.4: Comparison between the model and HSPICE for the power dissi-
pation of a 50x inverter driving a 3 mm wire with a termination resistance of
150Ω.
Table 7.2: Verification results for total energy dissipation using different wire,
driver and receiver sizes
Driver Receiver Wire len. Emodeltot E
spice
tot
(size) (Ω) (mm) (pJ) (pJ)
tc = 0.5ns tc = 1ns tc = 0.5ns tc = 1ns
10x 500 1 0.224 0.466 0.225 0.467
25x 1000 3 0.490 0.860 0.491 0.873
50x 2000 5 0.866 1.375 0.900 1.460
25x 500 1 0.436 0.891 0.437 0.889
50x 2000 3 0.649 0.931 0.682 0.941
10x 1000 5 0.250 0.498 0.242 0.495
93
Table 7.3: Verification results for propagation delay using different wire, driver
and receiver sizes
Driver Receiver Wire len. tmodeldelay t
spice
delay
(size) (Ω) (mm) (ps) (ps)
1 pole 4 poles
10x 500 1 210 199 200
25x 1000 3 414 414 430
50x 2000 5 1998 1523 1549
25x 500 1 81 86 88
50x 2000 3 843 666 665
10x 1000 5 4766 4301 4439
results to HSPICE over a wide range of parameters. In the figure is demon-
strated the error distribution for Etot and as a comparison also for the model
in [145]. The parameters used were: voltage source risetime 1–500ps, h 0.5–
8mm, c 20–500fF/mm, r 10–500Ω/mm, l 0.01–5nH/mm, Rs 10–1000Ω, Rrec
10–4000Ω, and tc 150–6300ps or at least 75% of output signal risetime. The
mean and standard deviation are also shown in the figure for total, driver, wire
and receiver energy components calculated with the presented model. The re-
sults were calculated using equations (7.13), (7.18), (7.21) and (7.22).
7.3.2 Differential Current-Mode Signaling
The model was applied to the analysis of differential current-mode signaling.
The implementation of the differential signaling is shown in Fig. 7.6. The driver
was implemented as two 50x inverters while the receiver was a modified clamped
bit-line sense amplifier [151]. The 200 Ω termination resistances in the lower-
right corner were implemented as in [152]. The termination transistor lengths
were 65 nm and their widths were 3.2 µm. The wires were 3 mm long and the
separation distance was 0.25µm. The coupling capacitance was 41 fF/mm and
mutual inductance 0.94 nH/mm. The output and input currents of the wires are
shown in Fig. 7.7. As can be seen, the model results and HSPICE were close to
each other. A comparison between the model and HSPICE for the rising input
energy dissipation of the differential system is shown in Fig. 7.8. As can be
seen, the results were again in good agreement.
7.4 Case Study
In Fig. 7.9 is shown the energy dissipation of a 3 mm wire with a 50x inverter
as a driver and a 200 Ω termination. The energy dissipation was calculated
with the model as a function of tc and wire widths ranging from 0.25 µm to
1.5 µm at 0.25 µm intervals. The RLC parameters were extracted with field
solvers for each case. In voltage-mode signaling, the energy dissipation of an
interconnect depends linearly on capacitance, as seen from the CV 2 model. In
94






































) = 9.5% 
Figure 7.5: Error comparison for Etot using the model and [145]. The mean µ,





50x inv. 50x inv.
0.065 /
3.2um
Figure 7.6: Implementation of differential current-mode signaling.
95




























Figure 7.7: Comparison between the model and HSPICE for the input and
output currents of the 3 mm wires using differential current-mode signaling.



























Figure 7.8: Comparison between the model and HSPICE for rising wire energy












































































































Figure 7.9: Energy as a function of wire width and clock cycle time tc for a
3 mm wire with width ranging from 0.25 µm to 1.5 µm using a 50x inverter as
a driver and a 200Ω termination.
current-mode signaling, the amount of energy and where the energy is dissipated
varies depending on the wire width as seen in Fig. 7.9. For example, the energy
dissipated at the receiver grew when the wire width was increased. This was
caused by the reduced wire resistance that increased the output voltage swing.
The overall reduced resistance also resulted in an increase in the total energy
and driver energy dissipation. The energy dissipation in the wire varied with
wire width based on the static power dissipation and the voltage swing that
influences the capacitive energy dissipation. The voltage swing varies depending
on the driver, wire and receiver properties. The significant effect of the clock
cycle time on all energy components is also seen in the figure.
97
7.5 Chapter Summary
In this chapter, an analytical model for the energy dissipation of RLC trans-
mission lines in current-mode signaling was proposed. The energy was derived
separately for the driver, wire and receiver termination. The model included
the effects of different clock cycles, rise times, transient resistive power and in-
ductance. In addition, a realizable Π-model for the driving point impedance of
a current-mode RLC transmission line was derived. The output current at the
end of an RLC current-mode line was also derived. The models were developed
for both switch-resistor and pre-characterized drivers. The energy, driving-point
and output current models were verified by comparison to HSPICE in 65 nm
technology. The average error was 1.9% for delay using 4-pole output current
and -0.9% for total energy with a standard deviation of 1.9% using 1-pole form.
The model was extended to multiple interconnects with capacitive and inductive
coupling and applied to model differential current-mode signaling. The compo-





This thesis considered modeling and analysis of noise and interconnects in on-
chip communication links. These long links are used to connect IP blocks to-
gether and they are often a bottleneck in modern integrated circuits. Analysis
and optimization of the links requires accurate and computationally effective
interconnect models that can be used in the iterative refinement loops of the
early stages of a design flow. For this purpose, several models were proposed
in this thesis. Specifically, the models were developed for multiple parallel in-
terconnects that commonly form a bus segment or a link in a NoC. Because of
significant electromagnetic coupling in such links, the interconnects were mod-
eled as an interacting group instead of as isolated signal paths. The developed
analytical models addressed issues such as signal integrity, energy dissipation,
and alternative signaling techniques. Furthermore, the models were also applied
to optimize signal integrity problems.
A major source of noise that affects interconnects running close to each other
is crosstalk. Another interconnect noise source is intersymbol interference that
influences successive signals. An analytical model was proposed in the thesis to
evaluate both crosstalk and intersymbol interference. The model includes multi-
ple aspects that affect these noise sources such as both inductive and capacitive
coupling, phases and different bit sequences. The model was then applied in
case studies to show that intersymbol interference affects crosstalk voltage and
propagation delay depending on bus throughput and the amount of inductance.
On-chip communication consumes a significant portion of total power and
therefore places a burden on the power supply network. In order to determine
the induced power supply noise, it is necessary to know the load on a power
supply network. A model for the switching current of a bus was therefore
developed in the thesis. The switching current was shown to depend on the
interconnect coupling and switching patterns. The model was then combined
with an existing power supply network model and applied to demonstrate the
reduction of power supply noise with bus input skewing.
This intentional skewing was then applied to reduce both functional crosstalk
and power supply noise. Unlike in previous methods, this was achieved as a
99
trade-off with time or timing slack and without encoding or additional wires
which makes the method well suited for area limited cases. Models proposed in
the previous chapters were used to calculate crosstalk and power supply noise
as a function of skewing. The skewing method was shown to be effective in
reducing long-range inductive crosstalk. The method was implemented and
compared to other crosstalk and power supply reduction methods. The impact
of the implementation on energy dissipation was less than 1.2% in simulations.
Skewing was also demonstrated to be usable alone or together with other noise
reduction methods.
In long buses, encoding is a promising method to avoid problematic switch-
ing patterns. This is however achieved with additional logic circuitry that is
susceptible to process variation together with the wires. A model was thus pro-
posed for analyzing the effects of process variation on encoded buses. The model
includes variation in both the signaling circuitry and in the wires to calculate
the delay variation of the encoded bus. The model was applied to level-encoded
dual-rail and 1-of-4 signaling using Monte Carlo analysis to evaluate the statis-
tical influence of wire width variation.
Another promising signaling method for long interconnects is current-mode
signaling that employs current sensing together with a resistive termination. An
analytical model for energy dissipation in RLC current-mode signaling was pro-
posed in the thesis. The energy was derived separately for the driver, wire and
receiver termination. The model was developed for both switch-resistor and
pre-characterized drivers and it included the effects of different clock cycles,
rise times, transient resistive power and inductance. A realizable driving-point
impedance for an RLC current-mode line was also derived. The energy model
was applied to differential signaling and to analyze the changes in energy dissi-
pation as a function of wire width. The location where energy is dissipated in
current-mode signaling was shown to depend on wire width.
To conclude, a set of analytical models was proposed for the analysis and
optimization of on-chip communication links. The proposed models were also
applied in noise reduction. All models proposed in the thesis included the effects
of inductance and they were verified by a comparison to HSPICE using RLC
parameters extracted with field solvers. A significant speedup over HSPICE was
also demonstrated. In addition to the wires themselves, also signaling circuitry
was considered in the modeling. While the focus of the thesis was in the analysis
and modeling of on-chip communication links, possible future research directions
can be outlined. One future possibility could be to integrate the developed mod-
els into a simulation tool, e.g. for design space exploration or communication
optimization purposes. Besides simulation tool integration, another possible fu-
ture direction could be to adjust or expand the developed models for emerging
on-chip interconnects such as through-silicon via bundles [153, 154] in 3-D inte-
grated circuits, or even carbon nanotubes whose interconnect behavior has been
shown to be analyzable with RLC transmission line circuits [155, 156].
100
Bibliography
[1] N. A. Kurd, S. Bhamidipati, C. Mozak, J. L. Miller, T. M. Wilson, M. Ne-
mani, and M. Chowdhury. Westmere: A family of 32nm IA processors. In
Digest of IEEE Int. Solid-State Circuits Conference, pages 96–97, 2010.
[2] S. Natarajan et al. A 32nm logic technology featuring 2nd-generation high-
k + metal-gate transistors, enhanced channel strain and 0.171µm2 SRAM
cell size in a 291Mb array. In Proc. IEEE Int. Electron Devices Meeting,
pages 1–3, 2008.
[3] International Technology Roadmap for Semiconductors. Interconnect,
2005, 2007 and 2009 editions. Online, http://www.itrs.net.
[4] N. Magen, A. Kolodny, U. Weiser, and N. Shamir. Interconnect-power
dissipation in a microprocessor. In Proc. Int. Workshop on System-Level
Interconnect Prediction, pages 7–13, 2004.
[5] R. Ho, K. W. Mai, and M. A. Horowitz. The future of wires. Proceedings
of the IEEE, 89(4):490–504, Apr. 2001.
[6] S. Kumar, A. Jantsch, J.-P. Soininen, M. Forsell, M. Millberg, J. Öberg,
K. Tiensyrjä, and A. Hemani. A network on chip architecture and design
methodology. In Proc. IEEE Computer Society Annual Symp. on VLSI,
pages 105–112, 2002.
[7] T. Bjerregaard and S. Mahadevan. A survey of research and practices of
network-on-chip. ACM Computing Surveys, 38(1):1–51, Mar. 2006.
[8] S. Vangal et al. An 80-tile 1.28TFLOPS network-on-chip in 65nm CMOS.
In Digest of IEEE Int. Solid-State Circuits Conf., pages 98–99, 2007.
[9] C.-T. Hsieh and M. Pedram. Architectural energy optimization by bus
splitting. IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, 21(4):408–414, Apr. 2002.
[10] A. Narasimhan and R. Sridhar. Variability aware low-power delay optimal
buffer insertion for global interconnects. IEEE Transactions on Circuits
and Systems I, 57(12):3055–3063, Dec. 2010.
101
[11] W. J. Dally and J. W. Poulton. Digital Systems Engineering. Cambridge,
UK: Cambridge University Press, 1998.
[12] A. Katoch, E. Seevinck, and H. Veendrick. Fast signal propagation for
point to point on-chip long interconnects using current sensing. In Proc.
28th European Solid-State Circuits Conference, pages 195–198, Sept. 2002.
[13] K. Banerjee, H. Li, and N. Srivastava. Current status and future perspec-
tives of carbon nanotube interconnects. In Proc. IEEE Conf. on Nan-
otechnology, pages 432–436, 2008.
[14] S. Pasricha, F. J. Kurdahi, and N. Dutt. Evaluating carbon nanotube
global interconnects for chip multiprocessor applications. IEEE Transac-
tions on Very Large Scale Integration (VLSI) Systems, 18(9):1376–1380,
Sept. 2010.
[15] M. Haurylau, G. Chen, H. Chen, J. Zhang, N. A. Nelson, D. H. Al-
bonesi, E. G. Friedman, and P. M. Fauchet. On-chip optical interconnect
roadmap: Challenges and critical directions. IEEE Journal of Selected
Topics in Quantum Electronics, 12(6):1699–1705, Nov./Dec. 2006.
[16] M. F. Chang, J. Cong, A. Kaplan, M. Naik, G. Reinman, E. Socher,
and S.-W. Tam. CMP network-on-chip overlaid with multi-band RF-
interconnect. In Proc. IEEE Int. Symp. on High Performance Computer
Architecture, pages 191–202, 2008.
[17] A. Agarwal, D. Blaauw, and V. Zolotov. Statistical timing analysis for
intra-die process variations with spatial correlations. In Proc. Int. Conf.
on Computer Aided Design, pages 900–907, 2003.
[18] R. Gandikota, D. Blaauw, and D. Sylvester. Modeling crosstalk in statis-
tical static timing analysis. In Proc. 45th Design Automation Conference,
pages 974–979, 2008.
[19] E. Humenay, D. Tarjan, and K. Skadron. Impact of process variations on
multicore performance symmetry. In Proc. Design, Automation and Test
in Europe, pages 1–6, 2007.
[20] W. Zhao and Y. Cao. New generation of predictive technology model
for sub-45nm early design exploration. IEEE Transactions on Electron
Devices, 53(11):2816–2823, Nov. 2006.
[21] A. Agarwal, D. Blaauw, V. Zolotov, and S. Vrudhula. Computation and
refinement of statistical bounds on circuit delay. In Proc. 40th Design
Automation Conference, pages 348–353, 2003.
[22] L. T. Pillage and R. A. Rohrer. Asymptotic waveform evaluation for tim-
ing analysis. IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, 9(4):352–366, Apr. 1990.
102
[23] A. Odabasioglu, M. Celik, and L. T. Pileggi. PRIMA: Passive reduced-
order interconnect macromodeling algorithm. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 17(8):645–
654, Aug. 1998.
[24] L. Ding, D. Blaauw, and P. Mazumder. Accurate crosstalk noise modeling
for early signal integrity analysis. IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems, 22(5):627–634, May 2003.
[25] Y. I. Ismail, E. G. Friedman, and José L. Neves. Figures of merit to
characterize the importance of on-chip inductance. IEEE Transactions on
Very Large Scale Integration (VLSI) Systems, 7(4):442–449, Dec. 1999.
[26] C. R. Paul. Decoupling the multiconductor transmission line equations.
IEEE Transactions on Microwave Theory and Techniques, 44(8):1429–
1440, Aug. 1996.
[27] J. W. Johnson and M. Graham. High-Speed Signal Propagation: Advanced
Black Magic. Upper Saddle River, NJ: Prentice Hall, 2003.
[28] C. R. Paul. Analysis of Multiconductor Transmission Lines. New York,
NY: John Wiley & Sons, 1994.
[29] F. J. German and R. W. Johnson. Full wave three-dimensional simulation
of Maxwell’s equations for the electrical characterization of high-speed
interconnects. IEEE Transactions on Components, Hybrids, and Manu-
facturing Technology, 13(2):341–346, June 1990.
[30] J. Cong, T. Kong, and D. Z. Pan. Buffer block planning for interconnect-
driven floorplanning. In Proc. Int. Conf. on Computer-Aided Design,
pages 358–363, 1999.
[31] Y.-H. Cheng and Y.-W. Chang. Integrating buffer planning with floor-
planning for simultaneous multi-objective optimization. In Proc. Asia and
South Pacific Design Automation Conference, pages 624–627, 2004.
[32] W.-L. Hung, G. M. Link, Y. Xie, N. Vijaykrishnan, and M. J. Irwin.
Interconnect and thermal-aware floorplanning for 3D microprocessors. In
Proc. 7th Int. Symp. on Quality Electronic Design, pages 104–109, 2006.
[33] T. Zhang and S. S. Sapatnekar. Simultaneous shield and buffer insertion
for crosstalk noise reduction in global routing. IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, 15(6):624–636, Jun. 2007.
[34] M. Cho, D. Z. Pan, H. Xiang, and R. Puri. Wire density driven global
routing for CMP variation and timing. In Proc. Int. Conf. on Computer-
Aided Design, pages 487–492, 2006.
103
[35] M. R. Becer, D. Blaauw, I. Algor, R. Panda, C. Oh, V. Zolotov, and I. N.
Hajj. Postroute gate sizing for crosstalk noise reduction. IEEE Trans-
actions on Computer-Aided Design of Integrated Circuits and Systems,
23(12):1670–1677, Dec. 2004.
[36] N. Hanchate and N. Ranganathan. Post-layout gate sizing for interconnect
delay and crosstalk noise optimization. In Proc. 7th Int. Symp. on Quality
Electronic Design, pages 97–102, 2006.
[37] K. Moiseev, A. Kolodny, and S. Wimer. Power-delay optimiztion in VLSI
microprocessors by wire spacing. ACM Transactions on Design Automa-
tion of Electronic Systems, 14(4):55:1–55:28, Aug. 2009.
[38] A. Vittal and M. Marek-Sadowska. Crosstalk reduction for VLSI. IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, 16(3):290–
298, Mar. 1997.
[39] K. Agarwal, D. Sylvester, and D. Blaauw. Modeling and analysis of
crosstalk noise in coupled RLC interconnects. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 25(5):892–
901, May 2006.
[40] C.-W. Ho, A. Ruehli, and P. Brennan. The modified nodal approach to
network analysis. IEEE Transactions on Circuits and Systems, 22(6):504–
509, June 1975.
[41] M. Celik, L. Pileggi, and A. Odabasioglu. IC Interconnect Analysis. Nor-
well, MA: Kluwer Academic Publishers, 2002.
[42] E. Chiprout and M. S. Nakhla. Analysis of interconnect networks using
complex frequency hopping (CFH). IEEE Transactions on Computer-
Aided Design of Integrated Circuits and Systems, 14(2):186–200, Feb.
1995.
[43] P. Feldmann and R. W. Freund. Efficient linear circuit analysis by Padé
approximation via the Lanczos process. IEEE Transactions on Computer-
Aided Design of Integrated Circuits and Systems, 14(5):639–649, May
1995.
[44] J. R. Phillips and L. M. Silveira. Poor man’s TBR: A simple model reduc-
tion scheme. IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, 24(1):43–55, Jan. 2005.
[45] K. C. Sou, A. Megretski, and L. Daniel. A quasi-convex optimization
approach to parameterized model order reduction. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 27(3):456–
469, Mar. 2008.
104
[46] H. Yu, Y. Shi, and L. He. Fast analysis of structured power grid by
triangularization based structure preserving model order reduction. In
Proc. 43rd Design Automation Conference, pages 205–210, 2006.
[47] Y. Shin and T. Sakurai. Power distribution analysis of VLSI interconnects
using model order reduction. IEEE Transactions on Computer-Aided De-
sign of Integrated Circuits and Systems, 21(6):739–745, Jun. 2002.
[48] B. Franzini, C. Forzan, D. Pandini, P. Scandolara, and A. Dal Fabbro.
Crosstalk aware static timing analysis: a two step approach. In Proc.
IEEE Int. Symp. on Quality Electronic Design, pages 499–503, 2000.
[49] P. Feldmann. Model order reduction techniques for linear systems with
large numbers of terminals. In Proc. Design, Automation and Test in
Europe, pages 944–947, 2004.
[50] P. Liu, S. X.-D. Tan, B. McGaughy, L. Wu, and L. He. Termmerg: An
efficient terminal-reduction method for interconnect circuits. IEEE Trans-
actions on Computer-Aided Design of Integrated Circuits and Systems,
26(8):1382–1392, Aug. 2007.
[51] P. Feldmann and F. Liu. Sparse and efficient reduced order modeling of
linear subcircuits with large number of terminals. In Proc. IEEE/ACM
Int. Conf. on Computer-Aided Design, pages 88–92, 2004.
[52] S. Aaltonen. Order Reduction of Interconnect Circuits. Licentiate thesis,
Helsinki University of Technology, 2003.
[53] S. Yu, D. M. Petranovic, S. Krishnan, K. Lee, and C. Y. Yang. Loop-based
inductance extraction and modeling for multiconductor on-chip intercon-
nects. IEEE Transactions on Electron Devices, 53(1):135–145, Jan. 2006.
[54] G.-T. Lei, G.-W. Pan, and B. K. Gilbert. Examination, clarification,
and simplification of modal decoupling method for multiconductor trans-
mission lines. IEEE Transactions on Microwave Theory and Techniques,
43(9):2090–2100, Sept. 1995.
[55] F. Szidarovszky and O. A. Palusinski. Clarification of a decoupling method
for multiconductor transmission lines. IEEE Transactions on Microwave
Theory and Techniques, 47(5):648–649, May 1999.
[56] R. J. Lopez. Advanced Engineering Mathematics. Boston, MA: Addison-
Wesley, 2001.
[57] C. R. Paul. Introduction to Electromagnetic Compatibility. Hoboken, NJ:
John Wiley & Sons, 2006.
[58] J. Chen and L. He. A decoupling method for analysis of coupled RLC
interconnects. In Proc. IEEE/ACM Int. Great Lakes Symposium on VLSI,
pages 41–46, 2002.
105
[59] G. Chen and E. G. Friedman. An RLC interconnect model based on
Fourier analysis. IEEE Transactions on Computer-Aided Design of Inte-
grated Circuits and Systems, 24(2):170–183, Feb. 2005.
[60] S. Subash, M. S. Rahaman, and M. H. Chowdhury. Compact model for
carbon nanotubes interconnects using Fourier series analysis. In Proc.
IEEE Int. Midwest Symp. on Circuits and Systems, pages 1175–1178,
2009.
[61] Y. Eo, S. Shin, W. R. Eisenstadt, and J. Shim. Generalized traveling-
wave-based waveform approximation technique for the efficient signal in-
tegrity verification of multicoupled transmission line system. IEEE Trans-
actions on Computer-Aided Design of Integrated Circuits and Systems,
21(12):1489–1497, Dec. 2002.
[62] J. Cong, L. He, K.-Y. Khoo, C.-K. Koh, and Z. Pan. Interconnect design
for deep submicron ICs. In Proc. IEEE/ACM Int. Conf. on Computer-
Aided Design, pages 478–485, 1997.
[63] J. Qian, S. Pullela, and L. Pillage. Modeling the ’effective capacitance’ for
the RC interconnect of CMOS gates. IEEE Transactions on Computer-
Aided Design of Integrated Circuits and Systems, 13(12):1526–1535, Dec.
1994.
[64] K. Agarwal, D. Sylvester, and D. Blaauw. An effective capacitance based
driver output model for on-chip RLC interconnects. In Proc. 40th Design
Automation Conference, pages 376–381, 2003.
[65] S. Abbaspour and M. Pedram. Calculating the effective capacitance for
the RC interconnect in VDSM technologies. In Proc. Asia and South
Pacific Design Automation Conference, pages 43–48, 2003.
[66] A. Goel and S. Vrudhula. Current source based standard cell model for
accurate signal integrity and timing analysis. In Proc. Design, Automation
and Test in Europe, pages 574–579, 2008.
[67] N. Menezes, C. Kashyap, and C. Amin. A ’true’ electrical cell model for
timing, noise, and power grid verification. In Proc. 45th Design Automa-
tion Conference, pages 462–467, 2008.
[68] J. F. Croix and D. F. Wong. Blade and Razor: Cell and interconnect delay
analysis using current-based models. In Proc. 40th Design Automation
Conference, pages 386–389, 2003.
[69] I. Keller, K. H. Tam, and V. Kariat. Challenges in gate level modeling
for delay and SI at 65nm and below. In Proc. 45th Design Automation
Conference, pages 468–473, 2008.
106
[70] S. Nazarian, H. Fatemi, and M. Pedram. Accurate timing and noise analy-
sis of combinational and sequential logic cells using current source model-
ing. IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
19(1):92–103, Jan. 2011.
[71] P. Liljeberg, J. Tuominen, S. Tuuna, J. Plosila, and J. Isoaho. Self-timed
approach for noise reduction in NOC. In J. Nurmi, H. Tenhunen, J. Isoaho,
and A. Jantsch, editors, Interconnect-Centric Design for Advanced SOC
and NOC, pages 285–313. Kluwer, Dordrecht, The Netherlands, 2004.
[72] T. Sato, Y. Cao, K. Agarwal, D. Sylvester, and C. Hu. Bidirectional
closed-form transformation between on-chip coupling noise waveforms and
interconnect delay-change curves. IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems, 22(5):560–572, May 2003.
[73] M. Becer and I. N. Hajj. An analytical model for delay and crosstalk
estimation in interconnects under general switching conditions. In Proc.
IEEE Int. Conf. Electronics, Circuits and Systems, pages 831–834, 2000.
[74] G. Servel and D. Deschacht. On-chip crosstalk evaluation between adja-
cent interconnections. In Proc. IEEE Int. Conf. Electronics, Circuits and
Systems, pages 827–834, 2000.
[75] A. B. Kahng, S. Muddu, N. Pol, and D. Vidhani. Noise model for mul-
tiple segmented coupled RC interconnects. In Proc. Int. Symp. Quality
Electronic Design, pages 145–150, 2001.
[76] J. Cong, D. Zhigang Pan, and P. V. Srinivas. Improved crosstalk model-
ing for noise constrained interconnect optimization. In Proc. Asia South
Pacific Design Automation Conf., pages 373–378, 2001.
[77] H. Kawaguchi and T. Sakurai. Delay and noise formulas for capacitively
coupled distributed RC lines. In Proc. Asia and South Pacific Design
Automation Conf., pages 35–43, 1998.
[78] D. Pamunuwa, L.-R. Zheng, and H. Tenhunen. Maximizing throughput
over parallel wire structures in the deep submicrometer regime. IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, 11(2):224–
243, Apr. 2003.
[79] I. B. Dhaou, K. Parhi., and H. Tenhunen. Energy efficient signaling in
deep submicron CMOS technology. Special Issue on Timing Analysis and
Optimization for Deep Sub-Micron ICs, VLSI Design Journal, 15(3):563–
586, 2002.
[80] S. H. Choi, B. C. Paul, and K. Roy. Dynamic noise analysis with capacitive
and inductive coupling. In Proc. IEEE Int. Conf. VLSI Design, pages 65–
70, 2002.
107
[81] G. Servel, D. Deschacht, F. Saliou, J.-L. Mattei, and F. Huret. Inductance
effect in crosstalk prediction. IEEE Transactions on Advanced Packaging,
25(3):340–346, Aug. 2002.
[82] S. Tuuna, J. Isoaho, and H. Tenhunen. Analytical model for crosstalk and
intersymbol interference in point-to-point buses. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 25(7):1400–
1411, July 2006.
[83] W. McC. Siebert. Circuits, Signals and Systems. Cambridge, MA: The
MIT Press, 1986.
[84] Y. I. Ismail and E. G. Friedman. Effects of inductance on the propagation
delay and repeater insertion in VLSI circuits. IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, 8(2):195–206, Apr. 2000.
[85] Y. Eo, W. R. Eisenstadt, J. Y. Jeong, and O.-K. Kwon. A new on-
chip interconnect crosstalk model and experimental verification for CMOS
VLSI circuit design. IEEE Transactions on Electron Devices, 47(1):129–
140, Jan. 2000.
[86] W.-Y. Chen, S. K. Gupta, and M. A. Breuer. Analytical models for
crosstalk excitation and propagation in VLSI circuits. IEEE Trans-
actions on Computer-Aided Design of Integrated Circuits and Systems,
21(10):1117–1131, Oct. 2002.
[87] H. B. Bakoglu. Circuits, interconnections and packaging for VLSI. Read-
ing, MA: Addison-Wesley, 1990.
[88] P. Heydari, S. Abbaspour, and M. Pedram. Interconnect energy dissi-
pation in high-speed ULSI circuits. IEEE Transactions on Circuits and
Systems I, 51(8):1501–1514, Aug. 2004.
[89] A. Djordjevic, M. Bazdar, T. Sarkar, and R. Harrington. LINPAR for
Windows: Matrix parameters for multiconductor transmission lines, soft-
ware and user’s manual, version 2.0. Artech House, Boston, 1999.
[90] M. Kamon, M. J. Tsuk, and J. K. White. Fasthenry: A multipole-
accelerated 3-D inductance extraction program. IEEE Transactions on
Microwave Theory and Techniques, 42(9):1750–1758, Sep. 1994.
[91] K. T. Tang and E. G. Friedman. Simultaneous switching noise in on-chip
cmos power distribution networks. IEEE Transactions on Very Large
Scale Integration (VLSI) Systems, 10(4):487–493, Aug. 2002.
[92] Y. Eo, W. R. Eisenstadt, J. Y. Jeong, and O.-K. Kwon. New simul-
taneous switching noise analysis and modeling for high-speed and high-
density cmos ic package design. IEEE Transactions on Advanced Packag-
ing, 23(2):303–312, May 2000.
108
[93] S. Tuuna, L.-R. Zheng, J. Isoaho, and H. Tenhunen. Modeling of on-chip
bus switching current and its impact on noise in power supply grid. IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, 16(6):766–
770, Jun. 2008.
[94] L.-R. Zheng and H. Tenhunen. Fast modeling of core switching noise
on distributed LRC power grid in ULSI circuits. IEEE Transactions on
Advanced Packaging, 24(3):245–254, Aug. 2001.
[95] M. R. Stan and W. P. Burleson. Bus-invert coding for low-power I/O.
IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
3(1):49–58, Mar. 1995.
[96] H. Kaul, D. Sylvester, M. A. Anders, and R. K. Krishnamurthy. Design
and analysis of spatial encoding circuits for peak power reduction in on-
chip buses. IEEE Transactions on Very Large Scale Integration (VLSI)
Systems, 13(11):1225–1238, Nov. 2005.
[97] C. Raghunandan, K. S. Sainarayanan, and M. B. Srinivas. Bus-encoding
technique to reduce delay, power and simultaneous switching noise (SSN)
in RLC interconnects. In Proc. ACM Great Lakes Symp. on VLSI, pages
371–376, 2007.
[98] B. Victor and K. Keutzer. Bus encoding to prevent crosstalk delay. In
Proc. IEEE Int. Conf. on Computer-Aided Design, pages 57–63, 2001.
[99] C.-G. Lyuh and T. Kim. Low-power bus encoding with crosstalk de-
lay elimination. IEE Proceedings Computers and Digital Techniques,
153(2):93–100, Mar. 2006.
[100] Z. Khan, T. Arslan, and A. T. Erdogan. Low power system on chip
bus encoding scheme with crosstalk noise reduction. IEE Proceedings
Computers and Digital Techniques, 153(2):101–108, Mar. 2006.
[101] E. Macii, M. Poncino, and S. Salerno. Combining wire swapping and
spacing for low-power deep-submicron buses. In Proc. ACM Great Lakes
Symp. on VLSI, pages 198–202, 2003.
[102] J. Zhang and E. G. Friedman. Effect of shield insertion on reducing
crosstalk noise between coupled interconnects. In Proc. Int. Symp. on
Circuits and Systems, pages 529–532, 2004.
[103] M. Mutyam. Selective shielding technique to eliminate crosstalk transi-
tions. ACM Transactions on Design Automation of Electronic Systems,
14(3):43:1–43:20, May 2009.
[104] N. Hanchate and N. Ranganathan. Simultaneous interconnect delay and
crosstalk noise optimization through gate sizing using game theory. IEEE
Transactions on Computers, 55(8):1011–1023, Aug. 2006.
109
[105] K. Hirose and H. Yasuura. A bus delay reduction technique considering
crosstalk. In Proc. Design Automation and Test in Europe, pages 441–445,
2000.
[106] M. Ghoneima, Y. I. Ismail, M. M. Khellah, J. W. Tschanz, and V. De.
Reducing the effective coupling capacitance in buses using threshold volt-
age adjustment techniques. IEEE Transactions on Circuits and Systems
I, 53(9):1928–1933, Sept. 2006.
[107] M. Ghoneima and Y. I. Ismail. Utilizing the effect of relative delay on
energy dissipation in low-power on-chip buses. IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, 12(12):1348–1359, Dec. 2004.
[108] Y. M. Lee and K. H. Park. Mesochronous bus for reducing peak I/O power
dissipation. IEE Electronics Letters, 37(5):278–279, Mar. 2001.
[109] S. Tuuna, J. Isoaho, and H. Tenhunen. Skewing-based method for re-
duction of functional crosstalk and power supply noise caused by on-chip
buses. IET Computers & Digital Techniques, in press.
[110] L.-R. Zheng and S. Tuuna. Power distribution modeling and integrity
analysis. In R. Nair and D. Bennett, editors, Power Integrity Analysis
and Management for Integrated Circuits, pages 221–258. Prentice Hall,
Boston, MA, 2010.
[111] Y. Ogasahara, M. Hashimoto, and T. Onoye. Measurement and analysis
of inductive coupling noise in 90 nm global interconnects. IEEE Journal
of Solid-State Circuits, 43(3):718–728, Mar. 2008.
[112] Y. Massoud, S. Majors, J. Kawa, T. Bustami, D. MacMillen, and J. White.
Managing on-chip inductive effects. IEEE Transactions on Very Large
Scale Integration (VLSI) Systems, 10(6):789–798, Dec. 2002.
[113] P. Gupta and A. B. Kahng. Wire swizzling to reduce delay uncertainty
due to capacitive coupling. In Proc. Int. Conf. on VLSI Design, pages
431–436, 2004.
[114] B. Victor and K. Keutzer. Bus encoding to prevent crosstalk delay. In
Proc. Int. Conf. on Computer Aided Design, pages 57–63, 2001.
[115] M. Lampropoulos, B.M. Al-Hashimi, and P. Rosinger. Minimization of
crosstalk noise, delay and power using a modified bus invert technique. In
Proc. Design, Automation and Test in Europe, pages 1372–1373, 2004.
[116] S.-W. Tu, Y.-W. Chang, and J.-Y. Jou. RLC coupling-aware simulation
and on-chip bus encoding for delay reduction. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 25(10):2258–
2264, Oct. 2006.
110
[117] R. R. Rao, H. S. Deogun, D. Blaauw, and D. Sylvester. Bus encod-
ing for total power reduction using a leakage-aware buffer configura-
tion. IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
13(12):1376–1383, Dec. 2005.
[118] Y. Cao, P. Gupta, A.B. Kahng, D. Sylvester, and J. Yang. Design sensi-
tivities to variability: extrapolations and assessments in nanometer VLSI.
In Proc. ASIC/SoC Conference, pages 411–415, 2002.
[119] V. Mehrotra. Modeling the effects of systematic process variation on cir-
cuit performance. Ph.D. thesis, Massachusetts Institute of Technology,
2001.
[120] A. B. Kahng and K. Samadi. CMP fill synthesis: A survey of recent stud-
ies. IEEE Transactions on Computer-Aided Design of Integrated Circuits
and Systems, 27(1):3–19, Jan. 2008.
[121] P. Yu, S. X. Shi, and D. Z. Pan. True process variation aware optical
proximity correction with variational lithography modeling and model cal-
ibration. SPIE Journal of Micro/Nanolithography, MEMS, and MOEMS,
6(3):1–36, Sept. 2007.
[122] S. Tuuna, E. Nigussie, J. Isoaho, and H. Tenhunen. Analysis of delay
variation in encoded on-chip bus signaling under process variation. In
Proc. 21st Int. Conf. on VLSI Design, pages 228–234, 2008.
[123] M. A. El-Moursy and E. G. Friedman. Shielding effect of on-chip inter-
connect inductance. IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, 13(3):396–400, Mar. 2005.
[124] R. Arunachalam, F. Dartu, and L. T. Pileggi. CMOS gate delay models
for general RLC loading. In Proc. Int. Conf. on Computer Design, pages
224–229, 1997.
[125] N. Delorme, M. Belleville, and J. Chilo. Inductance and capacitance
analytic formulas for VLSI interconnects. IEEE Electronics Letters,
32(11):996–997, May 1996.
[126] M. E. Dean, T. E. Williams, and D. L. Dill. Efficient self-timing with level-
encoded 2-phase dual rail (LEDR). In Proc. Univ. of California/Santa
Cruz Conf. on Advanced Research in VLSI, pages 55–70, 1991.
[127] W. J. Bainbridge and S. B. Furber. Delay insensitive system-on-chip in-
terconnect using 1-of-4 data encoding. In Proc. 7th Int. Symp. on Asyn-
chronous Circuits and Systems, pages 118–126, 2001.
[128] A. Mitra, W. F. McLaughlin, and S. M. Nowick. Efficient asynchronous
protocol converters for two-phase delay-insensitive global communication.
In Proc. 13th Int. Symp. on Asynchronous Circuits and Systems, pages
186–195, 2007.
111
[129] W. Zhao, F. Liu, K. Agarwal, D. Acharyya, S. R. Nassif, K. J. Nowka,
and Y. Cao. Rigorous extraction of process variations for 65-nm CMOS
design. IEEE Transactions on Semiconductor Manufacturing, 22(1):196–
203, Feb. 2009.
[130] W. Zhao and Y. Cao. New generation of predictive technology model
for sub-45nm design exploration. In Proc. Int. Symposium on Quality
Electronic Design, pages 585–590, 2006.
[131] D. Blaauw, K. Chopra, A. Srivastava, and L. Scheffer. Statistical timing
analysis: From basic principles to state of the art. IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, 27(4):589–
607, Apr. 2008.
[132] A. Agarwal, D. Blaauw, V. Zolotov, S. Sundareswaran, M. Zhao, K. Gala,
and R. Panda. Statistical delay computation considering spatial correla-
tions. In Proc. Asia and South Pacific Design Automation Conference,
pages 271–276, 2003.
[133] A. Singhee, S. Singhal, and R. A. Rutenbar. Practical, fast Monte Carlo
statistical static timing analysis: Why and how. In Proc. Int. Conf. on
Computer-Aided Design, pages 190–195, 2008.
[134] J. Jaffari and M. Anis. Practical Monte-Carlo based timing yield estima-
tion of digital circuits. In Proc. Int. Conf. on Design, Automation and
Test in Europe, pages 807–812, 2010.
[135] T. Wang and F. Yuan. A new current-mode incremental signaling scheme
with applications to Gb/s parallel links. IEEE Transactions on Circuits
and Systems I, 54(2):255–267, Feb. 2007.
[136] A. P. Jose, G. Patounakis, and K. L. Shepard. Pulsed current-mode sig-
naling for nearly speed-of-light intrachip communication. IEEE Journal
of Solid-State Circuits, 41(4):772–780, Apr. 2006.
[137] L. Zhang, J. M. Wilson, R. Bashirullah, L. Luo, J. Xu, and P. D. Franzon.
A 32-Gb/s on-chip bus with driver pre-emphasis signaling. IEEE Transac-
tions on Very Large Scale Integration (VLSI) Systems, 17(9):1267–1274,
Sept. 2009.
[138] E. Nigussie, J. Plosila, S. Tuuna, J. Isoaho, and H. Tenhunen. Energy
efficient semi-serial on-chip link through circuit optimizations and inte-
gration of signaling techniques. IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, in press.
[139] P. Heydari, S. Abbaspour, and M. Pedram. Interconnect energy dissi-
pation in high-speed ULSI circuits. IEEE Transactions on Circuits and
Systems I, 51(8):1501–1514, Aug. 2004.
112
[140] M. Alioto, G. Palumbo, and M. Poli. Energy consumption in RC tree
circuits. IEEE Transactions on Very Large Scale Integration (VLSI) Sys-
tems, 14(5):452–461, May 2006.
[141] M. Alioto, G. Palumbo, and M. Poli. Analysis and modeling of energy
consumption in RLC tree circuits. IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, 17(2):278–291, Feb. 2009.
[142] W. C. Athas, L. J. Svensson, J. G. Koller, N. Tzartzanis, and E. Y.-C.
Chou. Low-power digital systems based on adiabatic-switching princi-
ples. IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
2(4):398–407, Dec. 1994.
[143] S. Paul, A. M. Schlaffer, and J. A. Nossek. Optimal charging of capaci-
tors. IEEE Transactions on Circuits and Systems I, 47(7):1009–1016, July
2000.
[144] K. Sundaresan and N. R. Mahapatra. An analysis of timing violations
due to spatially distributed thermal effects in global wires. In Proc.
IEEE/ACM 44th Design Automation Conference, pages 515–520, 2007.
[145] R. Bashirullah, W. Liu, and R.K. Cavin. Current-mode signaling in deep
submicrometer global interconnects. IEEE Transactions on Very Large
Scale Integration (VLSI) Systems, 11(3):406–417, June 2003.
[146] A.N. Irfansyah, T. Lehmann, and S. Nooshabadi. Energy delay optimiza-
tion methodology for current-mode signaling for on-chip interconnects. In
Proc. IEEE Integrated Circuit Design and Technology Conf., pages 147–
150, 2008.
[147] C. Kashyap and B.L. Krauter. A realizable driving point model for on-chip
interconnect with inductance. In Proc. Design Automation Conference,
pages 190–195, 2000.
[148] S. Tuuna, E. Nigussie, J. Isoaho, and H. Tenhunen. Modeling of energy
dissipation in RLC current-mode signaling. IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, in press.
[149] J. A. Dobrowolski. Microwave Network Design Using the Scattering Ma-
trix. Artech House, Norwood, MA, 2010.
[150] K. Nose and T. Sakurai. Analysis and future trend of short-circuit power.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and
Systems, 19(9):1023–1030, Sept. 2000.
[151] A. Maheshwari and W. Burleson. Differential current-sensing for on-chip
interconnects. IEEE Transactions on Very Large Scale Integration (VLSI)
Systems, 12(12):1321–1329, Dec. 2004.
113
[152] G. Moon, M.E. Zaghloul, and R.W. Newcomb. An enhancement-mode
MOS voltage-controlled linear resistor with large dynamic range. IEEE
Transactions on Circuits and Systems, 37(10):1284–1288, Oct. 1990.
[153] R. Weerasekera, M. Grange, D. Pamunuwa, and H. Tenhunen. On sig-
nalling over through-silicon via (TSV) interconnects in 3-D integrated
circuits. In Proc. Conf. on Design, Automation and Test in Europe, pages
1325–1328, 2010.
[154] K. Salah, H. Ragai, Y. Ismail, and A. E. Rouby. Equivalent lumped
element models for various n-port through silicon vias networks. In Proc.
Asia and South Pacific Design Automation Conference, pages 176–183,
2011.
[155] P. J. Burke. An RF circuit model for carbon nanotubes. IEEE Transac-
tions on Nanotechnology, 2(1):55–58, Mar. 2003.
[156] A. Raychowdhury and K. Roy. Modeling of metallic carbon-nanotube in-
terconnects for circuit simulations and a comparison with Cu interconnects
for scaled technologies. IEEE Transactions on Computer-Aided Design of





In the equations below, s1, s2 and s3 are the roots of the third degree equation






) + 1RsLC1C2 . It has been assumed in
the derivation of the inverse Laplace transform that all roots are distinct.
Coefficients for Equation (3.12):











































































− s2s3 + s1s2
(A.2)
Coefficients for Equation (3.13):
b1 = β4 − β2 − β8β7 −
β3(s1s3−s1s2)
s2s3−s1s2






































(s2 − s3 − (s1s3−s1s2)(s1−s3)s2s3−s1s2 )
)−1
β4 = V02 − V0 + I0LRsC1 −
RsRC1I0+LI0
RsC1






β6 = − I0RsC1C2
β7 = s3 − s1 + (s2−s1)s2s3−(s2−s1)s1s2s1s3−s2s3
β8 = β5 + β4(s3 + s2)− (s2−s1)(β6−β4s2s3)s1s3−s2s3
(A.4)
Coefficients for Equation (3.14):
116
c1 = γ1 + γ4
c2 = −γ1 − γ2 − γ3 − γ4s3+γ5(s3−s2)s3−s1
c3 = γ2 + γ5











c9 = (γ6 + γ7)(
(s2s3−s1s3)(s3−s1)
s1−s2









c14 = (γ8 + γ9)(
(s2s3−s1s3)(s3−s1)
s1−s2


























































































































































































































































































































































CLRS + 1 (B.8)
d0 =
1
tr
(B.9)
119
