Statistical Power Supply Dynamic Noise Prediction in Hierarchical Power Grid and Package Networks by Graziano, Mariagrazia & Piccinini, Gianluca
Statistical Power Supply Dynamic Noise
Prediction in Hierarchical Power Grid and
Package Networks
M. Graziano  , G. Piccinini
Dipartimento di Elettronica, Politecnico di Torino,
Corso Duca degli Abruzzi, 24 Torino, I10129 Italy
Abstract
One of the most crucial high performance Systems-on-Chip design challenge is to
front their power supply noise suerance due to high frequencies, huge number of
functional blocks and technology scaling down. Marking a dierence from tradi-
tional post physical design static voltage drop analysis, a priori dynamic voltage
drop evaluation is the focus of this work. It takes into account transient currents
and on-chip and package RLC parasitics while exploring the power grid design
solution space: Design countermeasures can be thus early dened and long post
physical design verication cycles can be shortened. As shown by an extensive set
of results, a carefully extracted and modular grid library assures realistic evaluation
of parasitics impact on noise and facilitates the power network construction; fur-
thermore statistical analysis guarantees a correct current envelope evaluation and
Spice simulations endorse reliable results.
Key words: Interconnects; Power supply noise; IR drop; Switching activity
1 Introduction
The urgency to integrate increasing amount of functional units in the same
circuit enhances the System-on-Chip (SoC) scenario design complexity. One of
the most critical concern is related to the routing and sizing of interconnects
delivering both signal and power supply to the functional units. In fact the
compelling scaling down of transistors feature sizes that allows to achieve
the SoC integration level is strictly entangled with signal and power supply
 Corresponding author
Email address: mariagrazia.graziano@polito.it (M. Graziano).
/
integrity issues which are aggressively challenging the design of interconnects
systems. In particular the Vdd and Gnd signals are exposed to deviation from
the nominal values because of the mutual impact of two factors: the increasing
currents to be delivered to the huge number of active devices and the parasitics
of both on-chip and package-to-die wires, which are less negligible due to
scaling down and to rising frequencies.
The phenomena at the basis of Power Supply Noise (PSN) are voltage drop
(IR drop) and switching noise (LdI=dt) [1]. The former is due to the high
amount of current needed by the power hungry blocks and to the wire resis-
tance. This has the tendency not to decrease proportionally in scaled technolo-
gies. The tradeo between interconnect density requirements, urging toward
smaller wires, and the material and geometrical countermeasures, proposed by
process engineers, is the key for limiting the IR drop impact in future tech-
nology nodes. The latter is related to the increasing current transients allowed
by scaled transistors and required by frequencies constraints in performance
compelling applications. In addition, the inductive behavior of interconnects
is less and less negligible as frequencies increase, thus enhancing the jeopar-
dizing eect of switching noise.
Both IR drop and LdI=dt are related to on chip power supply network as well
as to package-to-die power delivery system. As reviewed in [2], the package
level parasitic inductance has traditionally dominated the power distribution
network total inductance. On the contrary, the on chip wire resistance has
been in the past years recognized as having the most aggressive impact on
the total PSN drop amount. This classication seems no more suitable to
modern high-performance SoCs, as the forecasted increase rate of transient
current is more than double the average current one [3]. Furthermore, the use
of ip-chip package technology in substitution of wirebonding balances the
on-chip/package inductance impact ratio. These aspects imply that switching
noise will be in the future less negligible with respect to IR drop [4], and that
on-chip and package parasitics have inuences on PSN which are dicult to
disentangle [5], [6], [7].
An evolved PSN classication has been recently introduced: It dierenti-
ates the Static Voltage Drop (SVD), that is IavgR, associated to the av-
erage gate/block current, from the Dynamic Voltage Drop (DVD), that is
i(t)R + Ldi(t)
dt
, due to transient gate/block current. The latter includes not
only resistive drop, but switching noise and thus on-chip and package induc-
tive impact as well. DVD evaluation is considered at the time of writing the
most trustworthy indication of PSN, as it accurately takes into account tran-
sient currents. In fact, it has been shown in [8] that the use of an average
IR drop for all the gates in a circuit using corner analyses (worst/best case
power supply voltage) or the derating factor methodology (gate delay linearly
varying with average power supply voltage variation) leads to completely in-
accurate results in terms of circuit timing analysis. For example, in a medium
performance industrial design (340K gates), synthesised using a 0:13m tech-
2
nology, the critical path analysis performed using the derating factor method
leaded to approximately 50% underestimation of the noise eect on timing, if
compared to an accurate transient Spice simulation. It is thus indubitable the
importance of taking into account transient currents for both IR drop and
switching noise when evaluating PSN eects.
Another important point concerns the design ow stage in which PSN is taken
into account. Traditional design methodologies use as much as possible prede-
ned and overdesigned power grids. This is risky in current and future technol-
ogy nodes, and not practicable in crowded high performance modern designs,
in which interconnect resources must be carefully assigned. In more accurate
design methodologies physical designers create the power network, verify with
back-end tools the power supply voltage variations and adjust the power grid
sizes and/or the blocks placement. In high performance design this is not a
one shot phase; on the contrary it is cycled many times till the constraints are
met, leading to intolerable time-to-market delays. Even if this back-end accu-
rate analysis phase cannot be avoided, a prediction of the power grid design
criteria would aid the designer in closing the loop in a shorter time. This is still
more important when not only the on-chip power network impact on supply
noise is considered, but when the wirebonding or ip-chip inuence is taken
into account as well. As a matter of fact, the choice on the connection points
between package and on-chip grid, their parasitics and layout impact mutually
on PSN. In fact, even if an on-chip power grid has been accurately designed,
its DVD performance may be completely harassed in case package parasitics
are included in the analysis. At the time of writing the possibility to know
the DVD amount and its connection with power grid parameters is bounded
to the back-end analysis. This means that a post physical design netlist must
be extracted (for both interconnects and logic gates) and a time consuming
vector based spice-like simulation must be executed. If, on the one hand, this
step is sometimes feasible as a nal verication step, on the other hand it
is not suitable to a trial-and-error design method. Furthermore, the package
impact is often neglected at this design stage. The aim of this work, thus, is to
asses noise statistics and their dependency on package and on-chip design pa-
rameters. Furthermore, a methodology for estimating in an early design phase
the potential DVD and its relation to the design variables is proposed. The
solution space can be thus explored before the physical design step is taken.
In this way, when designing the on-chip power grid, near-to-optimal solution
criteria can be adopted, and rapid design closure can be achieved. The predic-
tion regards dynamic voltage drop and concerns technology node, geometry,
topology and on-chip and package power supply design alternatives.
The rest of the paper is organized as follows: In section 2 previous works on the
subject are reviewed and in section 3 the proposed methodology is described.
In section 4 the structure used for the PSN evaluation is analyzed and in
section 5 the achieved results are discussed. Conclusions are drawn in section
6.
3
2 Previous works
The most important PSN aspects to be analyzed are: power supply network
electrical behaviour, current envelopes owing through metal lines, power grid
topology and its connection to package, package type and parasitics.
In most of the works addressing PSN analysis, power supply grid parasitics
are initially extracted and subsequently the correspondent network is, in most
of the cases, simplied to reduce the computational resources necessary for
executing the electrical simulation. The challenging tasks do precisely concern
both parasitics extraction accuracy and power grid modeling. The goal is to
achieve a good trade o between precision, and thus results reliability, and
the possibility to reasonably manage the extracted grid complexity. This step
has been in past works reached focusing on dierent methodologies. They con-
cern the way in which the grid parasitics can be extracted and the fact that
parasitic extraction can be pursued considering or neglecting the inductance
(RLC vs. RC parasitic networks). For example, in [9] macromodels for grid
subsets are created, while in [10] the grid is reduced to a coarser structure
mapped back to the original grid. In [11] and [12] the grid is modeled using
transmission line theory and, in particular, in [4] lossy transmission lines are
used for modeling power grid blocks with frequency dependent properties. In
[13] a nite dierence time domain method is used based on the solution of
Maxwell's equations in the time domain.
The circuit switching activity has a strong impact on IR-drop ad switching
noise amount and distribution. Depending on the accuracy and on the use
for which this information is derived, dierent analysis approaches can be
applied. The simplest way to assess the current value in a macroblock is to
sum up all the worst case currents for all the gates in the block, under the
assumption of their activity timing window superposition. This assumption
is as much realistic as much the considered block is made of dynamic logic
and less combinatorial paths are present. In any case, it has been shown that
this method leads to an extremely pessimistic evaluation of the wire width
necessary to overcome electromigration and IR drop, especially for big blocks
or circuits. Two are the problematic assumptions at the base of this approach:
The coincidence of the switching windows for all the gates, and the equal
value and \direction" of all the switching currents. Such a method has been
given up when routing congestion became a concern, and when simple direc-
tives to routing tools where no more satisfactory to solve the problem. As an
alternative in [14] the use of transistor level or gate level simulations is pro-
posed to nd out the current waveform drawn by the circuit block, applying
a user dened or a random input vector. In some other works genetic algo-
rithms are used to obtain an input set that will produce a worst case voltage
drop at all the power bus nodes. Most of the times these approaches are re-
source consuming; furthermore it may not be true that the maximum total
instantaneous current produces the maximum voltage drop at all power bus
4
nodes. Another approach (frequently found in the literature) is on the con-
trary input independent; it diers from the previous one because it performs a
static timing analysis to nd out the minimum and the maximum time during
which a gate may switch. Results could be still pessimistic because gates are
assumed to switch in the same direction. If detailed informations are needed
the only reliable approach is to take into account both polarity and timing
informations for current spikes. Within this methodology dierent solutions
have been proposed: In [15] temporal and spatial correlations and directions
of switching currents are accounted for using a constrained graph approach;
in [16] the dependency of the block current waveform on its input vector are
captured using frequency domain current macro-models, while in [17] current
signatures using complex compression methods are generated.
The other point strictly related to DVD is the inuence of package parasitics.
Switching noise has become a concern when wirebonding parasitic inductance
connecting the core pads to the package have reached critical values. The
circuit pads are normally directly connected to signal buers driving huge
amount of currents, and thus inducing dangerous glitches on the on chip power
distribution wires. The ip-chip (C4) full grid array technology has in many
cases replaced wirebonding in high performance ICs. The inductance assumes
smaller values in this case. Furthermore the power distribution technique has
changed from the traditional interdigitated style to a hierarchical full area
power supply pin connection. This allows a better voltage supply distribu-
tion to all the core points. On the basis of this scenario, it is thus necessary,
when modeling the power grid, to take into account the power supply volt-
age connection to the package, and the equivalent model of the ip-chip or
bondingwire interconnects. Many works in the literature have characterized
the dierences between the ip-chip and the wirebonding technology. In par-
ticular, in [18] the electrical performance of the twos are analyzed using a 3D
electromagnetic simulator: Results show better C4 bump performance com-
pared to the wirebonding one. Anyway, the mutual impact between package
and on-chip parasitics, that is on PSN, is not frequently taken into account. In
[19] the eective resistance between circuit blocks and power supply bump is
taken into account during the oorplanning design stage. In [20] the package is
analyzed in connection with the on-chip power supply distribution, while the
impact of decoupling capacitors on package and on chip DVD is discussed in
[21]. Anyway, in the last two works, the focus is on package parasitics, while
both the on-chip grid and functional blocks current sources are modeled in
a straightforward way. This is too simple for assessing not only the impact
of the on-chip grid design parameters on both global and local PSN, but the
mutual package-die inuence on PSN as well.
Even if apparently not related to the previous discussion, when dealing with
power grid design, even the electromigration (EM) problem should be taken
into account, as it is strictly tied to DVD for many reasons. First, a wire
stressed by EM has a growing resistance, and thus an higher impact on IR
5
drop; second, the EM mechanism involves joule-heating, that is, net lifetime
is a function of the real metal temperature. This depends on the RMS current
drawn by the net and on metal resistance, that is itself dependent on tem-
perature. Lifetime and PSN are then related one another by current, metal
characteristics and temperature: Design directives for solving both the prob-
lems could be in opposition. The methodology proposed in this paper for PSN
early assessment does not consider the EM problem, but it can be easily taken
into account using, for example, the formulations expressed in [22].
3 Proposed methodology
The most diused back-end PSN analysis tools [23], [24] use RC extracted
values of the power grid networks (only few cases include inductance), and
compute an average SVD using estimated switching activity factors and clock
frequency. The results consist thus only of post physical-design average values:
Late and non-DVD nor package aware informations are achieved. A dierent
perspective is at the root of this work: Its goal is not the generation of a back-
end accurate analysis algorithm; on the contrary it aims at assessing the DVD
behavior and its dependency on the design parameters. We propose to use
this methodology for rapidly predicting the expected noise when a cluster of
design conditions is given. These includes: technology, power grid parameters
(metal layers, wire geometry, grid density, hierarchical topology, parasitics),
circuit parameters (number of gates, operating frequencies, current switching
activities), package variables (ip-chip or wirebonding parasitics, number and
point of connections) and decoupling capacitors insertions (total capacitance
amount, radius of eect, on-chip or o-chip decap).
Given the abovementioned parameters, the power grid and package structure
is generated in a modular way using a library of simple grid and package com-
ponents and current generators described using the SPICE syntax. The library
grid components dier for the technology, the metal type and the geometry.
Their layouts are designed and their parasitics are extracted so that the nal
library of RLC block parasitics is used for the simulations. The current gen-
erators in the library are statistically described so that the complex circuit
blocks switching activities are meaningfully modeled as transient events. Fur-
thermore the statistical variables chosen reect the parameters of the circuit
whose PSN is to be predicted. The grid based on these library components
is generated on the basis of the chosen topology and hierarchy. The package
and decaps are included in the structure as well, still using ad-hoc library
elements. Anyway, these models are simple, as they are included in the work
with the aim of analysing the package impact on the on-chip DVD and not
the detailed ip-chip or wirebonding behavior itself.
Once the conguration to be analyzed is chosen and the grid is hierarchi-
6
M4
M6
Fig. 1. Example of a hierarchical power grid Vdd structure.
cally described, an electrical simulation is performed using the Monte-Carlo
analysis feature embedded in the SPICE engine, so that current generators
statistical parameters are varied and the DVD nal generated data are statis-
tically meaningful.
In this paper we consider a few grid congurations and report DVD values and
their dependency on the chosen parameters. A far as the author knowledge,
there are not previous works focusing on an a priori power grid dynamic volt-
age drop noise assessment tacking into account both on-chip and package pa-
rameters, as well as realistic transient current switching activities. The results
give the designer still approximated but reliable indications on the expected
circuit DVD on the basis of the chosen conguration, and thus directives for
the following power grid physical design step to be executed.
4 DVD analysis structure
The purpose of this work is thus to easily build a framework in which various
parameters are used for noise estimation. The relations between the noise g-
ures found and the variable set help in guring out guidelines to predict and
reduce power grid noise in early design phases. For this reason the trade-o
between accuracy and exibility suggested the creation of libraries composed
by simple but highly congurable structures for interconnects, current sources
and package, for which a brief description is given in the following. It is im-
portant to underline that the possible variables and congurations considered
in this context do not exhaust all the real possibilities, as the aim is to show
the feasibility of the methodology and to assess the DVD noise amount in
typical cases. Anyway, the framework organization is exible and further vari-
ables and/or structures can be easily added to enlarge the library complexity.
For example, one of the variables is technology: We consider here 0:25m
(hereinafter T1), 0:18m (T2) and 0:13m (T3) as fully available at the time of
writing for what concerns metal data and cell library informations. Once these
data (not only ITRS estimated but real foundry data) are available for other
up-to-date and future technology nodes, the same analyses reported here can
7
be performed using the same methodology.
4.1 Power grid library
In high performance designs the power grid is often organized as sketched in
gure 1, where, for example, a top layer (M6) is used for distributing the power
supply voltage to macro-blocks. Each of them has a more crowded power grid
wired on a lower layer (for example M4) which distributes the supply to smaller
blocks. Often the metal layer used is dierent for the horizontal and vertical
directions, and Vdd and Gnd are routed using the same layer and interleaved.
We consider thus as a basic structure for the power supply mesh the L  shaped
l
w
l
m
et
al
   
i
metal   j
via  Mj − Mi
w
L−shaped
structure
Fig. 2. Basic L  shaped component of a power supply mesh.
layout in gure 2. The metal used can be the same for both the directions or
can be dierent: The variables i and j identify the metal layer used for the
vertical and horizontal wires respectively, while the Mi Mj via identies the
simple or stacked via between the two layers. When building the library we
consider all the plausible (i; j) combinations of layers in this structure (metal
layers from 1 to 6 are allowed by the technologies considered in this work).
For what concerns the geometry, in the restricted library considered for this
work we used two possible widths (w in gure): One is the minimum allowed
for power supply stripes by the technology taken into account (wmin) and the
other is ve times larger than the minimum 1 (5wmin). The wmin may not be
realistic in real power networks, anyway it is used here for comparing dierent
technologies at a reference point, being the minimum width allowed for power
supply wires the only xed width for each technology. The wire length is an-
other variable which assumes in the case of this work two values: l1 = 10m
and l2 = 100m (the same note as for w is valid here). The length granularity
is based on the typical sizes of an \average gate" in a library of the technology
considered as a starting point, that is the T1 = 0:25m. We assume that an
\average gate" GT in the library for technology T has the parameter P , that
is area, width, current peak and width, etc., averaged among the real gate
1 Obviously, a complete library should comprise other values for w, but even these
two values guarantee the feasibility of the methodology.
8
parameters in a library of K gates:
GT = avgKfGkgT
where Gk is a library gate; thus
PGT = avgKfPGkgT
is an average parameter P of the average gateGT . In the case of this L  shaped
structure, we assume that the area embraced within the L  shaped square in
the minimum length case l1 corresponds to the area of two gates of average size
for technology T1 = 0:25m, that is A
T1
1 (l1  l1) = 2A(GT1). The other cases
in terms of length and of technology are a consequence of this assumption:
Given ll, where l = 1; 2 in our case, the number of gates embraced for the
other technologies in the correspondent ll  ll area is computed considering
the \average gate" area:
ATl (ll  ll) = NT1l A(GT1) = NT2l A(GT2) = NT3l A(GT3)
where NT11 = 2. These choices are clearly not general nor extremely accurate,
but allow to easily build the basic blocks to be used for creating a complex
grid in a modular way.
The layout of these L  shaped structures is generated using Cadence [25],
the correspondent GDSII les are used for extracting the parasitics using
SPACE [26] for resistance and capacitance, and FASTHENRY [27] for in-
ductance. The equivalent circuit is thus a T-RLC structure in each branch of
the l  shaped block, sketched in gure 3 (the current generators parameters
are described in the following section).
L
ii
j
jR
LR
ISource
SourceICground i
ground jC
R i−j via
Fig. 3. Equivalent electrical circuit for the L  shaped structure.
In table 1 some of the values extracted in the l1 = 10m and l2 = 100m cases
are reported for the three technologies. For l2 only the M4/M6 and M5/M6
cases are reported as the most typical for this length. It represents in fact an
higher hierarchical block. The Mi Mj via resistance values are not reported
for sake of brevity. The L  shaped structure equivalent circuit is used to build
in a regular and modular way a mesh of the desired shape and size, so that we
can emulate a mesh powering a circuit of an high number of gates. An example
is reported in gure 4, in which a certain number of L  shaped structures of
length l1 (or l2) is used for creating a mesh so that a global l
V
mesh  lOmesh
9
Table 1
Parasitics parameters extracted for the basic l1 and l2 structure.
l1 l2
M 1 M 2 M 3 M 4 M 5 M 6 M4/M6 M5/M6
Technology T1 = 0:25m
R wmin [
] 11.791 2.594 2.594 2.594 0.607 0.607 19.423 7.725
R 5 wmin [
] 2.062 0.459 0.459 0.459 0.089 0.089 5.683 2.171
C wmin [fF] 1.078 0.924 0.867 0.832 1.002 0.920 7.2/8.1 7.3/8.1
C 5 wmin [fF] 1.794 1.405 1.271 1.195 1.708 1.475 9.6/12.4 10.2/12.4
L wmin [pH] 7.1 7.1 7.1 7.1 7.1 7.1 115.5 115.6
L 5 wmin [pH] 5.2 5.2 5.2 5.2 5.2 5.2 96.6 96.6
Technology T2 = 0:18m
R wmin [
] 4.294 4.294 4.294 4.294 0.998 0.998 30.80 11.34
R 5 wmin [
] 0.774 0.774 0.774 0.774 0.163 0.163 8.395 2.655
C wmin [fF] 0.960 0.857 0.812 0.782 0.938 0.847 6.8/7.8 6.8/7.8
C 5 wmin [fF] 1.511 1.272 1.170 1.106 1.415 1.213 9.1/10.7 8.9/10.7
L wmin [pH] 7.4 7.4 7.4 7.4 7.4 7.4 119.7 119.7
L 5 wmin [pH] 5.6 5.6 5.6 5.6 5.6 5.6 100.9 100.9
Technology T3 = 0:13m
R wmin [
] 9.155 5.825 5.825 5.825 5.825 0.938 36.32 35.32
R 5 wmin [
] 1.733 1.089 1.089 1.089 1.089 0.162 8.51 7.39
C wmin [fF] 0.881 0.847 0.786 0.751 0.728 0.961 6.7/8.0 6.5/8.0
C 5 wmin [fF] 1.324 1.227 1.101 1.032 0.986 1.319 8.8/10.1 8.4/10.1
L wmin [pH] 8.2 8.2 8.2 8.2 8.2 8.2 127.6 127.6
L 5 wmin [pH] 6.5 6.5 6.5 6.5 6.5 6.5 109.7 109.7
area is covered. The equivalent circuit of this mesh is created in an automatic
way using the hierarchical constructs allowed by the SPICE language. This
is clearly and approximation, as the \inter-segment" capacitances and the
mutual inductances are neglected. Anyway, as said before, this is not a back-
end analysis methodology. Moreover two further points should be underlined.
First the neglected parasitics would have in most of the cases a dumping
eect on noise, that is, a worst case is considered here. Second, such parasitics
are almost negligible: For example, on the basis of an accurate extraction of a
l1
/l2
l mesh−O
m j
l m
es
h−
V
m
 i
W
Fig. 4. Example of a modular power mesh of horizontal length lOmesh and vertical
length lVmesh based on the simple L  shaped structure of length l1 or l2.
10
sample mesh, which results are not reported here for sake of brevity, compared
to the approximated one, the intersegment capacitances resulted more than
10 times smaller than the Cground capacitance of the single wire segment.
4.2 Blocks current sources
For what concerns the block current we use a current generator connected to
each intersection point of the T in the T-RLC model (see gure 3). This does
not means that the gate is really connected to that point of the wire, as it will
be probably joined to a M1 stripe in many contact points. The current source
represents the current delivered to an ensemble of gates connected among them
and placed in some way within the considered area, and contacted to the Vdd
metal line of the grid approximately in correspondence of the T intersection
point of the model.
For understanding the current model used, in gure 5 are reported the Vdd
current envelopes of gates included in a 0.25m library accomplishing the
connectivity of one of the ISCAS85 benchmarks [28]. The current activity
-1
-0.8
-0.6
-0.4
-0.2
 0
 0.2
 0.4
 0  100  200  300  400  500  600  700
Cu
rre
nt
 [m
A]
Time [ps]
Fig. 5. Current envelopes on the Vdd node of an ISCAS85 benchmark due to a
random input vector variation sequence.
correspondent to one of the possible input vector is shown here as an example.
It is important to note that within a clock cycle the gate switching gives
contribution to the global current envelope depending on the propagation
delay through a path and with positive or negative polarity. Moreover, it
should be underlined that the current can be dierent from zero even if the
gate is not switching. Exhaustive details on this point are not shown here
for sake of brevity. Anyway a simple example can be used to clear the point:
Suppose that the inputs a and b of an and gate are initially 0 and 1 respectively,
and that at instant t0 they go to 1 and 0 respectively. The output should not
change, but an internal current transient will happen, which is not negligible
with respect to the \active" current case, as the on-o and o-on transistor
switching is not instantaneous. It is thus clear the importance of correctly
modeling the current generators to avoid the typical errors that leads to an
11
extreme overestimation when the delays among switching are not considered
or when only the worst case is taken into account.
In this work, thus, the current pulse is shaped as a triangle, which parameters
(initial time, peak time, peak value, nal time) are described such that they
can be varied within xed ranges using a statistical distribution (normal).
In particular both the peak value and its activation delay are associated to
a normal distribution, while the other parameters are dependent values (for
sake of simplicity). The parameters are chosen such that the current envelope
can shift within a clock cycle and can have positive or negative polarities.
When the circuit is simulated the statistical engine embedded in SPICE, that
is the Monte-Carlo analysis feature, explores the probable space with many
iterations (it has been shown in past works that 30 iterations allow to achieve
statistically meaningful results). An example of the current statistical variation
for one of the generator is shown in gure 6.
-0.5
-0.4
-0.3
-0.2
-0.1
 0
 0.1
 0.2
 0.3
 0.4
 0.5
 100  150  200  250  300  350  400
Cu
rre
nt
 [m
A]
Time [ps]
Fig. 6. Statistical variation of a gate current envelope assured by 30 Monte-Carlo
analyses: In each iteration peak value and activation delay are statistically varied.
The peak and timing values chosen for the current generators are related to the
number of \average gates" comprised in the L  shaped structure for each of
the llll area and coherent with the previously dened number NTl . In the case
of the simple block of length l1 and for the T1, Isource models the activity of one
gate only; thus, the ranges for the current peak are averaged among the most
used gates chosen in the 0:25m cell library. For modeling the current shapes
of a generic block of gates, that is, lTl  lTl or lV;Tmesh lO;Tmesh structure, the number
of gates embraced in the correspondent area is reckoned using the previously
dened NTl . Then we performed a characterisation for considering the possible
current activities of the library gates and the typical delays and clock cycles
using some of the ISCAS85 benchmarks [28] having a similar number of gates
(NTl ). The current and timing values are used for dening the ranges of the
normal distribution parameters. We aim at disentangle our analysis from a
given circuit connectivity: For this reason we collapse the characterisation in
12
the parameters of a normal distribution. For example, in gure 7 we report a
normal PDF reckoned after a characterization associated to the current peak
parameter for an \average gate" in 0:13m technology (PeakGT3).
 0
 500
 1000
 1500
 2000
 2500
-0.6 -0.4 -0.2  0  0.2  0.4  0.6
PD
F
Peak current [mA]
mean  -52.67uA
sigma 0.161mA
Fig. 7. Normal Probability Distribution Function (PDF) of the current peak pa-
rameter for an \average" 0:13m gate.
For the l1 case, the values associated to the 3 limit for the normal distribution
used to vary the current peaks are 1mA for T1, 0.8mA for T2, and 0.48mA for
T3, while for the three cases the medium value is around 0. This means that
the majority of the current values are chosen, e.g. for T1, between -0.33mA
and +0.33mA. The timing parameters for the current envelopes are described
in a more complicated way and not reported here for brevity. Anyway they are
a function of the clock cycle and of the rising and falling times; the quantities
used for these two values are Tck = 3ns tr = 0:2ns for T1, Tck = 1:2ns and
tr = 65ps for T2, Tck = 0:5ns and tr = 30ps for T3.
In summary, each of the generators in the circuit will be associated to its own
statistical parameters (one for the peak and one for the activation delay) so
that a Monte-Carlo simulation of the circuit can be performed. In this way we
emulate the mesh as connected to a combinatorial circuit in which the gates
are switching in whichever direction and whenever within a clock cycle, with
meaningful current peak and timing values. The analysis is then independent
from the circuit logical connectivity.
4.3 Package library
The package elements are composed, as suggested in the literature, of a resis-
tance and an inductance connected in series. Their value depends on the type
of package, e.g. wirebonding or ip-chip, and, in this case, are chosen among
the measured values reported in [18] and [20]. For the wirebonding case (WB)
the suggested resistance is around 140m
 while inductance is 3:5nH. In the
ip-chip (C4) case the chosen resistance is 100m
 and inductance is 1:5nH.
13
Their connections to the on-chip mesh are diversied depending of the mod-
ular circuit created, as explained in the following sections.
5 Results
The simulations have been performed in a few phases. Firstly, the attention
has been focused on geometrical and technological parameters (section 5.1),
while in the following steps the analysis has been directed toward realistic
cases of power meshes (section 5.2), of package (section 5.3) and decoupling
capacitors choices (section 5.4).
5.1 Impact of geometry and technology
We compare here mesh topologies powering an increasing number of gates and
having dierent shapes as well. The structures are sketched in gure 8. In a
rst simple case (A) the L  shaped component is connected horizontally ve
times; this means that the shape is ll  5ll. The width is also varied and can
assume in the whole structure the value wmin, or, alternatively, 5wmin. The
number of gates powered is then 10 for the l1 and 0:25m technology case,
while 10 is the number of gate blocks (GB) for the other length lTl where the
number of gates is a function of NTl . Hereinafter we will refer to GB for all
the cases for sake of simplicity.
C
D
A
B
B C
A
Fig. 8. Structures used in the geometry assessment phase: A, ve l blocks connected
horizontally; B, ve A blocks connected vertically; C, ve B blocks connected hori-
zontally; D, ve C blocks connected vertically.
Structure B has a squared shape as ve A blocks are connected vertically. Sizes
are then 5ll  5ll and thus the GB number is 50. This structure is used for
hierarchically creating the structure C, using an horizontal connection, leading
then to a mesh powering 250 GB and measuring: 5ll  25ll. Finally, a bigger
squared mesh, D, based on the B one, spans 25ll  25ll and powers 1250 GB.
The nominal power supply in these simple experiments is connected only to
the left size of the mesh (supposing an interdigitated distribution from higher
14
layers) routed in metal 2, while, in this rst simulation set, M1 is used for
both metal Mi and Mj .
 1.16
 1.17
 1.18
 1.19
 1.2
 1.21
 1.22
 1.23
 1.24
 1.25
 100  150  200  250  300  350  400
Vo
lta
ge
 [V
]
Time [ps]
Voltage waveforms for 30 Montacarlo iterations
 1.16
 1.17
 1.18
 1.19
 1.2
 1.21
 1.22
 1.23
 1.24
 1.25
 100  150  200  250  300  350  400
Vo
lta
ge
 [V
]
Time [ps]
Voltage waveform for one of the 30 Montecarlo iterations
Fig. 9. Example of noise waveforms measured in the worst case node (far from
the nominal power supply) in the C structure at Vdd node for T3. Thirty waves
are superposed after Monte-Carlo iterations in the left graph, while one of them is
extracted in the right graph.
In gure 9 we report an example of noise waveforms at node Vdd for the
C structure (T3, l1, 5  wmin) due to the statistical variation of the current
generator parameters performed by the Monte-Carlo engine. The voltage is
measured at the grid node where it was found being the worst one, that
is at the opposite side of the nominal power supply connection points. It
is important to note (both from the left and the right graph) that this is a
dynamic noise, that is both under and over-voltages are reached with a similar
probability. This happens because the current values change polarity with the
same probability in the two directions and because of the inductive presence in
the electrical model. In spite of the fact that normally only the under-voltage
is considered in PSN analyses, the over-voltage should not be neglected, as
it implies a delay variation of the powered gates which not necessarily is less
critical than the one caused by the under-voltage [8].
In the following results, only maximum and minimum Vdd values measured
among all the 30 iterations are reported instead of all the values for easy
readableness. Figure 10 refers to T1, while gure 11 to T2, and gure 12 to T3.
The nominal power supplies are 2.5V for T1, 1.8V for T2 and 1.2V for T3. In
these three gures we compare in two graphs the results for the four A-B-C-D
structures based on the l1 (bottom) and l2 (top) lengths. Moreover, for each
value we compare the results measured for meshes where the wire width is the
minimum one (wmin) allowed by the technology for power supply delivering,
and where it is 5 times the minimum. As expected, when the number of gate
increases, but the shape is maintained (as from 10 to 250 GB, or from 50 to
1250), the worst case noise is higher; for example in the lT12 case, a 10 to 250 GB
increase (rectangular shape with 1:25 increasing rate) leads to an over/under-
voltage 3.5 times higher. The same noise enhancement is maintained when the
powered gate blocks go from 50 to 1250 (square shape with 1:25 increasing
15
2.470
2.485
2.500
2.515
2.530
10 GB 50 GB 250 GB 1250 GB
Vp
p 
[V
]
l = 10 um
Wmin
5xWmin
2.300
2.400
2.500
2.600
2.700
10 GB 50 GB 250 GB 1250 GB
Vp
p 
[V
]
l = 100 um
0.25um 
Wmin
5xWmin
Fig. 10. Noise peaks measured on worst case node (T1). Bottom graph: l1; top graph:
l2. Comparison among increasing number of GB powered by the mesh (structures
A, B, C and D in gure 8) and between two wire widths. Error bars display the
variability of the measured data.
rate). This trend is similar in the other two technologies as well (gure 11 and
12).
The increment is thus similar in the two rectangular and squared shapes, but
the absolute values are dierent between the twos. In fact, it is important to
note that, going from 10 to 50 GB the noise peak does not increase, but gets
even lower (1.96 times smaller at T1). The same can be noted while comparing
250 to 1250 GB (1.92 times smaller under-voltage). This is clearly due to the
shape of the structures and to the fact that in these simulation set the nominal
power supply is distributed only from one side. The squared case, even if bigger
in terms of number of gate blocks, has a lower drop because the maximum
distance from the nominal power supply does not increase. Furthermore the
overall capacitance of the metal wires tends to equalize the overvoltage. This
1.770
1.785
1.800
1.815
1.830
10 GB 50 GB 250 GB 1250 GB
Vp
p 
[V
]
l = 10 um
Wmin
5xWm
1.500
1.650
1.800
1.950
2.100
10 GB 50 GB 250 GB 1250 GB
Vp
p 
[V
]
l = 100 um
0.18 um 
Wmin
5xWm
Fig. 11. Noise peaks measured on worst case node (T2). Bottom graph: l1; top graph:
l2. Comparison among increasing number of GB powered by the mesh (structures
A, B, C and D in gure 8) and between two wire widths. Error bars display the
variability of the measured data.
16
1.160
1.180
1.200
1.220
1.240
10 GB 50 GB 250 GB 1250 GB
Vp
p 
[V
]
l = 10 um
Wmin
5xWm
0.960
1.080
1.200
1.320
1.440
10 GB 50 GB 250 GB 1250 GB
Vp
p 
[V
]
l = 100 um
0.13 um 
Wmin
5xWm
Fig. 12. Noise peaks measured on worst case node (T3). Bottom graph: l1; top graph:
l2. Comparison among increasing number of GB powered by the mesh (structures
A, B, C and D in gure 8) and between two wire widths. Error bars display the
variability of the measured data.
is an important result from the designer point of view, as these noise gures
give not only general directives on the most convenient shape to be adopted
when a cluster of gates must be placed and its power mesh designed, but
even quantitative values on the expected DVD consequence of this design
choice. This trend is similar in the other two technologies as well. For an
easier comparison among the three technologies the percentage variations with
respect to the nominal power supply are reported in table 2 for the l2-wmin-
250 and 1250 GB, M1 case. Results show how the scaling down has a negative
eect on noise, and this is conrmed also by the results in the following.
Further simulations have been performed considering the other metal layers.
We report here for sake of brevity the results obtained for the 1250 GB mesh
only (l1). An interesting synoptic view for the three technologies is in gure
13. The rst metal behaves as the worst one in all the technologies, even if
this is more enhanced in the 0:25m one. As underlined before, the technology
scaling down does not necessarily imply a noise reduction, as it appears from
the fact that, especially in the real case of the larger wire, for metal layers
dierent from the rst, noise increases when scaled technologies are taken
into account. In this case, the cause is the non linear reduction of resistivity
with technology scaling. From the data reported in table 1 it should be noted
Table 2
Comparison among the three technologies: percentage variations with respect to the
nominal power supply in the l2-wmin-M1-250 and 1250 GB case.
Technology 250 GB 1250 GB
min max min max
T 0:25m -7.28% 7.56% -3.78% 3.41%
T 0:18m -12.03% 12.50% -6.26% 5.65%
T 0:13m -18% 18.83% -11.83% 7.23%
17
that, even if the material used for the metallization is the same in the three
technologies, that is a Cu-Al alloy, the impact of the high-resistance barrier
in damascene metallization processes results in an higher eective resistance
in more scaled processes. It is interesting to point out that while in 0:25m
technology both M5 and M6 can be used to strongly reduce PSN, in 0:13m the
dierence introduced by the use of dierent metal layers is less evident, with
the exception of M6. These kind of informations can be used while planning
in early design phase the hierarchy of the power grid distribution.
2.460
2.480
2.500
2.520
2.540
Metal 1 Metal 2 Metal 3 Metal 4 Metal 5 Metal 6
Vp
p 
[V
]
Technology 0.25 um
Wmin
5xWmin
1.780
1.790
1.800
1.810
1.820
Metal 1 Metal 2 Metal 3 Metal 4 Metal 5 Metal 6
Vp
p 
[V
]
Technology 0.18 um
Wmin
5xWmin
1.180
1.190
1.200
1.210
1.220
Metal 1 Metal 2 Metal 3 Metal 4 Metal 5 Metal 6
Vp
p 
[V
]
Technology 0.13 um
Wmin
5xWmin
Fig. 13. DVD comparison for the three technologies in the l1 1250 GB case for
dierent metal layers, in the minimum and 5X minimum wire width. Error bars
display the variability of the measured data.
5.2 Impact of topology, activity and hierarchy
In this set of simulations the minimum block size used is the 1250 GB one.
The results have been achieved for the three technologies, but only the 0:13m
ones, as the most interesting, are reported in the following.
The rst analysis reported in gure 14 compares the eect of dierent nominal
power supply delivery systems. The left sided distribution, used in the rst
simulation step (section 5.1), is compared with a similar case in which two
opposite sides are powered with nominal Vdd, and, furthermore, with the op-
timal case of nominal distribution all around the block perimeter. As expected
the worst case noise reduces: 67% from one side to two sides and 83% from one
to four sides. As expected the point suering the worst noise peak (reported
in gure) shifts from the side to the center of the block. Having the possibility
to dynamically and rapidly evaluating informations of this kind could highly
help the designer while planning how to distribute the supply voltage through
18
1.170
1.185
1.200
1.215
1.230
1 side 2 opposite sides perimeter
Vp
p 
[V
]
Sides for nominal power distribution
 Wmin
 5xWmin
Fig. 14. Worst case noise for the 1250 GB T3 structure in three dierent types of
nominal power distribution: one side, two opposite sides, along the full perimeter.
the grid powering macros of dierent sizes and number of gates. The designer
can thus choose the best solution trading-o between noise constraints, grid
design complexity and wire resources allocation.
Another analysis performed is related to the eective switching activity of
gates: This simulation has been carried out by deactivating some of the gen-
erators (in percentage from 100% to 20%) uniformly through the circuit area.
We report in table 3 the values of voltage variations in percentage with respect
to the nominal supply voltage. The structure used is the 1250 GB one with
nominal power supply distributed around the block perimeter.
Table 3
Worst case node under (-NS) and over (+NS) voltage reduction with respect to
nominal supply (NS) for dierent percentages of active current generators (5wmin,
l1, T3).
100% ! 80% 100% ! 60% 100% ! 40% 100% ! 20%
-NS +NS -NS +NS -NS +NS -NS +NS
-7.7% -13% -10.7% -34% -17% -39% -29% -55%
As a general comment, reducing the activity of gates, for example using the
clock gating technique, gives perceptible results, but only a strong reduction in
the generators activity leads to important improvements in noise. This could
have been expected. In fact the current causing the overvoltages is the global
current shape on each grid branch, that is, the impact of the inactivity of some
generators may be small over the global current envelope. Moreover, in this
work, each generator is modeled with a statistical variation within extremes
that include small current values as well, such that a realistic gate behaviour
is already taken into account. This is important especially because in previous
works one of the rst problem was the reliable modeling of the gate activity
based on the worst case only, which leads to a strong overestimation of power
grid noise [15].
We performed then a further simulation step on blocks which emulate big-
19
-0.4
-0.2
 0
 0.2
 0.4
0.13um0.18um0.25um
+33.3
+16.6
-16.6
-33.3
Vd
d 
no
ise
 v
s.
 n
om
in
al
 p
ow
er
 s
up
pl
y 
[V
]
Pe
rc
en
ta
ge
 v
ar
ia
tio
n
Technology
Noise comparison among technologies with increasing block size
G
F
E
D
Fig. 15. Worst case noise for the three technologies (max-min with respect to the
nominal power supply voltage). Comparison among squared blocks of increasing
size: E = 5D 5D, F = 10E 10E, G = 5F  5F . Absolute and percentage values
are on the left and right y-axis respectively.
ger squared circuits, nearer to up to date high performance ICs. Using the
D structure previously dened, we simulated other three squared structures
organized respectively as E = 5D5D (31k GB), F = 10E10E (3.1M GB)
and G = 5F5F (78M GB). The power supply is distributed in a hierarchical
way: Blocks D lines are routed in metal M4 (l1, w = 5  wmin), while higher
level meshes are routed in M6. The nominal power supply is distributed at
the whole circuit perimeter. In gure 15 the worst case DVD (measured at the
circuit center) is reported for the four D-E-F-G structures and for the three
technologies. As pointed out before, the 0:13m case shows the worst noise as
better metal characteristics do not completely compensate the increased num-
ber of gates and shorter current transients. Furthermore, these results show
how the increasing circuit complexity represents a critical parameter: In the
F and G cases, which includes most of the state of the art SoC sizes (156mm2
and 38cm2 respectively), the suered dynamic noise increases up to about
15% and 30% respectively. According to the ITRS this trend is expected to
be conrmed by 0:90m, 0:65m and 0:40m nodes.
5.3 Impact of package
The results reported in previous steps have been obtained without considering
the inuence of package. Two are the impacts to be considered: the number
and position of connection points from die to package, and the package para-
sitics. In case a wirebonding technology is chosen (WB), the connections are
peripheral, and the parassitic inductance has a relevant impact. On the con-
trary, the ip-chip (C4) package allows a less inductive connection and permits
to distribute external power supply connection points on the whole die area.
20
The number of connections to the external pads is a further variable, even
if related to the package technology used. In this work, as we are showing a
methodology, only a xed number of connections is considered (with the ex-
ception of one case) for sake of brevity. Anyway, thanks to the modularity of
the approach, a variable number of Vdd and Gnd pads can be introduced in
the analysis.
To show the impact of these eects on noise we used in this simulation set
the structure G in the 0:13m case. It is interesting to take in consideration
the distribution of noise on the chip area, as the package connection style
and parasitics inuence this aspect as well. In gure 16, thus, the noise wave-
 1.175
 1.18
 1.185
 1.19
 1.195
 1.2
 1.205
 1.21
 150  200  250  300  350  400  450  500  550
Vo
lta
ge
 [V
]
Time [ps]
Vdd noise at locations ( 1.1)  (1.2)  (1.3)  (2.2)  (2.3)  (3.3)
1.1
1.2
1.3
2.2
2.3
3.3
Fig. 16. Noise measured after one Monte-Carlo simulation run for the circuit G
with peripheral connections to nominal power supply (wirebonding package style),
without package parasitics. Measure points are on one of the eight identical slices
in which we ideally partition the circuit die.
forms (in the case of one Monte-Carlo simulation) of ve measure points are
shown. Supposing to partition the G square in 25 identical squares (5  5),
we measure the voltage at the center of each square, and identify such points
as 1:1; 1:2; : : : 1:5 for the rst row, 2:1; 2:2; : : : for the second and so on. Be-
ing the structure perfectly regular, we report here only the points detailed in
gure and reckoning one of the eight triangles included in square G. In this
case the connections to nominal power supply are peripheral (WB package),
but no package parasitics are involved in this rst simulation. As expected,
voltage variations worsen as the measure point is nearer to the die center.
Consider now in gure 17 a comparison between the WB and C4 connection
styles, for two measure points (corner, that is 1.1, and center, that is 3.3),
still without package insertion. The ideal Vdd generators are distributed in
the C4 case not only at the peripheral but within the circuit as well. It is
interesting to note how a uniform distribution of connections points allows a
better noise equalization (0.72V at the center for WB and 0.94V for C4). More
complete results are in table 4, in which voltages at the six measure points
are reported for the two packages (without parasitics) after 30 Monte-Carlo
21
 0.7
 0.8
 0.9
 1
 1.1
 1.2
 1.3
 1.4
 600  650  700  750  800  850  900  950  1000
Vo
lta
ge
 [V
]
Time [ps]
Voltage at (1.1)
Voltage at (3.3)
WB - no package
C4 - no package
Fig. 17. Noise measured after one Monte-Carlo simulation run of the circuit G with
uniform distribution of connections to nominal power supply (C4 package), without
package parasitics. Measure points are at the circuit corner (1.1) and center (3.3).
Table 4
Minimum and maximum voltages at Vdd after 30 Monte-Carlo simulations. Com-
parison between WB and C4 style connection to nominal power supply. No package
parasitics included in simulation. Measured points: (1.1, 1.2, 1.3, 2.2, 2.3, 3.3). Re-
ported results are in Volts.
Wirebonding no
package
5 wmin 1 2 3
1 1.06  1.31 0.89  1.38 1.00  1.45
2 0.77  1.42 0.72  1.43
3 0.93  1.54
C4 no package
5 wmin 1 2 3
1 1.08  1.29 0.96  1.32 1.06  1.39
2 1.00  1.31 0.93  1.34
3 1.03  1.35
iterations. As expected both the overvoltages and undervoltages worsen in the
WB case when the center is reached, while C4 reduces both the positive and
negative deviations from nominal power supply.
In table 5 the same results are shown for 30 Monte-Carlo iterations when the
parasitics are taken into account, as described in section 4.3. The WB induc-
tance and resistance insertion eect is more evident in the rst row, with a
percentage variation of about +20% at the die corner (point 1.1: 1.58V with,
1.31V without package). If the C4 parasitics are taken into account, perfor-
mance worsen as well, as, at the corner, an undervoltage variation of about
-13% is present. Anyway, the worsening takes place in a lower measure, and
still there is a better voltage distribution through the circuit. In table 5 fur-
ther results have been obtained doubling the C4 connections through the die
surface (dense C4 package): In this way the noise values are almost superposed
to the ones in table 4 in which no parasitics where taken into account.
22
Table 5
Minimum and maximum voltages at Vdd for structure G. Comparison between
WB and C4 style connection (spare and dense). Package parasitics included in
simulation. Measured points: (1.1, 1.2, 1.3, 2.2, 2.3, 3.3). Reported results are in
Volts.
Wirebonding package
5wmin 1 2 3
1 0.87  1.58 0.77  1.51 0.83  1.61
2 0.63  1.53 0.65  1.50
3 0.76  1.58
C4 package
5wmin 1 2 3
1 0.94  1.44 0.83  1.44 0.97  1.53
2 0.81  1.50 0.78  1.48
3 0.98  1.46
dense C4 package
5wmin 1 2 3
1 1.09  1.29 0.97  1.31 1.06  1.28
2 1.06  1.37 0.95  1.32
3 1.04  1.32
5.4 Impact of decoupling capacitors
When facing power supply noise, not only geometry and topology are used to
reduce its eect, but decoupling capacitors are inserted within the die [29],
[21] and in the rst level package [30]. In this work, we suppose that the
physical design step allows to place decoupling capacitors not only at the die
peripheral, but wherever possible in the die area. For the simulations we use
the capacitors Cground in the equivalent circuit of the L  shaped structure in
gure 3. This in previous simulations models the intrinsic network capacitance
only, while in this case its value is increased including in it both the intrinsic
net and the decoupling capacitance. Several simulations have been carried on,
but only two results are reported in the following for sake of brevity: a dis-
tributed decoupling capacitance of 50fF and a double capacitance of 100fF
for each Cground capacitor. In table 6 simulation results for the G structure
are reported. Noise is decidedly reduced in the WB case as an average over-
voltage decrease of 14% is obtained, even improved to a 17% reduction if the
distributed capacitance is doubled; furthermore a better voltage distribution
is obtained within the circuit. In the C4 case noise is reduced of about 4%
and up to 11.6% in the double capacitance case. This result is comparable to
the dense C4 case. Optimal results can be thus achieved trading o between
these two strategies and tacking into account routing congestion as well.
Finally, in gure 18 an interesting synoptic view of all the cases taken into
consideration for circuit G is given. Worst case overvoltage and undervoltage
are shown in percentage with respect to nominal Vdd. It is evident that a power
supply noise estimation carried on without considering package parasitics is
not reliable: In the WB case the undervoltage may worsen from 22% to 36%,
23
Table 6
Decoupling capacitance insertion. Minimum and maximum voltages at Vdd for
structure G. Comparison between WB and C4 style connection. Package parasitics
included in simulation. Measured points: (1.1, 1.2, 1.3, 2.2, 2.3, 3.3). Reported re-
sults are in Volts.
Distributed decap Double distributed decap
WB
5 wmin 1 2 3
1 1.05  1.35 0.95  1.34 1.02  1.39
2 0.98  1.32 0.93  1.35
3 1.04  1.35
1 2 3
1.09  1.32 1.02  1.29 1.05  1.33
1.03  1.28 0.98  1.33
1.06  1.30
C4
5 wmin 1 2 3
1 1.01  1.37 1.07  1.40 1.03  1.35
2 1.01  1.43 1.07  1.41
3 1.06  1.40
1 2 3
1.10  1.29 1.02  1.29 1.06  1.31
1.05  1.28 1.01  1.31
1.06  1.29
while in the C4 case the overvoltage may grow from 12% to 21%. Furthermore
the C4 density is a variable to be taken into account as a dense connection to
C4 solder bump assures a dynamic voltage drop reduction comparable to the
insertion of decoupling capacitors. This last technique is normally adopted at
the end of the design ow, so that capacitors are inserted wherever possible in
the design. A prediction of the total distributed capacitance amount needed
for reducing the dynamic voltage drop could be of help if available before the
design stage. As a general comment it is interesting to note that the capability
to early analyze these noise gures as a function of the design parameters
allows to dene design countermeasures without the need of long post physical
design simulations.
-40
-30
-20
-10
 0
 10
 20
 30
 40
 1  2  3  4  5  6  7  8  9
Pe
rc
en
ta
ge
W
B 
no
 p
ac
ka
ge
C4
 n
o 
pa
ck
ag
e
W
B 
pa
ck
ag
e
C4
 p
ac
ka
ge
D
en
se
 C
4 
pa
ck
ag
e
W
B 
pa
ck
 d
ec
ap
C4
 p
ac
k 
de
ca
p
W
B 
pa
ck
 d
ou
bl
e 
de
ca
p
C4
 p
ac
k 
do
ub
le
 d
ec
ap
Min
Max
Fig. 18. Comparison among dierent simulation results for structure G. Variations
vs. nominal power supply.
24
6 Conclusions
This paper presented a new methodology for estimating in an early design
phase the Dynamic Power Supply Noise, so that the often repeated post
physical design verication step can be executed only when near to optimal
choices on power grid design have already been taken. Dierently from pre-
vious methodologies, the proposed one allows to predict dynamic noise and
not only static IR drop, as transient currents are considered and inductance
is included in the analysis as well. The prediction is allowed by the creation
of a library used for the automatic generation of power grid structures, which
geometrical and technological parameters are diversied on the basis of the
complexity of the circuit to be analyzed: Many grid structures, topologies and
hierarchies can be simulated thanks to the library modularity. Package par-
asitics and connection to the on-chip grid can be included in the analysis as
well. Their mutual impact can be thus simulated, thus abutting two aspects
which are separately considered in previous power supply noise evaluation
methods. Currents drawn by the circuits are modeled with current generators
which shapes are varied with statistical Monte-Carlo analysis: Frequency, ris-
ing times, current peaks, gate switching delay within the clock cycle can be
inserted as parameters, so that DVD is estimated using realistic gate switching
activities. Notwithstanding a restricted variety of parameters have been used
in this work, the validity of the methodology has been extensively shown as
worst case Dynamic Voltage Drop predictions have been accomplished for var-
ious power grid structures and working conditions. Results demonstrate how
realistic evaluation of parasitics impact are assured by a grid library which
parassitics have been carefully extracted, and which modularity facilitates the
global power network construction. A correct current envelope evaluation is
guaranteed by statistical analyses. Furthermore reliable results are endorsed
by accurate Spice simulations.
References
[1] H. Chen and D. Ling, \Power supply noise analysis methodology for deep-
submicron vlsi chip design," in Proceedings of Design Automation Conference,
Anaheim, CA, USA, November 1997, pp 638 - 643.
[2] A. V. Mezhiba and E. G. Friedman, \Scaling trends of on-chip power
distribution noise," IEEE Transactions on Very Large Scale Integration (VLSI)
Systems, vol. 12, pp. 386{394, Apr. 2004.
[3] Semiconductor Industry Association, \(2001) international roadmap of
semiconductors," 2001. [Online]. Available: http://public.itrs.net
25
[4] A. Deutsch, H. H. Smity, B. J. Rubin, and B. L. Krauter, \New methodology
for combined simulation of delta-i noise interaction width interconnect noise
for wide, on-chip data-buses using lossy transmission-line power-blocks," IEEE
Transactions on Advanced Packaging, vol. 29, pp. 11{20, Feb. 2006.
[5] A. V. Mezhiba and E. G. Friedman, \Inductive characteristics of power
distribution grids in high speed integrated circuits," in Proc. IEEE International
Symposium on Quality Electronic Design, Mar. 2002, pp. 316{321.
[6] J. L. A. Arledge and W. T. Lynch, \Scaling and performance implications for
power supply and other signal routing constraints imposed by i/o limitations,"
in Proc. IEEE International Symposium IC/Package Design Integration, Feb.
1998, pp. 45{50.
[7] S. R. Nassif and O. Fakhouri, \Technology trends in power-grid-induced noise,"
in Proc. ACM International Workshop System Level Interconnect Prediction,
Apr. 2002, pp. 55{59.
[8] M. Graziano, C. Forzan, and D. Pandini, \Including power supply variations
into static timing analysis: Methodology and ow," in IEEE International
SOCC Conference, September 2005, pp. 229{232.
[9] M. Zhao, R. Panda, S. Sapatnekar, and Blaauw, \Hierarchical analysis of power
distribution networks," Transactions on Computer-Aided Design of Integrated
Circuits and Systems, Vol. 21, No. 2, pp. 159 - 168, Feb 2002.
[10] J.Kozhaya, S. Nassif, and F. Najm, \A multigrid-like technique for power grid
analysis," Transactions on Computer-Aided Design of Integrated Circuits and
Systems, Vol. 21, No. 10, pp 1148 -1160, Oct. 2002.
[11] Y.Lee and C. Chen, \Power grid transient simulation in linear time
based on transmission-line-modeling-alternating-direction-implicit method,"
Transactions on Computer-Aided Design of Integrated Circuits and Systems,
Vol. 21, No. 11, pp 1343 -1352, Nov. 2002.
[12] L. Zheng and H. Tenhunen, \Fast modeling of core switching noise on
distributed lrc power grid in ulsi circuit," Transaction on advanced packaging,
Vol 24, 2001, pp 245 -254, Aug. 2001.
[13] J. Choi, L. Wan, M. Swaminathan, B. Beker, and R. Master, \Modeling of
realistic on-chip power grid using the FDTD method," in IEEE Int. Symp on
Electromagnetic Compatibility, vol. a, Minneapolis, MN, Aug. 2002, pp. 238{
243.
[14] Y. Jiang and K.T.Cheng, \Analysis and performance impact caused by power
supply noise in deep submicron devices," in Proceedings of Design Automation
Conference, New Orleans, LA, USA, June 1999, pp. pp 760{765.
[15] G.Bai and I.N.Hajj, \Simultaneous switching noise and resonance analysis
of on-chip power distribution network," in Proceedings of the International
Symposium on Quality Electronic Design, San Jose, California, mar 2002, pp.
163{168.
26
[16] S. Bondapati and F. Najm, \High-level current macro-model for power grid
analysis," in Proceedings of Design Automation Conference, New Orleans, LA,
USA, June 2002, pp. 385{390.
[17] R.Chaundhry, D. Blaauw, R. Panda, and T. Edwards, \Current signature
compression for ir-drop analysis," in Proceedings of Design Automation
Conference, Los Angeles, CA, USA, June 2000, pp. pp 162{167.
[18] C.-T. Chiu, S.-M. Wu, and C.-P. Hung, \High speed electrical performance
comparison between bump with rdl and wir bond technologies," in IEEE
International Symposium on Electronic Materials and Packaging, Kaohsiung,
Taiwan, 2002, pp. 83{88.
[19] H-M. Chen and L. Huang and I. Liu and M.D.F. Wong, \Simultaneous Power
Supply Planning and Noise Avoidance in Floorplan Design," Transaction on
Computer Aided Design, Vol 24, 2005, pp 578 -587, Apr. 2005.
[20] H. Hashemi and D. J. Herrel, \Power distribution delity of wirebond
compared to ip chip devices in grid array packages," IEEE Transactions
on Components, Packaging and Manufacturing Technology|Part B: Advanced
Packaging, vol. 20, pp. 272{278, Aug. 1997.
[21] N. Na, T. Budell, E. Tremble, and I. Wemple, \The eects of on-chip and
package decoupling capacitors and an ecient asic decoupling methodology,"
in IEEE Electronic Components and Technology Conference, June 2004, pp.
556{567.
[22] M. R. Casu, M. Graziano, G. Masera, G. Piccinini, and M. Zamboni, \An
electromigration and thermal model of power wires for a priori high-level
reliability prediction," IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, vol. 12, no. 4, pp. 349 { 358, April 2004.
[23] SoC Encounter, Cadence Inc., 2004.
[24] Star-RCXT, Synopsys Inc., 2004.
[25] ICFB Virtuoso Layout Editor, Cadence Inc., 2003.
[26] Space layout to circuit extractor, Delft University of Technology. [Online].
Available: http://www.space.tudelft.nl
[27] Multipole-accelerated inductance analysis program. [Online]. Available:
http://www.rle.mit.edu/cpg/research codes.htm
[28] [Online]. Available: www.cbl.ncsu.edu
[29] H.H. Chen,J.S. Neely, M.F. Wang, and G. Co, \On-Chip Decoupling Capacitor
Optimization for Noise and Leakage Reduction," IEEE Symposium on
Integrated Circuits and Systems Design, 2003.
[30] N. Pham, M. Cases, D.N. de Araujo, B. Mutnury, E. Matoglu, B. Herrman
and P. Patel, \Embedded Capacitor in Power Distribution Design of High-end
Server Packages," IEEE Electronic Components and Technology Conference,
2006.
27
