The design of power distribution networks in high performance integrated circuits has become significantly more challenging with recent advances in process technology. As on-chip currents exceed tens of amperes and circuit clock periods are reduced well below a nanosecond, the signal integrity of the on-chip power supply has become a primary concern in integrated circuit design. The existing work on power distribution noise scaling is reviewed and extended to include the scaling of the inductance of the on-chip global power distribution networks in high performance flip-chip packaged integrated circuits. As the dimensions of the on-chip devices are scaled by S, where S > 1, the resistive voltage drop across the power grids remains constant and the inductive voltage drop increases by S, if the metal thickness is maintained constant. Consequently, the signal-to-noise ratio decreases by S in the case of resistive noise and by S 2 in the case of inductive noise. As compared to the constant metal thickness scenario, ideal interconnect scaling in the global power grid mitigates unfavorable scaling of the inductive noise but exacerbates the scaling of resistive noise by a factor of S. On-chip inductive noise will therefore become of greater significance with technology scaling. Careful tradeoffs between the resistance and inductance of the power distribution networks will be necessary in nanometer technologies to achieve minimum power supply noise levels.
Introduction
CMOS technology scaling is forecasted to continue for at least another ten years [1] . The on-going miniaturization of integrated circuit (IC) feature size has placed significant requirements on the power and ground distribution networks. Circuit integration densities rise with each very deep submicrometer (VDSM) technology generation due to smaller devices and larger dies; the current density and the total current increase accordingly. At the same time, the higher switching speed of smaller transistors produces faster current transients in the power distribution network. The higher currents cause large ohmic IR voltage drops while the fast current transients cause large inductive L dI dt voltage drops (∆I noise) in the power distribution networks. Power distribution networks must be designed to minimize these voltage drops, maintaining the local supply voltage within specified design margins. If the power supply voltage sags too low, the performance and functionality of the circuit will be severely compromised. Alternatively, excessive overshoot of the supply voltage can affect circuit reliability. Further exacerbating these problems is the decrease in noise margins with each new generation of VDSM process technology.
Insuring adequate signal integrity of the power supply has become a primary design issue in high performance, high complexity digital integrated circuits. A significant fraction of the on-chip resources is dedicated to achieve this objective. Global on-chip power distribution networks are typically designed at the early stages of the design process, when little is known about the power demands at specific locations on an IC. Furthermore, allocating additional wiring resources for the global power distribution network at the later stages of the design process in order to improve the local electrical characteristics of the power network is likely to create routing problems and can also be prohibitively expensive. For these reasons, power distribution networks tend to be conservatively designed [2] , sometimes using more than a third of the on-chip metal resources [3, 4] .
Power distribution networks in high performance digital ICs are commonly structured as a multilayer grid. In such a grid, straight power/ground (P/G) lines in each metalization layer span the entire die (or a large functional unit) and are orthogonal to the lines in the adjacent layers. The power and ground lines typically alternate in each layer. Vias are used to connect a power (ground) line to another power (ground) line at the overlap sites. The power grid concept is illustrated in Fig. 1 , where three layers of interconnect are depicted with the power lines shown in dark grey and the ground lines The scaling trend of noise in high performance power distribution grids is, therefore, of practical interest. In addition to noise magnitude constraints, electromigration reliability considerations limit the maximum current density in on-chip interconnect. The scaling of the peak current density in power distribution grids is also of practical interest. The results of this scaling analysis depend upon various assumptions. The published scaling analyses of power distribution noise are reviewed and compared along with the relevant assumptions. The scaling of the inductance of an onchip power distribution network extends the existing work. Scaling trends of on-chip power supply noise in high performance flip-chip packaged circuits are the focus of this investigation.
The paper is organized as follows. Related existing work is reviewed in Section 2. The interconnect characteristics assumed in the analysis are discussed in Section 3. The scaling of power noise and electromigration reliability is described in Section 4. Implications of the scaling analysis are discussed in Section 5. Some conclusions are offered in Section 6.
Background
Ideal scaling of CMOS transistors was first described by Dennard et al. in 1974 [5] . Assuming a scaling factor S, where S > 1, all transistor dimensions uniformly scale as 1=S, the supply voltage scales as 1=S, and the doping concentrations scale as S. This "ideal" scaling scenario maintains the electric field within the device constant throughout the scaled process and ensures a proportional scaling of the I-V characteristics. Under the ideal scaling paradigm, the transistor current scales as 1=S, transistor power decreases as 1=S 2 , transistor density increases as S 2 . Transistor switching time decreases as 1=S, power per circuit area remains constant, and the current per circuit area scales as S. The die dimensions increase by a chip dimension scaling factor S C . The total capacitance of the on-chip devices and circuit current increase by SS 2 C while the circuit power increases by S 2 C . The scaling of interconnect was first described by Saraswat and Mohammadi [6] . Ideal scaling behavior is listed in Table 1.
Alternatively, the power supply voltage can be maintained constant during the scaling process. In this scenario, the Several research results have been published on the impact of technology scaling on the integrity of the IC power supply [7, 8, 9, 10] . The published analyses differ in the assumptions made about the on-chip and package level interconnect characteristics. The analyses can be classified according to several categories: whether resistive IR or inductive L dI dt noise is considered, whether wirebond or flipchip packaging is assumed, and whether packaging or onchip interconnect parasitic impedances are assumed dominant. Traditionally, the package-level parasitic inductance (the bond wires, lead frames, and pins) has dominated the total inductance of the power distribution system while the on-chip resistance of the power lines has dominated the total resistance of the power distribution system. The resistive noise has therefore been associated with the resistance of the on-chip interconnect and the inductive noise has been associated with the inductance of the off-chip packaging [9, 11, 12] .
Scaling of the resistive voltage drop in a wire bonded integrated circuit of constant size has been investigated by Song and Glasser in [7] . Assuming that the interconnect thickness scales as 1=S, the ratio of the supply voltage to the resistive noise, i.e., the signal-to-noise ratio (SNR) of the power supply voltage, scales as 1=S 3 under ideal scaling (as compared to 1=S 4 under constant voltage scaling). Song and Glasser proposed a multilayer interconnect stack to address this problem. Assuming that the top metal layer has a constant thickness, the scaling of the power supply signal-tonoise ratio improves by a power of S as compared to standard interconnect scaling.
Bakoglu [8] investigated the scaling of both resistive and inductive noise in wire-bonded ICs taking into account the increase in the die size by S C with each technology generation. Under the assumption of ideal interconnect scaling (i.e., the number of interconnect layers remains constant and the thickness of each layer is reduced as 1=S), the SNR of the resistive noise decreases as 1=S
4 S 2 C . The SNR of the in- ductive noise due to the parasitic impedances of the packaging decreases as 1=S 4 S 3 C . These estimates of the SNR are made under the assumption that the number of interconnect levels increases as S. This assumption scales the onchip capacitive load, average current, and, consequently, the SNR of both the inductive and resistive noise by a factor of S. Bakoglu also considered an improved scaling scenario where the number of chip-to-package power connections increases as SS 2 C , implying flip-chip packaging. In this scenario, the resistive SNR R scales as 1 assuming that the thickness of the upper metal levels is inversely scaled as S. The inductive SNR L scales as 1=S under the assumption that the effective inductance per power connection scales as 1=S 2 .
A detailed overview of modeling and mitigation of package-level inductive noise is presented by Larsson [9] . The SNR of the inductive noise is shown to decrease as 1=S 2 S C under the assumption that the number of interconnect levels remains constant and the number of chip-to-package power/-ground connections increases as SS C . The results and key assumptions of the the power supply noise scaling analyses are summarized in Table 2 .
The effect of the flip-chip pad density on the resistive drop in power supply grids has been investigated by Arledge and Lynch in [10] . All other conditions being equal, the maximum resistive drop is proportional to the square of the pad pitch. Based on this trend, a pad density of 4000 pads/cm 2 is the minimum density required to assure an acceptable onchip IR drop and I/O signal density at the 50 nm technology node.
Interconnect characteristics
The thickness of the top interconnect layers (where the conductors of the global power distribution networks are located) is assumed in this analysis to remain constant, as these layers have not been scaled through several recent technology generations due to power distribution noise and interconnect delay considerations. The impact of this assumption on the results as compared to ideal interconnect scaling is discussed in Section 5.
The number of metal layers and the fraction of the metal resources dedicated to the power distribution network are also assumed constant. The ratio of the diffusion barrier thickness to the copper interconnect core is assumed to remain constant with scaling. The increase in the copper resistivity of the interconnect due to electron scattering at the interconnect surface interface (significant at line widths below 45 nm [1] ) is neglected for relatively thick global power lines. Overall, the total sheet resistance of a global power distribution network remains constant with technology scaling.
In a flip-chip package, the integrated circuit and the package are interconnected via an area array of solder bumps mounted onto the on-chip I/O pads [13] . The power supply current enters the on-chip power distribution network from the power/ground pads. A view of the on-chip area array of power/ground pads is shown in Fig. 2 . All of the power/ground pads of a flip-chip packaged IC are assumed to be equipotential, i.e., the variation in the voltage levels among the pads is considered negligible as compared to the noise within the on-chip power distribution network. For the purpose of this scaling analysis, a uniform power consumption per die area is assumed. Under these assumptions, each power (ground) pad supplies power (ground) current only to those circuits located in the area around the pad, as shown in Fig. 2 . This area is referred to as a power distribution cell (or power cell). The edge dimensions of each power distribution cell are proportional to the pitch of the power/ground pads. The size of the power cell area determines the effective span of the on-chip power distribution network. The power distribution scaling analysis becomes independent of the die size as the number of power connections scales as S 2
C . An important element of this analysis is the scaling of flipchip technology. According to the ITRS Roadmap [1] , at the 150 nm line half-pitch technology node the pad pitch P is 160 µm. At the 35 nm node the pad pitch is forecasted to be 80 µm. That is, the linear density of the pads doubles for a fourfold reduction in circuit feature size. The pad size and pitch P scale, therefore, as 1= p S and the area density (∝ 1=P 2 ) of the pads increases as S with each technology generation. Interestingly, one of the reasons given for this relatively infrequent change in the pad pitch (as compared with the introduction of new CMOS technology generations) is the cost of the test probe head [1] . The maximum density of the flip-chip pads is assumed to be limited by the pad pitch. Although the number of on-chip pads is forecasted to remain constant, some recent research has predicted that the number of on-chip power/ground pads will increase due to electromigration and resistive noise considerations [10, 14] .
On-chip capacitors are used to decouple the parasitic impedance of the power grid lines from the power load. The decoupling capacitors provide a low impedance path between the on-chip power and ground at high frequencies, lowering the impedance of the on-chip power distribution networks. A simple model of an on-chip power distribution grid with a power load and a decoupling capacitor is shown in Fig. 3 .
To allow charge sharing during the switching transient, the decoupling capacitors are placed electrically close to the power load (the logic gates), i.e., R decoup R grid . The charge on the decoupling capacitance decreases during the switching transient due to charge sharing with the load capacitance. To prevent excessive power noise, the voltage across the decoupling capacitor should be restored to the nominal voltage before the next transition, i.e., within a clock period. The charge on the decoupling capacitor is replenished by the current flowing through the on-chip power network. The onchip decoupling capacitors, therefore, do not lower the effective impedance of the power distribution networks at signal frequencies comparable to or below the clock frequency of the circuit. 
Power supply noise scaling
A power distribution cell can be modeled as a circle of radius r c with a constant current per area I A [10] , as shown in 
The resistive voltage drop is proportional to the product of the total cell current I cell and the effective sheet resistance ρ 2 with the coefficient C dependent only on the r c =r p ratio.
The cell current I cell is the product of the area current density I A and the cell area πr 2 c . The current per area I A scales as S; the area of the cell is proportional to P 2 which scales as 1=S. The cell current I cell , therefore, remains constant (i.e., scales as 1). The sheet resistance ρ 2 of the power distribution grid and the ratio of the pad pitch to the pad size remains constant with scaling. The resistive drop ∆V R , therefore, scales as I R ∝ 1 1 ∝ 1. The resistive SNR of the power supply voltage, consequently, decreases with scaling as
This coincides with the trend described by Bakoglu in the improved scaling scenario [8] . A faster scaling of the onchip current as described by Bakoglu is offset by increasing the interconnect thickness by S which reduces the sheet resistance ρ 2 by S. This trend is more favorable as compared to the 1=S 2 dependence established by Song and Glasser [7] . The improvement is due to the decrease in the power cell area of a flip-chip IC by a factor of S with scaling whereas a constant die area is assumed in [7] . The inductive properties of power distribution grids are investigated in [15] . It is shown that the inductance of the power grids with alternating power and ground lines behaves similarly to the grid resistance. That is, the grid inductance increases linearly with the grid length and decreases inversely linearly with the number of lines in the grid. This linear behavior is due to the periodic structure of the alternating power and ground grid lines. The long range inductive coupling of a specific (signal or power) line to a power line is cancelled out by the coupling to the ground lines adjacent to the power line, which carry current in the opposite direction [15] . Inductive coupling in periodic grid structures, therefore, is effectively a short range interaction. Similar to the grid resistance, the grid inductance can be conveniently expressed as a dimension independent grid sheet inductance [16] . The grid sheet inductance is determined by the line pitch, height, and thickness, which are assumed constant in this analysis. Therefore, analogous to the resistive voltage drop ∆V R discussed above, the inductive voltage drop ∆V L is proportional to the product of the sheet inductance of the global power grid and the magnitude of the cell transient current.
The inductive voltage drop ∆V L in a power distribution cell is proportional to L 2 dI cell =dt , where L 2 is the sheet inductance of the power distribution grid, and dI cell =dt is the transient current of a power distribution cell. 
The maximum current density in the power grid occurs where the power grid lines contact the power pads. At these locations, the current delivered through the solder bump enters the power grid and spreads out. The area of contact between a power pad and the power grid is proportional to the pad perimeter which scales as 1= p S. The power cell current I cell supplied by the pad remains constant as described above. The maximum current density, therefore, scales as
The average current density in power distribution lines scales similarly. Note that the current density scaling is more favorable as compared to resistive SNR R (∝ 1=S). Therefore, if the metal capacity of the on-chip power distribution grid is increased by S so as to maintain the resistive SNR R constant, the average current density in the power lines will decrease as ∝ 1= p S.
Implications of noise scaling
The amplitude of both the resistive and inductive noise increases with technology scaling relative to the power supply voltage. A number of techniques have been proposed to mitigate the unfavorable scaling of power distribution noise. These techniques are briefly summarized below.
To maintain a constant supply voltage to resistive noise ratio, the effective sheet resistance of the global power distribution grid should be reduced. There are two ways to allocate additional metal resources to the power distribution grid. One option is to increase the number of metalization layers. This approach adversely affects fabrication time and yield and, therefore, increases the cost of manufacturing. The ITRS Roadmap forecasts only a moderate increase in the number of interconnect levels, from eight levels at the 130 nm line half-pitch node to eleven levels at the 32 nm node [1] . The second option is to increase the fraction of metal area per metal level allocated to the power grid. This strategy decreases the amount of wiring resources that can be allocated for global signal routing.
The sheet inductance of the power distribution grid, similar to the sheet resistance, can be lowered by increasing the number of interconnect levels. Furthermore, wide metal trunks typically used for power distribution at the top levels can be replaced with narrow interdigitated power/ground lines. Although this configuration substantially lowers the grid inductance, it increases the grid resistance and, consequently, the resistive noise [16] .
Alternatively, circuit techniques can be employed to limit the peak transient power current demands of the digital logic. Current steering logic, for example, produces a minimal variation in the current demand between the transient response and the steady state response. In synchronous circuits, the maximum transient currents typically occur during the beginning of a clock period, when, immediately after the arrival of a clock signal at the latches, a signal begins to propagate through the blocks of sequential logic. Clock skew scheduling can be exploited to spread in time the periods of peak current demand [17] . The inductive noise ∆V L scaling as S 2 increases by a factor of S faster as compared to the resistive noise ∆V R scaling as S. The estimates of inductive and resistive noise described by Bakoglu also differ by a factor of S [8] . The increase in the significance of inductance of the power distribution interconnect is similar to that noted in signal interconnect [18, 19] . The trend is, however, delayed by several technology generations as compared to signal interconnect. As discussed in Section 3, the high frequency harmonics are filtered out by the on-chip decoupling capacitance and the power current has a comparatively lower frequency content.
The significance of the inductive noise relative to the resistive noise is, therefore, increasing with technology scaling. This conclusion is in agreement with the trends forecasted by the ITRS [1] . The forecasted demands in power current of high performance microprocessors are shown in Fig. 5 . Both the average current and the transient current are rising exponentially with technology scaling. The rate of increase in the transient current is more than double the rate of increase in the average current as indicated by the slope of the trend lines depicted in Fig. 5 . The faster rate of increase in the transient current as compared to the average current is due to rising circuit clock frequencies. The transient current of modern high performance processors is approximately one teraampere per second (10 12 A/s) and is expected to rise reaching hundreds of teraamperes per second. Such a high magnitude of the transient current is due to switching hundreds of amperes within a fraction of a nanosecond.
In order to translate the projected current requirements into supply noise voltage trends, an interconnect structure, shown in Fig. 6 , is assumed. The voltage differential across this structure caused by the average and transient currents produces a resistive and inductive noise, respectively. The grid consists of interdigitating power and ground lines of 1 µm 1 µm cross section with 1 µm line spacing. The length and width of the grid are equal to the size of a power distribution cell. The size of the power cell is assumed to be twice the pitch of the flip-chip pads, reflecting that only half of the total number of pads are used for the power and ground distribution as forecasted by ITRS for high performance ASICs. Note that the resistance and inductance of the square grid are independent of the dimensions [16] (as long as the dimensions are much greater than the line pitch). The average and transient currents flowing through the grid are, however, scaled from the IC current requirements shown in Fig. 5 in proportion to the area of the grid. The current flowing through the square grid is therefore the same as the current distributed through the power grid within the power cell. The power current enters and leaves from the same side of the grid, assuming the power load is connected at the opposite side. The square grid has the same inductance to resistance ratio as the global distribution grid with the same line pitch, thickness, and width. Hence, the square grid has the same inductive to resistive noise ratio. The square grid model also produces the same rate in the noise increase because the current is scaled proportionately to the area of the power cell. The resulting noise trends are illustrated in Fig. 7 . The area of the grid scales as 1=S. The current area density increases as S. The total average current of the grid remains, therefore, constant. The resistive noise also remains approximately constant, as shown in Fig. 7 . The inductive noise, alternatively, rises steadily and becomes comparable to the resistive noise approximately at the 45 nm technology node. Note that the structure depicted in Fig. 6 has a lower inductance to resistance ratio as compared to practical power distribution grids because the power and ground lines are relatively narrow and placed adjacent to each other, reducing the area of the current loop and increasing the grid resistance [15, 16] . The inductive to resistive noise ratio is therefore somewhat optimistic.
The rise of the inductive noise is mitigated if ideal interconnect scaling is assumed and the thickness, width, and pitch of the global power lines are scaled as 1=S. In this scenario, the density of the global power lines increases as S and the sheet inductance L 2 of the global power distribution grids decreases as 1=S, mitigating the inductive noise and SNR L by S. The sheet resistance of the power distribution grid, however, increases as S, exacerbating the resistive noise and SNR R by a factor of S. Presently, the resistive parasitic impedance dominates the total impedance of the on-chip power distribution networks. Ideal scaling of the upper interconnect levels will therefore increase the power distribution noise. However, as technology approaches the nanometer range and the inductive and resistive noise become comparable, judicious tradeoffs between the resistance and inductance of the power networks will be necessary to achieve the minimum noise level.
Conclusions
An analysis of scaling power distribution noise in flipchip packaged high performance IC is presented in this paper. Published scaling analyses of power distribution noise are reviewed and the analysis assumptions are discussed. If the number of interconnect levels and the fraction of metal area dedicated to routing the power and ground are maintained constant, the resistive voltage drop across the power grids remains approximately constant, while the inductive drop increases by S. Consequently, the SNR decreases by S in the case of resistive noise and by S 2 in the case of inductive noise. Additional interconnect resources will be required to offset these trends. As compared to the resistive noise, the on-chip inductive noise will increase faster and become more significant with technology scaling. The maximum current density in power grids will also increase by p S, exacerbating reliability concerns in high performance power distribution grids. Ideal interconnect scaling of the upper metal levels improves the inductive noise SNR L by S and worsens the resistive SNR R by S. Careful tradeoffs between the resistance and inductance of the power distribution networks will be necessary in nanometer technologies to achieve minimum power supply noise levels.
