Abstract. The effect of power supply noise in on-chip power grids and its implications on the path delay in digital circuits is examined. The simulation results show that IR-Drop and the resulting path delay are strongly affected by the layout of the circuit. Power grid design measures to reduce IR-Drop, as well as their area and performance implications are discussed.
Introduction
With the ongoing scaling of CMOS technologies, process and environmental variations gain significance. Since supply voltages have been scaled with the last technology nodes, power supply integrity has become a first class design issue (Benoit et al., 1998; Lin and Chang, 2001 ). Variations of supply voltage are either caused by the inductance of the power line times the time-deviation of the current, so called dI /dt noise, or by the voltage drop over the finite resistance of the power grid, caused by the current flowing through it.
In this the paper the implications of power noise on the path delay will be examined. In Sect. 2 an overview of the different types of variations in modern technologies is given, and the challenges imposed by environmental variations are described. Following Sect. 3 on IR-Drop, the modeling of the power grid, as well as the simulation setups are explained in Sect. 4. In Sect. 5 the propagation of the voltage drop over the power grid is explained at two different simulation setups. Simulation results for implications of the power supply noise on the path delay are shown in Sect. 6. In Sect. 7 design measures for reducing power supply noise, as well as their implications on the performance and the area overhead are shown. Finally, conclusions are drawn in Sect. 8.
Correspondence to: M. Eireiner (eireiner@tum.de)
Variations
In deep submicron technologies, variations, both process and environmental, gain significance. In the following, both will be discussed briefly.
Process vriations
Process variations in active and passive devices, like changes in doping concentration, oxide thickness, wire resistance and capacitance, as well as length and width of transistors, influence their performance. Process variations which are correlated between transistors on a die, like lot-to-lot, wafer-to-wafer and chip-to-chip variations, can be characterized by full-speed tests and speed monitors. After the characterization of the circuits, appropriate counter measures, such as supply voltage binning or body bias adjustment can be carried out to increase the parametric yield (Tschanz et al., 2002) .
Random variations, which are not correlated, are very hard to detect and characterize. Up to now, only few approaches have been published to address this problem (Ernst et al., 2003) .
Since in current technologies global or correlated variations are dominant, speed monitors and binning of frequency and supply voltage is being used successfully. Since random variations become more and more significant with ongoing technology scaling, more research is to be expected on that topic.
Environmental variations
With the scaling of technology nodes, not only process variations, but also environmental variations, such as cross talk (X-Talk), changes in temperature and power supply distortions, become more and more significant. This is due to the continuous decrease of supply voltage, accompanied with (2) ity, high R-Drop hich the dent on e of the c resisaced by copper in modern technologies. Another factor is the thickness of the metal layers. Together with the specific resistance, the thickness of the metal layer determines the sheet resistance R S of a metal layer. These factors are technology dependent and can therefore not be altered by the chip designer. The designer of a circuit has two primary possibilities to influence the resistance of the power grid, since the resistance of a wire R wire is given by eq. 3, where L wire and W wire are the width and length of the wire.
The length of a power line to a specific circuit is usually given by the floorplan. Therefore, the only parameter of eq. 3, which can be directly altered by the designer is the width of the power grid wires. The topology of the power grid is the second factor through which the designer can influence the resistance of the power grid. Usually a power grid consists of three to five metal layers, each orthogonal to its neighboring metal layers. At their overlapping points, the different metal layers are connected through multiple vias in parallel, so called via stacks. Figure 1 shows a top view of an axially symmetric cut out of a four stage power grid, as it is used in our analysis. Going from bottom to top, the lowest metal layer is contacted by the second lowest metal layer in a fixed pitch. In the same manner, the second lowest metal layer is connected to the second highest metal layer in a fixed pitch and so on. The higher the metal layer, the wider the wires are. Besides the pitches and the widths of the metal lines, the number of parallel vias per overlap point determines the resistance of the power grid. Solder balls, through which the highest metal layer of the on-chip power grid is connected to the external power supply, are depicted as red circles in figure 1.
Outlook on Future Technologies
As the power density will continue to increase and the supply voltage will continue to decrease, even if slower Actually, environmental variations are deterministic, but since the complexity is too high and most of the variations are highly influenced by the layout of the chip, these variations are modeled, if at all, statistically. Another big challenge of environmental variations is the testing or characterizing of them. This is because switching pattern and the activity of the circuit determine environmental changes. Scan tests, which are used for characterizing process variations, are not suitable for characterizing these variations, since in scan tests only few paths are triggered during one clock cycle and the switching patterns are therefore not representative for real operation. Worst case analysis and tests are one possibility to cope with these variations. Because these analyses and tests are very pessimistic the overall yield decreases.
Power supply integrity
Since the paper addresses the problem of power supply noise, its origins will be explained here. Power supply noise is caused by two different mechanisms.
The first is called dI /dt noise. Transient changes of current cause a voltage drop over the inductance of the power grid, as described by Eq. 1, where V dI is the change in supply voltage due to dI /dt, L the inductance of the power grid and I the current through the power grid.
This voltage drop can either increase or decrease the effective supply voltage for the gates. Therefore, especially at scenarios of block activation, clock gating or frequency change, where large changes of current occur, becomes this effect present. For low power applications, on-chip inductance still can be neglected compared to the inductance of the bond wires (Piguet, 2004) . Since in this paper only on-chip effects of power integrity are looked at, inductance is omitted in our analysis and is not modeled in the simulation setup. The second effect of power supply integrity is called IRDrop. As Ohms law states, Eq. 2, the current through the power grid causes a voltage drop over the finite resistance of it and decreases therefore the effective supply voltage seen by the gates.
For applications with low V DD and high power density, high current values intensify this problem.
IR-Drop

Resistance of power grid
As Eq. 2 describes, the voltage drop caused by IR-Drop is proportional to the resistance of the power grid which the current must pass through. This resistance is dependent on various parameters. The first is the specific resistance of the metallization material used. For reducing the specific resistance of the metallization, aluminum has been replaced by copper in modern technologies. Another factor is the thickness of the metal layers. Together with the specific resistance, the thickness of the metal layer determines the sheet resistance R S of a metal layer. These factors are technology dependent and can therefore not be altered by the chip designer.
The designer of a circuit has two primary possibilities to influence the resistance of the power grid, since the resistance of a wire R wire is given by Eq. 3, where L wire and W wire are the width and length of the wire.
The length of a power line to a specific circuit is usually given by the floorplan. Therefore, the only parameter of Eq. 3, which can be directly altered by the designer is the width of the power grid wires. The topology of the power grid is the second factor through which the designer can influence the resistance of the power grid. Usually a power grid consists of three to five metal layers, each orthogonal to its neighboring metal layers. At their overlapping points, the different metal layers are connected through multiple vias in parallel, so called via stacks. Figure 1 shows a top view of an axially symmetric cut out of a four stage power grid, as it is used in our analysis. Going from bottom to top, the lowest metal layer is contacted by the second lowest metal layer in a fixed pitch. In the same manner, the second lowest metal layer is connected to the second highest metal layer in a fixed pitch and so on. The higher the metal layer, the wider the wires are. Besides the pitches and the widths of the metal lines, the number of parallel vias per overlap point determines the resistance of the Adv. Radio Sci., 4, [197] [198] [199] [200] [201] [202] [203] [204] [205] 2006 www.adv-radio-sci.net/4/197/2006/ (2003)]. This said, it is clear that power supply integrity is a problem that will become more and more challenging with ongoing technology scaling. The most likely solution of this problem is that more metallization area and more pins will be dedicated to power distribution to reduce IR-Drop and dI/dt effects. Even today, about half of the I/O pins of a high performance microprocessor are already dedicated to power distribution [Rusu et al. (2006) ].
Modeling of the Power Grid and Simulation Setup
In our analysis we use a linear RC-model to model the four stage power grid. A schematic of the power grid model is depicted in fig. 2 . For simplicity reasons only the lower three stages are shown. The values for sheet resistances, sheet capacitances, widths, pitches, and via counts are derived from a typical digital CMOS ASIC in a 90nm low power technology. Widths of the metal lines are in the range of 250nm − 600nm for the lowest to 50µm − 150µm for the highest metal layer. Pitches range from 3µm − 15µm to 300µm − 1500µm for the lowest and the highest layer. The resistance of a via is between 0.25Ω−20Ω and 0.02Ω−0.2Ω, the number of vias per crossing point (via count) ranges from 5 − 25 to 50 − 250 from the lowest to the highest metal layer. The sheet resistance for the different layers is between 0.02Ω/ and 0.125Ω/ from the highest to the lowest metal layer. Since setup violations due to power noise first occur in the critical paths of a circuit, a critical path replica of an ARM9 core, with a depth of 22 stages, was taken as test vehicle in our analysis. The layout of the path has a length of 86µm.
The basic structure of the critical path and how the gates are placed in the power grid is depicted in Fig. 3 . The orientation of the layout is from left to the right and all gates of the critical path are located on the same power grid. In the simulation, all gates are connected once to the power grid, so that the different gates are separated by the distance of the sum of their half lengths. Apart from the critical path, also small and medium paths are used in the simulation to model a heavier loaded power grid. The schematic of the medium and small paths, as well as the placing in the grid, is analog to the critical one shown in Fig. 3 . All simulations are done in a low-power 90nm technology, with V DD = 1.2V and T emp = 25 • C. As reference for the following simulations, the critical path is connected directly to a constant voltage source, and the path delay, CP − Q path -delay, is simulated. In the following a power rail refers to a power line on the lowest metal layer. For a better understanding, we declare that a power rail ends with a low ohmic connection through a via stack to the upper metal layer.
For the first simulation setup the critical path is placed in the power grid and no additional paths are added, as shown in Fig. 4 . The resulting IR-Drop at all nodes of the grid as well as the resulting CP − Q path -delay is simulated. This results in a maximum supply voltage drop of 8.2mV and an increase of 1% in CP − Q path -delay. To increase the load on the power grid, additional medium paths are placed in the same power rail, as depicted in Fig. 5 . The additional currents caused by the medium paths add up to a maximum IR-Drop of 13.2mV , which results in a CP − Q path -delay worsening of 1.3%.
In real circuits also inactive paths exist, also very often power rails are shared between to standard cell rows. Therefore, in the next simulation setup additional inactive medium paths are placed in the shared power rails to simulate the ef- 
Outlook on future technologies
As the power density will continue to increase and the supply voltage will continue to decrease, even if slower than previously expected, on-chip currents have to rise (http://www.itrs. net, 2003) . This said, it is clear that power supply integrity is a problem that will become more and more challenging with ongoing technology scaling. The most likely solution of this problem is that more metallization area and more pins will be dedicated to power distribution to reduce IR-Drop and dI /dt effects. Even today, about half of the I/O pins of a high performance microprocessor are already dedicated to power distribution (Rusu et al., 2006) .
Modeling of the power grid and simulation setup
In our analysis we use a linear RC-model to model the four stage power grid. A schematic of the power grid model is depicted in Fig. 2 . For simplicity reasons only the lower three stages are shown.
The values for sheet resistances, sheet capacitances, widths, pitches, and via counts are derived from a typical digital CMOS ASIC in a 90 nm low power technology. Widths of the metal lines are in the range of 250 nm -600 nm for the lowest to 50 µm -150 µm for the highest metal layer. Pitches range from 3 µm -15 µm to 300 µm -1500 µm for the lowest and the highest layer. The resistance of a via is between 0.25 − 20 and 0.02 − 0.2 , the number of vias per crossing point (via count) ranges from 5 -25 to 50 -250 from the lowest to the highest metal layer. The sheet resistance for the different layers is between 0.02 / and 0.125 / from the highest to the lowest metal layer. (2003)]. This said, it is clear that power supply integrity is a problem that will become more and more challenging with ongoing technology scaling. The most likely solution of this problem is that more metallization area and more pins will be dedicated to power distribution to reduce IR-Drop and dI/dt effects. Even today, about half of the I/O pins of a high performance microprocessor are already dedicated to power distribution [Rusu et al. (2006) ].
Modeling of the Power Grid and Simulation Setup
In real circuits also inactive paths exist, also very often power rails are shared between to standard cell rows. Therefore, in the next simulation setup additional inactive medium paths are placed in the shared power rails to simulate the ef- (2003)]. This said, it is clear that power supply integrity is a problem that will become more and more challenging with ongoing technology scaling. The most likely solution of this problem is that more metallization area and more pins will be dedicated to power distribution to reduce IR-Drop and dI/dt effects. Even today, about half of the I/O pins of a high performance microprocessor are already dedicated to power distribution [Rusu et al. (2006)].
In our analysis we use a linear RC-model to model the four stage power grid. A schematic of the power grid model is depicted in fig. 2 . For simplicity reasons only the lower three stages are shown. The values for sheet resistances, sheet capacitances, widths, pitches, and via counts are derived from a typical digital CMOS ASIC in a 90nm low power technology. Widths of the metal lines are in the range of 250nm − 600nm for the lowest to 50µm − 150µm for the highest metal layer. Pitches range from 3µm − 15µm to 300µm − 1500µm for the lowest and the highest layer. The resistance of a via is between 0.25Ω−20Ω and 0.02Ω−0.2Ω, the number of vias per crossing point (via count) ranges from 5 − 25 to 50 − 250 from the lowest to the highest metal layer. The sheet resistance for the different layers is between 0.02Ω/ and 0.125Ω/ from the highest to the lowest metal layer. Since setup violations due to power noise first occur in the critical paths of a circuit, a critical path replica of an ARM9 core, with a depth of 22 stages, was taken as test vehicle in our analysis. The layout of the path has a length of 86µm. The basic structure of the critical path and how the gates are placed in the power grid is depicted in Fig. 3 . The orientation of the layout is from left to the right and all gates of the critical path are located on the same power grid. In the simulation, all gates are connected once to the power grid, so that the different gates are separated by the distance of the sum of their half lengths. Apart from the critical path, also small and medium paths are used in the simulation to model a heavier loaded power grid. The schematic of the medium and small paths, as well as the placing in the grid, is analog to the critical one shown in Fig. 3 . All simulations are done in a low-power 90nm technology, with V DD = 1.2V and T emp = 25 • C. As reference for the following simulations, the critical path is connected directly to a constant voltage source, and the path delay, CP − Q path -delay, is simulated. In the following a power rail refers to a power line on the lowest metal layer. For a better understanding, we declare that a power rail ends with a low ohmic connection through a via stack to the upper metal layer.
In real circuits also inactive paths exist, also very often power rails are shared between to standard cell rows. Therefore, in the next simulation setup additional inactive medium paths are placed in the shared power rails to simulate the ef- Since setup violations due to power noise first occur in the critical paths of a circuit, a critical path replica of an ARM9 core, with a depth of 22 stages, was taken as test vehicle in our analysis. The layout of the path has a length of 86 µm. The basic structure of the critical path and how the gates are placed in the power grid is depicted in Fig. 3 . The orientation of the layout is from left to the right and all gates of the critical path are located on the same power grid. In the simulation, all gates are connected once to the power grid, so that the different gates are separated by the distance of the sum of their half lengths. Apart from the critical path, also small and medium paths are used in the simulation to model a heavier loaded power grid. The schematic of the medium and small paths, as well as the placing in the grid, is analog to the critical one shown in Fig. 3 . All simulations are done in a low-power 90 nm technology, with V DD = 1.2V and Temp = 25 • C.
As reference for the following simulations, the critical path is connected directly to a constant voltage source, and the path delay, CP − Q path -delay, is simulated. In the following a power rail refers to a power line on the lowest metal layer. For a better understanding, we declare that a power rail ends with a low ohmic connection through a via stack to the upper metal layer.
For the first simulation setup the critical path is placed in the power grid and no additional paths are added, as shown in fect of the junction capacitances on the IR-Drop. As shown in Fig. 6 , the inactive paths act as a buffering capacitance C M P for the power rail of the critical path. The resulting maximum voltage drop is reduced to 10.3mV and the resulting CP − Q path -delay increase is only 1.1%. To further increase the load on the power grid, active medium paths are placed in the next simulation in the shared rail, as depicted in Fig. 7 . For this simulation setup, a CP − Q path -delay increase of 2.2% and a maximum IRDrop of 28.2mV is simulated.
In a next step, the horizontal neighboring rails are loaded with medium paths, as shown in Fig. 8 . In contrast to the figures shown so far, the power rails of the second metal layer and the corresponding via stacks are shown in this figure. The increased load on the power rail results in a maximum In the next simulation the medium paths in the horizontal neighboring rails are replaced by critical paths, see Fig. 9 . The maximum IR-Drop is simulated to be 30.2mV, the path delay worsening does not increase any further and stays at 2.4%, as in the case for medium paths in the neighboring rails. Since in real circuits it happens that a lot of critical paths are placed next to each other, e. g. in data paths, this is modeled in the next simulation setup. Therefore, critical paths are not only placed in the next 10 neighboring rails, but also in their shared power rails, as illustrated in Fig.10 . A total number of 60 critical paths is used in this simulation. The resulting CP − Q path -delay increase is 4.1%, the maximum voltage drop is 49.8mV. In the next simulation setup, the orientation of the critical path in the power grid is changed. Now the gates of the path are not all placed in one power rail, as shown in Fig. 3 , but distributed over more rails. Assuming parallel critical paths, fect of the junction capacitances on the IR-Drop. As shown in Fig. 6 , the inactive paths act as a buffering capacitance C M P for the power rail of the critical path. The resulting maximum voltage drop is reduced to 10.3mV and the resulting CP − Q path -delay increase is only 1.1%.
To further increase the load on the power grid, active medium paths are placed in the next simulation in the shared rail, as depicted in Fig. 7 . For this simulation setup, a CP − Q path -delay increase of 2.2% and a maximum IRDrop of 28.2mV is simulated.
In a next step, the horizontal neighboring rails are loaded with medium paths, as shown in Fig. 8 . In contrast to the figures shown so far, the power rails of the second metal layer and the corresponding via stacks are shown in this figure. The increased load on the power rail results in a maximum Fig. 7 . Critical path with active medium paths in same and the shared power rails In the next simulation the medium paths in the horizontal neighboring rails are replaced by critical paths, see Fig. 9 . The maximum IR-Drop is simulated to be 30.2mV, the path delay worsening does not increase any further and stays at 2.4%, as in the case for medium paths in the neighboring rails. Since in real circuits it happens that a lot of critical paths are placed next to each other, e. g. in data paths, this is modeled in the next simulation setup. Therefore, critical paths are not only placed in the next 10 neighboring rails, but also in their shared power rails, as illustrated in Fig.10 . A total number of 60 critical paths is used in this simulation. The resulting CP − Q path -delay increase is 4.1%, the maximum voltage drop is 49.8mV. In the next simulation setup, the orientation of the critical path in the power grid is changed. Now the gates of the path are not all placed in one power rail, as shown in Fig. 3 , but distributed over more rails. Assuming parallel critical paths, Fig. 10 . Critical path surrounded with medium and critical paths in the neighboring and shared rails To increase the load on the power grid, additional medium paths are placed in the same power rail, as depicted in Fig. 5 . The additional currents caused by the medium paths add up to a maximum IR-Drop of 13.2 mV, which results in a CP − Q path -delay worsening of 1.3 %.
In real circuits also inactive paths exist, also very often power rails are shared between to standard cell rows. Therefore, in the next simulation setup additional inactive medium paths are placed in the shared power rails to simulate the effect of the junction capacitances on the IR-Drop. As shown in Fig. 6 , the inactive paths act as a buffering capacitance C MP for the power rail of the critical path. The resulting maximum voltage drop is reduced to 10.3 mV and the resulting CP − Q path -delay increase is only 1.1 %.
To further increase the load on the power grid, active medium paths are placed in the next simulation in the shared rail, as depicted in Fig. 7 . For this simulation setup, a CP − Q path -delay increase of 2.2 % and a maximum IRDrop of 28.2 mV is simulated.
In a next step, the horizontal neighboring rails are loaded with medium paths, as shown in Fig. 8 . In contrast to the figures shown so far, the power rails of the second metal layer and the corresponding via stacks are shown in this figure. The increased load on the power rail results in a maximum voltage drop of 27.2 mV and a CP − Q path -delay increase of 2.4 %.
ures shown so far, the power rails of the second metal layer and the corresponding via stacks are shown in this figure. The increased load on the power rail results in a maximum fect of the junction capacitances on the IR-Drop. As shown in Fig. 6 , the inactive paths act as a buffering capacitance C M P for the power rail of the critical path. The resulting maximum voltage drop is reduced to 10.3mV and the resulting CP − Q path -delay increase is only 1.1%.
In a next step, the horizontal neighboring rails are loaded with medium paths, as shown in Fig. 8 . In contrast to the figures shown so far, the power rails of the second metal layer and the corresponding via stacks are shown in this figure. The increased load on the power rail results in a maximum In the next simulation the medium paths in the horizontal neighboring rails are replaced by critical paths, see Fig. 9 . The maximum IR-Drop is simulated to be 30.2mV, the path delay worsening does not increase any further and stays at 2.4%, as in the case for medium paths in the neighboring rails. Since in real circuits it happens that a lot of critical paths are placed next to each other, e. g. in data paths, this is modeled in the next simulation setup. Therefore, critical paths are not only placed in the next 10 neighboring rails, but also in their shared power rails, as illustrated in Fig.10 . A total number of 60 critical paths is used in this simulation. The resulting CP − Q path -delay increase is 4.1%, the maximum voltage drop is 49.8mV. In the next simulation setup, the orientation of the critical path in the power grid is changed. Now the gates of the path fect of the junction capacitances on the IR-Drop. As shown in Fig. 6 , the inactive paths act as a buffering capacitance C M P for the power rail of the critical path. The resulting maximum voltage drop is reduced to 10.3mV and the resulting CP − Q path -delay increase is only 1.1%. To further increase the load on the power grid, active medium paths are placed in the next simulation in the shared rail, as depicted in Fig. 7 . For this simulation setup, a CP − Q path -delay increase of 2.2% and a maximum IRDrop of 28.2mV is simulated.
In a next step, the horizontal neighboring rails are loaded with medium paths, as shown in Fig. 8 . In contrast to the figures shown so far, the power rails of the second metal layer and the corresponding via stacks are shown in this figure. The increased load on the power rail results in a maximum In the next simulation the medium paths in the horizontal neighboring rails are replaced by critical paths, see Fig. 9 . The maximum IR-Drop is simulated to be 30.2mV, the path delay worsening does not increase any further and stays at 2.4%, as in the case for medium paths in the neighboring rails. Since in real circuits it happens that a lot of critical paths are placed next to each other, e. g. in data paths, this is modeled in the next simulation setup. Therefore, critical paths are not only placed in the next 10 neighboring rails, but also in their shared power rails, as illustrated in Fig.10 . A total number of 60 critical paths is used in this simulation. The resulting CP − Q path -delay increase is 4.1%, the maximum voltage drop is 49.8mV. In the next simulation setup, the orientation of the critical path in the power grid is changed. Now the gates of the path In the next simulation the medium paths in the horizontal neighboring rails are replaced by critical paths, see Fig. 9 . The maximum IR-Drop is simulated to be 30.2 mV, the path delay worsening does not increase any further and stays at 2.4 %, as in the case for medium paths in the neighboring rails.
Since in real circuits it happens that a lot of critical paths are placed next to each other, e.g. in data paths, this is modeled in the next simulation setup. Therefore, critical paths are not only placed in the next 10 neighboring rails, but also in their shared power rails, as illustrated in Fig. 10 . A total number of 60 critical paths is used in this simulation. The resulting CP − Q path -delay increase is 4.1 %, the maximum voltage drop is 49.8 mV.
Adv. Radio Sci., 4, [197] [198] [199] [200] [201] [202] [203] [204] [205] 2006 www.adv-radio-sci.net/4/197/2006/ the path in the power grid is changed. Now the gates of the path are not all placed in one power rail, as shown in Fig. 3 , but distributed over more rails. Assuming parallel critical paths, Table 1 . Descriptions of used abbreviations for simulation setups now all flipflops of the critical paths are in the first power rail, in the next power rail all first logic gates are placed and so on. This is illustrated in Fig. 11 . In our simulation we place 16 critical paths next to each other, about one quarter of the critical paths we used in the last simulation. The change of orientation causes that now all gates in a rail switch at the same time, this in turn increases the voltage drop and path delay. Since the layouts of the gates have different lengths, 
Reduction of Complexity
Looking at our simulation setup, five independent variables can be identified. These are the two power supply rails voltages V DD and V SS , the coordinates, x and y, of the power grid nodes and the time. In our analysis, the power grids for V DD and V SS are modeled identically. Therefore, IR-Drops will be corresponding, even if the two power rails are stressed at different transitions of a gate, the V DD rail at a low-high, the V SS rail at a highlow transition. Hence, only the V DD rail is observed in our analysis. It should be noted that equal modeling of the two rails is only correct for a triple well process, in which the bulk is not connected to the V SS rail. In a twin well process, there exists an additional high ohmic path parallel to the V SS rail through the bulk. However, this effect is assumed to be higher order and therefore is neglected in this work. In general, no implications can be made on the path delay just by knowing the maximum occurring IR-Drop [Henzler et al. (2005) ]. However, within one rail the shape of the IRDrop is quite similar with respect to different locations in the rail, as shown in Fig. 12 . The transients of two nodes within In the next simulation setup, the orientation of the critical path in the power grid is changed. Now the gates of the path are not all placed in one power rail, as shown in Fig. 3 , but distributed over more rails. Assuming parallel critical paths, now all flipflops of the critical paths are in the first power rail, in the next power rail all first logic gates are placed and so on. This is illustrated in Fig. 11 . In our simulation we place 16 critical paths next to each other, about one quarter of the critical paths we used in the last simulation. The change of orientation causes that now all gates in a rail switch at the same time, this in turn increases the voltage drop and path delay. Since the layouts of the gates have different lengths, the longest cell, flipflop, is taken as reference and the logic gates are grouped to best fit the length of the flipflop, to best fill the available area between power rails. This setup results in a maximum IR-Drop of 60.8 mV, and a delay increase of 5.8% is observed.
In the last simulation the number of parallel critical paths, with changed orientation, is increase to 48, to match better the number of critical paths used in the last simulation with regular orientation. This simulations results in a increase in path delay of 11.3% and a maximum supply voltage drop of 93.6 mV.
For easier understanding, all abbreviations which will be used later are shortly explained and the respective figures are given in Table 1 . 5 Propagation of V DD bounce
Reduction of complexity
Looking at our simulation setup, five independent variables can be identified. These are the two power supply rails voltages V DD and V SS , the coordinates, x and y, of the power grid nodes and the time. In our analysis, the power grids for V DD and V SS are modeled identically. Therefore, IR-Drops will be corresponding, even if the two power rails are stressed at different transitions of a gate, the V DD rail at a low-high, the V SS rail at a highlow transition. Hence, only the V DD rail is observed in our analysis. It should be noted that equal modeling of the two rails is only correct for a triple well process, in which the bulk is not connected to the V SS rail. In a twin well process, there exists an additional high ohmic path parallel to the V SS rail through the bulk. However, this effect is assumed to be higher order and therefore is neglected in this work.
In general, no implications can be made on the path delay just by knowing the maximum occurring IR-Drop (Henzler et al., 2005) . However, within one rail the shape of the IRDrop is quite similar with respect to different locations in the rail, as shown in Fig. 12 . The transients of two nodes within one rail are shown. The first is placed directly under a via stack, x = 150 µm, the other transient is taken at the node at which the highest IR-Drop, worst case (WC), occurs. One sees, that both shapes are similar and only differ in amplitude. Therefore, the maximum supply voltage drop can be used as metric for comparison between different nodes of the power grid, within one simulation setup. Together with the change in path delay, these are the figures of merit used in our work. 
Reduction of Complexity
Looking at our simulation setup, five independent variables can be identified. These are the two power supply rails voltages V DD and V SS , the coordinates, x and y, of the power grid nodes and the time. In our analysis, the power grids for V DD and V SS are modeled identically. Therefore, IR-Drops will be corresponding, even if the two power rails are stressed at different transitions of a gate, the V DD rail at a low-high, the V SS rail at a highlow transition. Hence, only the V DD rail is observed in our analysis. It should be noted that equal modeling of the two rails is only correct for a triple well process, in which the bulk is not connected to the V SS rail. In a twin well process, there exists an additional high ohmic path parallel to the V SS rail through the bulk. However, this effect is assumed to be higher order and therefore is neglected in this work. In general, no implications can be made on the path delay just by knowing the maximum occurring IR-Drop [Henzler et al. (2005) ]. However, within one rail the shape of the IRDrop is quite similar with respect to different locations in the rail, as shown in Fig. 12 . The transients of two nodes within one rail are shown. The first is placed directly under a via stack, x = 150µm, the other transient is taken at the node at which the highest IR-Drop, worst case (WC), occurs. One sees, that both shapes are similar and only differ in amplitude. Therefore, the maximum supply voltage drop can be used as metric for comparison between different nodes of the power grid, within one simulation setup. Together with the change in path delay, these are the figures of merit used in our work. 
Propagation of peak
In the following, a rail is taken which is terminated by via stacks at x = 150 µm and x = 300 µm. The y coordinate is y = 225 µm. The coordinates correspond to those in Fig. 1 . Figure 13 shows the propagation of the peak in x direction for the RA-VDDActive simulation. The start position for the critical path is x = 170 µm. The signal propagation in the critical path is from left to the right, i.e. from lower to higher x-values. For better visualization only three rails are shown, the rail used by the critical path and the two neighboring ones. The highest voltage bounce is at the rail which is loaded by the critical path and the medium paths in the same and in the shared rail. The second highest voltage peak is from the power rail which shares the V SS and is loaded with medium paths. The third rail shown is not loaded at all and the resulting voltage bounce is imposed from the loaded rails over the higher metal layers and via stacks.
In Fig. 13 , clearly the fast fading away of the bounce towards the low ohmic via stacks is shown. The asymmetry in the loading of the horizontal neighboring rails and of the peak bounce can be explained by the direction of signal propagation within the critical path, which is from left to the right. Therefore, the point of highest stress appears in the right part of the power rail and the grid is therefore loaded asymmetrically.
For the same simulation, RA-VDDActive, the propagation of the maximum voltage drop in y direction is shown in Fig. 14 . Again, the two loaded rails are shown clearly. In contrast to the previous figure, the fading away of the peak is much faster and abrupt. This is due to the fact that in ydirection the rails are connected through the second metal layer, which is low ohmic compared to the lowest metal layer, and the via stacks. The fading on the second metal 6 M. Fig. 13 . Propagation of the peak voltage drop in dependence on xcoordinate for the rail loaded by the critical path and its neighboring rails, for the RA-VDDActive simulation y = 225µm. The coordinates correspond to those in Fig. 1 . Fig. 13 shows the propagation of the peak in x direction for the RA-VDDActive simulation. The start position for the critical path is x = 170µm. The signal propagation in the critical path is from left to the right, i.e. from lower to higher x-values. For better visualization only three rails are shown, the rail used by the critical path and the two neighboring ones. The highest voltage bounce is at the rail which is loaded by the critical path and the medium paths in the same and in the shared rail. The second highest voltage peak is from the power rail which shares the V SS and is loaded with medium paths. The third rail shown is not loaded at all and the resulting voltage bounce is imposed from the loaded rails over the higher metal layers and via stacks. In Fig. 13 , clearly the fast fading away of the bounce towards the low ohmic via stacks is shown. The asymmetry in the loading of the horizontal neighboring rails and of the peak bounce can be explained by the direction of signal propagation within the critical path, which is from left to the right. Therefore, the point of highest stress appears in the right part of the power rail and the grid is therefore loaded asymmetrically. For the same simulation, RA-VDDActive, the propagation of the maximum voltage drop in y direction is shown in Fig. 14 . Again, the two loaded rails are shown clearly. In contrast to the previous figure, the fading away of the peak is much faster and abrupt. This is due to the fact that in ydirection the rails are connected through the second metal layer, which is low ohmic compared to the lowest metal layer, and the via stacks. The fading on the second metal layer is corresponding to the fading on the lowest metal layer. The peak decreases about linearly from one via stack to the next 225µm. The coordinates correspond to those in Fig. 1 . 13 shows the propagation of the peak in x direction the RA-VDDActive simulation. The start position for critical path is x = 170µm. The signal propagation in critical path is from left to the right, i.e. from lower to er x-values. For better visualization only three rails are n, the rail used by the critical path and the two neighng ones. The highest voltage bounce is at the rail which aded by the critical path and the medium paths in the e and in the shared rail. The second highest voltage peak om the power rail which shares the V SS and is loaded medium paths. The third rail shown is not loaded at all the resulting voltage bounce is imposed from the loaded over the higher metal layers and via stacks. layer is corresponding to the fading on the lowest metal layer. The peak decreases about linearly from one via stack to the next one.
A 3-dimensional plot of the peak propagation is shown in Fig. 15 . For better illustration, a zoom-in at the via stacks at x = 150 µm and x = 300 µm is shown. The low ohmic contacts through the via stacks, as well as the fading away over the higher metal layers can be seen.
After the IR-Drop caused by two loaded rails has been examined, now the IR-Drop for the CPallover simulation will be looked at. In Fig. 16 the voltage drop in dependence on the x-direction is plotted for the three power rails at y = 200 µm, y = 205 µm, and y = 210 µm, where the rail at y = 200 µm is the first unloaded, the rails at y = 205 µm and y = 210 µm Adv. Radio Sci., 4, [197] [198] [199] [200] [201] [202] [203] [204] [205] 2006 www.adv-radio-sci. over the higher metal layers can be seen.
After the IR-Drop caused by two loaded rails has been examined, now the IR-Drop for the CPallover simulation will be looked at. In Fig. 16 the voltage drop in dependence on the x-direction is plotted for the three power rails at y = 200µm, y = 205µm, and y = 210µm, where the rail at y = 200µm is the first unloaded, the rails at y = 205µm and y = 210µm are both loaded with critical paths. In contrast to Fig. 13 , in Fig. 16 three peaks can be observed for each loaded rail. Additionally, the highest peak increased from 24mV in Fig. 13 to 48mV in Fig. 16 , which is about a factor of two. The peak value at the unloaded rail increased from 2.5mV in Fig. 13 to 23mV in Fig. 16 , which is a factor of about 10. This reflects the heavier loading of the second metal layer, through which the unloaded rail is connected to the loaded ones.
In Fig. 17 the propagation of the voltage drop in dependence 1% between the best delay at x = 150µm and the wors delay. Therefore, it can be said that the delay of the c path is almost insensitive to the placing within the rail. As already mentioned and expected, the path delay o Fig. 16 . Propagation of the peak voltage drop in dependence on x-coordinate for two loaded and one neighboring unloaded rail, for the CPallover simulation are both loaded with critical paths. In contrast to Fig. 13 , in Fig. 16 three peaks can be observed for each loaded rail. Additionally, the highest peak increased from 24 mV in Fig. 13 to 48 mV in Fig. 16 , which is about a factor of two. The peak value at the unloaded rail increased from 2.5 mV in Fig. 13 to 23 mV in Fig. 16 , which is a factor of about 10. This reflects the heavier loading of the second metal layer, through which the unloaded rail is connected to the loaded ones.
In Fig. 17 the propagation of the voltage drop in dependence on the y-direction is shown. The major difference to Fig. 14 is that there are not two, but 10 rails loaded. As said before, the maximum voltage drop increases due to the heavier loading of the total power grid. The fading away Fig. 16 . Propagation of the peak voltage drop in dependence on x-coordinate for two loaded and one neighboring unloaded rail, for the CPallover simulation Fig. 17 . Propagation of the peak voltage drop in dependence on ycoordinate for 10 loaded and the neighboring unloaded rails, for the CPallover simulation on the y-direction is shown. The major difference to Fig. 14 is that there are not two, but 10 rails loaded. As said before, the maximum voltage drop increases due to the heavier loading of the total power grid. The fading away of the peak on the second layer is illustrated even better in this figure. 6 Delay Implications
Placing in the Rail
The layout of the critical path has a length of 86µm, a rail in our setting has a length of 150µm. To examine the impact of the placing of the critical path within one rail, all simulations were done for the starting positions x = 150µm, x = 170µm, x = 190µm, and x = 210µm in the rail ranging from x = 150µm to x = 300µm, as depicted in Fig. 18 . In Tab. 2 the path delays for varying starting positions within one rail are displayed. The results are given for the falling edge, which is the slower one in the considered path. Two effects can be seen. Firstly, the starting condition at x = 190µm is always the worst case starting point. The other observation is that the results do only differ by less than 1% between the best delay at x = 150µm and the worst case delay. Therefore, it can be said that the delay of the critical path is almost insensitive to the placing within the rail. As already mentioned and expected, the path delay of the critical path increases with increased loading of the power grid. The simulation results for the path delay as well as the relative change of the path delay for the falling edge, which is the more critical one, are displayed in Tab. 3. The simula- 6 Delay implications
Placing in the rail
The layout of the critical path has a length of 86 µm, a rail in our setting has a length of 150 µm. To examine the impact of the placing of the critical path within one rail, all simulations were done for the starting positions x = 150 µm, x = 170 µm, x = 190 µm, and x = 210 µm in the rail ranging from x = 150 µm to x = 300 µm, as depicted in Fig. 18 . In Table 2 the path delays for varying starting positions within one rail are displayed. The results are given for the falling edge, which is the slower one in the considered path. Two effects can be seen. Firstly, the starting condition at x = 190 µm is always the worst case starting point. The other observation is that the results do only differ by less than 1% between the best delay at x = 150 µm and the worst case delay. Therefore, it can be said that the delay of the critical path is almost insensitive to the placing within the rail. As already mentioned and expected, the path delay of the critical path increases with increased loading of the power grid. The simulation results for the path delay as well as the relative change of the path delay for the falling edge, which is the more critical one, are displayed in Table 3 . The simulation without power grid serves as reference. As can be seen, the path delay increases up to 4.1% for the CPallover simulation. In this simulation setup, the power grid is loaded with 60 critical paths, all placed horizontally in the power grid. However, if we compare the CPallover simulation with the OriChange16bit setup, we see that even if in the latter case the grid is only loaded with about a quarter of the gate count, the path delay degrades up to 5.8%. This is due to that fact that in the OriChange16bit setup, all gates within one power rail switch at the same time. Through the synchronous switching within one rail, the lowest metal layer is stressed more than in the CPallover case. This higher stress of the lowest metal layer causes a higher IR-Drop, which in turn increases the delay degradation. For the OriChange48bit simulation, which uses again less critical paths than the CPallover simulation, the path degradation increases even to 11.1%. Even if this setup might seem a bit unlikely to occur in real circuits, layouts like this can exist in data paths of semicustom designed circuits, where the placing of the cells is done automatically.
Design strategies
In the last Section, the delay implications of IR-Drop were drawn for different simulation setups. In the following Section, two design measures to reduce IR-Drop, and their implications of path delay and area overhead are discussed. As noted earlier, there exist numerous possibilities to change the resistance of the power grid. There we examined two examples to show their different impact on path delay and area overhead. The first design measure discussed in this Section is the increase of the line width of the lowest metal layer. Through the increased width of the power rail, its resistance decreases proportionally, which consequently results in a reduced IR-Drop.
A widening of the lowest power rails can be achieved in two ways. The first would be to increase the height of the standard cells. This has the advantage that the routing ability on the lowest level is not affected. However, the increase of line width results in a direct increase of the chip area. The other possibility would be to keep the standard cell height constant, but decrease the routing capability in the lowest metal layer. The area impact of this method can hardly be estimated in general, since the area overhead is dependent on whether a reduction of routability in the lowest metal layer is acceptable and if the routing can be done on higher metal layers. A widening of about 17% of the lowest power rail results for the OriChange16bit simulation in a reduction of 15% of the maximum voltage drop to 51.4 mV. The path delay increase due to IR-Drop is reduced to 5.1%, compared to 5.8% with the smaller power rail.
The second design measure to reduce IR-Drop is the doubling of the via count. Through this, the higher metal layers are better connected to the lower ones and the overall grid resistance is reduced.
The area impact of this design change is quite small, since on the lowest metal layer a via stack occurs only every 150 µm. Therefore, the routing capability on this layer is little affected. On the higher metal layers, things are even better, since routing usually is more relaxed on those layers and the pitches get even wider. The doubling of the via count results in a path delay increase of only 5.0% compared to 5.8% in the OriChange16bit simulation. The maximum voltage drop is reduced from 60.8 mV to 50.2 mV.
Therefore, better IR-Drop performance can be achieved by less area impact. This shows that for power grid optimization Adv. Radio Sci., 4, [197] [198] [199] [200] [201] [202] [203] [204] [205] 2006 www.adv-radio-sci.net/4/197/2006/ a detailed sensitivity analysis, which is beyond the scope of this work, has to be carried out for all design parameters, to find the best trade-off between IR-Drop reduction and area impact.
Conclusions
In this paper, the effect of IR-Drop in on-chip power grids and its implications on the path delay of a critical path has been examined. The simulations show performance degradation of up to 4% respectively 11% for a critical path in typical topologies. It should be noted that in a chip, effects of local IR-Drop are superposed, giving rise to higher worstcase voltage bounces. It was also shown, that the layout of circuit plays an important role, which caused differences in IR-Drop by more than a factor of two in our settings. Finally, design counter measures for reducing IR-Drop and a brief discussions of the area implications were presented. Altogether, the paper underlines the increasing importance of power supply noise in digital integrated circuit design.
