This paper presents a probabilistic approach to model the problem of power supply voltage fluctuations. Error probability calculations are shown for some 90-nm technology digital circuits. The analysis here considered gives the timing violation error probability as a new design quality factor in front of conventional techniques that assume the full perfection of the circuit. The evaluation of the error bound can be useful for new design paradigms where retry and self-recovering techniques are being applied to the design of high performance processors. The method here described allows to evaluate the performance of these techniques by means of calculating the expected error probability in terms of power supply distribution quality.
Introduction
The problem of power supply voltage disturbances in the nodes of internal components in nowadays VLSI Gigascale integrated circuits is a major bottleneck for the technological trend of such circuits and CMOS technology in general. The supply voltage decrease together with the power consumption increase in modern circuits implies a huge increase of the current levels feeding the integrated circuit. At the same time the increment of both complexity and speed of circuits is also forcing the dI/dt factor toward greater levels. In spite of the multiple assignment of power and ground pins as well as the technological efforts to reduce the resistive and inductive parasitic components in packages and power distribution networks, the resistive (IRdrop) and inductive (LdI/dt noise) disturbances originated in the package and the power supply distribution network by the power supply current components are of main concern in integrated circuit design [7] . The disturbances and fluctuations of the power supply voltage of internal blocks cause power supply voltage noise and gate delay variations.
In general a VLSI design objective is to keep voltage fluctuations bounded by a given limit, usually considered in between 5 and 10% [7] , [3] , [9] in order to bound their corresponding impact on performance. To get this goal, designers try to evaluate all together packaging and integrated circuit power distribution parasitics, as well as the circuit energy demand. Then they introduce a set of distributed internal and external decoupling capacitors to compensate for inductance, and they design an optimum power distribution grid to reduce resistance. However, for future complex circuits the evaluation of both parasitics and energy demand will become more and more difficult because of, among other things, increasing fabrication tolerances in interconnections and devices for nanoscale circuits. The conventional design approach may then lead to a conservative design with an excessive and perhaps unaffordable cost of compensating components.
Previous works approaching the problem [8] identified voltage fluctuations as a key factor for high performance integrated circuits. Recently, a special relevance is being dedicated to the problem due to the critical impact of power supply voltage noise on circuit performances [5] . This effect is expected to be even more relevant for next years Gigascale Integration multi-core processors. In [10] , Zheng and Tenhunen defend the thesis that the peak of the noise is the most relevant factor to investigate the impact on performances, while in [5] , Saint-Laurent and Swaminathan defend that it is more general to consider the average supply voltage while a circuit is switching. In this paper a comprehensive approach is considered based on the modeling of voltage fluctuation as a random variable with a supposed function distribution. The mean value is given by the average IR-drop and the variance is dominated by the fluctuation swing caused by LdI/dt noise. In complex Gigascale circuits the unpredictable nature of fluctuations is better dealt with the approach here presented and as it will be shown, this approach allows the determination of the probability of potential transient faults caused by this un- predictable voltage noise. Previous papers [2] , [1] , investigated the effect of power supply voltage disturbances on the appearance of transient errors, modeled as delay faults. The paper here presented gives the calculation of error probability caused by such faults. From this approach, designers may take decisions about compensating techniques for a desired error bound. The evaluation of the error bound can be useful for new design paradigms where retry and self-recovering techniques are applied to the design of high performance processors [6] , [4] .
The structure of the paper is as follows. Section 2 discusses the characteristics of power supply voltage fluctuations. Section 3 investigates the impact of voltage fluctuation on circuit path delays, including an analysis of the impact on latch timing requirements. Section 4 presents the delay distribution, assuming power supply voltage fluctuations as the only cause of random fluctuation for a generic circuit. Section 5 calculates the probability of error through a timing violation analysis, giving results about the impact of the mean and variance of the noise as well as the circuit designing parameters. Finally section 6 presents the main conclusions of the paper.
Power Supply Noise in Gigascale Circuits
This section very briefly reviews the main characteristics of power supply voltage noise, based on the elements that cause it. Figure 1 shows the simplified schematic of the power distribution network of a VLSI circuit. There are RLC parasitics associated to package and on-chip power distribution lines. Essentially, the power supply voltage noise can be considered as a forced voltage oscillation due to the RLC components where the excitation is the current consumption of the circuit. Therefore, the importance of power supply voltage noise will depend on the current waveform and its frequency spectrum in relation with the resonance frequencies associated to the power distribution network.
The current consumption can be conceptually divided into the sum of two components: one high frequency waveform representing individual gates switching with times of the order of ps for modern and future technology nodes, and one piecewise linear current waveform that represents different functional blocks of the circuit turning on or off (with times of the order of 10 ns).
The decoupling capacitance (on-chip and on-package) reduces power supply voltage noise by making a low resonance frequency of the power distribution network. Therefore, the high frequency current excitation has a small contribution to noise, and the block current consumption generates slow voltage fluctuations compared to clock period, each time there is a change in current consumption (blocks turning on or off).
From the point of view of a digital synchronous circuit, power supply voltage noise implies that each clock period the value of V DD is different. Thus, it is possible to consider the value of V DD as a random variable with a probability density function that can be experimentally obtained from a histogram of the power supply voltage noise waveform with the circuit operating for a long time.
A gigascale circuit with a large number of complex blocks and modes of operation will have a histogram of V DD that can be very reasonably approximated by a gaussian distribution. In this work, we therefore assume that V DD is a random variable with gaussian distribution. In this gaussian distribution, the mean µ is essentially determined by the importance of IR-drop (mean current over time, times equivalent resistance of the power distribution network), and the variance σ is mostly influenced by the importance of LdI/dt noise.
Impact of Power Supply Noise in Path Delays
The main effect of power supply voltage noise in a synchronous digital circuit is to cause a timing violation in a latch during a clock period when the V DD is smaller than nominal value. In this situation, the path delay of the combinational path between latches will increase. The latch timing characteristics are also affected by the value of V DD . In the following sections, both gate delay and latch characteristics are investigated with respect to V DD for a 90-nm technology.
Impact on Gate Delay
Due to the slow variations of V DD with respect to clock period, we assume that all the gates in a combinational path have the same value of V DD , that will vary from period to period. In order to obtain the distribution of gate delay due to power supply voltage noise, it is necessary to find the dependence of gate delay with V DD . This dependence can be analytically obtained only for very simple MOS models that are not accurate enough for nanometric devices. Therefore, we obtained this dependence by HSPICE simulations for a 90-nm technology with 1 V nominal voltage.
We simulated a 3-stage ring oscillator structure to avoid the dependence on the delay of the input voltage waveform and obtain more accurate results. Several gates are considered for the ring: NOT, NAND, NOR, and XOR. Each ring is simulated several times, each time with a different value of V DD , ranging from 0.65 V to 1.2 V. This is a wide range considering that the predicted percentage power supply voltage variation for this technology is 10% [3] . Figure 2 shows the results of the delay of each gate type obtained from the delay of one period of the ring oscillator output (measured at cross by 50% V DD ).
Impact on latch Setup Time
Not only combinational paths like the ones studied before are affected by the power supply voltage noise generated by the rest of the circuit. A latch presents the problem of meta-stability, and this aggression may result in a higher zone of uncertainty of the value latched. Depending on the V DD value at the instant a clock transition takes place, the decision made by the latch will be different. We therefore will have a certain probability of error at the output.
In our work, we have determined the critical temporal zone where the impact of power supply voltage noise on the probability of error of a latch will be maximised, that is around the critical setup time. It is important to point out that this time is not the same as the conventional setup time given by circuit manufacturers. Considering a free noise environment, we defined the critical setup time as the time that a value has to be on the input of the latch before the clock transition in order to ensure that the value will be latched correctly. It is the frontier between a correct output value 
Critical Setup (ps) and a latch error due to meta-stability. Around this threshold time there will be a certain time range where the error will highly depend on noise. Manufacturers add a security margin when defining the setup time to almost completely avoid the error in the presence of noise. In order to evaluate the impact of power supply voltage noise on latch meta-stability, we have simulated the latch in HSPICE considering a master-slave scheme of the conventional latch shown in Figure 3 with a nominal supply voltage of 1 V. Around this value we applied the same range of values for V DD as in the combinational gates (0.65 V to 1.2 V).
The results for the simulations are shown in Table 1 and plotted in Figure 2 where we can see that the dependence between power supply voltage and critical setup time is significant.
Timing Probability Distribution
The result of the previous section gives a function relating power supply voltage either with gate delay, or with timing characteristics of a latch. In order to obtain the desired error probability due to timing errors, it is necessary to obtain the probability density function (pdf) of timing characteristics: gate delay and setup time of the components in the circuit.
As section 3 shows, the relation between voltage and timing characteristics is nonlinear. Therefore, the pdf for the timing characteristics will not be gaussian as the pdf for power supply voltage, but must be derived from it either analytically or numerically. In our study, the dependence voltage-timing is obtained numerically, and so will be the timing pdf.
First, the probability that the delay of the combinational path is between two arbitrary limits t 1 and t 2 is equal to the probability that the circuit operates between the corresponding voltages V 1 and V 2 :
where t 1 < t < t 2 , V 1 > V > V 2 and V 1 and V 2 are the corresponding voltages of t 1 and t 2 respectively. The timing pdf can be numerically approximated by the probability of finding a time between two close limits
and
so that the numerical version of expression (1) is obtained:
Under our assumption, p v (V ) is gaussian, and so a simple Matlab script is used to calculate numerically all the timing pdf of the different gates considered in this paper.
Error Probability Computation
Once the timing pdf is calculated, the error probability of a timing violation can be calculated by numerical integration. Let us consider a typical pipeline stage, with a combinational delay chain and a latch, as shown in figure 5 . In this structure, a timing error occurs when the combinational delay (t G ) plus the setup time of the latch (t S ) exceeds the clock period (T clk ). Under our approach, both t G and t S are random variables that have a timing distribution derived as explained in section 4. As a first approach, we do not consider clock skew or clock period variations. 
Formulation
In general, calculating the error probability for a pipeline stage is a complex task as it is necessary to calculate the delay probability for each gate and then calculate the chain delay probability. However, it is possible to simplify this process if we assume two properties of these chains:
First, that the normalized timing pdfs are equivalent. It is expected to find a different pdf for each type of gate or latch. However, if we normalize the time axis by the mean time of the corresponding distribution for each gate, all the pdfs for the different gates are nearly identical for a given V DD gaussian distribution. Figure 4 shows the normalized pdf for NOT, NOR, NAND and XOR gates and also for the latch critical setup time. The pdfs are very similar and may reasonably be considered identical for the calculation of the error probability.
The second assumption is that the switching time for the gates are small compared to the period of the ground bounce noise. This allows to consider all the gates in a chain to be affected by the same noise value. Several simulations demonstrate that the error introduced by considering all the gates in the chain and the final latch affected by the noise average during the propagation delay is very small. Considering both facts, the condition for having a delay error simplifies to where δ stands for the normalized timing random variable with a pdf p(δ) derived from the gaussian distribution of V DD as explained above. Parameters t G and t S are the mean values of total gate delay and the latch setup time respectively. All these parameters are dependent on the power supply voltage distribution. As discussed in section 2, we consider Gaussian power voltage noise distributions with a given mean (µ vdd ) and standard deviation (σ vdd ). With these considerations, the error probability P r(E) for a pipeline stage reads
In summary, given a gaussian distribution of V DD with given mean (µ vdd ) and standard deviation (σ vdd ), the procedure to calculate the error probability needs four steps:
1. Obtain the absolute delay distributions of the pipeline components from the V DD gaussian distribution.
2. Obtain t G and t S from the mean of the delay distributions.
3. Generate the δ distribution by normalizing each delay distribution with respect to its mean.
4. Integrate the δ distribution taking as the lower integration limit T clk /(t G + t S )
Illustrative example
Numerically integrating eq. (6) it is possible to calculate the error probability given a certain power supply voltage pdf. In our illustrative example we consider a delay path composed by 5 gates (NOR, NAND, XOR and two NOTs). Adding the latch setup time the total delay under nominal conditions (V DD of 1V) is 306 ps (3.27 GHz). Figure 6 depicts the timing error probability for this stage under different conditions of supply voltage considering a clock period of 340 ps (2.94 GHz) which gives a security margin of 10% of the nominal maximum frequency. We consider different mean values for the supply voltage (which consider the IR drop and the control actions of the supply system). Instead of considering the standard deviation, we use a normalized parameter 3σ vdd /µ vdd which represents variation over the mean value. The plot shows the error curves for mean V DD values ranging from 1.1 to 0.9 V with noise amplitudes (3σ vdd ) from 0 to 30% of the mean. To ensure a correct behavior it is necessary to control the noise amplitudes below a certain value that is greatly affected by the mean V DD value. For example, under nominal conditions (V DD =1 V), the noise amplitude must be less than 0.1 V for an error probability less than 0.22%. Otherwise, if the mean value is 0.9 V, the allowable noise amplitude for a similar error probability (0.25%) must be below 0.009 V, while keeping the previous noise amplitude budget of 0.1 V increases the error probability to a huge 39%. The solution in this latter case would be to increase the clock period margin by decreasing its frequency and therefore the circuit performance.
It is also possible to calculate the time security margin (T clk /(t G + t S )) given a certain supply voltage noise characteristics. Figure 7 shows the error probability for these chains given a mean V DD value of 0.95 V and different noise amplitudes (from 1 to 30%). To ensure an error probability of the stage smaller than 0.20% with a 10% noise amplitude, it is necessary to provide a time security margin larger than 22%.
Conclusions
This paper presented the calculations to evaluate the probability of a timing error in a digital synchronous circuit stage due to power supply voltage fluctuations. By considering power supply voltage as a gaussian random variable for each clock period, it was found that there is nonzero probability of timing error, that becomes larger with decreasing power voltage quality (lower mean and larger standard deviation).
It was found in the examples shown that a decrease in the mean voltage needs an even larger decrease or control of standard deviation to maintain performance. This shows the connection of the IR-drop phenomenon and the corresponding countermeasures with the LdI/dt noise, and how designers should consider both in a comprehensive way: a great effort in reducing IR-drop is compensated by a slightly reduced effort in controlling LdI/dt noise.
The method described in this paper makes it possible to quantify the characteristics for a given allowed error probability. Traditionally, there are two ways to achieve the error probability target: one, to improve the power supply voltage quality (increasing cost) and two, to increase the clock period for a better safety margin (decreasing performance).
Several novel design techniques based on retry and selfrecovery of errors try to avoid the rough cost-performance trade-off and therefore allow a lower cost power distribution network design at an improved performance. These techniques tolerate a given error probability at the cost of recalculation of the data. The method here described give a way to evaluate the performance of these techniques by means of calculating the expected error probability in terms of power supply distribution quality.
Therefore, the probabilistic approach considers the digital design problem from a more global point of view. It can also be applied to the problem of manufacturing process variations, that become increasingly important in future technologies and was not considered in our paper. Also, the clock period was considered deterministic in this paper. A further elaboration would consist in considering it also as a random variable, statistically independent of the delay variables.
