Abstract
Introduction
In deep submicron (DSM) technology, power supply analysis has become increasingly important in predicting the realistic worst-case delays in integrated circuits [1] . Fluctuations of 10% in power/ground supply voltage can cause the delay for standard gates to vary by up to 30% in 130 nm technology [ 2 ] [ 3 ] . Newer technologies have increased delay sensitivity to supply noise, due to reduced gate overdrive. In addition to causing reduced circuit performance, supply noise can cause functional failure. Therefore, the analysis of power supply noise has become inevitable in timing analysis.
Generally, the supply voltage noise is due to both the parasitic resistance (IR) and inductance ( ) / L di dt ⋅ of onchip and package interconnect. The on-chip power grid is predominately resistive, with its noise produced by IR drop. Package interconnect has a higher inductance, so its noise is generated primarily by / L di dt ⋅ effects. At faster gate switching speeds and higher circuit density, on-chip inductance must be taken into account [3] .
Power supply voltage analysis has been addressed through vector-based and vectorless approaches. Many vector-based approaches [1] [4] [5] [6] [7 ] use genetic or other algorithms to find a set of input vectors which cause the maximum voltage drop on certain targeted regions, whereas the vectorless approaches [2] [8] [9] use circuit timing and functional information, a superposition method and supply current constraints. Previously, supply noise analysis has often used logic simulation. However, in future DSM technology, the vectorless technique is expected to be favored due to the cost of simulation.
We propose two novel vectorless approaches to incorporating supply voltage noise analysis into static timing analysis (STA). These approaches use a set of vectors produced by input pattern generation methods to statistically estimate the realistic power supply noise. We use a supply noise modeling approach developed for delay test generation [4] . Due to the correlation of supply voltage noise between circuit blocks, we adopt the Principal Component Analysis (PCA) technique [11] . The PCA technique not only identifies a small set of uncorrelated parameters that explain most of the noise for the circuit blocks, but it also transforms the set of correlated parameters into a set of uncorrelated parameters. Once we obtain all uncorrelated supply voltage variation across the chip, we can determine delay distributions corresponding to the supply voltage distributions for all individual gates on the chip by using a linear delay model [10] and the sensitivity of delay to supply voltage. Then, we can perform statistical static timing analysis by propagating delay distributions along the longest paths on the chip, and finding the maximum of these distributions. We avoid the abbreviation SSTA, since it is usually associated with statistical process variations, rather than statistical supply voltage variations. This paper is organized as following. In section 2, we introduce a static technique to estimate the supply voltage noise distribution across all regions on the chip. We then perform statistical static timing analysis with the supply voltage noise distribution in section 3. Section 4 shows experimental results of our approach on ISCAS89 benchmark circuits. Finally, a conclusion is provided in section 5.
Estimation of Voltage Noise Distributions
A fast, accurate estimation of the power supply noise distribution is essential for accurate timing analysis. In this section, we present a vectorless approach to estimate the power supply noise distribution based on both the simplified power region model and the circuit switching model proposed in [4] .
Power Region and Switching Models
Because RLC network analysis is expensive, a simplified power region has been proposed. The maximum voltage drop max V ∆ in a region during a clock cycle can be estimated with several approximations described in [4] :
where d C and p C are respectively the single lumped decoupling capacitance and the total parasitic capacitances of devices and interconnect connected to the power supply network in a region. The switching current, denoted as We also employ the circuit switching model, which is similar to the approximated models proposed in [3] [10], to estimate the switching current. The switching current consists of leakage current and charging/discharging current. We do not discuss leakage current further, since we treat it as constant, and analyze its voltage impact with a one-time IR drop analysis. As shown in Figure 1 , switching current is approximated by a piecewise linear current waveform, which is triangular for small load capacitances and trapezoidal for large load capacitances.
Statistical Model for Supply Voltage Noise
The chip is divided into a rectangular grid, with gates assigned to the regions where they connect to the power grid. We run zero-delay logic simulation on the circuit with three different types of input patterns: MC (Monte Carlo approach), AAC (analytical approach with care bits), and AAR (analytical approach with random bits). We first generate test patterns propagating transitions along the top 200 longest paths in the circuit using the CodGen ATPG tool [14] . In the MC approach, all "don't care" bits in the test patterns are randomly filled. In the AAR approach, for the purposes of noise analysis, all input bits are random, including the care bits. In the AAC approach, a set of care bits are selected as follows: we choose the path, among the 200 longest paths, which had the highest probability of being the critical path among 1000 MC samples. The "don't care" bits are randomly filled.
The input patterns for MC, AAR, and AAC are simulated to obtain statistical parameters of supply voltage noise distributions for each region. We assume that the random variables for each region are Normal. Because of correlations in voltage noise distributions between regions, we employ the PCA technique. The PCA method transforms a set of correlated random variables
with a covariance matrix M into a set of uncorrelated random variables
that any random variable i x X ∈ uu r can be expressed as a linear function of the principal components with 0 mean and 1 variance in ' X uur :
where µ i and σ i are the mean and the standard deviation of 
STA with Power Supply Noise Variation
With the statistical parameters from the fast power supply noise analysis, we can statistically evaluate the performance of the circuit. Here, we consider temporal and spatial supply voltage noise variation.
Temporal and Spatial Voltage Variation
Because power supply noise and logic gate switching times are both uncertain, it is very difficult to determine the supply voltage at the time a logic gate switches. We adopt the approximation proposed by Wang [4] , using the average of the initial and worst-case supply voltages during the clock cycle.
In addition to temporal variation, supply voltage has spatial variation. If driver and receiver gates are far enough apart, they can have different supply voltages. This can significantly affect the gate delay because the charging/discharging current heavily depends on the input supply voltage. Hashimoto [12] proposed PG level equalization -after equalizing input supply voltage and gate supply voltage, the output load capacitance is increased/decreased by the same ratio. However, we found Times (ns) I peak I peak Current (A) that Hashimoto's method does not work well over our range of output loads and input slopes. We obtained more accurate results by equalizing the input and gate supply voltage without changing the output load capacitance.
Gate Delay Model and Path Computation
We employ the gate delay model proposed in [12] 
where V µ and V µ+σ are the mean and (mean + standard deviation) of a random variable V, respectively, and δ is the sensitivity of delay versus voltage. Once individual gate delay random variables are computed, the longest path computation as well as circuit performance analysis with the longest paths can be easily done using the sum and max functions of PCA properties described in [11] .
Experimental Results
In this experiment, ISCAS89 benchmark circuits are implemented in 180 nm with 1.8 V static CMOS technology. We use the CodGen ATPG tool [14] to automatically generate a set of longest paths and corresponding set of path-dependent input patterns. The input patterns consist of "don't care" bits and care bits, where the care bits sensitize the longest paths.
In our first experiment, we first validate the accuracy of the Monte Carlo approach which employs the simplified power grid and gate delay models, by comparing results with Cadence Spectre simulation, denoted as SS. For validation, we select the path with the highest probability of being the longest from the combinational version of ISCAS89 benchmark s1488. All "don't care" bits in the sensitizing pattern are randomly filled in our first experiment. A set of these randomly filled patterns is generated, and simulated, to obtain the simulated voltage drop distribution. Figure 2 shows the distributions of supply voltage drops across circuit s1488 circuit using the MC and SS methods. Whereas the SS voltage drop distribution can be approximated as a Normal distribution, MC distribution cannot. One possible reason the MC distribution is not Normal is the very small number of gates (673) in circuit s1488. Figure 3 shows that the MC distribution of voltage drops in the larger s38417 circuit is close to Normal. Although the means of the MC and SS voltage drop distributions are close, there is a large difference in their standard deviations. The simplified power region, approximated circuit switching models, and the small number of gates in s1488 are possible explanations for this difference. As shown in Figure 4 , we compute the realistic worstcase delay distribution of the longest path in s1488 circuit, using the voltage noise computed using MC and SS approaches. The differences between MC and SS worstcase delay distributions are much smaller than those between the voltage drop distributions in Figure 2 . In other words, the differences in the standard deviations of voltage drop distributions in Figure 2 have little impact on the delay distributions. This may be because of the relatively low sensitivity between delay and supply voltage in the 180 nm technology, or due to an averaging effect. We will further investigate this in much larger benchmark circuits in the future. The results in Figure 2 and Figure 4 are summarized in Table 1 . In our second experiment, we apply the MC calculation for a large number of input patterns, to compute the circuit delay distribution. Circuit simulation is too expensive to generate this large number of samples. We then compare these results to our proposed analytical approaches. Unlike the first experiment, we use the 200 longest paths in each circuit, and also group gates into regions. Here, we use 9 regions for each benchmark circuit.
The MC approach computes the max of the 200 path delay distributions numerically in the STA computation. This should provide higher accuracy than the analytical max approximations commonly used in statistical static timing analysis, but at higher cost. Since the MC approach is time-consuming, we developed two types of analytical approaches: with and without considering the care bits required to propagate transitions on the longest circuit paths, denoted as AAC and AAR, respectively.
As described in Section 2, AAC first performs a statistical voltage noise analysis with care bits. We then use PCA to transform the set of correlated voltage random variables across regions on the chip into a set of uncorrelated voltage random variables. Given the uncorrelated voltage random variables, we employ the gate delay model as well as the sensitivity model to compute the gate delay distribution. Then, we propagate all gate delay distributions using the sum operation along the 200 longest paths. Finally, we obtain the circuit performance by applying an analytical max operation to all 200 path delay distributions. Since we do voltage noise analysis and timing analysis statistically, we only need to perform this analysis once for each benchmark circuit. That is why this approach is very fast when compared to MC. The AAR approach is identical to AAC, except that all input bits are random. Table 2 shows the means and standard deviations of delay distributions calculated by the MC, AAC, and AAR approaches. The µ and σ from AAR for circuit s1488 are farther off from the MC values than those for circuits s35932 and s38417. As with the first experiment, this may be due to the small size of s1488. It can be seen that using random inputs rather than longest-path care bit values causes an underestimate in mean delay and overestimate in standard deviation.
We see a similar, but less severe phenomenon in s35932. There is little difference in mean delay, but AAC reduces the error in σ by 36 % compared to AAR. This is surprising considering that only 0.2% of the input bits are care bits. This small number of bits causes a noticeable change in supply noise and delay variation. Unlike in s1488 and s35932, AAR results for the worst case delay µ and σ match that of the MC results in circuit s38417. Figure 5 illustrates the accuracy of the analytical AAC and AAR approaches versus the MC numerical approach in the s35932 benchmark circuit. One reason for the accuracy in s35932 may be that the larger number of gates causes the noise to appear more random, and the longer paths causes more averaging. The other reason may be that the care bits for the longest paths in this design have less impact on supply noise. Note that even though the noise-induced delay variation in Figure 5 is small, this is only due to the power grid design, and does not affect the accuracy of the analysis technique. In circuits s35932 and s38417, the data in Table 2 shows that the σ values for the AAC approach are still off from the σ values from the MC method. This is because in these circuits, there are multiple paths with significant probability of being the longest path for any one random pattern. The standard deviation errors can be reduced by intelligently deciding which care bits should be used in the analysis. Since AAR and AAC both take the same amount of CPU time, if the care bit information is available, then the AAC method is preferable to increase the accuracy of the σ calculation. The CodGen ATPG time for these circuits was only a few seconds, so obtaining care bits should usually not be a problem.
Finally, we observe that both AAR and AAC are much faster than the MC approach. Statistical power noise analysis and timing analysis reduce run time. Thus, the analytical approaches can be very helpful for quickly estimating the impact of supply noise during early design phases.
Conclusions
In deep submicron technology, power supply noise analysis must be performed during timing analysis. Supply noise analysis has often used a vector-based approach. However, this is very expensive, particularly during the early design phase. In this paper, we introduce novel vectorless approaches, with and without considering care bits, which sensitize the longest paths in the circuit. These methodologies can be used efficiently and accurately to estimate the delay increases due to power supply noise. Our experiments on ISCAS89 circuits also demonstrate the importance of a small number of care-bits during the power supply noise analysis.
