Statistical timing analysis for manufacturing variability requires modeling of spatially-correlated variation. Common grid-based modeling for spatially-correlated variability involves a trade-off between accuracy and computational cost, especially for PCA (principal component analysis). This paper proposes to spatially interpolate variation coefficients for improving accuracy instead of fining spatial grids. Experimental results show that the spatial interpolation realizes a continuous expression of spatial correlation, and reduces the maximum error of timing estimates that originates from sparse spatial grids For attaining the same accuracy, the proposed interpolation reduced CPU time for PCA by 97.7% in a test case.
Introduction
For coping with aggravation of manufacturing variability, stochastic performance estimation before fabrication has been eagerly demanded, and statistical static timing analysis (SSTA) has been intensively studied [1] - [4] . Delay times are expressed in statistical distributions, and signal arrival times are computed statistically. With SSTA, a relation between performance and yield can be predicted before fabrication.
For implementing SSTA, variability models of gates and interconnects are necessary. Manufacturing variability is often decomposed into die-to-die, within-die spatiallycorrelated, and random components [5] . Among these components, the within-die spatially-correlated component is the least tractable, since a large number of random variables and their covariance matrix are necessary to take it into account, which severely limited the analyzable circuit size in the past [4] . Later on, reference [2] proposed SSTA that models spatially-correlated component using a 2-D grid. The authors translate correlated random variables into uncorrelated random variables by PCA (principal component analysis), and improve SSTA efficiency. Another approach is a modeling with quad-tree proposed in [6] . Both approaches have different advantages and disadvantages [7] , however they share the same issue, that is the trade-off between accuracy and computational time required for modeling. When the number of spatial division increases, the spatially-correlated component is well reproduced, however it involves unwanted increase in computational time.
This paper presents a technique that mitigates dis- cretization error by spatially interpolating coefficients of principal components in PCA-based SSTA [2] . The interpolation enables a continuous expression of correlation even when the grid-based modeling is used for spatiallycorrelated variability. This paper is organized as follows. Section 2 discusses the conventional grid-based model of manufacturing variability, and points out its problem. Section 3 presents accuracy enhancement of the grid-based model using coefficient interpolation. Experiments in Sect. 4 demonstrate improvement of SSTA accuracy. The discussion is concluded in Sect. 5.
Problem of Grid-Based Variability Model
This section explains a grid-based spatially-correlated variability modeling used in SSTA, and demonstrates its problem.
Grid-Based Modeling of Spatially-Correlated Manufacturing Variability
Variation of a parameter that affects delay (F), e.g. gate length and threshold voltage, is often expressed as [5] 
where f 0 is the average value of F, and X g , X s , and X r are random variables whose average is 0. X g represents dieto-die variability, and it fluctuates uniformly within a chip. All elements on a single chip have the same value in terms of die-to-die variability. In contrast, X s and X r represent within-chip variability, and elements within a chip have different values. The within-chip variability consists of spatiallycorrelated variation X s and random variation that differs element by element X r (Fig. 1) . X s has stronger correlation between neighboring elements, and the correlation decreases as the distance increases, whereas X r fluctuates randomly independent of other elements. With X s component, relative placement of elements affects correlation between the element delays.
To take the spatially-correlated variability into SSTA, a model that can reproduce the variability with a reasonable accuracy is necessary. Reference [2] proposes a SSTA that takes into consideration the spatially-correlated manufacturing variability using PCA. We here explain how the variability is modeled in [2] . We first divide a chip spatially. Spatially-correlated component X s is discretized in a 2-D grid, and a random variable is assigned to each region. Within a region, the variability is assumed to be identical. After the variable assignment, a correlation coefficient matrix is constructed, and PCA is applied to the matrix. Random variable p i in region i is expressed as a sum of orthogonalized variable p j .
where μ i is the average of p i , σ i is the standard deviation of p i , λ j is the j-th largest eigenvalue of the correlation coefficient matrix, v i j is the i-th value of the eigenvector corresponding to λ j , and m is the number of the principal components. Applying the above grid-based modeling to F in Eq. (1), F is expressed as a linear summation of uncorrelated random variables.
where x j is the uncorrelated random variable whose average and standard deviation are 0 and 1, respectively, and it includes p j in Eq. (2) and die-to-die variation component. k i j is the coefficient of x j in region i, and δ corresponds to random component X r .
Error due to Discretization
Correlation coefficient between two elements continously changes in space by nature, while the grid-based model incurs discretization error inevitably. Figure 2 illustrates two examples of the discretization error. When elements A and B are placed adjacently, they often have a strong correlation. However, there is a grid boundary between them, and hence different random variables are assigned for them. Consequently, the modeled correlation becomes weaker than the actual one. On the other hand, though elements A and C are placed distantly, they belong to the same region, and hence the correlation coefficient between them is modeled as one. In this case, the modeled correlation is stronger than the actual one. These errors due to discretization are significant especially when the number of discretized regions is small, because the size of each region becomes larger, and the correlation between adjacent two regions becomes weaker.
In fact, to model spatially-correlated variability accurately, finer spatial discretization is necessary, which will be shown in Sect. 4.
Computational Cost
The finer grid improves the modeling accuracy. The lager number of random variables, however, increases CPU times of PCA and SSTA.
The computational complexity of PCA used in modeling is O(n 3 ), where the number of regions is n [2] . Table 1 lists CPU time for PCA that was performed using R [9] on a computer with Opteron 2.4 GHz processor and 16 GB memory. As the number of discretized regions increases, the required CPU time increases drastically. Memory usage is also a problem, since the memory space of O(n 2 ) is necessary to store the correlation coefficient matrix. The CPU time of SSTA is proportional to the number of principal components [2] .
This CPU time problem becomes severer especially when the chip area is large and the correlation distance is small. CPU time increases when more accurate analysis is necessary, and the accuracy degrades when CPU time is saved vice versa. Taking into account the trade-off between CPU time and accuracy, we have to model the spatiallycorrelated variability with a reasonable discretization.
Accuracy Enhancement by Interpolating Principal Component Coefficients
Grid-based modeling of spatially-correlated variability necessarily involves the discretization error as pointed out in Sect. 2. To solve this problem, we propose to use spatial interpolation for expressing continuous change of correlation and mitigating the discretization error. We interpolate principal component coefficient k i j in Eq. (3) using two interpolation techniques that are often used for image processing; bilinear and bicubic interpolations [8] . We here explain the coefficient interpolation via bilinear interpolation as an example. Dotted lines in Fig. 3 represent grid boundaries, and here let us compute coefficient k j corresponding to an element at point A. Bilinear interpolation uses values at neighboring four points.
Δx and Δy represent horizontal and vertical distances from the center of region (i x , i y )(= (1, 1) in Fig. 3 ) where point A is included, respectively. The distance between adjacent regions is normalized to one, and then 0 ≤ Δx, Δy ≤ 0.5. When simply interpolating coefficients using Eq. (4), the variance after the interpolation is underestimated. Figure 4 illustrates the reason. Suppose a and b are coefficient vectors (k i0 , k i1 , ..., k in ) and (k (i+1)0 , k (i+1)1 , ..., k (i+1)n ), and we interpolate a and b for simplicity, though bilinear interpolation uses four vectors. The norm of the interpolated vector ||ra + (1 − r)b||, where r is a weighting factor determined by distance, becomes smaller than the correct value, which leads to underestimation of variance.
We therefore compensate the underestimation by multiplying a constant to the coefficient interpolated by Eq. (4).
σ org is the standard deviation before the interpolation, and k j is the coefficient after compensating the variance. In the case of using bicubic interpolation, values at neighboring 16 points are used for interpolation [8] . The expression of the interpolation is described in Appendix. The compensation of the variance (Eq. (5)) is similarly performed after the interpolation.
From now, we will demonstrate the error of correlation coefficients is reduced by the proposed coefficient interpolation. Referring to a variability in a 90 nm technology [10] , we assumed that the correlation coefficient of spatiallycorrelated variability was dependent on distance, and expressed as e −2x , where x mm is the distance between two elements, referring to [10] . We chose two points in a chip randomly, and compared the correlation coefficients estimated by conventional grid-based model and the proposed model to the correct value. The results are depicted in Figs. 5, 6 and 7. The chip size and the grid size were assumed to be 5 × 5 mm 2 and 10 × 10, respectively. Figure 5 shows the correlation coefficients expressed by the conventional grid-based model. Due to the discretization, only a few discrete values are expressible. There are many dots where the estimated correlation coefficient is larger than the actual one in the upper part. On the other hand, Figs. 6 and 7 depict the correlation coefficients estimated with bilinear and bicubic interpolations. With the interpolation, the continuous expression of correlation coefficients is attained. Large errors of the original grid-based model found in Fig. 5 are improved both in Figs. 6 and 7. In the case of bilinear interpolation, the modeled correlation coefficients tend to be larger than the actual, because the function f (x) = e −2x is concave and the linear interpolation between any two points overestimates the correlation. On the other hand, this tendency is suppressed in the case of bicubic interpolation. Figure 8 shows the distribution of correlation coefficients to the center point. The interpolation smoothes the distribution. Figure 9 shows the distribution of correlation coefficient at y=2.5 to the center point extracted from Fig. 8 . We can see that the bilinear interpolation overestimates the correlation coefficient in all x region, whereas the bicubic interpolation reduces the amount of overestimation in the range of correlation coefficient from 0.25 to 0.8 and underestimates the correlation coefficient below the range. Figure 9 also suggests that the appropriate interpolation could depend on the function of the spatial correlation. Table 2 lists RMS (root mean square) error of the correlation coefficients, that is RMS value of (modeled correlation coefficient) -(actual correlation coefficient). The bicubic interpolation archived the lowest error whereas the RMS error was increased by bilinear interpolation. In the case that the spatial correlation is expressed by an exponential function, the bilinear interpolation is not appropriate.
Experimental Results
In this section, the proposed modeling with interpolation is applied to SSTA, and the SSTA accuracy is discussed.
Experimental Conditions
We implemented SSTA proposed in [2] with C++ language, and evaluated the accuracy. We assumed a 5 × 5 mm 2 chip in a 90 nm technology. Supposing V th had spatial correlation just as an example, V th variation of σ=25 mV was given. Note that the proposed interpolation is independent of variation parameters of the spatial correlation. The correlation coefficient was assumed to be dependent on distance x mm, and be expressed as e −2x . Other variability components, such as random and die-to-die, were not considered here. We used a benchmark circuit c1355 included in IS-CAS85 benchmark set. The number of cells was 329. We obtained a cell placement using a commercial P&R tool, and scaled the placement to two sizes; 0.25 × 0.25 mm 2 and 0.05 × 0.05 mm 2 . When a circuit is placed in a smaller area, more accurate model of correlation coefficient is necessary for estimating timing. The grid size was varied from 2 × 2 to 20 × 20.
When we use the grid-based model, the analyzed result varies depending on the circuit placement, even though the assumed variability is uniform, as pointed out using Fig. 2 . This is because the relative position between grid boundaries and cells fluctuates the timing estimates. We therefore placed the circuit at 8 × 8 positions within a single region, as depicted in Fig. 10 , and evaluated the timing estimates at each position. Thus, the total number of timing estimation for each circuit is 64.
Without Interpolation
Figures 11 and 12 show the timing estimates when the conventional grid-based model was used. The left and right graphs represent the mean and standard deviation (SD) of the circuit delay with respect to the 64 timing estimates, respectively. The error bars indicate the maximum and minimum values. The horizontal axis is the number of discretization per side. We can see the estimates converge to a value as the number of discretized regions becomes large. However, when the circuit area is small, the difference between the maximum and minimum is still large in Fig. 12 . In this case, the 20 × 20 grid is not sufficient. The timing analysis of a circuit in a small area is sensitive to the discretization error.
Let us examine the above result from a viewpoint of chip designers. The timing analysis of a small block requires finer spatial discretization. In contrast, the above result suggests that in the case of timing verification for inter-block signals, sparser discretization is sufficient. Therefore, the required fineness of spatial discretization, that is the physical dimension of a grid, is determined by the error of the small blocks, and it is independent of the entire circuit size.
With Interpolation
Figures 13 and 14 show the estimates when the bicubic interpolation was applied. The difference between the maximum and minimum values becomes small, which means the estimation is not sensitive to the relative placement between cells and grid boundaries. Figure 15 shows the maximum errors of the average and standard deviation in the case of 0.05 × 0.05 mm placement. The timing analysis with bicubic interpolation achieved more accurate estimation than that with the conventional grid-based model. Although the conventional model could reduce the error by increasing the number of discretized regions, the proposed model attained the same accuracy with the smaller number of discretized regions. For example, when the maximum acceptable error of the average and standard deviation is 3ps, the proposed model requires only 8 × 8 grid, whereas the conventional model needs 15 × 15 grid. In this case, the proposed model can reduce the CPU cost for PCA by 97.7%.
In the experimental setup in this paper in terms of chip size and function of correlation coefficient, the CPU time of PCA for 15 × 15 grid is affordable. On the other hand, the correlation distance of the spatial correlation becomes shorter and the chip size becomes larger, the number of grids increases and the CPU time of PCA explodes due to O(n 3 ) complexity. The SSTA run time also increases because the SSTA run time is proportional to the number of principal components [2] . In this situation, the proposed interpolation becomes more effective in the PCA run time. In addition, SSTA with the proposed interpolation has fewer principal components, though the runtime of SSTA increases due to the interpolation. Therefore, the overhead of SSTA run time is expected to be insignificant.
Conclusion
This paper discussed modeling for spatially-correlated variability, and presented an accuracy enhancement technique with spatial interpolation of principal component coefficients. We experimentally demonstrated that the interpolation enabled the continuous expression of correlation even though the grid-based modeling was adopted. We also verified the accuracy improvement of SSTA. Even when analyzing a circuit placed in a small area, the proposed modeling provided accurate timing estimates with reasonable grid fineness. From another aspect, the proposed model attained the same accuracy even when the number of discretization was reduced. In the test case, CPU time for PCA was reduced by 97.7%. Future work includes further investigation supposing more advanced technologies.
