Abstract-Within-die process variations arise during integrated circuit (IC) fabrication in the sub-100nm regime. These variations are of paramount concern as they deviate the performance of ICs from their designers' original intent. These deviations reduce the parametric yield and revenues from integrated circuit fabrication. In this paper we provide a complete treatment to the subject of within-die variations. We propose a scan-chain based system, vMeter, to extract within-die variations in an automated fashion. We implement our system in a sample of 90nm chips, and collect the within-die variations data. Then we propose a number of novel statistical analysis techniques that accurately model the within-die variation trends and capture the spatial correlations. We propose the use of maximum-likelihood techniques to find the required parameters to fit the model to the data. The accuracy of our models is statistically verified through residual analysis and variograms. Using our successful modeling technique, we propose a procedure to generate synthetic within-die variation patterns that mimic, or imitate, real silicon data.
I. INTRODUCTION
The advanced sub-wavelength semiconductor fabrication techniques have resulted in nanometer feature sizes with a substantial amount of process variations. These process variations are the result of the inability to robustly print geometric features [3, 11] and the inability to precisely control the diffusion of dopants [3, 13] . Process variations translate to variations in the key electrical parameters of circuit devices and interconnects, which increase the uncertainty in the outcome from the design process, and consequently jeopardize the parametric yield of the fabrication process. Process variations are typically divided into two components: inter-die and intra-die [3, 9] . Inter-die variations account for variations that arise between different chips in the same wafer or different wafers, while intra-die, or within-die, variations account for variations that arise between different devices and interconnects that reside within the same chip.
To cope with process variations, designers attempt to characterize the underlying sources of variations and then either apply statistical techniques [1, 5, 10] or design guard bands [16] . It is also possible to cope with these variations in postmanufacturing and during operational time [7] . Before developing or applying solutions to process variations, it is first fundamental to characterize, or to develop models, for process variation trends, and then use these models to derive design and manufacturing. While process variations are random in nature, within-die variations typically exhibit spatial correlations, i.e., devices that are spatially close to each other are likely to be more strongly correlated than devices that are spatially far from each other. This correlation has been the subject of a number of recent works [6, 2, 17, 8] .
The objective of this paper is to develop a complete treatment to the subject of within-die process variations. We develop accurate statistical modeling techniques that fit realistic variability trends that we extract from 90nm chips. We also propose applications for our model. The contributions of this paper can be summarized as follows.
• We propose and implement a system, vMeter, to extract process variation data from sample 90nm chips. This extraction of within-die variations from actual silicon chips provides a solid basis for the development of realistic statistical modeling and estimation techniques.
• Based on the concepts of Gaussian random fields, we propose novel statistical analysis techniques that accurately fit a model to data. In particular, we propose a generic statistical model, the Matérn model, with enough flexibility to capture different within-die variation trends. We also propose the use of maximum-likelihood estimation techniques to calculate the required parameters of the proposed model. We thoroughly verify the accuracy of our models against the extracted data using statistical techniques such as the Kolmogorov-Smirnov test and statistical variograms.
• As an application of our model, we develop an algorithm that can generate synthetic variability trends that mimic within-die variations of real chips. Our algorithm is randomized and thus can be used to generate as many synthetic trends as required. The proposed algorithm provides a useful method to drive future research with realistic process variation models.
The organization of this paper is as follows. Section II overviews the previous research work that is related to this paper. Section III describes the components of our data extraction system (vMeter). Section IV develops the main concepts behind the statistical modeling techniques proposed by this work, and Section V shows how to use maximum-likelihood estimation techniques to find the best parameters for our models. Section VI proposes a number of techniques to thoroughly verify our model. In Section VII, we propose a method to generate synthetic within-die process variation trends that mimic realistic chips. Section VIII provides an extensive set of results from our model and ten sample chips. Finally, Section IX summarizes the main conclusions of this paper.
II. PREVIOUS WORK
A ring oscillator (RO) is a simple yet powerful device in measure process variations. It consists of an odd number of inverters cascaded sequentially in a feedback loop. For a given ambient temperature and a number of inverter stages in a RO, the frequency of oscillation of a RO depends on physical manufacturing processes [14, 11] at the location of the RO. The frequency of a RO represents a lumped value of all process variations regardless of their source. To measure the within-die process variations over a chip, it is necessary to position a number of ROs in a number of locations, and then use additional circuitry to connect the ROs in order to transfer the within-die variation data to where it is stored or processed [4, 12] .
The measured variations from a test structure at a particular die location can be considered as a random variable that takes different values depending on the considered die. The collection of random variables that represent the results of all test structures form a stochastic process or a random field that is spatially indexed by locations of the test structures. A number of recent efforts investigate how to model within-die variations [6, 2, 17, 8] . Friedberg et al. [6] design critical dimension test structures to capture the variations in gate length, and then model the correlations between the variables of the resultant random field using piece-wise linear functions. Bhardwaj et al. [2] propose reducing the number of variables in the random field by using the Karhunen-Loève expansion to write the field as a series expansion of uncorrelated random variables. The uncorrelated random variables are fewer in number than the original correlated random variables, which reduces the complexity of the problem. Xiong et al. [17] propose modeling the correlations in the random field using exponential and "general" functions (that are reducible to the "Matérn" model). The parameters of the model are found through constrained nonlinear optimizations. As [8] notes, none of the models of [2] and [17] are applied to measured data, so further validation is still to be carried out. Recently, Liu [8] proposes the use of correlograms and variograms to model the spatial variations, and where the model parameters are determined through generalized least square fitting that is solved using Nelder-Mead simplex method. Table I gives a comparison between the previous approaches and the proposed approach in this paper.
III. VMETER: A SYSTEM TO EXTRACT WITHIN-DIE VARIATIONS
Realistic modeling of within-die process variations must start by first acquiring or extracting raw process variations data from silicon chips. In this section we briefly describe our process variation acquisition system vMeter.
Our main device in measuring the within-die variations is the ring oscillator (RO). As the output frequency of a ring oscillator is sensitive to the inherent process variations of the chip, a RO frequency provides a succinct signature that determines the speed of a die at any desired location [4, 12, 11] . To measure the variations across all locations on a die, it is necessary to cover the entire die with ROs and connect them in a way that facilitates the automated extraction of their signatures. Towards that goal, our RO circuitry ( Figure 1 ) is designed similar to a scan chain [4, 14] , where the ROs are sequentially enabled one at a time for a sample period, during which the RO frequency is measured using a frequency counter and stored in a memory subsystem. Enabling one RO at a time ensures minimal current consumption, which reduces the runtime variations on the power supply network that can introduce noise in the measurements. Sequentially chaining all the ROs also reduces the bandwidth needed to transfer the signatures of all ROs to the analysis circuitry.
A block diagram of our overall extraction circuitry is given in Figure 1 . The circuitry consists of n 1 × n 2 ROs, where each RO occupies one of the numbered tiles and connects to its subsequent neighboring RO in the chain via two signals: the scan-enable signal which turns one RO at a time and the scan-output signal which carries the oscillatory output of each RO tile down the scan chain and ultimately to the frequency counter. The scan clock signal advances the enabled RO one at a time. The frequency counter counts the number of pulses it receives from the enabled RO tile for a sample period that is synchronized with the RO outputs via a synchronization unit. The ring oscillator tile, given in Figure 2 , forms the backbone for the within-die measurements. Each RO tile uses only local interconnects, and thus any variations in the RO frequency are mainly contributed by physical device variations. Each tile consists of a string of an odd number of inverters and the control circuitry. The flip-flop holds the value of the scan enable signal that controls the operation of the RO. The scan output and scan enable of the RO are connected to the scan input and scan enable signals of the subsequent RO tile as part of the scan chain.
To acquire accurate measurements and reduce the impact of switching noise on the power-ground network, we reduce the length of the RO scan chain using interleaving. A long scan chain implies that the output oscillation of each RO will have to travel down the chain, creating unnecessary switching noise on the power supply network. Thus we interleave the outputs of the RO tiles, by dividing the chain into columns as shown in Figure 1 , where the scan output of each column is not chained to scan output of the next column, but rather to the outputs of other odd (or even) RO columns. More complex schemes can be used to interleave the odd and even rows together.
We implement our system into ten sample chips at 90nm technology. Figure 3 visually shows the within-die process variations for four chips, where it is clear that the variation trends exhibit systematic and random components that are unique to each chip. Our goal in the subsequent sections is to develop a statistical analysis technique that can accurately model these trends.
IV. MODELING WITH GAUSSIAN RANDOM FIELDS
The die can be considered as consisting of a grid of n 1 by n 2 locations. A location on the chip will be denoted by l = (x, y) where x is the horizontal coordinate and y is the vertical coordinate. The delay at a location l on chip i is denoted D i (l). Each D i (l) will be considered as a random variable with mean μ i , where μ i does not depend on the location l but is dictated by the inter-die variations and varies from chip to chip. The delays D i (l) and D i (l ) at any two locations l and l on the same chip i will be correlated. The correlation is typically strong at nearby locations and weak for locations far apart. To fully describe the intra-die variations we assume that the collection of random variables
representing the delays at all different locations on chip i, form a random field. The random field D i is assumed to be Gaussian with mean μ i . That is, the delays at any vector of locations
) has a multivariate Gaussian distribution with mean μ i .
To impose further structure the Gaussian random field is assumed to be stationary and isotropic. This implies that the variance σ 2 i of the random variable D i (l) does not depend on the location l, and that the covariance between two locations l and l only depends on the (Euclidean) distance h = l − l between l and l . Then the distribution of D i is completely determined by its covariance function, which can be written as
The parameter σ i > 0 is a scale parameter (σ 2 i is the variance of D i (l)) and the function i is called the correlation function. Note that the mean μ i , the scale parameter σ i , and the correlation function i all depend on i and may be different for different chips.
In many cases it is convenient to represent the random field D i as a random vector of length n = n 1 × n 2 . The random vector, which also will be denoted D i , is constructed such that the delay
To fully specify the model for intra-chip variations it only remains to specify a valid correlation function i . Two models will be considered: the exponential model and the Matérn model, the first actually being a special case of the second.
A. The exponential model
A simple and natural model that allows for correlation between different locations is the exponential model. For this model the correlation function decays exponentially as a function of the distance h = l − l , i.e.
6B-2
Note that as λ i increases the correlation decays faster as a function of the distance. In this respect λ i can be interpreted as the strength of correlation. Under this model, the random field D i has three parameters; the mean level μ i , the scale parameter σ i , and the strength of correlation λ i .
B. The Matérn model
The exponential model is attractive because of its simplicity but it is not very flexible in capturing a wide range of correlation structures. Another popular and more flexible class of correlation functions is the Matérn class [15] . In contrast to the exponential class the Matérn correlation function is parameterized by two parameters, θ 1i > 0 and θ 2i > 0, and has the functional form
where K α (·) denotes the modified Bessel function of the second kind of order α, and Γ(·) denotes the Gamma function. The parameter θ 1i can be interpreted as the rate of decay of the correlation as a function of distance. Large values of θ 1i lead to faster decay of correlations as distance increases. The parameter θ 2i controls the general shape of the correlation function and in particular the behavior of the correlation at small distances. An attractive feature of the Matérn class is that it contains three important and widely used correlation functions as special cases; the linear model (θ 2i → 0), the exponential model (θ 2i = 0.5), and the Gaussian model (θ 2i → ∞). Under the Matérn model the random field D i has four parameters; the mean level μ i , the scale parameter σ i , and the correlation parameters θ 1i and θ 2i .
V. PARAMETER ESTIMATION USING MAXIMUM

LIKELIHOOD
To fit the exponential model or the more general Matérn model to particular within-die measurements of a sample chip, one has to find the model parameters that provide the best fit. In statistical literature one of the most popular approaches to estimate unknown parameters from observed data is the maximum likelihood estimation method. This entails maximizing the likelihood function (or equivalently the log-likelihood function) over all possible parameter values. The likelihood function is the probability density evaluated at the observed values. Representing the Gaussian random field D i as a random vector with mean μ i and covariance matrix σ 2 i Ω i the logarithm of the likelihood function is, up to an additive constant,
For the exponential model the correlation matrix Ω i is completely determined by the unknown parameter λ i . Then the following procedure leads to point estimates for μ i , σ i , and λ i . Maximizing with respect to μ i and σ 2 i yields the estimateŝ with 1 = (1, . . . , 1) . Plugging these expression back into the log-likelihood function we obtain the so-called profile loglikelihood
which is to be maximized over λ i > 0. This can be done using any standard numerical optimization algorithm. For the Matérn model the only difference is that Ω i depends on two parameters θ 1i and θ 2i instead of λ i . Then the profile log-likelihood function has to be maximized over all positive values of θ 1i and θ 2i . This can also be done using standard numerical optimization. In our implementation, we use the MAT-LAB function fminsearch which is an implementation of the Nelder-Mead algorithm for unconstrained nonlinear optimization.
VI. MODEL VERIFICATION
In this section two measures of model verification, or goodness-of-fit tests, are presented. The proposed procedures are intended to evaluate if the suggested model is a believable model for measurement data. The main objective is to evaluate the explanatory power of the model.
A. Verification by residual analysis
An effective measure of goodness-of-fit is to perform a statistical test to investigate if the measured data is consistent with the model or not. To this end we will analyze the residuals of the Gaussian random fields model D i described in Section IV.
We will make use of the representation of D i as a Gaussian random vector with mean μ i and covariance matrix σ 
where A i is the Cholesky decomposition of Ω i (the matrix A i such that A i A i = Ω i ) and W i is a vector of n independent N (0, 1) random variables. To see this note that with this representation D i is a linear combination of independent N (0, 1) variables. Hence, it has a joint Gaussian distribution with mean μ i and covariance matrix
Inverting the relation (2) yields the residuals W as 
where
. . , W in ) and Φ(x) is the standard normal distribution function. If the null hypothesis, that the variables in W i are independent standard normal, is rejected then there is evidence that the variables in W i are not standard normal. Otherwise, if the null hypothesis is not rejected, the data is consistent with the standard normal assumption. This provides evidence in favor of the model. The hypothesis test will be performed at the 1% confidence level implying that there is only a 1% chance the null hypothesis will be (incorrectly) rejected when the null hypothesis is true.
In addition to the result of the hypothesis test it is also desirable to report the P-value. The P-value can be interpreted as the amount of support for the null hypothesis. It is computed as the probability (under the assumption that the null hypothesis is true) that the test statistic KS would take a value equal or larger than the observed test statistic. It is intimately connected to the result of the hypothesis test as the null hypothesis is rejected if the P-value is below 1%.
B. Verification using variograms
One of the main features of the suggested model is to allow for spatial correlation between different locations on the chip. In this respect it is desirable to investigate if the suggested correlation structure captures the main features of the data. An exploratory analysis of the correlation structure can be performed by studying the variogram and the sample variogram. The variogram is a popular tool in spatial statistics; in particular in geostatistical analysis. It has also been suggested in connection with spatial variations [8] . For a stationary isotropic random field D i the variogram is defined by
The sample version of the variogram, called the sample variogram, is based on observations of a random field. It is computed byγ
where N h is the number of pairs l, l such that l − l = h. If the model accurately captures the correlation structure, one expects the sample variogram for the measurement data to be similar to the theoretical variogram. However, one cannot expect a perfect correspondence. Because of the intrinsic randomness, the sample variogram will differ from the theoretical.
VII. SYNTHETIC GENERATION OF WITHIN-DIE VARIATIONS
A very attractive feature of Gaussian random fields is that it is easy to generate pseudo-random samples from them on a computer. That is, given the model and values of the model parameters it is easy to generate a large number of synthetic chips. In this section we describe a simple algorithm to generate samples from Gaussian fields. We present the algorithm for the isotropic field D i in the previous section, but it is elementary to generalize to more complicated structures when the mean level and the scale parameter depend on the location on the chip as well as more complex correlation function.
Suppose the Gaussian random field D i is represented as a Gaussian random vector of length n = n 1 × n 2 with mean μ i and covariance matrix σ
To generate a sample of D i the following algorithm is implemented.
Generate a column vector
Then D i has the representation (2) which shows that it has the desired distribution.
VIII. EXPERIMENTAL RESULTS
To evaluate the appropriateness and accuracy of the suggested statistical models, we first design the proposed vMeter system to extract within-die process variations. We then implement the system in ten sample chips (Altera's EP2C35 devices) manufactured in 90nm technology, with all chips belong to the same speed bin (C6). Each chip holds 198 ring oscillator tiles organized as a 18 × 11 lattice, i.e., n 1 = 18 and n 2 = 11. Each tile is composed of 135 inverters that are organized in a 3 × 3 logic blocks. As described earlier, Figure 3 gives the withindie variations from four chips of the ten sample chips. With the extracted data in hand, we carry out the following procedure based on the previous discussions.
• Depending on the modeling flexibility required, choose the appropriate model: exponential (Subsection IV.A) or Matérn (Subsection IV.B).
• Calculate the model parameters using maximumlikelihood estimation (Section V) and the provided extraction data.
• Verify the accuracy of the model using KolmogorovSmirnov test (Subsection VI.A) and/or using sum of squared errors from data variograms (Subsection VI. B).
A. Statistical analysis of measurement data
The first thing to explore is if the Gaussian random field model is reasonable for describing the measured intra-chip variations. We certainly require that the model is able to capture the essential structure of the spatial correlations. For this purpose, we first plot the variograms calculated for the silicon measurements of the ten chips in Figure 4 • The 100 random variograms generated from the model, whether using the exponential or the Matérn model, often deviate from the theoretical variogram (displayed by thick black line). In particular, we observe that the Matérn model is flexible enough to describe a wider range of spatial correlations than the exponential model.
• All of our silicon chips have variograms that are consistent with the Matérn model with 'roughly' the suggested parameter values. That is, the 10 variograms computed from the silicon chips appear as a subset from the sample space variograms generated from the Matérn model.
B. Fitting silicon data to statistical models
The maximum likelihood estimation method described in Section V is used to estimate the parameters μ, σ, and λ of the exponential model for each of the ten chips. To evaluate the goodness of fit, the Kolmogorov-Smirnov test of the residuals is performed to check whether they appear to have standard normal distribution. The results are summarized in Table  II . For the Matérn model the corresponding maximum likelihood estimation and goodness-of-fit tests are also performed. The results are summarized in Table II and the resulting model variograms and sample variograms are illustrated in Figure 5 . It is noteworthy that none of the Kolmogorov-Smirnov tests is rejected at the 1% level. This indicates a good fit. We also note that the mean value, as dictated by the inter-die variations, vary from chip to chip as expected. The maximum difference due to inter-die variations is • From Figure 5 , we observe that the Matérn model captures more of the correlation structure in comparison to the exponential model. For example, in chip 4 where the variogram deviates considerably from the exponential model, the Matérn model gives a better fit.
Besides generating synthetic variograms as was given in Figure 4 , our method is capable of generating complete within-die variations patterns as outlined in Section VII. In Figure 6 , we show four synthetic within-die variation trends (μ i = 0 and σ i = 1). Besides the thorough statistical validation provided in the previous two subsections, we visually compare the synthetic trends in Figure 6 to the actual silicon trends of Figure 3 . We find that our synthetic trends are visually similar to the actual trends. Our method for generating synthetic data can be of great value for researchers who would like to derive their process variation-based research with realistic within-die process variation models.
IX. SUMMARY AND CONCLUSIONS
In this paper we have developed a complete treatment for the subject of within-die process variations. We have designed and implemented a process variation extraction system in 90nm chips. To find a model that captures the extracted data, we have proposed a number of statistical models that accurately capture the correlation between the different spatial locations on the test chips. We have also described how to calculate the required parameters for our proposed models using maximum likelihood estimation, and thoroughly verified the correctness of our models and parameters. We have also proposed a procedure to generate synthetic variability trends that mimic realistic silicon chips. The procedure can be utilized by other researchers to generate accurate synthetic within-die variation trends for their experiments. 
6B-2
