Abstract
Introduction
Process technology and environment-induced variability of gates and wires in VLSI circuits makes timing analysis of such circuits a challenging task [1] . More precisely, advanced analysis tools must be developed that are capable of verifying changes in the circuit timing which stem from various sources of variations [2] . In block-based statistical timing analysis (σTA), every timing quantity of interest (e.g., delay and slew, arrival time and required arrival time) is represented as a function of global sources of variation (denoted by X i ) and independent random sources of variation (denoted by S i ) in the canonical first-order (denoted by CFO) form. The advantages of such a formulation are that a) it can capture all correlations and b) it can produce delay sensitivities due to changes in various environmental and process-related parameters [2] . Sources of variations have often been assumed to be Gaussian, which in turn simplifies the block-based σTA. However, it has been recently reported that certain process parameters exhibit non-Gaussian probability distributions [3] .
Block-based σTA breaks its analysis into two parts: 1) variational interconnect timing analysis [4] [5] and 2) variational gate timing analysis. Unfortunately, block-based σTA is lacking in variation-aware gate timing analysis. The authors in [7] propose a modeling technique for gate delay variability considering multiple input switching. In [8] , a model for calculating statistical gate delay variation caused by intra-chip and inter-chip variability is presented. Recent works do not provide an accurate means of analyzing the gate propagation delay and output slew as a function of variational input transition, variation-aware gate timing library, and variational gate load. In this paper a new framework is proposed for determining variational gate timing behavior. This is achieved by performing the following steps:
1. Given the variational resistive-capacitive load (where all resistances and capacitances are represented in the CFO form), an efficient and accurate algorithm is presented to calculate variationaware RC-π load. To perform the analysis, we calculate the variation-aware admittance moments (cf. section 3), and as a result, the resistance and capacitances in the RC-π load can be written in the CFO form.
2. Based on the statistical RC-π load obtained in step 1, we calculate the variation-aware effective capacitance in the CFO form. In order to achieve the aforementioned goal, a new approach for effective capacitance calculation in static timing analysis (STA) is proposed (cf. section 4. We point out that although, in the remainder of this paper, we will mainly focus on the CFO random variables to represent process and environmental sources of variation as well as the performance quantities of interest; the work itself is not limited to the first-order approximation of these quantities. In fact, it is straightforward to extend the approach to more complex (e.g., second-order) forms regardless of considering Gaussian or non-Gaussian parameter variations.
The remainder of this paper is as follows. In section 2, we review the background of block-based σTA. We also show how to convert a quantity, which itself is a function of global and independent sources of variation, into a canonical first-order (CFO) form. The variationaware RC-π calculation is presented in section 3. Section 4 explains the statistical gate timing analysis for the variational input rise time, variation-aware gate timing library, and variational RC-π load. In this section a new statistical effective capacitance calculation will be proposed and used for gate timing analysis, which is the key contribution of this paper. Section 5 presents experimental results. Finally, conclusions are discussed in section 6. We use the notation shown in Table 1 throughout the paper. 
Background
As mentioned before, the sources of variation may exhibit nonGaussian distributions. Therefore, in general, in addition to calculating the mean and variance of the electrical and timing parameters, we need to calculate the skewness of their distributions, i.e. using the first three moments of the parameters variations. 
Canonical first-order (CFO) model for timing and electrical parameters
In block-based statistical timing analysis tool, a first-order variational model is employed for all timing quantities such as the gate and wire delays, arrival times, required arrival times, slacks and slews, i.e., all timing quantities are expressed in the CFO form as: Variation in the physical dimensions of the wire causes change in its resistance and capacitance, thereby, making the gate delay and slew as well as wire delay and slew to vary accordingly [9] . Therefore, we need to capture the effect of geometric variations on the electrical parameters. For instance, resistance and capacitance in the CFO form are calculated as follows: 
, which follows from the fact that form of function f is independent of its input type (deterministic or variational.)
Converting a variational function into CFO form
It is important to represent timing and electrical quantities in the CFO form. This in turn enables one to propagate first order sensitivities to different sources of variation through timing graph [2] [9] . In addition, it makes statistical computations efficient and practical and provides timing diagnostics at a very small cost in run time. The remaining question is how to convert a quantity of interest (which itself is a function of different CFO variables) into the CFO form.
The following subsection presents a method to answer the above question. We use an example to show the procedure. The problem we address is how to convert the gate output transition time into the CFO form. However, this method can be easily applied to any other quantity of interest.
2.2.1
Gate timing analysis for lumped capacitive load Problem Statement I: Given is a variational CMOS driver where its input rise time, t in , is in the CFO form and drives an output capacitive load, also, in the CFO form. Note that the distribution characteristics of all global and independent sources of variation (µ=0, σ 2 =1, κ) are given. The objective is to calculate the output transition time, t r , in the CFO form:
i.e., calculate the nominal value (t r,0 ) and the sensitivity coefficients (t r,i and t r,m+1 ) as well as the skewness of distribution of ∆S t r .
The gate output transition time is a function of the input transition time, the logic gate characteristics (e.g., the W/L ratio, threshold voltage of transistors, V dd , and temperature), and the output load. In commercial ASIC cell libraries, it is possible to characterize various output transition times (e.g. 10%, 50%, and 90%) as a function of above variables; i.e.;
where , ,
where t r is the output transition time and TF is the corresponding output transition time function. z captures the gate characteristics and environmental factors, t in is the input transition time, and c l is the output capacitive load. Based on the Invariant Functional Form Property, the form of function TF is independent of its input type (deterministic or variational.) Hence, we extend the above equation to the variational case. In block-based σTA, t in , c l , and every parameter z is given in the CFO form as a function of m global and exactly one independent random sources of variations. Therefore, t r itself is a complex (non-CFO) random variable. Hence, to represent the complex t r in the CFO form, we replace t in , c l , and z with their corresponding CFO models and collect terms. Hence, by differentiating with respect to global and independent random sources of variation, t r as a function of m global sources of variation and p independent random sources of variation can be approximated as: (2) can be re-written as:
By using Lemma 1:
In Lemma 2, we present how to calculate addition, multiplication, and division of two CFO forms in a new CFO form. 
RC-π Load Calculation in the CFO Form
In VDSM technologies, one cannot neglect the effect of interconnect resistance of the load on the gate delay and output transition time. In STA, an adequate approximation of an n th order load seen by the gate (i.e., a load with n distributed capacitances to ground) is obtained by replacing the load by a second order RC-π model [10] . Equating the first, second, and third moments of the admittance of the real load with the first, second, and third moments of the RC-π load, one can compute c n , r π , and c f as [11] :
where Y k,in is the k th moment of the admittance of the real load. In σTA, it is required to consider the effect of variability of the load on the gate timing analysis, as detailed below. Problem Statement II: Given is an RC network representation of the load of a logic gate in a design as exemplified in Figure 1(a) , where each r and c is in the CFO form. Note that the distribution characteristics of all global and independent sources of variation (µ=0, σ 2 =1, κ) are given. The objective is to calculate an equivalent variational RC-π load (i.e., c n , r π , and c f of Figure 1 (b) are in the CFO form), while its admittance matches the admittance of the real load in the frequency range of interest.
c n , r π , and c f are functions of the admittance moments as seen from Eqn. (3) . Hence, by calculating the variational admittance moments, we can calculate the CFO parameters of RC-π load (using the technique explained in section 2.2.) This can be done by differentiating the expressions in Eqn. (3) with respect to the sources of variation (cf. section 2.2.) However, as it will be shown next, a recursive operation is utilized to calculate the variational admittance moments and since in each recursion step, we have a complex (non-CFO) random variable which will feeds in the next step and this may increase the complexity of the calculations;
We represent the admittance moments in the CFO form throughout the recursion. This helps us by controlling the complexity of presenting the moments as the recursive function proceeds. Following shows how to calculate the input admittance moments of the real load in the CFO form. Consider the RCY segment shown in 
Gate Timing Analysis for the RC-π Load in
Block-Based σTA Problem statement III: Given is a variational CMOS driver, whose input rise time, t in , is in the CFO form and drives a variational RC-π load. The resistance and capacitances of this load are also in the CFO forms. The distribution characteristics of all global and independent sources of variation (µ=0, σ 2 =1, κ) are given. 
A new approach for effective capacitance calculation in static timing analysis
By definition, the effective capacitance is a pure capacitance that replaces an RC-π load and has the property that it gives the most accurate result from a timing model that is characterized with lumped capacitance. Typically, the effective capacitance stores the same amount of charge as the RC-π load until a certain point of the output voltage transition [11] [12] [13] (e.g., the 50% point of the output transition.) Figure 3 (a) depicts a typical CMOS driver with its input waveform and RC-π load. The output voltage waveform may be modeled as a weighted linear sum of ramp and exponential waveforms as shown in Figure 3(b) . We therefore assume that the actual c eff can be obtained as a weighted average of that obtained for the ramp output waveform and that obtained for the exponential output waveform.
In the following, we calculate c eff for ramp and exponential waveforms of the gate output voltage. 
, , , 
where;
. , Proof: It is omitted for brevity. Eqn. (12) is the iterative c eff calculation under the nominal conditions of the circuit. Hence, c eff,0 can be evaluated by using the conventional effective capacitance calculation [12] [13] .
t in,i , c n,i , r π,i , c f,i , are given (cf. Eqns. (6)- (9).) To evaluate Eqns. (13) and (14), we must calculate the derivatives of function F (function F is given in Eqn. (5)) with respect to t r , c n , r π , c f , and evaluate these derivatives for the nominal values of the circuit parameters (when all sources of variation are set to zero i.e., (∂F/∂t r )
, and (∂F/∂c f ) nom . ) These terms are easy to evaluate. For the remaining terms, we need to calculate the derivatives of the output transition time (t r ) with respect to t in and c eff and evaluate them under the nominal condition of the circuit (i.e., (∂t r /∂t in ) nom and (∂t r /∂c eff ) nom .) Therefore, we propose two different solutions: 1. Updating the gate library look-up table and utilizing the additional data during σTA: The revised tables now provide not only the timing quantity for each combination of t in and c l , but also the derivatives of the timing quantity (t r ) with respect to t in and c l for each combination of t in and c l . nom only once and in a constant time. Therefore, complexity of our method is dominated by the iterative effective capacitance calculation under the nominal conditions.
Using the existing gate library look-up

Experimental Results
Our experiments use 90nm CMOS process parameters to model gates and interconnect parasitics. We assumed two different configurations for the experimental setup. The first one consists of two inverters connected in series whereas the second one is a CMOS inverter followed by a 2-input NAND gate. For both configurations, we apply a ramp input to the first inverter while its nominal value is chosen from the set (t in ) nom ={10ps,80ps,150ps,220ps,300ps}. For the first configuration, size of the first inverter is fixed at W p /W n =30/15µm whereas size of the second inverter is chosen to be one of W p /W n ={20/10, 50/25, 70/35, 100/50}µm. For the second configuration, size of the first inverter is again fixed at W p /W n =30/15µm whereas this time the size of the succeeding 2-input NAND gate is chosen to be one of W p /W n ={40/40, 50/50, 100/100}µm.
To characterize the timing behavior of the gate, a look-up table based library is employed which represents the gate delay and output transition time as a function of input rise time, output capacitive load, V dd , and temperature. We apply different loading scenarios for the second-stage gate as explained in the following subsections, i.e., pure capacitive load, and general RC load. We have also considered four different global sources of variation (V dd , temperature, Metal layer 1 width, and ILD) and one independent random sources of variation for each electrical parameter (i.e., r and c) and timing parameter (for instance t in ) in the circuit. The sensitivity of each given data to the sources of variation is chosen randomly, while the total σ variation for each data is chosen to be 10% and 15% of their nominal value. We also assumed that the sources of variation are skewed with different skewness values as explained in each subsection. Mean, variance, and skewness of effective capacitance, the gate 50% propagation delay, and 10%-90% output transition time (slew) are calculated using the approaches presented in this paper.
To compare the results, we ran Monte Carlo simulation with 10 4 samples on each test scenario and derived mean, variance, and skewness of the effective capacitance, gate 50% propagation delay, and 10%-90% output transition time. Average percentage errors for the mean, variance, and skewness of effective capacitance, the gate 50% propagation delay, and 10%-90% output transition time between the obtained results from the Monte Carlo and the calculated results based on using statistical gate timing analysis approach are reported.
A. Purely Capacitive Load
The load in this section is considered to be purely capacitive. Its nominal value is chosen to be (C) nom = {400, 500, 800, 1400}fF. The scaled distribution of the sources of variation is considered to have a skewness of 0.4, 0.6, and 0.8. We performed our experiments on both circuit configurations explained above. The results for the first configuration (where the second gate is an inverter) are presented in Table 2 (the skewness of the given data is 0.4) and Table 3 (for the skewness of 0.8). The results for the second configuration are provided in Table 4 (for the skewness of 0.6). Experimental results indicate an average error of about 3% for two different σ values, i.e. 10% and 15%. As we increase the σ value (i.e. the total σ variation for each data; e.g. σ variation of t in , and c l ) from 10% to 15%, the error in calculated mean, variance, and skewness of the delay and slew increase, but slightly. The sources of error can be mainly classified into two groups: 1) the inaccuracy of the gate library table lookup and 2) the linear first order approximation of the timing and electrical parameters with respect to the sources of variation. Note that, the runtime of the proposed algorithm in average is 89 times faster than the Monte Carlo based approach. 
B. General RC Load
For this section, the load is considered to be an RC tree of varying topology. The nominal value of total load resistance is chosen from the set (R) nom = {150, 260, 300, 710, 1000}Ω and the nominal value of the total capacitance of the load is chosen to be from the set (C) nom ={400, 500, 800, 1400}fF. The scaled distribution of the sources of variation is considered to have a skewness of 0.5, 0.75, and 1.
Again, we performed the experiment on both circuit configurations as explained before. The results for the first configuration (where the second gate is an inverter) are presented in Table 5 (the skewness of the given data is 0.5) and Table 6 (the skewness of the given data is 0.75). The results for the second configuration are also provided in Table 7 (the skewness of the given data is 1). Experimental results indicate an average error of about 6% for different σ values. As we increase the σ value (i.e. the total σ variation for each data; e.g. σ variation of t in , c n , r π , and c f ) from 10% to 15%, the error in calculated mean, variance, and skewness of c eff , the gate delay, and output transition time increase, but slightly. Similarly, as skewness increases (e.g. skewness of t in , c n , r π , and c f ) from 0.5 to 0.75, the error in calculated mean, variance, and skewness of the c eff , as well as the error in delay and slew increases, but slightly. The sources of error can be mainly classified into four groups: 1) the inaccuracy of the gate library table lookup, 2) the linear first order approximation of the timing and electrical parameters with respect to the sources of variation, 3) the error in calculating the variational RC-π load and 4) the error in the effective capacitance iterative equation proposed in section 4.1. The runtime of the proposed algorithm is, on average, 95 times faster than the Monte Carlo based approach. 
Conclusion
In this paper we presented a framework to handle the variation-aware gate timing analysis in block-based σTA considering non-Gaussian sources of variation. First, we proposed an approach to calculate variational RC-π load, which can be utilized in place of the actual variational RC load for the gate timing analysis purposes. Next, we presented a new approach for calculating effective capacitance in STA. We used this technique to calculate the statistical c eff in the CFO form, and thereby, calculated the gate delay and output slew in the that form. Experimental results show an average error of 6% with respect to Monte Carlo with 10 4 samples simulation.
