Abstract. With technology steadily progressing into nanometer dimensions, precise control over all aspects of the fabrication process becomes an area of increasing concern. Process variations have immediate impact on circuit performance and behavior and standard design and signoff methodologies have to account for such variability. In this context, timing verification, already a challenging task due to the sheer complexity of todays designs, becomes an increasingly difficult problem. Statistical static timing analysis has been proposed as a solution to this problem, but most of the work has focused in the development of timing engines for computing delay propagation. Such tools rely on the availability of delay formulas accounting for both cell and interconnect delay that take into account unpredictable variability effects. In this paper, we concentrate on the impact of interconnect on delay and propose an extension to the standard modeling strategies that is variation-aware and compatible with such statistical engines. Our approach, based on a specific type of perturbation analysis, allows for the analytical computation of the quantities needed for statistical delay propagation. We also show how perturbation analysis can be performed when only the standard delay table lookup models are available for the standard cells. This makes the proposed approach compatible with existing timing analysis frameworks. Results from applying our proposed modeling strategy to computing delays and slews in several instances accurately match similar results obtained using electrical level simulation.
Abstract. With technology steadily progressing into nanometer dimensions, precise control over all aspects of the fabrication process becomes an area of increasing concern. Process variations have immediate impact on circuit performance and behavior and standard design and signoff methodologies have to account for such variability. In this context, timing verification, already a challenging task due to the sheer complexity of todays designs, becomes an increasingly difficult problem. Statistical static timing analysis has been proposed as a solution to this problem, but most of the work has focused in the development of timing engines for computing delay propagation. Such tools rely on the availability of delay formulas accounting for both cell and interconnect delay that take into account unpredictable variability effects. In this paper, we concentrate on the impact of interconnect on delay and propose an extension to the standard modeling strategies that is variation-aware and compatible with such statistical engines. Our approach, based on a specific type of perturbation analysis, allows for the analytical computation of the quantities needed for statistical delay propagation. We also show how perturbation analysis can be performed when only the standard delay table lookup models are available for the standard cells. This makes the proposed approach compatible with existing timing analysis frameworks. Results from applying our proposed modeling strategy to computing delays and slews in several instances accurately match similar results obtained using electrical level simulation.
Introduction
The impact of process variation on circuit performance is an area of increasing concern, both in the semiconductor industry, as well as academic research. In the research community, considerable work has been devoted to the development of statistical static timing analysis [1, 2] . Nowadays, designers spend a considerable amount of their verification budget trying to make sure that their circuits will work under all possible settings. To achieve this, they target the worst possible scenarios by considering so-called pessimistic conditions, and design in order to ensure that such corner cases are accounted for. This analysis is usually based on assuming worst-case conditions on all possible variations simultaneously. Such an scenario is pessimistic and may lead to considerable over-design.
Improving this situation requires tools that are better suited to handle realistic variations and the complex inter-relations that exist between those variations. Not only should those tools directly make use of realistic process information, thus making them better suited to model the unpredictability of process parameter variations, but they should be able to implicitly determine how such variations affect the circuit behavior. Such a formulation makes it possible to compute on a single analysis the circuit behavior not only due to a given parameter setting, but to a variety of settings. The recent development and availability of statistical timers that are based on a parametric description of delay in terms of random process variables is an example of movement in this direction [1] . Other approaches targeting direct determination of the worst parameter settings with respect to delay also follow the same trend [3] .
A timing analyzer consists of several component pieces. In a statistical context, the most well-studied part of the timing engine is the timing graph traversal, which manages the calculation of arrival times and slews at the level of abstraction of a timing graph. An equally important, if more mundane, component is the delay calculation engine. The delay calculator takes as input the cell and interconnect models and produces a delay expression in a form that can be consumed by the graph engine. This paper is concerned with a portion of the delay calculation step, the impact of interconnect on delay. We explore how commonly used interconnect modeling strategies can be extended to be compatible with the most recent generation of statistical timing analysis tools [1] . Specifically, we wish to produce cell and interconnect delay as affine functions of process parameters. We assume that one of several recently proposed approaches for interconnect reduction under process variation is available to generate tractably sized reduced order models [4] [5] [6] . The key technology in our approach is a specific type of perturbation analysis. While digital circuits are strongly nonlinear with respect to the circuit inputs, cell delays are often close to linear with respect to process parameters. In this paper we adapt the general development of linear time-varying (LTV) perturbation theory [7, 8] for extraction of variation-aware delay models to the specific needs of delay calculation for precharacterized standard cells. LTV perturbation theory has been widely used in RF analysis with great success [9] and is at the heart of many interesting new developments. The advantage of this type of approach over, for example, differencing repeated delay calculation runs is that it is essentially an analytic method. Differencing type approaches can suffer from severe robustness problems that make them difficult to use reliably. In addition, our technique can potentially be made very fast, handling parametric models with ten to twenty parameters at minimal penalty relative to a non-variational calculation.
The outline of this paper is as follows: in Section 2, we review the basics of delay computation, the general mechanics of the procedure including cell and interconnect delay, assuming no variations are taken into account. Then, in Section 3, we introduce the general perturbation formulation and discuss the specific specialization of the more general technique to cell-level interconnect-related delay. In Section 4, we also discuss how perturbation analysis can be performed when only delay table lookup models are available for the standard cells. A key point is that analytic expressions for delay sensitivities can be obtained without having to have closed-form expressions for the cell delay elements (however, see [10] for such closed-form expressions). Finally in Section 5 we discuss the utilization of adjoint methods to accelerate the computation of timing models when large numbers of parameters are present. Results of using our proposed approach are shown in Section 6 and conclusions are drawn in Section 7.
Nominal Delay Calculation
Timing verification is an enabling methodology for optimizing performance and ensuring that circuits satisfy certain timing and frequency requirements. To that end, timing verifiers determine approximate but safe estimates of the worstcase delay through a circuit: for every input and output signal, there are many possible paths through the circuit, each path consisting of a set of interconnected network cells. Timing verification deals with the identification and analysis of the critical paths, also known as the longest delay paths in the circuit. In addition to finding critical-path delays, timing verifiers can also be used to do miscellaneous static analysis, like finding high-speed components off the critical path that can be slowed down to save power and several other relevant tasks. However, the most common usage is indeed to determine the worst case paths. Computing the delay along a path requires the computation of the delay of every cell along that path, plus the added delay due to interconnect between the cells. In this section we review the standard computation of cell and interconnect delay.
Mechanics of Delay Computation
Timing analysis constitutes the foundation of any timing verification methodology. The typical timing analysis methodology consists in arrival time computation, which is concerned with computing the time instants at which signal transitions reach "interesting" nodes in the circuit, often corresponding to primary outputs or register inputs, where specific timing constraints must be enforced.
Two main approaches have been proposed for timing analysis: block-based and path-based. In the block-based approach, characterized by linear runtime, arrival times are pushed through the circuit in a levelized fashion, performing sum operations with cell or interconnect delays and min/max operations to compute the arrival times in the outputs of multi-input cells, assuming that the earliest/latest input transition determines the output transition. The alternative path-based approach consists in individually computing the delay of each path in the circuit by adding the delay of all the cells and interconnect along that path. Even though more accurate, this approach is computationally much more expensive than the former, since the number of paths is known to grow exponentially with the number of nodes. Clearly any timing analysis approach requires the computation of cell and interconnect delays.
For timing analysis purposes, the digital IC topology is usually partitioned into cells and interconnect nets, as illustrated in Figure 1 . Primary inputs and outputs are usually represented by the corresponding pads, which are a particular type of cells. Cell input and output pins are connected by interconnect nets. Each interconnect net can be seen as distributing the signal injected in its input, designated by port, to each of its outputs, designated by taps, that are connected to cell input pins. For typical cell and interconnect delay models, the slew of the input signal(s) is a required parameter. Accordingly, the slew of the output signal(s) is a result produced by the model. Therefore, once the circuit is properly partitioned and all the cell and interconnect delay models are in place, the task of the delay computation engine is to forward propagate the slews and invoke the appropriate delay models that will compute delays and output slews given the input slews and output loads.
Cell Delay and Cell Loading
Mainly for historical reasons, the most common modeling strategy for cell library characterization is based in delay look-up tables (LUTs) sometimes referred to as dot-lib (.lib) tables. This is a simplified model where delay and power information is maintained in the form of a few parameters. In this simplified model the timing behavior of a cell is usually characterized by a set of lookup tables that, for each input/output pin pair, describe the delay and output slew of the cell as a function of the input slew and output load. Such a model is illustrated in Figure 2 where the standard definitions are also used, namely input and output slews are defined as definition of noise margins. In a similar manner, delay is defined as the time it takes the output of a cell to reach its transition midpoint, from the time the cell input waveform reached its own midpoint. Cell characterization is performed by simulating the cell behavior as a function of input slew and loading capacitances. These results are then stored in look-up tables as mentioned, which are accessed to determine delay and slew in specific instances.
The outlined delay modeling strategy assumes a voltage source model for the cell characterization, as illustrated in Figure 2 , since delay and slew values implicitly characterize the output voltage waveforms of the cell. However, in recent years, current source models are gaining more prominence, since they are more effective in handling complex interconnect loading effects. Even though throughout this paper we assume voltage source delay models, the proposed techniques can also be directly applied when using current source delay models.
In Figure 2 , the output load is assumed to be a single lumped capacitance that somehow models the capacitive effects introduced by the interconnect and by the input pins of the cells connected to same net. In reality, however, the interconnect attached to the driver cell is a complex RC network that in deep submicron processes is very poorly modeled by a lumped capacitance. The loading effect of interconnect on the cell, i.e. the impact of downstream interconnect on the cell delay itself, cannot be accurately obtained simply by looking at the total capacitance on the net. To try to account for the effects of complex interconnect, while still preserving table-based cell models, the concept of effective capacitance [11, 10] has been widely adopted. For the remainder of this paper we will consider that the C shown in Figure 2 is such an effective capacitance.
The idea behind the effective capacitance consists in determining the value of C that in a certain sense approximates as accurately as possible the behavior of the original parasitic network. In Figure 2 . the output stage of a cell (or more accurately, of an output pin of a cell) is modeled by a voltage source, producing a voltage ramp v, with slew s, and a series resistor, with resistance R, that models the output resistance of the pin. The figure depicts the output stage of a cell loaded by the effective capacitance C (top right), and by the original parasitic RC network, obtained by layout extraction (bottom right). In the following, without loss of generality, in order to simplify the description, we restrict ourselves to the case of rising output waveforms for non-inverting cells. Clearly any other case can be derived in a similar manner.
The simple RC circuit on the top of Figure 2 is an approximated model of the output stage of a cell connected to an effective capacitance, that is itself an approximation of the interconnect load. For a given input slew s i and a given effective capacitance C, we can compute the estimated cell delay d and the estimated output slew s o , by a table lookup in the timing characterization of the cell. Using this information, we can compute the three time instants at which the waveform of the output voltage v o should cross V L , V T and V H , respectively,
Assuming the voltage v to be a rising ramp of slew s, shifted in time by k,
the output voltage, v o , produced by the simple RC circuit presented in Figure 2 , with time constant τ = RC, is given by,
(5) In order to simplify our notation, in the following we will assume,
Using Eqn. (5), we can compute a waveform for v (e.g. s and k) and a resistance R, such that the waveform of the response v o crosses (t L , V L ), (t T , V T ) and (t H , V H ), thus matching the tabulated behavior of the cell and its output response. This problem can be stated by the following three equations,
The waveform of v can be seen as the "ideal" output voltage of the cell, under a zero output load. We should not lose track of the fact that our goal is to determine an appropriate value for the effective capacitance C. The previous derivations assumed that such a value was somehow known. However, all that is required is that C should approximate the behavior of the original parasitic network as accurately as possible. Several criteria [12] can be used when defining what effective capacitance provides a good approximation of the behavior of the original parasitic network. In this work we consider that the effective capacitance that better approximates the behavior of the original parasitic network is the one that draws the same average current, over the transition period (e.g. when the output voltage switches from V L to V H ). Formally,
where
An analytical expression for I c can be derived. On the other hand, I m must be computed by numerically integrating the port current, obtained by interconnect simulation, as detailed in Section 2.3.
From Eqns. (7), (8), (9), and (10) we can compute the value of φ that both matches the output waveform v o with the tabulated timing information at t L , t T and t H , and also that matches the average current drawn by the original parasitic network and the effective capacitance. Since Eqns. (7), (8), (9) and (10) contain nonlinear terms, an implicit iterative method must be used to solve them. We have used Newton's method in this work. Once the value of the effective capacitance C is known, we can compute the delay d and output slew s o of the cell by a simple lookup in the timing characterization of the cell. This completely characterizes the cell output waveform within the constraints of the simple model. Such a waveform constitutes the input to the interconnect model.
Interconnect Delay
Assuming that the cell output voltage waveform has been computed, signals are then propagated along the path through an interconnect net. The input of such nets, the port, is tied to the output of a cell, and the net outputs, the taps, connect to the inputs of several other cells. At the timing level, the difference in the timing of the transition at the cell output (port) and next cell inputs (taps) we refer to as intrinsic interconnect delay. There are various methods of computing the interconnect delay ranging from closed-form expressions, descendants of the Elmore delay formula, to numerical solution of the underlying interconnect equations. In this work we assume that the circuit equations of the cell driver plus interconnect network are solved numerically, either via direct integration or an equivalent process like recursive convolution. Likewise the slew at the output nodes must be computed to be used in the analysis of the following cell.
The general state-space representation of a parasitic RC network (either in its original of reduced form) is
where x ∈ R n is the vector of circuit state variables, u is the input excitation, y is the output response, C and G are the matrices describing the reactive (capacitances) and dissipative (conductances) parts of the circuit and N selects the output response.
Assuming a cell characterization in terms of voltage source models, as illustrated in Figure 2 , the input excitation is the voltage waveform, v m , and the output response are the voltage waveforms in the taps, v tap . Therefore, we have,
where B is a matrix describing the node where the input voltage is injected, and L is an incidence-type matrix describing which voltage nodes are monitored (taps).
In the particular case of voltage source models, the current drawn by the parasitic network, I m , is also relevant, both for computing the effective capacitance and the input voltage waveform. Hence, an additional equation should be added,
where M selects the output current out of the state vector x.
3 Variation-Aware Methodology
General Perturbation Formulation
In this section, we will discuss the parametric analysis of the intrinsic interconnect delay itself. The impact of the interconnect parameters on the cell delay (i.e. variation in cell loading effects) is taken up in the next section. The starting point of our analysis is the general formulation of time-varying linear perturbation theory (see [8] for details). We assume the existence of a set of nonlinear differential-algebraic equations whose topology is fixed, but whose constitutive relations depend on a continuous way on a set of parameters. Without loss of generality the basic circuit equations can be written as
where x again represents the circuit state variables, for example, node voltages, q ∈ R n , the dynamic quantities such as stored charge, i ∈ R n , the static quantities such as device currents, t, time, and u(t) ∈ R n , the independent inputs such as current and voltage sources. In departure from the usual case, we introduce a p-element parameter vector λ ∈ R p . These parameters represent properties of the circuit, such as wire width or thickness, that induce variation in the circuit behavior through the q and i functions.
The perturbation approach to modeling the parameter variation treats the parameters as fluctuations ∆λ around a nominal value λ 0 , and assumes the circuit response x can be treated similarly, i.e.
Expanding i and q as function of x, λ and keeping the first order variations, we get
Assuming a solution to the nominal case, x 0 (t) is obtained, that is
then substituting the perturbation expansions (19) and (20) into Eqn. (16) and using (21) to eliminate the nominal-case terms, we obtain the equations for the first-order perturbation expansion as
The simplest way to compute waveform sensitivities from Eqn. (22) is by solving it once for each parameter in turn, as
This gives the final expression
Once the sensitivities in the waveforms are known, the next step is to translate to sensitivity of delay. As discussed, delay can be computed as d = t 2 − t 1 where t 2 , t 1 are the crossing times of the two waveforms of interest. The sensitivity in a crossing time can be related to the sensitivity of the waveform value x(t) at that point via the slew, ∂x/∂t. Suppose there is a small change ∆T in the crossing time of a given waveform. With a linear model, the corresponding change in the voltage is
Conversely, if the perturbation in the waveform ∆X can be computed, the change in crossing time is given by
Therefore we can compute the sensitivity of the delay as
Note that for this computation, the waveform sensitivity is only needed at a few points in time, a fact that can be used to speedup computations (see Section 5) . This is the formulation for a general first-order perturbation analysis. In the following we restrict ourselves to the problem at hand, namely modeling the linear interconnect sub-circuits assuming variations in parameters affecting the interconnect elements.
Specialization to Interconnect
Our concern in this document is with the special case of interconnect parameters, so simplifications of the general theory are possible. On-chip cell-level interconnect models are usually written in terms of capacitances and resistances, or equivalently, capacitances and conductances. Inductance is typically neglected at this level and for the sake of simplicity we will proceed likewise; it is however easy to see that the derivation is quite similar when inductance is involved. Therefore, in this case,
Let us then assume, for now, that for every element in the parasitic network (resistor or capacitor), a linear variational model is available. Such a model contains the nominal values for the elements and also the sensitivities to each parameter. Therefore, the conductance and the capacitance matrices have the form:
where G 0 and C 0 are the nominal values of the elements in the interconnect network and the sensitivities ∂G ∂λ k and ∂C ∂λ k to each parameter λ k are given by
The nominal value corresponds to the solution of the equations with each ∆λ k = 0, that is λ = λ 0 . Assuming the variational formulation for G presented in Eqn. (30), and for x presented in Eqn. (18) we obtain, for instance for i(x, λ):
Simplifying and eliminating the (non-linear) cross-product terms, we obtain:
implying that:
An identical procedure can be applied to q(x, λ) leading, as expected, to:
and therefore, that:
Eqns. (21) and (22) which describe the general perturbation analysis framework, can therefore, in the specialization of parameter-varying interconnect, be written as:
The delay modeling problem is completed by adding the notion of inputs and outputs to form state-space models. In the case of cell-level interconnect, the inputs are represented by drivers, the output stages of cells. If the cell library is characterized using current source models, then the input is a fixed current source,
where B is simply an incidence matrix indicating at which node each driver is connected to. Similarly, if the cell library is characterized using voltage source models (as in the case under study), we have
as in Eqn. (13), where v drv = v m . Other models may be used, like nonlinear current source models [13, 14] . Recalling Eqn. (14), the full set of equations is now
These equations can be written more compactly if we define
where x 0 (t) is the nominal solution computed above. s k can be interpreted as the "equivalent source" that will allow determination of the sensitivity to the kth interconnect parameter. With this definition, the final, complete set of equations is then rewritten as
Interconnect Sensitivity Calculation
The process of sensitivity calculation can now be concisely stated. First, solve Eqns. (46) and (47) to get the nominal case responses. Then, for each parameter k, solve
to get the sensitivity of the response waveforms. From the sensitivity waveforms, the delay sensitivity can be computed using Eqn. (27) at the appropriate timepoints. Of course, in practice, it is useful to diagonalize the state-space model above, i.e. to put the C 0 , G 0 matrices into pole-residue form, as numerical solution of the multiple systems is much more efficient.
Cell Delay Sensitivity Calculation
In the preceding section, we have seen how to perform variation-aware delay computation, by computing the sensitivities of the response waveforms in interconnect blocks. However, it is also necessary to show that similar sensitivities can be computed at the output of cells, in particular assuming that cell delay computation is still based on delay table models.
To show this, we refer back to the derivation in Section 2 and in particular to Eqns. (7), (8), (9) and (10). If we perform an expansion around a nominal point φ 0 , keeping the first order variations, and eliminating the nominal-case terms, we obtain,
Noticing the dependence of t L , t T and t H , on d and s o , and their dependence on s i and C, we obtain the generic equation,
t X can be replaced by t L , t T or t H to obtain Eqns. (52), (53), and (54), and all derivatives are computed at time t X . For Eqn. (55) a similar expansion can be performed,
∆s i and ∆I m are related to the parameter variation vector, ∆λ, by the following expressions, 
∂si ∂λ results from the variational timing analysis on the interconnect of the input net, as described in Section 3.
∂ Im ∂λ can be computed by integrating the sensitivities of the port current, I m , for the transition period and dividing by its width. All the derivatives in J, Q and W can either be computed analytically or by accessing the timing characterization of the cell.
If N C = [0 0 0 1] is a vector that "selects" the capacitance row of ∆φ,
Acknowledging the dependence of the delay d and the output slew s o on the input slew s i and the capacitance C, the following expressions can be derived, 
Optimizations for Large Numbers of Parameters
In this section, we discuss how using adjoint methods [15] can accelerate the computation of the timing models when large numbers of parameters are present.
Since the computation time is nearly independent of the number of parameters, the sensitivity to a larger number of parameters can be done simultaneously. Thus, if only a few crossing times are of interest, the computation is very cheap on an information-gained basis. When device mismatch is of interest, the sensitivities to multiple model parameters, for every device in a circuit, are needed. Mismatch could be caused by purely stochastic mechanisms, such as dopant fluctuations in MOSFET channels. Systematic effects such as optical proximity printing errors may also lead to device-by-device parameter variations.
Let us suppose the nominal system in Eqn. (46) has been discretized into M time-points and an operating point x 0 (t) has been computed. For a given timepoint t k , k = 1 . . . M , let us introduce the capacitance and conductance matrices C, G as follows:
Similarly, define the "source" functions s as
We "pack" the time-varying quantities into matrices and vectors with a block structure. If there are N equations in (22) and M timepoints, then the vectors
have M sections, each section a vector of N entries. The vector X represents the waveforms of perturbations due to parameter fluctuation. The vectors s (l) will be used to form the p columns (one for each parameter) of the matrix S,
Likewise the matrix
has M × M blocks, each block an N × N matrix. After time-discretization, a composite capacitance matrix C may also be formed. The precise structure of this matrix depends on the discretization scheme used. For example, for a backward Euler discretization with timesteps h 1 , . . . , h M , the matrix C becomes
. . .
Eq. (22) can be written as one composite matrix equation
To extract the sensitivity of the waveforms to the parameter λ k , we solve
where e k denotes the kth unit vector (all zero, except entry k, where it is unity). For the delay computation, the sensitivity of the waveform at a specific timepoint j and node k is needed. Construct the (block-structured) vector
with the vectors E l given by
Then the required sensitivity a k is a k = E T X k . Note that (C + G) is block-lower-triangular. This means that operations with (C + G) −1 are cheap to compute. Of course, the matrices C, G are never written down explicitly, we only perform implicit operations as as multiplying (C + G) −1 times a vector. Clearly, to extract the full set of sensitivity information, we must perform p solves -one for each parameter. This is acceptable if p is small, but problematic if p is large. On the other hand, for one solve, we obtain the sensitivities of the waveforms at all nodes and all all timepoints. The computational complexity is O(pN M ) for the solution.
The idea of adjoint analysis is to obtain the sensitivity of a voltage waveform at a single timepoint and single node, to perturbations of all parameters simultaneously, at all timepoints. With the above notation, the notation of the procedure is simple. First we solve
Denoting the vector of sensitivities η = [a 1 , a 2 , . . . , a p ], we have
If the sensitivities for multiple timepoints or nodes are to be computed, there is one solve of Eq (82) for each such observation point. The computation of S is done once, and shared across all solves. If t is the number of such terminal points, the computational complexity is O(tN M ) for matrix solution. Compared to the direct computation, savings is possible if t < p.
We have not yet discussed the computation time for constructing the matrix S. At worst, this is O(pDM ) where D is the number of devices. However, usually either the number of parameters is small, p is O(1), or each device depends on only a small number of parameters. In either case, the complexity becomes O(DM ) O(N M ) if the implementation is done so as to exploit such structure.
Experimental Results
A realistic circuit block was synthesized and mapped to an industrial 90nm technology. As process parameters, we considered the widths and thicknesses of the six metal layers needed to route the block. During parasitic extraction of the design, we computed the nominal values and sensitivities of each parasitic element (resistors and grounded capacitors), relative to each one of the 12 parameters.
In order to validate the interconnect delay and slew computations, we selected from the design 3671 nets, including nets in the internal logic, nets in the clock tree and nets in the pad wiring. For each of these nets, we computed the parametric delay and slew expressions for each of its taps (resulting in 13870 taps among all nets), while the port was excited by a rising voltage ramp. To assess the accuracy of the proposed methodology, the delay and slew sensitivities were compared to transistor-level simulations performed using the circuit simulator Spectre. In Figure 3 we present scatter plots of the sensitivities computed by both methods, for 4 parameters. In Figure 4 we present histograms of the relative errors for other 4 parameters. Both figures clearly show that the computed sensitivities accurately match those obtained by simulation.
In order to validate the cell delay and output slew computations we proceeded as follows. For a given standard cell of that same 90nm technology, and using Spice-level models, we generated a dotlib-style lookup table of size 7x7, for delay and output slew, as a function of input slew and load. Using these tables, and applying the proposed methodology, we computed the delay and output slew sensitivities for one of the cell instances in the previously mentioned design, considering its loading net obtained from extraction. Using the methodology proposed in Section 4 we generated the sensitivities of delay and output slew to all 12 parameters. Next, varying the parameter values, a similar set of sensitivities was also computed with Spectre, using accurate Spice-level models for the cell. The delay and output slew sensitivity values obtained using the proposed method were then assessed by computing its relative error versus the Spectre-generated data. These relative errors are shown in Figure 5 (left plot). As can be observed, the errors are in general small, usually in the low percentage range. The only exception to this rule is the pathological case of the slew sensitivity to parameter #2, whose absolute value is small, the smallest of all the sensitivities and near machine precision. In order to investigate this behavior, we introduced a variation in the input slew depending on parameter #2, so that the delay and output slew sensitivity values to this parameter would become larger. As a result we observed that when this happened the relative error dropped to the normal range, as shown in Figure 5 (right plot). Considering that the size of the dotlib-style lookup table used was only 7x7 (typical value), providing a rough approximation of the behavior of the cell, and that the parasitic network was also approximated by a single lumped capacitance, we believe that the accuracy of the computed values is fairly good. Better accuracy should be obtained by using larger lookup tables, or by extending the proposed model for handling tables depending on other parameters.
Conclusions
In this paper we have developed an analytic delay calculation methodology suitable for use in a statistical static timing methodology. Our approach, based on a specific type of perturbation analysis, allows for the analytical computation of the quantities needed for statistical delay propagation. We also showed how perturbation analysis can be performed when only the standard cell delay table lookup models are available. The techniques proposed are robust and show good correlation with transistor level calculations. Furthermore, they can be directly applied when cell characterization is based either in voltage or current source models. Future work will show how to develop models that include nonlinear contributions from the process parameters. 
