Abstract-This paper presents a novel and flexible modeling technique to generate accurate linear and nonlinear driver models with applications in timing and noise analysis. The new technique, based on Galerkin's finite elements method, is very efficient because it relies on existing logic block characterization for timing, does not require additional nonlinear circuit simulations during modeling, and generates reusable models. The performance of the proposed modeling technique is exemplified in two different implementations: nonlinear driver models for delay noise analysis and piece-wise linear driver models for static-timing analysis.
I. INTRODUCTION

O
NE OF THE problems that has gathered much attention recently is the effect of switching noise on chip-level timing (delay noise or dynamic noise). static timing analysis determines the extremes of signal propagation and is the main tool used for predicting the speed performance of digital ICs. Since switching noise can overlap with and affect logic signals, it will directly impact the chip level timing and the reliability of the final product. A good description of the different types of noise, their impact on circuit activity, and ways to model and analyze it is given in [1] . Other tools and methodologies for functional noise analysis are proposed in [2] - [4] . Special circuit modeling techniques to assess global noise impact have been proposed in [5] - [7] .
The impact of switching noise on circuit performance is generally split into functional noise and delay noise. Functional noise is noise induced in quiet nets (victims) by switching neighbors (aggressors). For high levels of induced currents, it can cause unwanted logic activity and even functional failures. The delay noise is caused by the same switching activity on the neighboring nets, but it happens while the victim net is itself active. In this case, the noise can modify the time of flight and slew-rate of the useful signal and it can cause delay (timing) errors (Fig. 1) .
Switching noise analysis is performed in two steps: a first stage where all possible aggressors are considered, some of them filtered based on functional constraints, clock domains, timing windows, etc., and a second stage where the actual effect of noise on delay is determined through circuit simulation. Most of the research in this area has been focused on the first step, on the alignment of the aggressor noise signals for worst/best case analysis and convergence of the timing analysis in the presence of noise [8] - [15] . In this work, our attention is focused on the second step, mainly on the derivation of efficient and accurate logic block models suitable for delay noise analysis. In this area, the existing models can be separated into three groups: 1) linear timing models: in [13] , [16] , [17] the authors have developed linear logic block models that include current injected by aggressors, based on the static timing logic block models developed earlier by [18] , [19] ; 2) best-fit resistance models: an analytic model based on the equivalent resistance of the pull up/down transistor chain proposed in [20] and a transient holding resistive model proposed in [12] ; 3) large signal driver current models derived using DC gate output current measurements [21] , [22] . In the case of functional noise, since the victim net driver is holding low or high, the driver is well approximated by a linear (RC) model and the analysis is reduced to linear circuit simulation. In the case of delay noise, functional noise-like analysis is used to determine a worst-case alignment of the aggressor noise pulses which are then "merged" with the victim net logic signal. In the "merging" step, it is crucial to take into account the very complex nonlinear interaction between the driver logic block and noise injected from aggressors. In [13] , [16] , and [17] , this complex interaction is modeled by an iterative process which matches the perturbation induced by the current (charge) injected into the driver by noise with a change in the effective load 0278-0070/04$20.00 © 2004 IEEE capacitance. In the case of [12] , the area under the noise pulse must be matched by the area obtained with a transient holding resistance model of the driver. In the case of [20] , the driver is modeled by a simple pull-up/down resistance derived from the physical devices. In the third approach [21] , [22] , small gates are characterized differently, using dc output pin current measurements. A large signal model of the gate is determined in this way, but it needs further processing to match the transient behavior of the gate.
In this paper, we present a new logic block modeling technique based on the Galerkin finite elements method (FEM) for solving differential equations. The proposed modeling technique offers a general framework for generating logic block models for static timing and delay noise analysis. Some of the distinguishing features of our modeling technique are: 1) the technique generates a behavioral model of the logic block output (driving) port, controllable by both input and output port signals; 2) our models can be derived for a range of input and output conditions so they can be reused; 3) the modeling process is based on existing timing characterization data and no additional characterization work is needed; 4) the modeling technique allows the user to control the accuracy and the complexity of the models that are generated. In Section II. we justify the need for nonlinear logic block models by analyzing the difference between functional noise and delay noise estimation. In Section III, we give a brief presentation of the existing logic block models used in static timing and noise analysis, and an introduction to the Galerkin FEM used at the core of our modeling technique. The modeling technique is described in detail in Section IV, followed by implementation results (Section V) in the context of delay noise and static-timing analysis. In Section VI, we review the major contributions of the proposed modeling technique and set goals for future work.
II. FUNCTIONAL NOISE VERSUS DELAY NOISE
To illustrate the difference between functional noise and delay noise, we consider the example of a medium-sized four-input NAND gate victim driver, with the active input signal in while all others ( , and ) were tied to logic 1 ( ). The output of the gate is driving a long chip-level interconnect wire (Fig. 2) . The aggressor net, with a strong inverter as driver, has been routed alongside the victim at minimum spacing such that approximately 40% of total victim wire capacitance is coupling capacitance. The two nets are coupled for 75% of the victim net's length. The aggressor signal has been offset such that its effect is overlapping the far-end victim signal obtained in the absence of noise. In Fig. 3(a) , we show the victim input signal and the victim far-end signals in three cases: the signal in the absence of noise ("without noise"), the signal in the presence of noise ("with delay noise"), and the estimate of the noise-perturbed far-end signal obtained by superposing the functional noise pulse with the signal in the absence of noise. In Fig. 3(b) , we compare the delay noise "pulse" signal (obtained by taking the voltage difference between the original signal and the noise-perturbed signal at the far end) and the functional noise pulse (obtained by holding the output of the gate at 0 logic) and it is immediately apparent that functional noise can be a very poor estimate of delay noise.
In Fig. 4 , we show the effective driver resistance seen at the output port during the output transition [for the input-output signals pair shown in Fig. 3(a) ]. The last stage of the logic block can be seen as comprised of two variable resistors: one modeling the P FETs and one for the N FETs. These two resistors will have opposite variation during the transition of the output and, as a consequence, any current injected in the output pin will see this variable resistive path to ground. Note how the effective resistance varies between 100 and 1000 ohms during the active interval. Since one of the existing methods to model the driver for delay noise [12] relies on computing a transient holding resistance, it is apparent from Fig. 4 that such a single-value resistor model will not be able to accurately capture the complex driver behavior. 
III. BACKGROUND
In Section III-A we give a succinct presentation of the different linear driver models currently used in static timing and delay noise analysis. In Section III-B we give a brief introduction to the Galerkin FEM, which is the core of our modeling technique.
A. Logic Block Models for Static-Timing and Delay Noise Analysis
In the timing precharacterization process of a logic block, detailed simulations of all the possible signal-propagation paths are performed for different input signals and output loads. The delay measurements are stored in table format or even postprocessed as delay equations. The delay data (equations) are usually generated for simple output capacitive loads. However, due to interconnect parasitic resistance and inductance, the output load of the logic block must be modeled by complex RLC circuits that vary from simple models to high order models. During timing analysis, the simple delay data (equation) is used to generate driver equivalent circuit models using an iterative procedure (often called the C-effective technique). These models were first introduced in [18] , later their accuracy has been greatly improved in [19] .
In the C-effective technique, the driver is modeled by a Thévenin-like circuit: an ideal voltage source-step or saturated ramp-and a driver equivalent resistance (Fig. 5) . The iterative procedure tries to determine an "effective" output capacitance load such that for the output signal transition time interval the total charge stored on the simple capacitance is the same as the total charge stored on the complex load, and the delays and rise times derived from precharacterized data for this simple effective capacitance match the ones obtained through the simulation of the linear driver model driving the RC load where is charge, is current, TD is delay time, TX is rise time, and and are the gate delay and rise time coming from precharacterized data.
An improved version of the C-effective algorithm has been presented by Dartu et al. in [19] . In this model, the internal voltage source is a saturated ramp with two parameters, arrival time, and rise time, used to match precisely the actual driver output signal 20% and 50% points for a C-effective load.
In practice, the C-effective techniques [18] , [19] are robust and converge rapidly, and have been in use for almost a decade with small variations in most static timers, commercial as well as corporate EDA tools. Switching noise and other effects (such as on-chip interconnect inductance and multiple source nets) are relatively recent issues for static timing and there has been significant work done to extend the C-effective modeling techniques to handle these cases. In [17] , the authors extend the algorithm to general output load models in reduced-order format [23] , [24] . In [25] and [26] , the authors developed an extension of the RC model to a stable RLC model with good accuracy for chip-level timing.
For delay noise analysis there are several models proposed in the literature:
1) An extension of the C-effective technique to model the injected current as additional capacitive load [16] . The C-effective technique is applied simultaneously to the victim and the aggressor resulting in a system of nonlinear equations solved efficiently using the successive-chord method. It is worth noting that the same algorithm can be applied to the situation of a single net with multiple sources.
2) The transient holding resistance model proposed in [12] models the reaction of the logic block to the injected current with the help of a fitted resistance. Each iteration contains the following steps: a) for each aggressor in isolation (with victim and other aggressors grounded), the current injected in the victim is recorded. b2) a nonlinear simulation is performed to determine the response of the logic block with the induced current at its output. From the comparison of this output with the one obtained in the absence of noise a delay noise pulse is obtained. c) a transient resistance value for the victim driver is then computed to match the area of the delay noise with a functional noise pulse. 3) the large signal current models used in crosstalk analysis in [22] are based on dc current measurements. In [12] , the authors have compared the first two methods and reported much better results for the transient holding resistance technique. It is important to note that the transient resistance technique requires a nonlinear circuit simulation at each iteration. The third modeling technique requires special characterization and is not directly applicable to custom logic blocks. The existing literature [21] , [22] does not provide detailed implementation information to allow a direct comparison with these models.
B. Introduction to FEM
The FEM is used extensively in engineering (e.g., for solving field equations, in various civil and mechanical engineering problems, and electronic device parameter modeling). The success of FEM comes from its simplicity and flexibility.
Furthermore, the method can be used very efficiently (such as the Galerkin method) by reducing the nonlinear dynamic problems to simple linear systems of equations. In this section, we give a very brief introduction to finite elements tailored to the Galerkin method. This introduction follows [27] closely.
To illustrate the basic concepts from FEM, we take a simple differential equation with essential boundary conditions. Find , a real valued function defined on a finite real domain , so that , satisfying the following differential equation:
with the given boundary conditions (2) Let us assume that we have a family of basis functions , and that the solution to our problem is sought to be in the form (3) In order for this set of basis functions to provide a reasonable approximation of the solution, each basis function must be continuous, bounded, and twice differentiable on . The solution is defined only on a set of nodes within with and . As a natural choice for a set of basis functions we use the Lagrange interpolating polynomials. For our domain with nodes, we define basis functions such that each one is 1 in one node and 0 in all others. The family of basis functions is defined as (4) The Galerkin technique first transforms the differential equation (1) into an integral equation by noting that if (1) is identically satisfied by the solution then the following form: (5) holds for any test function . Integrating by parts the secondorder derivative term of (5) we get (6) Equation (6) must hold for any choice of a test function and that gives us the possibility to choose test functions that are equal to 0 at the boundary points which will cancel the last two terms in (6) . If is defined also over , a set of basis functions which must be continuous, bounded, and at least once differentiable on , we can can rewrite (6) as (7) which must be valid for any , thus generating independent equations: for (8) Using (3), the system of (8) can be rearranged as: for (9) The differential equation (5) has been reduced to a system of linear equations (9) which can be written in a compact form as The system matrix entries are (10) the solution vector is and the right-hand side vector is , with the th row entry defined by (11) The evaluation of the integrals in (10) and (11) is done using Gaussian quadrature formulas.
There is a tradeoff between the number of nodes and the accuracy of the approximation, and the computational cost of the modeling process. In order to keep the computational cost low, the domain can be further split into contiguous elements as . This allows the use of low order basis functions and low order Gaussian quadrature [27] formulas. 
IV. NONLINEAR DRIVER MODELS FOR TIMING AND NOISE ANALYSIS
The switching noise pulses inject/draw charge in/from the victim net, effectively changing the size of the interconnect load seen by the victim net driver. As a consequence, the driver response depends simultaneously on the input signal and the noise pulse, and it is not possible to separate these effects without incurring errors. Our solution is a simple nonlinear model which has either a Thévenin or a Norton form (shown in Fig. 6 ). In the following, the Thévenin-type model is used to present the modeling process and the Norton-type model is used to analyze its stability properties.
The main steps of our modeling technique are presented in Section IV-A, followed by a discussion on the properties of our models in Section IV-B.
A. Proposed Modeling Technique
In the Thévenin form, the driver model is comprised of a nonlinear voltage source , controlled simultaneously by the input pin voltage and the output pin voltage , and a fixed value resistance modeling the holding high/low output port admittance (considered known/given for now). For any input signal and any output capacitive load we can determine from the precharacterized data a set of output delay values on predefined measurement levels (e.g., 10%, 50%, and 90% delays). The circuit shown in Fig. 6(b) is described by the following Kirchhoff current equation: (12) We assume that is defined over a 2-dimensional (2-D) domain (13) An example of a simple piece-wise linear voltage source function is shown in Fig. 7 . Our goal is to determine the expression of such that the output signal solution of (12) is similar at the measurement points with the predetermined output data measurements.
The current equation of the output node (12) can be rewritten in integral form [following (5)] (14) We must point out an important difference between the traditional FEM and our technique; the former is applied to solve a differential equation (i.e., to find , the function under the differential operator), while we actually have the solution, but we do not know the functional coefficients of the equation (i.e., ). In some sense, we are applying the FEM in reverse. The source is sought to be in the form (15) where and , and is the value at the point . are 2-D Lagrange interpolation polynomials defined over . Most often the 2-D basis functions are defined on rectangular elements (such is the case here) with uniform node distribution within elements and are simply products of 1-dimensional basis functions All other (time dependent) functions:
, and are expressed in interpolated form. For a particular input-output signal pair (Fig. 8 ) the time domain is partitioned by the measurement points. For the end points of the time domain, is defined by the starting point of the input signal and is usually defined by the end point of the output signal . For example, we express the input signal as and the output signal as Note that both input and output are defined by measurements which are taken on predefined voltage levels, i.e., the values and are known a priori.
The test function is expressed using a set of basis functions which may be different from the basis functions used for signals When all the functions are expressed in interpolated form, (14) becomes (16) Since we can choose any test functions, (16) must be identically satisfied for any choice of coefficients, generating an equivalent system of equations. Following the algebraic manipulation of the Galerkin method described in Section III-B, each equation can be described as (17) where . All the integration in (17) is performed using Gaussian quadrature. For example, the 2-D basis functions are evaluated at the abscissa points , first on each signal, and , then . Equation (17) can be concisely written as (18) which corresponds to one equation of the linear system (19) The matrix entries on row are defined as (20) and the right-hand side vector entry on row is defined as (21) All the unknown voltage points are ordered in the vector . By solving the system of equations (19), we can obtain the set of voltage point values that define our voltage source model.
The number of equations obtained in this process must be related to the number of elements needed for the function. Since one input-output signal pair will provide a limited number of equations , given a choice of test functions, we may have to consider more than one input-output signal pair in order to satisfy the total number of unknowns . It is easier to understand this by visualizing every input-output signal pair as a trajectory in the input-output domain . Fig. 9 shows a typical set of trajectories for an inverter with rising and falling input signals (and corresponding falling and rising outputs). These trajectories can be obtained by varying the input signal and/or the output pin capacitive load. To improve the ability of our model to capture the hold-up and hold-down driver resistance, we can use actual noise-pulse waveforms to cover the corresponding corners of the domain [ Fig. 9(b) ] or use precharacterized functional noise holding resistance values. The variety of basis functions and the flexibility in the choice of domain partitions provide us with the adequate means of controlling the accuracy of our models. Depending on the application, the user can choose to fit a model to a larger number of data points (equations) and can use curve-fitting techniques such as singular value decomposition to generate optimal (in the least square sense) driver models. Or, as will be shown in Section V-B, the user can simplify the logic block model complexity and use the modeling framework to efficiently generate linear logic block models for static-timing analysis.
B. Properties of the Proposed Nonlinear Driver Model
In a practical implementation like the delay noise analysis flow, we must pay attention to the stability and the robustness of our driver models. While it is not possible to guarantee the stability of the models generated using our technique, it is possible to check and determine the stable domains. It is also possible to take steps to improve the robustness of the model (such as input-signal time scaling and domain partitioning) based on the available measurements data.
The domain of existence was defined in (13) as the entire space of input-output port voltage values that the original circuit can take (22) For stability analysis, it is easier to work with a Norton-type model, while the Thévenin-type model was a more natural choice for presenting the modeling technique in the context of timing analysis (Fig. 10) . The output port Kirchhoff current equation in Norton form is (23) where is the output port current exiting the logic block and is defined for any input-output port voltage pair in . The model resistance is derived using the holding high/low small signal output port impedance of the logic block (and is the same as the one used in the Thévenin format model).
The stability analysis is exemplified for the four-input NAND gate for which we selected one timing arc from input to output X. The other input pins ( , and ) are at logic 1. In Fig. 11(center) , the dc output port current of the NAND gate is plotted with respect to input and output pin voltages. We also provide the contour plots of the current as a function of (the output port voltage) for fixed values of the input port voltage [in Fig. 11(left) ] and as a function of input port voltage for fixed values of the output port voltage [in Fig. 11(right) ]. Note that at any given point of the domain the dc output current has a negative sensitivity with respect to a change in the output voltage. A two-pin element with a strictly negative current variation with respect to voltage is an incrementally passive element [28] .
In Fig. 12 , we show the output pin operating point voltages of the NAND4 gate as a function of the input pin voltages of the NAND4 gate as a function of the input pin dc voltage. The nonlinear model that we have generated is not going to match the dc port current of the original gate because it models the transient behavior rather than the steady state one. In Fig. 13 , we show the port current of our model, which has been obtained for a one-element partition of the domain and using second order Lagrange interpolating polynomials as basis functions. From the contour plots, it can be seen that the port current is not monotonic inside the domain and that the lack of monotonicity results in multiple steady-state points for the same input-output voltage pair. The "incrementally passive" domain is the region where the current is strictly monotonic and has negative sensitivity with respect to the output port voltage. Within the full domain defined in (13) and (22), the incrementally passive region is defined by (24) which indicates that in any operating point, the derivative of the port current must be negative. The restricted incrementally passive domain can be determined when the model is generated, allowing the user to decide whether the domain covers the expected simulation conditions.
In Fig. 14 , the steady-state points curve of the driver port model and stability region(s) are shown. The white region corresponds to all the points in given by (24) , where the model is incrementally passive. The black regions contain operating points from which the simulation will diverge. To better explain the above statement, consider the case where the output is switching from logic 0 to 1. When the input has reached logic 0, the output-node activity is governed by the output current characteristic (shown in Fig. 15) , which is the contour plot of the output current for in Fig. 13 . If the output is in a state between 0 and (where signifies the incrementally passive region boundary point), the circuit will evolve toward negative output port values because the current variation with output port voltage is positive. The point at negative is not a steady state value, in fact it is a point of divergence. Any transient simulation reaching this point will continue to diverge to . The point at positive is an attraction point, hence, a steady-state point. A possible solution to this problem is to use finite-elements integration with boundary conditions for both start and end port voltage values (essential boundary conditions). Such integration will in fact replicate the modeling process and will accept unstable behavior over small time intervals (the gray-shaded areas in Fig. 14) .
V. IMPLEMENTATION
In this section, we present two possible usage scenarios for our proposed modeling technique. In Section V-A, we describe the implementation of a nonlinear (quadratic) driver model for delay noise analysis, and in Section V-B, we present the implementation of a piece-wise linear driver model for static-timing analysis.
A. Logic Block Models for Delay Noise Analysis
One of the properties of the nonlinear logic block models generated using the proposed modeling technique is the ability to capture the nonlinear behavior of the output driving port of logic blocks in the context of delay noise. From a practical implementation standpoint, static-timing analysis complemented by functional noise perturbation information is a reasonably accurate approximation of the noise impact on timing that can be used effectively in a noise-aware design methodology. In the cases where functional noise is significant on critical timing paths, it is reasonable to generate nonlinear logic block models to verify the extent of noise induced timing errors. As a practical application of our modeling technique, we have implemented a critical timing path-delay noise-analysis tool that is using quadratic logic block models and SPICE-level circuit simulation to verify the effect of noise on delay.
In our analysis, we have used timing characterization for 10%, 20%, 50%, 80%, and 90% voltage threshold levels. In the 10%-50% and the 50%-90% ranges, we have used quadratic polynomial approximations, the 0% and 100% time points were determined through extrapolation. For the timing arcs that did not have both falling and rising timing information, we assumed symmetrical data for the missing output signal direction.
We have implemented a Thévenin-type model on a 2-D domain with one element (similar to the one shown in Fig. 9) characterized by nine nodes (a 3 3 grid), two of them with known values, corresponding to the hold high and hold low conditions. So, seven node values were unknowns in the modeling process. We have used eight input-output signal pairs, four for rising output, and four for falling output using lumped capacitance loads that are representative for the complex load (total victim grounded capacitance, total capacitance etc.). On each signal pair, we have isolated two equations using rectangular test functions (one equation corresponding to a test function that isolates the 0%-50% section of the output response and, one equation corresponding to a test function that isolates the 10% to 90% section of the output response). The resulting system of 16 equations with seven unknowns was solved using singular value decomposition. For convenience, the driver-model resistance was computed using the method outlined in [19] as the effective resistance using the 50%-90% rise time (25) where was the total output load capacitance and was the input-signal rise time. The value of was chosen to be the smaller of the two effective driver resistances computed for rising and falling output signal. We have observed experimentally that the driver resistance value has a small effect on the accuracy of the model but the higher the resistance the more likely it is to generate larger unstable regions (Section IV-B). The modeling-stage complexity is dominated by the SVD procedure complexity where is the number of unknowns [29] .
In the rest of this section, we show a suite of examples based on the medium-sized four-input NAND gate model, mentioned in Section II, from Motorola's MPC755 microprocessor containing a PowerPC 1 processor core.
Example 1: The first example is the test used in Fig. 3 . The near-and far-end waveforms in the case "without noise" are shown in Fig. 16(a) . The near-and far-end waveforms in the case with delay noise are shown in Fig. 16(b) . The actual "delay noise" waveforms are shown in more detail in Fig. 17 for the far-end signals. In Figs. 16 and 17 , we have also shown the delay noise estimates using the linear logic block timing model obtained based on the Dartu et al. [19] modeling technique (the waveforms marked "timing model"). It is apparent that although the signals in the absence of noise are accurate, in the presence of noise the timing logic block model has significant errors.
Example 2: We have used the algorithm proposed in [12] to generate the transient holding resistance for comparison with our model. Through full net simulation, a transient holding resistance has been determined such that the area of the noise pulse with resistance model matches within 0.004% the area of the real delay noise pulse at the output of the logic block (near end). The superposition of the quiet response and the noise pulse with resistance model produces the approximation of the noise impact on delay. The errors are tabulated in Table I . In Fig. 18 , all the near-end point "delay noise" pulses (using actual 
B. Logic Block Models for Static-Timing Analysis
Using our modeling technique we can generate accurate piece-wise linear (PWL) logic block models that are suitable for fast timing simulation and compatible with the frequency-domain interconnect analysis models employed by most static-timing analysis tools [23] . Our approach is to start with the models proposed in [19] -which makes our models compatible and easy to integrate with the existing static timing tool engines. In a second step, the internal voltage source of the model (see Fig. 5 ) is modified by a PWL correction term such that it accurately captures the logic block output signal for the effective capacitance load.
The main drawback of the Dartu et al. model [19] is that the second half of the driver-output response is modeled only indirectly using the effective driver resistance value (25) . In practice this results in less precise output rise-time estimates and this imprecision propagates down the timing path, ultimately affecting the accuracy of all delays and rise times. A possible extension of the model would allow the internal voltage source to have a flexible PWL form. In Fig. 19 , we show a possible PWL logic block model. A very similar approach has been proposed in [17] , but has gained little support in practice due to the difficulty of precharacterizing the internal PWL voltage source. In a brute-force implementation, where the PWL source breakpoint values would be variables, the nonlinear system solver quickly becomes expensive ( complexity per iteration, where is the number of breakpoints) and the procedure is not guaranteed to converge. In contrast, with our modeling technique, we are able to generate the PWL internal source parameters very efficiently, in only complexity, without iterations and without extra precharacterization.
Using the method described in [19] as the first modeling stage provides the driver resistance value, the effective output capacitance value, and a good first approximation of the internal voltage source. At this point, the user has the ability to check the accuracy of the model on the second half of the output response against the value indicated by the timing rules (e.g., the 80% point error). If the error is significant, one can go to the second corrective step, which is described next.
The internal voltage source of (12) is expressed as (26) where is the saturated ramp source provided by the first modeling step and is the correction term. Note that is initially a function of the output port voltage . This allows us to define the correction term at specific output port voltage levels, independent of time, avoiding in this way the problem of selecting breakpoints that have both time and voltage as variables. Following the procedure outlined in Section IV, we select the driver-output signal for the effective capacitance load . Equation (12) becomes (27) To simplify notation, we collect all the terms that are fully known in (27) in the function (28) We use a PWL representation of (29) which is computationally efficient and sufficient for accurate timing analysis but it should be noted that basis functions of higher order can be used. In Fig. 20 , we give an example of a three-node PWL correction term (first order Lagrange basis function) with two boundary nodes . Using the Galerkin method we transform the differential equation (27) into an integral equation (30) Assuming that the test function is generated using a set of basis functions and has the general form Equation (30) can be rewritten as
Because (31) must be satisfied by any test function with any coefficients, (31) is equivalent with a set of independent equations: for (32)
In Fig. 21 , we show the correlation between the position of the nodes of the finite element representation of the correction term and the selection of the linear test basis functions . Each node has a predefined voltage level (for simplicity we have chosen equidistant nodes between and ) which uniquely identifies a point on the output signal . The triangular pulse basis function spans the time interval between the time points corresponding to node and node and equals 1 in node . Using (28) and (29), the system of equations (32) is PWL, the resulting system matrix is tridiagonal and the linear system of equations can be solved in linear complexity . Each entry in the system matrix and in the right-hand side vector is evaluated using Gaussian quadrature. Depending on the order of quadrature , the effort to populate the matrix is . Overall, computing the correction term is a procedure of complexity. After the parameters of the correction term are determined from (35), we perform the last operation in which the complete PWL form of internal time-domain voltage source is computed. The correction term is evaluated at different time points within the interval in which the output signal is active ( , and in Fig. 21 ). We have implemented this static-timing analysis specific version of the modeling technique described above and verified its accuracy on a set of 640 test cases designed to showcase difficult simulation conditions. The test driver is a strong inverter which is sensitive to input-signal shape. The input signals to the driver are "collected" along a pin-to-pin interconnect wire driven by a medium-sized inverter and the driver-output loads are the input-port admittances to various interconnect wires with different resistive shielding conditions. As observed in [19] , in order for the modeling technique to provide accurate signal estimations, one needs accurate timing arc characterization, beyond the customary 50%-50% delay and 20%-80% output rise time. In our analysis, we have used timing characterization for 10%, 20%, 50%, 80%, and 90% voltage threshold levels. For all the input and output signals we have used cubic splines representations based on the five-point characterization. The 0 (0%) and 1 (100%) logic time-points were obtained by extrapolating the signal waveform from the 10%-20% range and 80%-90% range, respectively. The characterization was verified to be accurate within % for individual point values (delays) and % for time differences between point values (rise times).
We have implemented the modeling technique from [19] and the static-timing analysis version of our modeling technique. The distributions of delay and output rise time errors for the Dartu et al. models are shown in Fig. 22 . The distributions of errors for our modeling technique using 5-nodes PWL correction term models are shown in Fig. 23 . As expected, the improvement in delay accuracy is small but the improvement in rise-time accuracy is significant.
For a practical implementation, one needs to consider the accuracy of the characterization data (affected by circuit simu- lation accuracy, delay measurements imprecision, gate internal state, mismatch between actual timing gate input signal and characterization signals, etc.) as a limiting factor on the accuracy of signal propagation and, consequently, choose a reasonable order of complexity for the model.
VI. CONCLUSION AND FUTURE WORK
In this paper, we have proposed a new modeling technique with applications to static timing and delay noise analysis. Our approach has a few distinct advantages over the existing modeling techniques for timing and/or noise.
1) The modeling process uses existing measurements data generated for static-timing analysis for each logic block; no new data or special characterization work is needed, although more accurate characterization (beyond delay and slew) is recommended.
2) The models have variable accuracy both in terms of the range of input rise time and output capacitance load that is covered and in terms of the error with respect to the actual measurement values used in the process.
3) The modeling technique is very flexible and can be adapted to specific modeling tasks-we have presented possible implementations for delay noise and static-timing analysis. As drawbacks we can point out the following 1) The nonlinear delay noise models require SPICE-level simulation, which currently limits their usage to custom tools such as a critical timing path delay noise verification tool. The issue can be efficiently addressed by using special purpose fast circuit simulators (such as ACES [30] ).
2) The accuracy of the models depends on the assumptions made during characterization (the internal state of the logic block). As future work, we want to investigate the possibility to take advantage of the model structure and use a finite elements-based simulation engine as a fast circuit simulator. In addition, we would like to investigate the possibility of extending our modeling technique to static timing and noise analysis in the presence of inductive interconnect.
