I. INTRODUCTION
T HE DRASTIC down scaling of layout geometries to 65 nm and below has resulted in a significant increase in the packing density and the operational frequency of VLSI circuits. An unfortunate side effect of this technology advancement has been the aggravation of noise effects, such as the capacitive crosstalk noise. The conventional static timing analysis (STA) techniques model signal transitions as saturated ramps with known arrival and transition times, and propagate these timing parameters from the circuit primary inputs to the primary outputs. To check whether the circuit meets its timing goal, the required time for each circuit node is calculated by using a backward propagation method [1] . If the signal arrival time is less than its required time, the node will be safe from a timing point of view. This signal model has also been used in statistical static timing analysis (SSTA), where the mean and variance of the arrival/transition times are calculated and propagated through the circuit for the purpose of timing analysis. Note that different waveforms with identical arrival time and slew (transition) time applied to the input of a logic gate or an interconnect line can result in very different propagation delays through the component, depending on the exact form of the applied signal waveform [2] . Therefore, the shape of the voltage waveforms should be considered in order to ensure accurate timing and noise analysis results in sub-90 nm CMOS designs.
In the application-specific IC (ASIC) design flow, combinational and sequential logic cells are precharacterized for the input-to-output propagation delay and output slew as a function of the input slew and effective output capacitance . We shall refer to this modeling technique as the voltage-based method throughout this paper. The voltage-based approach not only suffers from high timing inaccuracy due to approximation, but also is inherently incompatible with the arbitrary shapes of voltage waveforms, and thus, falls short when dealing with noisy inputs such as crosstalk-induced noisy waveforms. A current source model (CSM) is load independent, and can handle any electrical waveform at intermediate signal lines of the circuit; therefore, it overcomes the aforementioned shortcomings of the voltage-based models.
The incompatibility of the voltage-based precharacterization data with noisy waveforms necessitates additional waveformaware characterization steps of the logic cells for the purpose of noise analysis. One aspect of the noise analysis is to realize whether a certain noise glitch causes a failure, meaning it is large enough to change the state of a memory element and result in functional error. To perform noise analysis, first the victim noise glitch injected by the aggressor net should be calculated. A mechanism based on noise failure criteria should then be used to determine whether the noise is faulty. Noise failure criteria has been commonly modeled as either dc or ac transfer curves of the receiver logic cell to represent how much a cell is immune to noise glitch [3] .
Accurate determination of noise failure criteria for sequential elements is very challenging because the final state of the memory element depends not only on input noise height and width, but also on its alignment with the clock edge. Noise analysis is performed in [4] for feedback loops to check whether the noise transferred from the output back to the input is strong enough to change the state of the circuit. Considering the fairly complex architecture of sequential cells, especially the presence of feedback loops typical noise analysis precharacterization is computationally very expensive. A key advantage of our CS model is that it can handle any type of input voltage waveform, including full-swing hazardous pulses and partial glitches, e.g., a crosstalk-induced noise glitch. Consequently, no extra characterization steps, such as the one in [4] , are needed.
Before going into the existing CSMs, the two well-known vendor formats namely effective CSM (ECSM) [6] and composite CSM (CCSM) [7] are briefly reviewed. For a given input slew and , ECSM stores the times at which the output voltage waveform crosses certain predefined threshold points. In CCSM, the output current values at specified voltage-level points are stored. It is interesting to note that the stored current values in CCSM can be retrieved using ECSM-stored voltage values, and vice versa (from ); therefore, ECSM and CCSM are essentially identical models.
Both models can be regarded as generalizations of the conventional cell delay models, which only store three predefined voltage crossing points (such as 20%, 50%, and 80%) in the form of cell delay and output slew time as a function of input slew time and . ECSM and CCSM come short in the presence of noisy waveform. This is why the electronic design automation (EDA) vendors have come up with other models and formats for the noise analysis in VLSI interconnect.
The authors of [8] were among the first to present a true CSM of a CMOS logic cell (called Blade), in which a precharacterized CS is utilized to capture the nonlinear behavior of the cell with respect to the input and output voltage values. They model parasitics of the logic cell with a single capacitance at output node. The computed output voltage waveform is time shifted by a precharacterized value to compensate for the offset with respect to HSPICE. The Miller effect between the input and output nodes was ignored in this model. A Blade-based model is used in [9] , and the input and output voltage waveforms are approximated with Weibull functions. Keller et al. [10] presented a CSM for the purpose of crosstalk noise analysis. Similar to Blade, a precharacterized CS is used. The parasitic components, namely the Miller and the output capacitances, are assumed to be constant regardless of the input and output voltage values. In practice, these capacitive effects can vary by orders of magnitude, depending on cell input and output voltage values. This weakness is resolved by introducing a nonlinear output capacitance model in [11] . The authors in [12] presented a CSM in which the input and output pins as well as several chosen internal pins of the cell are modeled with a voltage-dependent CS and a nonlinear capacitor. Each component in this model generally depends on all the input voltages and the output voltages. We introduced in [13] nonlinear input, output, and Miller capacitors along with an output CS, all of which are functions of the input and output voltages.
CSM is compelling in the sense that instead of only propagating the delay and slew value, it can propagate the whole voltage waveform (in the form of a set of pairs). CSM is able to do this propagation along the whole timing path from primary input to primary output. High accuracy of the CSMs makes them attractive for employment inside a sign-off timing analysis tool. Once a set of critical paths is identified by a standard STA tool, CSMs of logic cells along a target critical path may be utilized to provide an accurate, yet highly efficient, evaluation of the timing criticality and/or noise susceptibility of the path in question. Close-to-SPICE accuracy with orders of magnitude faster than SPICE tools makes the CSM-based analysis very attractive.
All previous CSM approaches have targeted combinational logic cells. Lack of CSM for sequential circuit elements makes it impossible to have a complete CSM-based solution for performing the delay and noise analysis, and optimization steps. Our CSM for the sequential cells makes it feasible to construct the exact voltage waveforms for their outputs, and hence, drastically reduce the pessimism of timing arc calculations in the presence of noise.
We note that accurate characterization of setup and hold times of sequential cells is critical for timing analysis of CMOS digital circuits [19] . Optimism in setup/hold times can cause circuit failure, while pessimism results in performance degradation. As a result, full SPICE-level analysis of sequential circuits, using detailed device models, has been widely used in industry [20] . However, SPICE-level simulations in existing industrial practice can have tremendous run-times. Alternatively, any existing SPICE-based characterization methodology can adopt CSM to perform its necessary output voltage calculations for setup/hold characterization to achieve accuracies close to SPICE while speeding up the process (we have shown in [18] that output voltage calculation using our CSM is above three orders of magnitude faster than HSPICE.)
One of the deficiencies of conventional sequential cell models is that they report an unknown result for the output if the setup/ hold time tests are violated [5] , [17] . A key benefit of the proposed model for CMOS register cells is that the output waveform may be computed even when setup/hold time violations occur. This can be very useful for diagnostic purposes.
The major contributions of our work are as follows. 1) A more accurate CSM for combinational cells is presented.
2) CS modeling is introduced for sequential cells, e.g., latches and flip-flops. A thorough investigation is also conducted for different sequential circuit components, especially the feedback as the most challenging element. 3) The cell output voltage waveform can be constructed with close-to-SPICE accuracy (average mean squared error with respect to SPICE for voltage waveforms was less than 2% of ) with speedup as high as 2000 times compared to HSPICE [14] . This is achieved because the cell parasitic effects such as the Miller capacitance, the nonlinearity of these parasitic effects, and the feedback and multistage loading effects of sub-circuits are captured by our precharacterized CSM. 4) The output of the cell can be predicted even when timing tests are violated. Voltage-based sequential models report "unknown" for the output if a timing constraint such as setup check is not met. The remainder of this paper is organized as follows. In Sections II and III our CSMs for combinational and sequential logic cells are presented respectively.
II. CS MODELING-COMBINATIONAL LOGIC CELLS
We first start with our CSM for combinational logic cells. This will lead us to a better understanding of our proposed model for sequential circuit elements. A dc analysis step is performed to precharacterize this CS as a function of the input and output voltages of the cell. The difference between the existing combinational CSMs is mainly on how they capture the parasitic effects. The main motivation for us to create a new CSM was that the existing models sometimes exhibit rather large errors compared to HSPICE, because they ignore parasitic effects altogether or make simplistic assumptions about them.
Our combinational CSM, which is shown in Fig. 1 , consists of three nonlinear voltage-dependent capacitive components, namely, input and output parasitic capacitances, and , to model the parasitic effects at input and output nodes of the cell, and the Miller capacitance, , to model the Miller effect between the two nodes. The model also has , a nonlinear voltage-controlled CS at the output node. Each component is a function of the input and output voltage values. The following sections give details of our precharacterization steps for the CSM of Fig. 1 using HSPICE.
The CS at the output node captures the nonlinear resistive behavior of the combinational logic cell during an output transition. More precisely, the following KCL equation models the current at the output pin of the cell during switching:
(1)
A. Precharacterization
Model parameters, , , and should be calculated and stored in the logic cell delay library. To characterize , input and output pins are driven by dc values. Each pin is swept from to , where is considered for cases where the input and output voltages under/over shoot beyond ground and . The current sourced by the output pin is measured in SPICE. As a result, a 2-D lookup table is constructed to store the values of . Fig. 2 shows the characterization setup for calculating the model elements, which is then stored in the cell library. To characterize for the cell, CH1 and CH2 are dc voltage sources that are swept from to . Since and do not change, all derivative terms in (1) become zero, i.e.
(2) For a given input-output voltage pair and , it is enough to monitor , the current flowing into CH2, to determine . Parasitic capacitances are precharacterized through a series of transient simulations, in which saturated ramp input and output voltages are applied to input and/or output nodes while the output current is monitored. More specifically, to characterize , a saturated ramp is applied to CH1 and a dc voltage source is applied to CH2. Equation (1) is then simplified to (3) where is the only unknown parameter for each level of the input ramp and of the output dc voltage source. As changes (for example, from to ) and for a constant , values are calculated. The earlier experiment is repeated for each between and . To characterize the output capacitance , a dc source is connected to when a saturated ramp drives . Equation (1) becomes (4) With already characterized values, is the only unknown parameter, which is easily calculated as before.
In our model, and are dependent only on the input and output voltages. Therefore, according to this model, the slew of the ramp signal waveforms used in the transient analysis should not affect and values. However, in (3), for example, the term represents the slope of the ramp signal applied to , which may assume different values. If we change the input slew, the measured value (for the same level of and ) will also change. These two variations in and tend to counter each other so that the change in the calculated for different input slews is small. More importantly, the sensitivity of the output voltage waveform to variation as a function of input slews is quite weak. To be more precise, we have noted that the value can change for up to 5% for different input slews ranging from 50 to 500 ps, whereas the change in output voltage waveform for the same range of input slews is only 0.2%. We have thus opted to ignore the effect of input slews on parasitic capacitance characterizations. In practice, we examine ramp signals with different slopes and use the average parameter values for all the ramps to fill up the lookup tables.
The following KCL equation is used to characterize , the parasitic capacitance seen at the parasitic capacitance seen at the input: (5) To characterize , a dc source is connected to when a saturated ramp drives , resulting in
The only unknown parameter in this equation is , which is easily determined. This characterization is done for values ranging from and . It takes less than a second for each parameter of our CSM to be characterized.
The characterization steps for different combinational logic cells in a cell library are typically automated as part of a library characterization tool.
B. Output Voltage Calculation
A logical cell generally drives a circuitry including one or more logic cells through a short or long piece of interconnect. This whole circuitry can be considered as a load. Typical cell delay models are forced to model this load an effective capacitance to make the load compatible with the characterized cell library. However, using our CSM, we have the advantage of using any type of load model to increase the accuracy of cell delay analysis. KCL at the output node results in the following equation: (7) where denotes the current drawn by the load. Equation (7) is essentially the same as (1) in which has been replaced by . In [10] , it is shown how to use the Pade method to approximate the admittance function of an RC network with a reduced-order representation. As reported in [10] , in most cases, only one Pade term (i.e., the model approximation) is sufficient for the error to be within 2%-3% of SPICE. In the rare cases where one pole is not sufficient, more Pade terms can be use. Note that is the admittance function of the load multiplied by the output voltage; therefore, in (7), the only unknown parameter is the output voltage of the logic cell . One can numerically calculate by any integration method (in our implementation, we use the Euler integration method [15] ).
The CSM described before is used to model logic cells with a single channel-connected component (CCC) [16] . Examples of a single CCC are inverter, NAND, and NOR cells. For the case of multistage logic cells, such as OR and AND gates, the logic cell should be divided into multiple CCCs. For each CCC, a CSM should be generated. For example, AND (i.e., a NAND followed by an inverter) has two CCCs, therefore a cascade of two CSMs is used to model the AND gate. To calculate the output voltage of the AND cell, first the output voltage value of the NAND cell is calculated. This voltage is then input to the inverter cell to produce the output voltage of the AND cell.
III. CS MODELING-COMBINATIONAL LOGIC CELLS
We show how to construct CSMs for specific instances of sequential cells (i.e., a transmission-gate (TG) based latch and an MS flip-flop). This construction makes use of the circuit schematic of the flip-flop and requires understanding of the detailed operation of the flip-flop. The CSM construction process for other flip-flops (including, for instance, the monostable-or time-window-based ones), which is desirable from a practical viewpoint, has not been automated. Although this is an important undertaking, it falls outside the scope of this paper. Fig. 4 shows a simple latch with a data input D, a clock input CLK, and true output Q, and the complementary output . The goal is to devise a CSM capable of computing the output voltage waveforms (for nodes Q and/or ) given the input voltage waveforms for data and clock nodes. The feedback loop is the most challenging part of such a model, because the noise that has been transferred to the output node through the path from the inputs to the output can be magnified and fed back to input. The model must be capable of accounting for this feedback-magnification effect.
We construct the CSM for a TG, which is commonly found in sequential circuit elements [cf., Fig. 5(a) ]. The TG essentially acts as a nonlinear resistor with the resistance value adjusted by its control input voltage ( and ) as well as its input and output voltage levels. The nonlinear resistance behavior of the TG can be effectively modeled by a CS [cf., Fig. 5(b) ].
Each capacitance in Fig. 5 (b) models the parasitic effects seen at the respective node. There also exists the Miller effect between every two nodes. The corresponding Miller capacitance between every pair of nodes is decoupled and merged into the capacitance of each node.
It is necessary to consider the effect of both G and , therefore the dependency becomes 4-D . However, examining the TG closely, we see that the model components corresponding to the NMOS (PMOS) transistor do not depend on (G) voltage value. This makes all model components 3-D, with each component dependent on , , and exactly one of or . The TG characterization setup is shown in Fig. 6 and performed in two steps: one with respect to node G and the other with respect to . CH1-CH4 are the voltage sources used during characterization. In the first step, (CH4) is forced to a HIGH voltage level to turn off the PMOS transistor while the NMOS transistor is characterized. Each component in this part is dependent on three voltage values, , , and (CH1-CH3, respectively.) The second step of the characterization is conducted similarly to model the PMOS transistor by forcing G (CH3) to a LOW voltage level, thereby turning off the NMOS transistor. Each component in this part is dependent on , , and . To construct the complete model, the components of the aforementioned parts are combined as depicted in Fig. 5(c) .
The following set of equations defines the components in Fig.  5(c) : (8) where sets and represent the NMOS and PMOS model components, respectively. and are connected in parallel, and hence, they can be added into . Similarly, and consist of their respective parallel-connected components, as shown in equation set (8) .
The CS can be decoupled into two CSs at the input and output ( and , respectively.) Similarly, can be decoupled into and . and are parallel to each other and can be added to . Similarly, and can simply be added into . The resulting model with decoupled CSs is shown in Fig. 5(d) . Note that similarly to what was done for CS characterization of combinational cells (cf., Section II), the TG CSs are characterized using dc voltage sources. In addition, parasitic capacitances are characterized through transient simulations. For example, for the model components, a transition is applied to the output (input) voltage while the input (output) voltage is connected to a dc source.
A. Mode-Based Analysis of a Latch
At any time instance, the latch can be in one of the three modes: transparent, opaque (hold), or transition. In order to have an accurate CSM, the behavior of the latch in each mode should be investigated. In the following, we introduce the CSM for each mode. We will present a complete CSM (Fig. 9) , which covers all different modes, and is able to adapt itself and calculate the output voltage in any mode.
1) Transparent Mode : In this mode, CLK (and CLK
), the latch is transparent, i.e., , and TG1 is conducting when TG2 is OFF (cf., Fig. 5 ). The inverter between Q and passes the inverted D into [cf., Fig. 7(a) ]. The latch CSM in this mode can be obtained by connecting the CSMs for the inverter and TG1 in series, resulting to the model depicted in Fig. 7(b) .
Note that the CSM for TG has decoupled elements at its input and output, as was shown in Fig. 5(d) . However, in Fig. 7(b 
2) Opaque Mode
: In this mode, CLK (and CLK ), making TG2 conducting when TG1 is OFF. A feedback loop is thereby established such that the two inverters feed one another around the loop, while the input data are disconnected from the rest of the latch circuit [ Fig. 8(a) ]. The inverter model of Fig. 1 is used back to back to construct the CSM for this case [ Fig. 8(b) ]. The scenario in which TG2 is partially conducting will be captured in the transition mode described shortly.
3) Transition Mode (CLK in Transition):
This mode exists when CLK (CLK_bar) makes a falling (rising) transition and is not in the steady (high or low) state (e.g., when a setup or hold time test is performed.) In this case, the two TGs may be partially ON. In contrast to the opaque mode, where the feedback loop is closed and the two cross-coupled inverters are connected back to back, in the transition mode, the current to Q node through the feedback is controlled by CLK (CLK_bar). If CLK , then this current will be zero; otherwise, it will be equal to the output current of the feedback inverter, i.e., in Fig. 8(b) . To account for this controlling behavior of the CLK/CLK_bar signals, we should make in Fig. 8 (b) dependent on these signals. We convert the 4-D CSM to a 3-D CSM, i.e., instead of using , we utilize and . The transition mode must also work for the case when the feedback loop is open, i.e., CLK . In this case, TG1 is conducting. Therefore, the CSM should be a superset of the CSMs in the transparent mode [ Fig. 7(b) ], and the opaque mode [ Fig. 8(b) ] with the made dependent on and
A similar situation applies to , meaning that the parasitic capacitance at node Q is controlled by CLK and CLK_bar, i.e., two components of and are considered. The resulting model for the output nodes of the latch is presented in Fig. 9 . Note that node is isolated from CLK and CLK_bar nodes by the inverter in the feedback loop; therefore, values may be identified by dependency to Q and only. The CSM of Fig. 9 can handle waveforms of arbitrary shapes at nodes D and CLK/CLK_bar inputs, and enables construction of voltage waveforms at node Q and for any operation mode of the latch.
B. Precharacterization
We explain how to precharacterize the CSM of Fig. 9 . The setup is shown in Fig. 10 . The latch is divided into two parts, and each part is characterized separately. In the first step [ Fig. 10(a) ], TG1 is characterized as explained earlier at the beginning of this section (cf., Figs. 5 and 6), and and are calculated. The second step of the characterization (see Fig. 10(b) and the corresponding circuit model in Fig. 11 ) is explained below.
As stated earlier, is divided into and to reduce the dimension of characterization tables. To characterize , CH4 is forced to zero while CH1-CH3 voltage values are swept from to . The current value sourced through CH1 is measured as . Characterization for is done similarly. The characterization of is only dependent on CH1 and CH2 values. By forcing these two supplies to a certain dc voltage level, the current sourced through CH2 will be unique regardless of the value of CH3 and CH4; therefore, there is no CLK and CLK_bar dependency for . To characterize and , we start from the KCL equation at the node of the model in Fig. 11 (10)
A number of transient simulations are performed to characterize the capacitive elements of our CSM. To precharacterize the Miller capacitance , a saturated ramp input voltage is applied to node Q (CH1 voltage source in Fig. 11 ). Simultaneously, CH2 voltage value is swept from to . The terms containing in (10) will thus be zero. Next, with the earlier setup, (the current associated with CH2) is monitored.
is plugged into the equation for the corresponding voltage values of Q and nodes. Since has been already characterized, the only unknown parameter in (10) is , which is thereby calculated. A similar procedure is used for the characterization of capacitance . However, this time, a ramp voltage is applied to CH2 while CH1 is forced to dc voltage values. The CSM characterization for our latch example (Fig. 4 ) takes about 5 s. Considering the fact that characterization process is performed only once for each cell, the cell library characterization step is almost negligible.
As explained earlier for the combinational cell characterization, we have observed that the slope of the ramp input has a minor impact on the characterization results. As before, we examine ramp signals with different slopes and use the average parameter values for all the ramps to fill up the lookup tables.
To characterize , the KCL equation at the Q node of the model in Fig. 11 is written as follows: (11) A saturated ramp voltage is applied to node Q (CH1) and CH2 is swept from to . The term becomes zero; and are also known from the aforementioned characterization steps. Therefore, can be calculated as a function of CH1, CH2, and also CH3 for its CLK-dependent component, i.e.,
. Similarly, its CLK_bar-dependent component is calculated as a function of CH1, CH2, and CH4. Fig. 12 shows the complete CSM for the latch of Fig. 4 . The output voltage waveforms at nodes Q and can be constructed for given input voltage waveforms (for CLK and D) in the presence of an arbitrary load. The following two KCL equations are used to calculate the voltage values at Q and : (12) TG (13) where and denote the currents drawn by loads at nodes and Q, respectively. The effect of node D appears in TG and TG . The effect of node appears in TG and TG . As seen in (12) and (13), the CSM components at nodes , CLK, and CLK_bar are not required for the output voltage calculation, but their characterization can be done similarly to what explained in the previous section. As before, a Pade approximation [10] is used to model the load and substitute as a function of the output voltage . An Euler integration method is used to numerically solve the two unknown voltages and from (12) and (13) . As before, a Pade approximation [10] is used to model the load and substitute as a function of the output voltage . An Euler integration method is used to numerically solve the two unknown voltages and from (12) and (13) . Solving (12) and (13) when there is no feedback for the sequential cell (i.e., CLK , CLK , and so ) is similar to calculating the output voltage of an inverter for which the input comes from a TG. When the feedback is present (i.e., CLK , CLK , and so TG ), (12) and (13)) update one another's CSs ( and ), which thus models the magnification effect of the feedback loops. The other mode of operation, which is the transmission mode, is also captured by the dependency of and on CLK and CLK_bar.
C. Output Voltage Calculation

D. CS Modeling-MS Flip-Flops
To develop the CSM for a master-slave flip-flop [17] (such as the one shown in Fig. 13) , the latch CSM model of Fig. 9 can be substituted for both the master and the slave latches. Therefore for a given input data and clock, the voltage values at and can be calculated similarly to the approach in Section III-C. Since, master and slave parts are not separated from each other and a transmission gate (which is a channel-connected component) is in between, the iterative approach should not separate the computation of from and they should be performed simultaneously. In the experimental results section, we shall present the cases in which and are iteratively and concurrently updated.
E. CS Modeling-SR Latches
In this section, we briefly explain how the CSM for a different type of latch, i.e., an SR latch can be created. Fig. 14(a) shows an SR latch implemented using a pair of cross-coupled NAND cells. We use a multiple-input switching CSM for each NAND, and then, combine them to create the CSM for the SR latch. The resulting CSM is depicted in Fig. 14(b) .
The CSs at nodes Q and are characterized by 3-D lookup tables. Although in theory, capacitances at input and output nodes of the NAND are dependent on voltage values of the combinational cell terminals, these values are not as sensitive to these voltages as the nonlinear CSs. Therefore, the number of entries in the capacitance lookup tables can be significantly smaller than that for the CS lookup tables. The voltage values at Q and can be calculated similar to that in Section III-C. Note that, in general, the characterization process for all our combinational and sequential cells takes less than a second for each CSM parameter. It took less than 8 s for the SR latch in Fig. 14 to be characterized.
IV. EXPERIMENTAL RESULTS
Our CSM simulator was implemented using C and Perl languages. All the experiments discussed in this section were performed on a Sun Fire V880 machine with the Ultra-SPARC III 750-MHz processor running Sun Solaris operating system.
A. CSM Evaluation for Combinational Cells
In order to show the effectiveness of our CSM for combinational logic cells, it was compared with HSPICE. Waveforms of arbitrary shapes, ranging from a simple saturated ramp to crosstalk-induced noisy waveforms with voltage fluctuation as high as , were applied by using the setup of Fig. 15 . The set of experiments involved various logic cells, such as simple inverter and NAND gates, multistage cells such as OR and AND, as well as complex cells such as and-or-invert (AOI). Fig. 16 shows comparison with HSPICE for some examples of crosstalk-induced noisy waveforms given to a minimum size inverter in our 130 nm cell library and results match HSPICE very closely. Also, Fig. 17 shows two cases with multiple aggressors. The KTV model [10] is one of the most representative models among the existing CSMs mainly because it considers the Miller effect between the input and the output nodes. For KTV, we made and constant, and set them to the average value of their respective lookup tables. Fig. 18 illustrates the absolute delay error comparison of our model and KTV with respect to HSPICE for a minimum size inverter in our 130 nm cell library. The input to the inverter is coupled by a 50 fF coupling capacitance and under attack by an aggressor net. Both the input of the inverter and the aggressor net are driven by minimum-sized inverters. The cell under consideration has a FO4 load. The signal arrival time at the input of driver line driver is set to 0 ps, while that of the aggressor driver (i.e., the noise injection time) is swept from 100 to 200 ps with a time step of 1 ps. The slew values for the signal transition at the input of the victim and aggressor drivers are chosen from the range of 100--500 ps. This way we can create noisy waveforms of different shapes at the input of the inverter cell under interest. Compared to KTV, the accuracy of delay calculation for the minimum size inverter cell is improved by 8.8% (17.3%) in average (maximum), respectively. Fig. 19 shows the absolute delay error trend for a similar experiment performed on AOI22 cell with size 10x, where x is the minimum size AOI22. The coupling value is 80 fF and the arrival time of the aggressor line input driver is swept from 100 to 250 ps with a time step of 1 ps. The accuracy improvement in this case is 52.1% (93.4%) in average (maximum.)
The high accuracy of our model is mainly due to our accurate parasitic effect modeling during cell characterization, where the dependency of such effects to input and output voltage values are considered. In general, the error in cell propagation delay is less than 0.7% (2.4%) in average (maximum) compared to HSPICE for the cells in our 130 nm library. The shape of the waveform highly impacts the accuracy of timing analysis; therefore, delay and output slew metrics may not be sufficient to construct shape of the waveform. We use 
where and are the voltage values of the output of the logic cell at a given time. For each experiment, represents , which is the time at which the noisy input starts to change, whereas represents when reaches its stable final value (either high or low). We finally normalize RMSE to to take out the effect of scaling. Note that RMSE has the same unit as the quantity being estimated, which, in this case, is voltage. As mentioned earlier, the noise injection time shows the skew between the arrival of the aggressor and that of the victim signal transition. Table I reports the RMSE values for a few noise injection times for the combinational cells used in our experiments. It shows that our model is able to compute close-to-HSPICE output waveforms in terms of their actual shape. A voltage step size of 0.04 is considered for HSPICE sweep in all of our precharacterization steps. This means that for and (in order to consider the under/overshoot situations as explained in Section II-A), the size of the lookup tables is 33 33 for all experimental results reported in this paper. Our experiments showed that increasing the size of the tables from 33 33 to 66 66 increases the waveform similarity by up to 17%. However, as reported in Table I , waveform similarity numbers generated with 33 33 lookup tables are already very good, therefore we found this sizing to result in a reasonable tradeoff between accuracy and run-time/memory efficiency. 
B. CSM Evaluation for Sequential Cells
To evaluate our CSMs for the sequential cells, we also use HSPICE [14] to provide the "golden" result. In our experiments, we considered voltage waveforms with arbitrary shapes from simple saturated ramps to crosstalk-induced noisy waveforms with voltage fluctuations as high as 85% of . It is important to capture the noise effects at the output of the latch to determine whether noise can flip the state of the latch through the feedback loop(s). Experiments were performed by using latches and flip-flops of different types (to be described later) comparing Q and output waveforms with those of HSPICE. An example is depicted in Fig. 20(a) , where input D is noisy; however this does not result in the change of state for the latch. The and waveforms generated by our model closely match the HSPICE waveforms. Fig. 20(b) shows another setup in which the noisy latch input results in an illegal change of state. The signal remains at high level and causes a functional error. Fig. 20(b) shows how closely the latch output voltage by our CSM model matches that in HSPICE.
Similarly to what we did for the case of combinational cell models, we calculate the RMSE for the latch model to measure its waveform similarity to HSPICE. Equation (14) is used to calculate the RMSE. However, in this case, represents the voltage values of the latch output (Q) at a given time. For each TABLE II  WAVEFORM SIMILARITY (NORMALIZED RMSE) COMPARISON WITH HSPICE  FOR SEQUENTIAL CELLS   TABLE III  NORMALIZED RMSE AND RUN-TIME COMPARISON WITH HSPICE FOR  SEQUENTIAL CELLS experiment, represents , which is the time at which the noisy input D starts to change, whereas represents when both Q and reach their stable final values (either high or low). We finally normalize the RMSE to to take out the effect of scaling. To generate different noisy waveforms for D, the noise injection time (which is defined as the arrival time of the aggressor that is attacking the D signal) is swept from 100 to 600 ps with a step size of 5 ps. Slew values for the signal transition at the input of the victim and aggressor drivers are in the 100-500 ps range. The CLK signal was kept fixed at 1.6 ns. Note that in some of the cases, unwanted change of the latch may occur. Table II shows the normalized RMSE for some of these cases in 130 nm library for the output. Latch1 and Latch2 are TG-based latches (Fig. 4 ) of different sizes (in latch1, all elements are minimum size, and in latch2, they are all 10x), whereas latch3 is a minimum-sized SR-type latch (Fig. 14) . FF1 is a minimum-sized MS flip-flop (cf., Fig. 13 .) As reported in Table II , the RMSE is around 1% of , which confirms that our voltage waveform closely matches that produced by HSPICE.
In addition to sweeping the noise injection time from 100 to 600 ps, the CLK signal was also swept from 1 to 1.9 ns with a step size of 5 ps. This resulted in 9000 different configurations for each cell under evaluation. It is seen from Table III that the CSM-based calculator is on average 1200 times faster than HSPICE while producing results with nearly the same accuracy.
V. CONCLUSION
An accurate CSM for combinational cell was presented. Furthermore, CSMs for sequential cells were introduced. In addition to multistage logic nature of the sequential cells, the main challenge was the presence of feedback loops. Our proposed model addressed these by creating the necessary CS and parasitic components. Given the input and clock voltage waveforms of arbitrary shapes, our model can accurately compute the output voltage waveform of a register cell, and hence, the timing and noise parameters associated with the cell. This was shown to considerably reduce the pessimism in timing and noise analysis. Experimental results for our CS sequential cell model demonstrate close-to-HSPICE waveforms with significant runtime speedup. 
