A Dissertation<br>by<br>XIANG LU

Submitted to the Office of Graduate Studies of
Texas A\&M University
in partial fulfillment of the requirements for the degree of

## DOCTOR OF PHILOSOPHY

December 2005

Major Subject: Computer Engineering

A Dissertation<br>by<br>XIANG LU

# Submitted to the Office of Graduate Studies of Texas A\&M University in partial fulfillment of the requirements for the degree of <br> <br> DOCTOR OF PHILOSOPHY 

 <br> <br> DOCTOR OF PHILOSOPHY}

| Approved by: |  |
| :--- | :--- |
| Chair of Committee, | Weiping Shi |
| Committee Members, | Duncan M. Walker <br> Gwan Choi |
| Head of Department, | Andreas Klappenecker <br> Costas Georghiades |

December 2005

Major Subject: Computer Engineering

ABSTRACT<br>Fault Modeling, Delay Evaluation and Path Selection for Delay Test Under Process<br>Variation in Nano-scale VLSI Circuits. (December 2005)<br>Xiang Lu, B.S., Xi'an Jiaotong University;<br>M.S., Xi'an Jiaotong University<br>Chair of Advisory Committee: Dr. Weiping Shi

Delay test in nano-scale VLSI circuits becomes more difficult with shrinking technology feature sizes and rising clock frequencies. In this dissertation, we study three challenging issues in delay test: fault modeling, variational delay evaluation and path selection under process variation. Previous research of fault modeling on resistive spot defects, such as resistive opens and bridges in the interconnect, and resistive shorts in devices, lacked an accurate fault model. As a result it was difficult to perform fault simulation and select the best vectors. Conventional methods to compute variational delay under process variation are either slow or inaccurate. On the problem of path selection under process variation, previous approaches either choose too many paths, or missed the path that is necessary to be tested.

We present new solutions in this dissertation. A new fault model that clearly and comprehensively expresses the relationship between electrical behaviors and resistive spots is proposed. Then the effect of process variations on path delays is modeled with a linear function and a fast method to compute coefficients of the linear function is also derived. Finally, we present the new path pruning algorithms that efficiently prune
unimportant paths for test, and as a result we select as few as possible paths for test while the fault coverage is satisfied. The experimental results show that the new solutions are efficient and accurate.

## DEDICATION

To my parents

## ACKNOWLEDGMENTS

It has been a great pleasure working with the faculty, staff, and students at Texas A\&M University, College Station, during my Ph.D. program starting from the fall of 2000. The doctoral study has been possible with the help of a lot of people.

This dissertation would never have been completed if I were not given plenty of freedom to pursue my research interests, thanks in large part to the kindness and advising provided by Dr. Weiping Shi, my long-time advisor and the committee chair. He patiently provided the vision, encouragement and advice necessary for me to proceed through the doctoral program and complete my dissertation. I have learned from him to proceed with my research interests in the future. I really appreciate his kindness and help, and consider myself fortunate to have been one of his students.

I would like to thank Dr. Duncan M. (Hank) Walker for his great help and advice on my research work. His intuitive ideas and industrial concerns guided me in the correct research directions and focus. It was a great time working with him in the Semiconductor Research Corporation (SRC) project under contract 2000-TJ-844.

Special thanks to my committee, Dr. Weiping Shi, Dr. Duncan M. (Hank) Walker, Dr. Gwan Choi, and Dr. Andreas Klappenecker, for their support, guidance and helpful suggestions.

I also benefited much from the faculty in Computer Engineering, Dr. Jiang Hu, Dr. Sunil Khatri, and Dr. Peng Li. I have learned a lot from their research and ideas presented in the weekly seminar, and the discussions with them were very helpful to my research work.

I am grateful to Zhuo Li and Wangqi Qiu for their great help in my research work. Zhuo Li helped me solve difficult research problems and provided a lot of experimental data to verify research results. He also did a good job in fault modeling of resistive bridges to make the fault modeling work complete. Wangqi Qiu was a great partner in the SRC project under contract 2000-TJ-844. We had a good time working together and my research benefited much from him. I would also like to thank Jing Wang and Senthikumar Veluswami. Their research work, timing analysis under power noise and statistical timing analysis, provided helpful attempts to extend parts of my research work.

Also, thanks to Mrs. Carolyn A. Warzon for her help in the group of Computer Engineering. I thank Mrs. Tammy Carda and the Department of Electrical Engineering for providing me with a Teaching Assistant position in the spring of 2005.

My research was funded in part by the SRC grant $000-\mathrm{TJ}-844$, NSF grants CCR0098329, CCR-0113668, EIA-0223785, and ATP grant 512-0266-2001. I thank these corporations for providing financial support.

I had a great team of officemates: Zhijun Cai, Xiaonan Ma, Zhiquan Qiu, Dongming Peng, Cheng-Ta Chiang, Shu Yan, Zhuo Li, C. N. Sze, Yong Liu, Haiyun You, Chuan He, Le Zou, Wentao Zhao, etc. I thank them for making me very comfortable in the group of Computer Engineering and for making the study in Werc111 a memorable one.

Finally, I want to thank my wife Ziding Yue and my parents for their constant support in my doctoral program the past 5 years. Their support and believing in my abilities were indispensable for me to complete the dissertation. I would also like to thank my brother Tao Lu, my sister-in-law Chun Yue and my parents-in-law for their support and patience.

## TABLE OF CONTENTS

## Page

ABSTRACT ..... iii
DEDICATION ..... v
ACKNOWLEDGMENTS ..... vi
TABLE OF CONTENTS ..... viii
LIST OF FIGURES .....
LIST OF TABLES ..... xiii
I. INTRODUCTION ..... 1

1. Resistive Spot Defect ..... 1
2. Process Variation ..... 3
3. Path Selection ..... 4
4. Solutions ..... 7
5. Organization ..... 9
II. FAULT MODELING OF INTERCONNECT RESISTIVE SHORTS ..... 10
6. Resistive Open in Interconnect ..... 10
7. Resistive Bridge in Interconnect ..... 11
III. FAULT MODELING OF DEVICE RESISTIVE SHORTS ..... 30
8. Linear Resistance Based Approach ..... 32
9. Current Table Based Approach ..... 41
10. Fault Behavior and Vector Selection ..... 60
11. Test Performance Improvement ..... 62
IV. PARAMETRIC DELAY EVALUATION ..... 66
12. ISCAS85/89 Benchmark Circuits ..... 66
13. Linear Delay Modeling ..... 68
14. Experimental Results ..... 76
V. LONGEST PATH SELECTION. ..... 80
15. Delay Test Using Longest Paths ..... 80
16. Path Pruning Algorithms ..... 84
17. Longest Path Generation ..... 86
18. Experimental Results ..... 90
VI. SUMMARY AND CONCLUSIONS ..... 97
REFERENCES ..... 99
VITA ..... 106

## LIST OF FIGURES

Page
Figure 1. Delay of four longest paths under process variation. ..... 6
Figure 2. Resistive open fault model. ..... 10
Figure 3. Delay increases linearly with open resistance. ..... 11
Figure 4. The resistive bridge circuit model. ..... 12
Figure 5. The simplified resistive bridge circuit model. ..... 12
Figure 6. The circuit model when In1 is low and In2 is high. ..... 13
Figure 7. The relationship between $R_{b}$ and Out1. ..... 14
Figure 8. Four basic resistive bridge fault models for static analysis. ..... 15
Figure 9. The circuit model when In1 is rising and In2 is low. ..... 16
Figure 10. The approximation circuit model for Out1 when In1 is rising and In2 is low. ..... 17
Figure 11. (a) The equivalent circuit of Figure 9 when there is no bridge. (b) The equivalent circuit of Figure 10 ..... 18
Figure 12. (a) A bridge causes an increased delay at Out1 when input pattern is (falling, high). (b) The bridge causes a decreased delay at Out1 when input pattern is (rising, high). ..... 20
Figure 13. Four basic resistive bridge fault models. ..... 21
Figure 14. Example circuit for simulation. ..... 26
Figure 15. An example relationship between $R_{b}$ and the increased delay at Out1 with the rising step input. ..... 26
Figure 16. An example relationship between $R_{b}$ and the increased delay at Out1 with the falling step input. ..... 27
Figure 17. An example relationship between $R_{b}$ and the increased delay at Out1 with the rising ramp input. ..... 27
Figure 18. Gate-to-source and gate-to-drain shorts in NMOS and PMOS transistors ..... 30
Figure 19. The circuit with an NMOS gate-to-drain short. ..... 31
Figure 20. The circuit for static analysis ..... 33

## Page

Figure 21. A current path connects $A$ and $B$ through $R_{b}$.................................................... 35
Figure 22. $R_{A, e f f}$ is computed by equalizing the average current in the two circuits. .......... 36
Figure 23. $R_{B, e f f}$ is computed by equalizing the average current in the two circuits. .......... 37
Figure 24. Decreasing delay estimated and computed by SPICE vs. $R_{b}$. ........................... 38
Figure 25. $C_{2}$ is discharging through the pull-down path in the faulty cell. ....................... 39
Figure 26. Increasing delay estimated and computed by SPICE vs. $R_{b}$.............................. 40
Figure 27. The NMOS gate-to-source short may connect $A$ to GND or connect to $B$
under different the input signals. ...................................................................... 41
Figure 28. The circuit with an NMOS gate-to-source short in static analysis. ................... 42
Figure 29. Static voltage approximation compared with SPICE simulation. ..................... 45
Figure 30. Threshold short resistance $R_{A, l o w}$........................................................................ 45
Figure 31. The circuit with an NMOS gate-to-source short in transition. ........................... 47
Figure 32. Delay decreases with smaller $R_{b}$, under a rising input signal............................ 48
Figure 33. Delay increases with smaller $R_{b}$, under a falling input signal. .......................... 49
Figure 34. Simplified circuit for RC delay approximation. ................................................ 53
Figure 35. Circuit with an NMOS gate-to-drain short under a rising input........................ 54
Figure 36. A current path connects $A$ and $B$ through $R_{b}$..................................................... 54
Figure 37. Approximated circuit delay compared with SPICE simulation for NMOS gate-to-source short, under a rising input.57

Figure 38. Approximated circuit delay compared with SPICE simulation for NMOS
gate-to-source short, under a falling input ..... 58

Figure 39. Approximated circuit delay compared with SPICE simulation for NMOS gate-to-drain short, under a rising input59

Figure 40. Approximated circuit delay compared with SPICE simulation for NMOS
gate-to-drain short, under a falling input. ..... 59

Figure 41. The circuit failure distribution vs. the short resistance for ISCAS89 circuit s1488.

## Page

Figure 42. The circuit failure distribution vs. the short resistance for ISCAS89 circuit s38417.

Figure 43. Delay variations due to process variation are linear in SPICE simulation.
The x -axis indicates process variation and the y -axis indicates the percentage deviation from the nominal delay.71

Figure 44. The delay effect of process variation is additive, which is demonstrated by the error distribution of approximating $\Delta d_{w 1+w 2}$ with $\Delta d_{w 1}+\Delta d_{w 2}$ over 160 buffer-to-buffer segments in circuit c 432
Figure 45. The distribution of $\left(\Delta C_{j k} / \Delta x_{i}\right) / C_{j k}$ due to metal 2 width variation on 406 samples in ISCAS85 circuit c432. The x-axis indicates values of $\left(\Delta C_{j k} / \Delta x_{i}\right) / C_{j k}$74

Figure 46. Test on the longer path is more likely to capture delay defect. ......................... 81
Figure 47. Four path delays under one process parameter x............................................... 83
Figure 48. Flowchart of longest path generation. ............................................................... 87
Figure 49. Distribution of longest paths versus path indexes. ............................................ 89
Figure 50. Path distribution vs. path indexes in s38417 using the new method. ................ 95
Figure 51. Path distribution vs. path indexes in c432 using the min-max method. ............ 96
Figure 52. Path distribution vs. path indexes in c432 using the new method..................... 96

## LIST OF TABLES

Page
TABLE 1. The bridge fault model for Out1 ..... 15
TABLE 2. Increased Delay(ID) or Decreased Delay (DD) at Out1 for Figure 10(d), $r$ means rising, $f$ means falling, 1 means high, 0 means low, and other variables are defined in equation series. ..... 22
TABLE 3. Delay fault simulation results for 10000 random vectors using our bridge model ..... 29
TABLE 4. Relationship between delay changes and output signal waveform. ..... 50
TABLE 5. Approximated delay and SPICE simulation results in Figure 37 -
Figure 40. ..... 56
TABLE 6. Behaviors for each type of short under different input signals. ..... 61
TABLE 7. Vectors to cause the greatest increasing delay change. ..... 62
TABLE 8. Cell list in standard cell library ..... 67
TABLE 9. Running time comparison between the traditional RSM and new methods for ISCAS85 circuits. ..... 77
TABLE 10. Accuracy comparison between the traditional RSM and new methods for ISCAS85 circuits. ..... 79
TABLE 11. Parameters of $f(k)$ and the estimated and the actual maximum percentage of longest paths ..... 89
TABLE 12. Performance comparison between the min-max method and the newmethod.93
TABLE 13. Path set size distribution for all fault sites in three largest circuits. ..... 94

## I. INTRODUCTION

The 2002 International Technology Roadmap for Semiconductors (ITRS) [1] projects the at-speed test as an increasingly difficult problem. With the shrinking feature size in VLSI technology and the rising clock frequencies, traditional functional and delay test approaches are becoming either infeasible or inefficient. More challenging issures, such as fault modeling for resistive spot defects and effect of process variations, are necessary to be considered in at-speed test.

## 1. Resistive Spot Defect

There are two types of spot defects in the interconnect - opens and bridges. Opens are unintended impedance of a wire, and are considered a wire cut-off if the impedance becomes infinite. Bridges are unintended electrical connections between interconnects. The spot defects inside MOS transistors, gate oxide shorts, are unintended electrical connections through the gate oxide between the gate and the source, drain, or channel of a MOS transistor. Gate oxide shorts have been identified as a major type of fabrication defect and, in some CMOS processes, the dominant defect [2].

In traditional logic tests, spot defects are considered to have zero resistance and are modeled as stuck-at faults, such as [3][4]. However, spot defects have a wide range of resistance values, which plays an important role in both logic and delay behaviors. For example, Hawkins and Soden [2] showed that the experimentally measured gate oxide short

This dissertation follows the style of IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems.
resistance is from 0.8 k to 4.7 k ohms. They also found that gate oxide shorts cause degraded voltage levels and increased propagation delays. Hao and McCluskey [5] studied circuit behaviors of resistive shorts, and the dependence on input signals of the faulty gate and the driving gate. Renovell et al. [6] showed that the voltage behavior of gate oxide shorts related to the short resistance, short locations and short sizes. In Degraeve et al. [7], a method to determine the breakdown position in short channel NMOSFETs is introduced.

To detect resistive spot defects in delay test, an accurate and realistic fault model that considers the short resistance is necessary. Gaitonde and Walker [8] studied problems faced in mapping spot defects to changes of circuit topology, and found it is more difficult to map gate oxide shorts than to map interconnect shorts. Segura et al. [9] developed electrical models of gate oxide shorts, and showed that, depending upon location and transistor type, the short can be resistive, diode, parasitic MOSFET or parasitic BJT. Hao and McCluskey [10] modeled the gate oxide short as a resistor connecting the gate and the source or the drain of a transistor. Based on the model, they showed the change of the logic and delay under different signal patterns. However, the model does not give a quantitative relationship between the delay change and the short resistance, and cannot be used to measure the quality of the test pattern in delay test. Sar-Dessai and Walker [11] presented several logic fault models for resistive bridge faults. Moore et al. [12] presented comprehensive delay fault analysis for resistive bridge models and coupling effects, but the delay calculation was not given. Other techniques, such as the mixed-mode simulation method by Chuang and Hajj [13] and neural network techniques by Shaw et al. [14], give more accurate bridge fault models. But these methods are not efficient for large circuits due
to their high time complexity. In [15], we proposed a circuit level fault model for resistive opens and bridges in the interconnect. However, resistive shorts in MOS transistors are more complex and the approach in [15] is insufficient.

## 2. Process Variation

Process variations are unintended deviations from the requested manufacturing parameters. There are two distinct sources of variation: environmental factors and physical factors [16]. Environment factors include variations in the power supply voltage and the temperature. These factors are highly design dependent and exhibit time constants similar in scale to the clock frequency. Physical factors include variations in the electrical and physical parameters characterizing the behaviors of wires and devices. These variations are caused by processing and mask imperfections and various wear-out mechanisms. Physical factors exhibit long time constants, typically measured in years, and can be further divided into inter-die variations and intra-die variations. Inter-die variations are independent of the design implementation, and are considered globally in die-to-die, wafer-to-wafer and lot-tolot areas. Intra-die variations are dependent of design implementation and are considered locally.

With the decreasing feature size of VLSI technology and the rising clock frequencies, the impact of process variations is increasingly felt. A great amount of research has been done recently on process variations, such as the clock skew analysis under process variation [17][18][19], statistical performance analysis [20][21][22], worst case performance analysis [23][24], parametric yield estimation [25], impact analysis on micro architecture [26] and delay fault test under process variation [27][28][29][30].

In previous research, the variational path delay under process variation is modeled either as a function of process variables [17] [18][20][24][25][30], or as a random variable of certain distribution [19][22][28][31]. However, the conventional methods to compute path delay are either slow or inaccurate. The response surface method (RSM), which performs multiple simulations and curve-fittings, is used in [20] [24] [25] [31]. To achieve high accuracy, the RSM method must perform multiple parasitic extractions and delay evaluations under different process conditions. Due to the large number of metal layers in the modern technology, there are many interconnect process variables. For example, for a k -layer technology, there are 3 k process variables related to interconnect, corresponding to the metal width, metal thickness and inter-layer dialectic thickness of each layer. As a result, the traditional RSM has prohibitive running cost for large circuits. Orshansky et al. [23] derived delay sensitivity to gate length variation based on a simple model, and expressed delay as a function of gate length. Their method does not automatically apply to interconnect process variation due to the lack of a similar model for the interconnect. For methods that treat the path delay as a random variable, it is also necessary to extract parasitic RCs and compute the path delay under different process conditions [22] [28], resulting in too much timing cost.

## 3. Path Selection

Delay test of digital integrated circuits is to ensure that the signal from any primary input to any primary output is propagated in less time than the specification. A circuit is considered faulty if the delay of any path exceeds the specification. A delay increase due to a local defect, such as a resistive bridge or a resistive open, may cause a timing violation on
the path through the defect, which is modeled as a delay fault [15][32]. Such a delay increase is localized to a gate input, output or an interconnect wire in the circuit, where the localized position is called a local fault site. Testing the longest path through the local fault site will capture the delay increase due to the fault. When process variation is not considered, the problem of finding the longest and testable paths that cover all local fault sites has been extensively studied [33][34][35]. In these methods, only one path with the maximum delay is tested for each local fault site.

When process variation is considered, the path delay becomes a function of process variables. Among all paths through a fault site, there are often multiple paths whose delay can be the maximum under different process conditions [36]. For each fault site $s$, we call a path longest for $s$ if the path has the maximum delay among all paths through $s$ under some process conditions. On the other hand, we call a path redundant for $s$ if the path can never be longest for $s$ under any process condition. For example in Figure 1, there are four longest paths, $P_{1}, P_{2}, P_{3}$ and $P_{4}$, through a fault site in ISCAS85 circuit c432 using TSMC 180 nm technology. Two process variables, $x_{2}$ and $x_{3}$, represent metal thickness variations of Metal 2 and Metal 3 respectively. In this example, four paths form the upper bound of the delay for all paths through the fault site within the range of process variation. Any path whose delay is below this bound is redundant for the fault site.


## Figure 1. Delay of four longest paths under process variation.

Traditionally, tests are only performed on the longest paths under the nominal or worstcase process condition. However, this might be insufficient. In Figure $1, P_{1}$ is the longest under the nominal process condition $\left(x_{2}=0, x_{3}=0\right)$ and also the worst-case process condition $\left(x_{2}=-20 \%\right.$ and $\left.x_{3}=20 \%\right)$. However, under process condition $x_{2}=20 \%$ and $x_{3}=$ $-20 \%, P_{1}$ is much shorter than $P_{4}$. Since it is hard to know the actual process variation for a chip under test, we must test all paths that could be longest under any process condition. In this dissertation, we propose techniques to select all longest paths through each fault site to maximize the fault coverage.

Modern delay optimization tools tend to make many paths critical or near critical [37], resulting in too many paths for test. Pruning some of the paths based on structural correlation and process variation correlation is an effective approach to reduce the number of paths. If two paths share some nets or gates, there is a structural correlation between
them. Similarly, if two nets run on the same metal layer, there is a process correlation between them. Luong and Walker [27] proposed a pruning technique using both the structural correlation and the process correlation. As a result, they significantly reduced the number of paths. However, they only considered the longest paths for the entire circuit, instead of the longest paths through every local fault site. Furthermore, they did not consider interconnect delay. Tani et al. considered the longest paths through every local fault site [38]. They used a min-max comparison method, with the help of the structural correlation but not process correlation. As a result, their approach is overly pessimistic and produces too many paths. Liou et al. [28] used Monte Carlo simulation to select a set of critical paths that maximizes the probability of covering all critical paths under all process conditions. However, Monte Carlo simulation is very slow for large circuits and no running time is given for their method.

## 4. Solutions

In this dissertation, we put attention to the impact of these three issues on delay test, and present our new solutions.

On resistive spot defects, we propose an accurate and realistic fault model for delay test in CMOS circuits. The restive spot defect is modeled as a logic fault or a delay fault depending on the resistance and input patterns. The delay change is approximated as a function of the short resistance. Based on the new fault model, we present a method to select the best test vectors to cause the worst-case effect of the short on the circuit. As an application, we quantitatively evaluate the performance of the delay test and the logic test, and show the improvement using the new fault model.

On the computation of variation delay due to process variation, we present a new method PARADE for fast PARAmetric Delay Evaluation using analytical formulae and pre-characterized lookup tables. We first model variational path delays as linear functions of process variables. By analyzing a small sample of parasitic capacitance extracted from any commercial parasitic extraction tools, we derive an efficient method to compute the effect due to process variation for parasitic capacitance. Then the variational path delay is evaluated efficiently, based on the lumped C delay model and based on the effective capacitance delay model respectively. No multiple parasitic extractions and multiple delay evaluations are needed for both methods, resulting in a significant speedup over the traditional RSM. The efficiency of our methods makes it possible to comprehensively analyze circuit performance on all interconnect and device process variables for large circuits. We do experiments on ISCAS85 circuits under TSMC 180nm 1.8V 6-metal layer technology. Experiments show that our methods achieve high accuracy and efficiency. Compared to the traditional RSM, the delay error is within $5 \%$ using analytical methods, and is within $3 \%$ using the table lookup method.

On longest path selection, we present a new method to select longest paths for each local fault site in the circuit. To maximize fault coverage, we want to find as many longest paths as possible. On the other hand, to minimize test costs, we want to find as few paths as possible. Given a set of testable paths, our method first models the path delay as a linear function of process variation variables, then uses two pruning algorithms to remove paths that are redundant or almost redundant. We repeat the process for each fault site in the circuit, and the remaining paths are longest paths for delay test. Experiments on the ISCAS
circuits show that the new method is efficient and significantly reduces the number of paths for test, compared to the previous best method proposed in [38]. We consider process variations of devices and interconnect in our work, and the method can also be applied in path selection under operating variations of supply voltages and temperature [39].

## 5. Organization

The dissertation is organized as follows. In Section II we analyze electrical behaviors of resistive open/bridge in the interconnect and resistive short in devices, and present the new fault model for each type of the spot defect. In Section III, the new method to compute variational delay under process variation is presented. In Section IV we propose the path pruning algorithms to select longest paths for test, and show the experimental result presents using the new algorithms. We conclude our work in Section V.

## II. FAULT MODELING OF INTERCONNECT RESISTIVE SHORTS

## 1. Resistive Open in Interconnect

Resistive opens can be classified as strong opens ( $>10 \mathrm{M} \Omega$ ) and weak opens ( $\leq 10 \mathrm{M} \Omega$ ) [32]. Strong opens cause stuck-at faults, while weak opens cause delay change in the interconnect. Strong opens cause stuck-at faults and, can be detected by regular stuck-at patterns. Weak opens cause delay faults and thus may not be detected by regular stuck-at patterns [2][12]. Montanes and Gyvez [32] showed that in modern nano-scale circuits, the percentage of weak opens is high and delay test for such defects is necessary.

The resistive open fault model is shown in Figure 2. In this model, a resistive open is represented by a resistor $r_{o}$ in a net at the location where the open defect may occur. The input buffer $\mathrm{B}_{1}$ and output buffer $\mathrm{B}_{2}$ represent arbitrary CMOS gates.


## Figure 2. Resistive open fault model.

From extensive SPICE simulation, we found the delay increases almost linearly with the open resistance. Figure 3 shows the SPICE simulation result of a typical net using TSMC 250nm technology.

The open resistance is modeled as an increased delay $d$ ' in the net. The increased delay $d^{\prime}$ is approximated by a linear function $d^{\prime}=r_{0} / R_{\text {nominal }} \cdot d$, where $r_{0}$ is the open resistance, $R_{\text {nominal }}$ is the interconnect resistance without open and $d$ is the nominal delay. Above a
certain value, depending on the clock frequency of the circuit the open becomes a stuckopen fault.


Figure 3. Delay increases linearly with open resistance.
2. Resistive Bridge in Interconnect

The resistive bridge circuit model is shown in Figure 4. Each gate $B_{i}$ is an arbitrary CMOS gate. To simplify the analysis, CMOS devices in $\mathrm{B}_{1}$ and $\mathrm{B}_{2}$ are replaced by switches and linear resistors in Figure 4, and $B_{3}$ and $B_{4}$ are replaced by buffers. We use a simple RC interconnect model that lumps interconnect parasitic capacitance with the load capacitance.


Figure 4. The resistive bridge circuit model.


Figure 5. The simplified resistive bridge circuit model.

Circuit parameters in Figure 5 include pull-up and pull-down resistances $R_{1, u p}, R_{1, \text { down }}$, $R_{2, u p}$ and $R_{2, \text { down }}$ of $B_{1}$ and $B_{2}$, interconnect parasitic resistances $R_{1}, R_{2}, R_{3}$ and $R_{4}$, bridge resistance $R_{b}$, capacitances $C_{1}$ and $C_{2}$ that includes interconnect and sink capacitances, and logic interpretation voltages $V_{3 t}, V_{4 t}$ of $B_{3}$ and $B_{4}$. The logic interpretation voltage $V_{t}$ of a buffer is defined as follows. If the input of the buffer is below $V_{t}$, the output is considered logic low. If the input of the buffer is above $V_{t}$, the output is considered logic high. For
inverters or other gate types, the definition is similar with some "high" and "low" exchanged. Inputs of $B_{3}$ and $B_{4}$ are denoted as x and y , respectively. For simplicity, the delay of Out1 (Out2) means the delay at $x(y)$.

## A. Static Analysis

In static analysis, it is assumed that input signals remain constant and output signals are stable. Therefore, all interconnect parasitic capacitances and sink capacitances are ignored. There are four possible cases of input patterns in the static analysis. When In1 and In2 are both high, or both low, the bridge has no impact on the circuit. When In1 is low and In2 is high, the circuit is shown in Figure 6.


Figure 6. The circuit model when In1 is low and In2 is high.

Define the Bridge Threshold Resistance (BTR) for Out1 as

$$
\begin{equation*}
R_{1, V \text { ss }}=\frac{V d d\left(R_{1}+R_{1, \text { down }}\right)}{V_{3 t}}-\left(R_{1}+R_{3}+R_{1, \text { down }}+R_{2, \text { up }}\right) \tag{2.1}
\end{equation*}
$$

When $R_{\mathrm{b}}<R_{1, \mathrm{Vss}}$, the voltage at $x$ is greater than $V_{3 t}$ and Out 1 is high, which is a logic fault. When $R_{b}>\mathrm{R}_{1, \mathrm{Vss}}$, there is no logic fault, but there might be an increased delay, which is
discussed in the Section 2.2. The relationship between Out 1 and $R_{b}$ is illustrated in Figure 6.


Figure 7. The relationship between $\boldsymbol{R}_{b}$ and Out1.

Similarly for Out2, the BTR is

$$
\begin{equation*}
R_{2, V d d}=\frac{V_{4 t}\left(R_{3}+R_{1, u p}\right)}{V d d-V_{4 t}}-\left(R_{1}+R_{1, \text { down }}\right) . \tag{2.2}
\end{equation*}
$$

The case when In1 is high and In2 is low is symmetric. The corresponding BTRs are given as follows:

$$
\begin{gather*}
R_{1, V d d}=\frac{V_{3 t}\left(R_{1}+R_{1, \text { up }}\right)}{V d d-V_{3 t}}-\left(R_{3}+R_{2, \text { down }}\right),  \tag{2.3}\\
R_{2, V \text { ss }}=\frac{V d d\left(R_{3}+R_{2, \text { down }}\right)}{V_{4 t}}-\left(R_{1}+R_{3}+R_{2, \text { down }}+R_{1, \text { up }}\right) . \tag{2.4}
\end{gather*}
$$

It is known that for Boolean functions with two inputs, only four are monotone and non-constant. Therefore, the behavior of Out 1 in Figure 5 can only be one of the four in Figure 8. Table 1 summarizes the above analysis and shows which model the circuit behaves. The concept of Bridge Threshold Resistance is similar to the concept of critical (limit, detectable) resistance studied in previous work [6][9][10]. In this dissertation, simple formulas to compute BTRs are presented.


Figure 8. Four basic resistive bridge fault models for static analysis.

TABLE 1. The bridge fault model for Out1.

| $R_{b}$ range | Out1 Model |
| :--- | :--- |
| $R_{b} \leq \min \left(R_{1, V d d}, R_{1, V / s}\right)$ | (c) |
| $R_{1, V_{s s}}<R_{\mathrm{b}}<R_{1, V d d}\left(\right.$ if $\left.R_{1, V s s}<R_{1, V d d}\right)$ | (a) |
| $R_{1, V d d}<R_{b}<R_{1, V s s}\left(\right.$ if $\left.R_{1, V d d}<R_{1, V s s}\right)$ | (b) |
| $R_{b} \geq \max \left(R_{1, V d d}, R_{1, V s s}\right)$ | (d) |

Some useful properties can be derived directly from the models. For example, it is impossible for all BTRs to be greater than zero. Here is a simple proof. If all BTRs are set to be greater than zero, then from Eq. (2.1) and (2.2), we can derive that $V_{4 t}>V_{3 t}$. Similarly, from Eq. (2.3) and (2.4), we can derive that $V_{3 t}>V_{4 t}$, which is a contradiction. Therefore, all BTRs cannot be greater than zero (or less than zero with a similar proof) at the same time. Thus, Out1 and Out2 cannot behave as the model shown in Figure 8(c) simultaneously, that is, there exist some input vectors that make logic values of two outputs not be swapped. When $R_{b}<\max \left(R_{1, V d d}, R_{1, V_{s s}}, R_{2, V d d}, R_{2, V s s}\right)$, Out1 and Out2 cannot behave as the model shown in Figure 8(d) simultaneously, that is, there must be a logic fault at either Out1 or Out2.

## B. Dynamic Analysis

In the dynamic analysis, there are four types of input signals: high, low, rising (from low to high), and falling (from high to low). According to the static analysis, the output behavior eventually settles down to one of the four fault models in Figure 8, determined by BTR values. There are totally 16 cases of input type combinations for In1 and In2. The analysis for all cases is similar to the following case.

Consider the case when $\operatorname{In} 1$ is rising and $\operatorname{In} 2$ is low. The circuit in Figure 5 can be simplified to the circuit in Figure 9. If $R_{b} \leq R_{1, V d d}$, the static analysis shows that there is a logic fault for Out1, which can be detected by logic tests. If $R_{b}>R_{1, V d d}$, there is no logic fault.


Figure 9. The circuit model when In1 is rising and In2 is low.

From the circuit analysis by matching the second moment of transfer function [40], we found that when there is a rising input on In1, the behavior of $x$ in Figure 9 can be approximated by the behavior of $x$ in Figure 10, where

$$
C_{e}=C_{1}+\frac{\left(R_{2, \text { down }}+R_{3}\right)^{2}}{\left(R_{2, \text { down }}+R_{3}+R_{b}\right)^{2}} \cdot \frac{C_{2}}{1+\left|R_{4}-R_{2}\right| /\left(R_{1, \text { up }}+R_{1}+R_{2}\right)}
$$



Figure 10. The approximation circuit model for Out1 when In1 is rising and In2 is low.

The coefficient $\frac{1}{1+\left|R_{4}-R_{2}\right| /\left(R_{1, u p}+R_{1}+R_{2}\right)}$ is achieved experimentally to balance the interconnect resistance $R_{4}$ and $R_{2}$. When $R_{4}=R_{2}$, the following fact is true: the first two moments of driving admittance [41] in Figure 9 and 10 are the same, which is

$$
\frac{1}{R_{2, \text { down }}+R_{3}+R_{b}+R_{1, \text { up }}+R_{1}}+\frac{\left(R_{2, \text { down }}+R_{3}\right)^{2} C_{2}+\left(R_{2, \text { down }}+R_{3}+R_{b}\right)^{2} C_{1}}{\left(R_{2, \text { down }}+R_{3}+R_{b}+R_{1, \text { up }}+R_{1}\right)^{2}} s+\mathrm{O}\left(s^{2}\right)
$$

In the approximation, $B_{3}$ is only regarded as a sink capacitance that is included in $C_{1}$ in Figure 9 and $C_{e}$ in Figure 10. In Figure 9, if the bridge does not exist in the circuit, then the equivalent circuit is shown in Figure 11(a), where $R_{\text {line }}=R_{2}+R_{1}+R_{1, u p}$.


Figure 11. (a) The equivalent circuit of Figure 9 when there is no bridge. (b) The equivalent circuit of Figure 10.

Define the delay at $x$ in Figure 11 (a) as $\mathrm{d}_{1}$, then

$$
\begin{equation*}
d_{1}=-R_{\text {line }} \cdot C_{1} \cdot \ln (0.5) \tag{2.5}
\end{equation*}
$$

Similarly, the equivalent circuit of Figure 10 is shown in Figure 11(b), where

$$
\begin{gathered}
R_{e}=R_{2}+\left(R_{1}+R_{1, \text { up }}\right) / /\left(R_{2, \text { down }}+R_{3}+R_{b}\right), \\
m=\frac{R_{2, \text { down }}+R_{3}+R_{b}}{R_{1}+R_{1, \text { up }}+R_{2, \text { down }}+R_{3}+R_{b}}
\end{gathered}
$$

Here, the symbol "//" represents the parallel computation of two resistances. Define the delay at x in the Figure 11 (b) as $d_{2}$, then we can get

$$
\begin{equation*}
d_{2}=-R_{e} \cdot C_{e} \cdot \ln \left(1-\frac{0.5}{m}\right) . \tag{2.6}
\end{equation*}
$$

Since the peak voltage at x in Figure 11(b) is only a fraction of Vdd, there is an increased delay at Out1. Intuitively, m can be seen as the voltage division ratio, and $R_{e}$ can be seen as the effective resistance from upstream to $x$. When $R_{b} \rightarrow \infty, C_{e}=C_{1}, R_{e}=R_{\text {line }}$, $m=1$ and $d_{2}=d_{1}$.

From Eq. (2.5) and (2.6), the increased delay $d^{\prime}=d_{2}-d_{1}$ can be computed as

$$
\begin{equation*}
d^{\prime}=-a \cdot b \cdot \log _{2}(1-0.5 / m) \cdot d_{1}, \tag{2.7}
\end{equation*}
$$

where $a=C_{e} / C_{1}, b=R_{e} / R_{\text {line }}$. In Eq. (2.7), $a$ can be seen as the ratio between the effective capacitance with and without the bridge, $b$ can be seen as the ratio between the effective resistance with and without the bridge. When the input pattern is (falling, low), then $d^{\prime}=-a \cdot b \cdot \log _{2}(0.5 / m) \cdot d_{1}$, where all parameters have similar meanings to parameters in (rising, low) case except for different values. Generally, if the initial voltage value of a rising input in Figure $11(\mathrm{~b})$ is defined as $g$, the static value after the dynamic process is defined as $h$, then

$$
\begin{equation*}
d^{\prime}=-a \cdot b \cdot \log _{2}\left(1-\frac{0.5-g}{h-g}\right) \cdot d_{1}=-a \cdot b \cdot \log _{2}\left(\frac{0.5-h}{g-h}\right) \cdot d_{1} . \tag{2.8}
\end{equation*}
$$

If the initial value of a falling input is $g$, the static value after the dynamic process is $h$, it is interesting that $d_{0}$ can be written in the same format as in the rising case except for different parameter values.

In $\mathrm{Eq}(2.8)$, we write $d_{0}$ as the function of $d_{1}$ since we can calculate $d_{0}$ from $d_{1}$, which is a more accurate value such as the delay including the cell delay from SPICE simulation or delay tables. Our equation can also be easily modified when there is a ramp input in Figure 8. We can abstract the ramp input by a step input applied at the instant when the ramp crosses the $50 \%$ point and an extra delay $\tau / 2$, where $\tau$ is the slope of the ramp input [42]. Now Eq. (2.8) is modified to

$$
\begin{equation*}
d_{2}=\frac{-a \cdot b \cdot \log _{2}((0.5-h) /(g-h))}{1+\left(\tau / 2 /\left(-R_{\text {line }} \cdot C_{1} \cdot \ln (0.5)\right)\right)} \cdot d_{1} \tag{2.9}
\end{equation*}
$$

Through SPICE simulation the bridge resistance can increase or decrease the delay ( $d^{\prime}$
can be greater or less than zero) depending on input patterns. This was also mentioned in previous work [12]. Figure 12 shows the SPICE simulation of two interconnect segments from the layout of the ISCAS85 circuit c432. The bridge resistance is $1 \mathrm{~K} \Omega$. There is an increased delay at Out1 when input pattern (In1, In2) is (falling, high), and a decreased delay at Outl when the input pattern is (rising, high). The decreased delay may cause a hold time violation or race at Out1. This type of fault cannot be detected by the current delay test.


Figure 12. (a) A bridge causes an increased delay at Out1 when input pattern is (falling, high). (b) The bridge causes a decreased delay at Out1 when input pattern is (rising, high).

## C. Modeling Procedure

Based on the above analysis, we derive the bridge fault model as follows. All logic and delay faults are included in the model. Previous fault models such as the aggressor-victim model [12] are special cases of this model.
(1) Compute $R_{1, \text { up }}, R_{1, \text { down }}, R_{2, \text { up }}, R_{2, \text { down }}, V_{3 t}$ and $V_{4 t}$ from the cell library and the input pattern for cells other than inverter/buffers. Compute $R_{1}, R_{2}, R_{3}, R_{4}, C_{1}$ and $C_{2}$ from the interconnect parasitics.
(2) Compute BTR values $R_{1, V d d}, R_{1, V s s}, R_{2, V d d}$ and $R_{2, V s s}$ according to equations (2.1) to (2.4).
(3) For the fault simulation, $R_{b}$ is given. Use $R_{b}$ to choose a fault model from Figure 12 according to Table 1. When there is a delay fault, compute

$$
\begin{equation*}
d^{\prime}=\left(-l \cdot \log _{2}((0.5-h) /(g-h))-1\right) \cdot d_{1} \tag{2.10}
\end{equation*}
$$

where $d_{1}$ is the nominal delay of Out $1, l, g$ and $h$ are chosen according to Table 2.


Figure 13. Four basic resistive bridge fault models.

In Table 2, behaviors at Out1 for Figure 12(d) with all input patterns are presented. If both In1 and In2 change, it is assumed that two inputs change simultaneously. If two inputs do not change simultaneously, we treat the case as the combination of two cases happening sequentially. For example, if both inputs are rising and In1 is faster, then this case is consistent with the combination of $(r, 0)$ and $(1, r)$.

TABLE 2. Increased Delay(ID) or Decreased Delay (DD) at Out1 for Figure 10(d), $r$ means rising, $f$ means falling, 1 means high, 0 means low, and other variables are defined in equation series.

| Input Pattern (In1, In2) | Out1 Model |
| :--- | :--- |
| Both static $(0,0),(0,1),(1,0),(1,1)$ | No ID nor DD |
| Same direction $(r, r),(f, f)$ |  |
| In1 static $(0, r),(0, f),(1, r),(1, f)$ |  |
| $(r, 0)$ | DD, $g=m_{1}, h=0, l=a_{1} \cdot b_{1}$ |
| $(f, 0)$ | ID or DD, $g=m_{2}, h=m_{1}, l=a_{1} \cdot b_{1}$ |
| $(r, f)$ | ID or DD, $g=m_{1}, h=m_{2}, l=a_{2} \cdot b_{2}$ |
| $(f, r)$ | DD, $g=m_{2}, h=1, l=a_{2} \cdot b_{2}$ |
| $(r, 1)$ | ID, $g=1, h=m_{2}, l=a_{2} \cdot b_{2}$ |
| $(f, 1)$ |  |

Some constants in Table 2 are given as follows. Constants $a_{i}{ }^{\prime} \mathrm{s}, b_{i}$ 's and $m_{i}$ 's have similar meanings to those explained above.

$$
\begin{gathered}
a_{1}=1+\frac{\left(R_{2, \text { down }}+R_{3}\right)^{2}}{\left(R_{2, \text { down }}+R_{3}+R_{b}\right)^{2}} \cdot \frac{C_{2} / C_{1}}{1+\left|R_{4}-R_{2}\right| /\left(R_{1, \text { up }}+R_{1}+R_{2}\right)} \\
a_{2}=1+\frac{\left(R_{2, \text { up }}+R_{3}\right)^{2}}{\left(R_{2, \text { up }}+R_{3}+R_{b}\right)^{2}} \cdot \frac{C_{2} / C_{1}}{1+\left|R_{4}-R_{2}\right| /\left(R_{1, \text { down }}+R_{1}+R_{2}\right)} \\
b_{1}=\frac{R_{2}+\left(R_{1}+R_{1, \text { up }}\right) / /\left(R_{2, \text { down }}+R_{3}+R_{b}\right)}{R_{1, \text { up }}+R_{1}+R_{2}} \\
b_{2}=\frac{R_{2}+\left(R_{1}+R_{1, \text { down }}\right) / /\left(R_{2, \text { up }}+R_{3}+R_{b}\right)}{R_{1, \text { down }}+R_{1}+R_{2}} \\
m_{1}=\frac{R_{2, \text { down }}+R_{3}+R_{b}}{R_{1, \text { up }}+R_{1}+R_{2, \text { down }}+R_{3}+R_{b}}
\end{gathered}
$$

$$
m_{2}=\frac{R_{1, \text { down }}+R_{1}}{R_{2, \text { up }}+R_{1}+R_{1, \text { down }}+R_{3}+R_{b}}
$$

For other models in Figure 13, the same delay formula can be used to compute the increased or decreased delay, except for some input vectors causing logic faults at Out1. Similar results for Out2 can be easily derived from Figure 12and Table 2 by replacing Out2 with Out1, input pattern (In2, In1) with (In1, In2), $R_{2, V D D}$ with $R_{1, V D D}$ and $R_{2, V S S}$ with $R_{1, V S S}$. All the equations in (2.10) and (2.11) need to be recomputed by exchanging all the superscript 1 with $2, R_{3}$ with $R_{1}, R_{4}$ with $R_{2}$ and $C_{1}$ with $C_{2}$ in the right hand side. For example, when input pattern $(\operatorname{In} 2, \operatorname{In} 1)$ is $(r, 0)$,

$$
a_{1}=1+\frac{\left(R_{2, \text { down }}+R_{3}\right)^{2}}{\left(R_{1, \text { down }}+R_{1}+R_{b}\right)^{2}} \cdot \frac{C_{1} / C_{2}}{1+\left|R_{4}-R_{2}\right| /\left(R_{2, \text { up }}+R_{3}+R_{4}\right)}
$$

When the input is a ramp signal, the modeling procedure is the same except the delay formula (2.10) needs to be modified in the way shown in Section 2.2. We use "Step input delay model" and "Ramp input delay model" to distinguish two different formulas, though in real applications only one formula is needed and the effect of the ramp input is considered as one coefficient.

Since driving resistances are dependent on input patterns for cells other than inverters/buffers, in the static fault simulation in which input patterns are unknown, resistances are chosen to maximize (minimize) the delay effect that gives optimistic (pessimistic) estimation.

In Table 2, there are two input patterns, $(r, f)$ and $(f, r)$, which may cause an increased delay or a decreased delay at Out1 and Out2 simultaneously. However, it can be easily
derived from equation series (2.10) and (2.11) that the delay of Out1 with input $(r, 0)$ is greater than the delay with $\operatorname{input}(r, f)$, and the delay with input $(f, 1)$ is greater than the delay with input $(f, r)$. Therefore, to maximize the delay of Out1, the best input patterns are $(r, 0)$ and $(f, 1)$. Whether the former is better or the latter is better depends on the parameters. To maximize the delay of Out2, the best input patterns are $(0, r)$ and $(1, f)$. Similarly, to minimize the delay at Out1, the best input patterns are $(r, 1)$ and $(f, 0)$ and to minimize the delay at Out2, $(1, r)$ and $(0, f)$. To maximize or minimize the delay at both output simultaneously, the best input patterns may be $(r, f)$ and $(f, r)$. Also, the delay formula we derived helps to choose input patterns at the previous stage. As shown in Figure 11 , when the output of $B_{1}$ is rising, and the output of $B_{2}$ stays low, then there will be an increased delay. There are three input patterns, $(1, f),(f, 1)$ and $(f, f)$, which set the output of $\mathrm{B}_{1}$ to a rising signal. However, $(f, f)$ will produce less delay than the other two patterns since it decreases the pull-up resistance. Therefore, we should choose best patterns at previous stage to maximize or minimize the driving resistance.

For cases $(r, 0)$ and $(f, 1)$, simulation results of SPICE and our delay model on the example circuit in Figure 14 with the step input are shown in Figure 15 and Figure 16. In Figure 14, the input vector assignment for $(r, 0)$ is shown. The technology is TSMC 180 nm 1.8 V. The PMOS size (width/length) is $540 \mathrm{~nm} / 180 \mathrm{~nm}$, and the NMOS size is $270 \mathrm{~nm} / 180$ nm . Pull-up/down resistances range from $1.5 \mathrm{~K} \Omega$ to $3.2 \mathrm{~K} \Omega$, which are computed based on the linear region CMOS U-I curve [43]. Sink capacitances are 2.2 fF and 3.1 fF for the inverter and NAND gate, respectively. The logic interpretation voltage is 0.9 V . There are 8 RC segments in each interconnect (same for the two lines), in which the total resistance is
$17.4 \Omega$ and the total capacitance is 4 fF . The bridge locates in the middle of the two nets. BTRs $R_{1, V_{s s}}$ for Out1 is greater than zero and $R_{1, V d d}$ is less than zero. Out1 can only behave as the model in Figure 13(d) or (b) based on TABLE 1.

When $R_{b}>R_{1, V_{s s}}$, in both Figure 15 and Figure 16, Out1 behaves as Figure 13(d) and an increased delay exists. Bridge faults falling in this range may be detected by delay tests with our fault model, but may not be detected by traditional logic tests with infinitely slow speed.

When $R_{b} \leq R_{1, V_{s s}}$, Outl behaves as Figure 13(b) in both figures but appears as an increased delay in Figure 14 and a logic fault in Figure 16. In this case, even though there is a delay fault for Out1 with some input patterns, bridge faults in this range can still be detected by traditional logic tests with other input patterns. Both delay and logic tests may detect these bridge faults.

For the case $(r, 0)$, simulation results on the example circuit with the ramp input are shown in Figure 17. The slope of the ramp input is 0.01 ns , which is almost half of the nominal delay of Out1. Results of SPICE, step input delay model and ramp input delay model are compared and we can see that the ramp input delay model considering the slope effect gives more accurate result when $R_{b}$ increases.


Figure 14. Example circuit for simulation.


Figure 15. An example relationship between $R_{b}$ and the increased delay at Out1 with the rising step input.


Figure 16. An example relationship between $R_{b}$ and the increased delay at Out1 with the falling step input.


Figure 17. An example relationship between $R_{b}$ and the increased delay at Out1 with the rising ramp input.

From all figures, our model shows a good match with SPICE simulation results. However, there are still some errors, which come from following sources: cell delay errors due to the linear resistance model and lumped capacitance model for driving gates, and interconnect errors due to simple RC interconnect and the approximation from Figure 8 to Figure 10.

In our experiment, when the bridge location is not in the middle of the two nets, the average delay varies by $0.1 \%$. In general, if we do not know the exact location of a bridge, we will assume it locates in the middle of the two nets. It is a good approximation in practice. Some previous work also showed that the delay effect of a bridge fault has little relation with its location [42].

## D. Application

The resistive bridge model d has been implemented in the CodSim delay fault simulator [44]. In the experiments, the bridge sites are assumed to be between two nets where large coupling capacitances exist. Such net pairs can be extracted using commercial capacitance extraction tools. The ISCAS85 benchmark circuits are used and the circuit layout is done with the Cadence Silicon Ensemble in TSMC 250 nm 3 V 3-metal technology. Commercial parasitic extraction tools are used to extract parasitics and compute net delays. The logic interpretation voltages are from 1.4 V to 1.5 V , and pull-up/down resistances of all gates are from $1 \mathrm{~K} \Omega$ to $4 \mathrm{~K} \Omega$. For multi-input gates, pull-up/down resistances are computed assuming only one input changes at any time. The clock period is set to be $5 \%$ longer than the delay of the longest structural path.

Table 3 shows the simulation results for the ISCAS85 circuits, using 10,000 random
vectors. Circuit c2670 is not included due to a parasitic extraction tool problem. The bridge resistance is approximately uniformly distributed from $0 \Omega$ to $40 \mathrm{~K} \Omega$ [45]. Columns 3 and 4 show the fault coverage using full-speed and half-speed tests, respectively. The fault coverage is computed by averaging the detected bridge resistance range over the potentially detectable resistance range for each bridge site. The half-speed tests can be considered fast logic tests and the full-speed tests can be considered the at-speed built-in self-tests (BIST), whose fault coverage is $1-5 \%$ higher than the half-speed tests. Using our model, for the first time it becomes possible to estimate the benefit from at-speed tests for resistive bridge faults. Our model is independent of the bridge resistance distribution, and therefore more accurate fault coverage can be computed if a more accurate distribution is known. Column 6 shows the simulation time and indicates that our bridge model is computational efficient.

TABLE 3. Delay fault simulation results for 10000 random vectors using our bridge model.

| Circuit | Total Bridges | Resistive Bridge Model FC (\%) |  | Time (s) |
| :--- | :--- | :--- | :--- | :--- |
|  |  | Full -Speed | Half-Speed |  |
| c432 | 821 | 88.1 | 84.4 | 1.4 |
| c499 | 1,102 | 93.5 | 89.4 | 2.2 |
| c880 | 1,412 | 90.0 | 86.2 | 2.4 |
| c1355 | 2,488 | 88.6 | 84.2 | 7.0 |
| c1908 | 4,007 | 92.0 | 91.9 | 5.1 |
| c3540 | 8,919 | 87.0 | 86.7 | 17.9 |
| c5315 | 12,168 | 94.3 | 94.0 | 18.6 |
| c6288 | 14,17 | 91.6 | 91.4 | 22.5 |
| c7522 | 12,156 | 87.2 | 86.6 | 25.7 |

## III. FAULT MODELING OF DEVICE RESISTIVE SHORTS

The objective of the fault model is to transform the effect of resistive shorts in the circuit into a delay fault or a logic fault, and to compute the delay change due the short resistance $R_{b}$. We consider four types of gate oxide shorts (NMOS gate-to-drain short, NMOS gate-to-source short, PMOS gate-to-drain short and PMOS gate-to-source short), which are shown in Figure 18. This abstraction was first proposed in Hao and McCluskey [10]. Note that there is a diode in series with the short resistor for PMOS transistors, since the polysilicon gate is doped with N -type dopant for long channel transistors. We ignore the gate-to-channel short following [5], since its effect is little on the majority of transistors in logic gates.


Figure 18. Gate-to-source and gate-to-drain shorts in NMOS and PMOS transistors.

To evaluate the short effect, we consider the CMOS circuit in Figure 19. There is a driving cell with output node $A$ and a faulty cell with output node $B$. The potential defect is a gate-to-drain short inside an NMOS transistor of the faulty cell. Circuits for other defect types are similar, except that the faulty transistor may be PMOS or the short may be an NMOS gate-to-source short, or the faulty transistor may be directly connected to the gate
output or the supply.


Figure 19. The circuit with an NMOS gate-to-drain short.

We assume that there is only one short in the faulty cell. We also assume shorts are only on input transistors, i.e. transistors driven by external signals. For shorts on other transistors, we can divide the faulty cell and make the faulty transistor driven by an external signal, then perform the similar analysis. For example, an AND gate can be partitioned into a NAND gate driving an inverter, in order to analyze the short at one of the inverter transistors.

When a signal transition at $A$ causes a transition at $B$, we consider the sum of the driving cell delay and the faulty cell delay. On the other hand, when the signal at $A$ is static, and a signal at other inputs of the faulty cell causes a transition at $B$, we only consider the faulty cell delay. Here delay is computed at $50 \%$ of $V_{d d}$ on the signal waveform, and denoted as $d_{R b}$. When there is no short, the driving cell delay is denoted as $D_{A}$, and the
faulty cell delay is denoted as $D_{B}$. Both $D_{A}$ and $D_{B}$ are pre-computed by any commercial tools.

We present two methods to analyze the circuit behavior with gate-to-source/drain shorts. One is based on the analysis of resistive circuits, and the other is based on the analysis of the output waveforms properties. The former method uses the linear resistance of a transistor to simplify the transistor-level circuit into a resistive circuit, then derives output voltages and delay changes. The method is simple and fast, but the error is large for small short resistance due to the inaccurate circuit modeling. The latter method uses table-based current model for the U-I property of a transistor in the circuit, and achieves satisfied accuracy. The speed of the method is slow compared with the former method, but is still faster than the method using complex current models.

## 1. Linear Resistance Based Approach

The linear resistance based approach uses Shichman-Hodgs (SPICE Level-1) MOSFET model to compute the linear resistance of a transistor as $R_{\text {linear }}=1 / \beta /\left(V_{g s}-V_{t}\right)$, where $\beta$ is the transistor gain factor, $V_{g s}$ is the voltage between the gate and the source of the transistor, and $V_{t}$ is the threshold voltage. $R_{\text {linear }}$ is then used to compute the pull-up/down resistance in static analysis and the dynamic analysis
A. NMOS Gate-to-drain Short
i. Static Analysis

The circuit with static signals is simplified in Figure 20. The gate-to-drain short of the faulty NMOS transistor connects the driving cell and the faulty cell, and also breaks the pull-down path of the faulty cell into two parts. We use linear resistance $R_{u p 1}$ and $R_{\text {down } 1}$ to
represent the pull-up and pull-down resistance of the driving cell. The pull-up path of the faulty cell is represented by $R_{u p 2}$. The pull-down path of the faulty cell is divided into resistor $R_{\text {down } 2}$ and $R_{\text {down } 3}$ by $R_{b}$. The linear resistance of the faulty transistor is included in $R_{d o w n 2}$. Switches $s_{1}, s_{2}, s_{3}$ and $s_{4}$ indicate whether the pull-up path or the pull-down path is conducted. $R_{1}$ and $R_{2}$ are lumped resistance of the RC network of the interconnect.


Figure 20. The circuit for static analysis.

In the case of $A=0$ and $B=1$, the pull-down path of the driving cell is conducted and $s_{1}$ is off, $s_{2}$ is on. The low voltage of $A$ causes the pull-up path of the faulty cell conducted, then $s_{3}$ is on and $s_{4}$ is off. The voltage of $A$ can be derived as:

$$
\begin{equation*}
V_{A, \text { low }}=\frac{R_{1}+R_{\text {down } 1}}{R_{d o w n 1}+R_{1}+R_{b}+R_{d o w n 3}+R_{u p 2}} V_{d d} \tag{3.1}
\end{equation*}
$$

Define Short Threshold Resistance (STR) $R_{A, \text { low }}$ for the logic value of $A$, such that when $R_{b}<R_{A, \text { low }}, V_{A, l o w}$ is greater than the logic low threshold voltage $V_{L L}$. Then according to (3.1),

$$
\begin{equation*}
R_{A, l o w}=\frac{\left(V_{d d}-V_{L L}\right)\left(R_{1}+R_{d o w n 1}\right)}{V_{L L}}-R_{\text {down } 3}-R_{u p 2} . \tag{3.2}
\end{equation*}
$$

Therefore when $R_{b}<R_{A, l o w}, V_{A, \text { low }}>V_{L L}$, there is a logic fault at $A$. When $R_{b}>R_{A, l o w,}$ there will be no logic fault at A, but there might be a delay fault, which will be analyzed later. Similarly, we can derive the voltage of $B, V_{B, h i g h}$, which is logic high in a fault-free circuit, and then derive the corresponding STR $R_{B, h i g h}$ for $B$. When a pull-up path in the driving cell is formed, similar voltage expressions $V_{A, \text { high }}$ at $A, V_{B, \text { low }}$ at $B$ and corresponding BTRs can be derived.
ii. Dynamic Analysis

In dynamic analysis, there are four types of input signal transitions: high, low, rising and falling. Transitions may occur in both the driving cell and the faulty cell, or occur in the faulty cell only. We analyze circuit behaviors through two typical transition patterns. Analysis for other patterns can be derived similarly. In each pattern, $R_{b}$ is assumed to be large enough such that no logic fault exists. To simplify the computation, we use $C_{1}$ and $C_{2}$ to represent the load capacitance of the driving cell and the faulty cell respectively. Both $C_{1}$ and $C_{2}$ include the lumped interconnect parasitic capacitance and the input gate capacitance.
a. Pattern 1: Falling at A, Rising at B

The circuit in dynamic analysis is illustrated in Figure 21. When the signal at $A$ is falling and the signal at $B$ is rising, the initially high voltage of $A$ decreases to $V_{A, l o w}$ and the initial low voltage of $B$ increases to $V_{B, h i g h}$. Then the faulty NMOS transistor is cut off due to the low voltage at $A$, and the pull-up path of the faulty cell is conducted. If there is no $R_{b}$, $C_{1}$ is discharged only through $R_{\text {down } 1}$, and $C_{2}$ is charged only through $R_{u p 2}$. However, in the presence of $R_{b}$, a path through $R_{b}$ connects $A$ and $B$. Then $C_{1}$ is partially discharged to $C_{2}$ because the initial voltage of $A$ is higher than that of $B$. At the same time, $C_{1}$ is charged
through the pull-up path of the faulty cell and $R_{b}$.


Figure 21. A current path connects $A$ and $B$ through $\boldsymbol{R}_{b}$.

The delay computation works as follows. We first derive the formulas of driving cell delay $d_{A, R b}$ and faulty cell delay $d_{B, R b}$ according to the linear circuit analysis. Then considering both delays are functions of $R_{b}$, we compute the ratio between the delay with a fixed value of $R_{b}$ and the delay with $R_{b}=\infty$, and approximate $d_{R b}$ as

$$
\begin{equation*}
d_{R_{b}}=\frac{d_{A, R_{b}}}{d_{A, R_{b}=\infty}} D_{A}+\frac{d_{B, R_{b}}}{d_{B, R_{b}=\infty}} D_{B} . \tag{3.3}
\end{equation*}
$$

The value of $d_{A, R b}$ in the formula is computed in the following procedure.
In order to transform the circuit in Figure 21 to a first-order linear system and derive the delay formula directly, we introduce an effective resistance $R_{A, \text { eff }}$ to replace the discharging path $R_{b}-R_{\text {down } 3}-R_{2}-C_{2}$. The value of $R_{A, e f f}$ is computed by equalizing the average current of the two circuits in Figure 22, during time interval $T_{d}$, witch is the time for the voltage of $C_{1}$ in the circuit of Figure 22(a), dropping from its initial voltage $V_{0}$ to $50 \%$ of the total swing. The equivalent current method is also used to calculate effective capacitance in [46]. In this work we use it to compute $R_{A, e f f}$.

(a)

(b)

Figure 22. $\boldsymbol{R}_{A, e f f}$ is computed by equalizing the average current in the two circuits.

Then we get

$$
\begin{equation*}
R_{A, e \mathrm{eff}}=\frac{C_{2}}{C_{1}+C_{2}} \cdot \frac{\ln (2)}{\ln \left(\frac{2\left(C_{1}+C_{2}\right)}{2 C_{1}+C_{2}}\right)} \cdot\left(R_{b}+R_{d o w n 3}+R_{2}\right) \tag{3.4}
\end{equation*}
$$

Since the circuit becomes a first order linear system, the driving cell delay can be derived directly as

$$
\begin{equation*}
d_{A, R_{b}}=d_{f 1}+\ln \left(\frac{V_{A, l o w}-V_{A, \text { high }}}{V_{A, \text { low }}-0.5 V_{d d}}\right) \cdot R_{A} \cdot C_{1} \tag{3.5}
\end{equation*}
$$

where $d_{f 1}$ is the intrinsic falling delay of the driving cell, $R_{A}=\left(R_{\text {down } 1}+R_{1}\right) / /\left(R_{b}+R_{\text {down } 3}+R_{\text {up } 2}\right) / / R_{A, \text { eff. }}$. Note that when $R_{b}$ is infinity, $d_{A, R b=\infty}$ is the sum of intrinsic gate delay and gate load delay under the lumped C delay model:

$$
\begin{equation*}
d_{A, R_{b}=\infty}=d_{f 1}+\ln 2 \cdot\left(R_{\text {down } 1}+R_{1}\right) \cdot C_{1} . \tag{3.6}
\end{equation*}
$$

The computation of $d_{B, R b}$ is similar. For the rising signal transition at $B$, path $C_{1}-R_{b^{-}}$ $R_{d o w n 3}-R_{2}$ is a charging path. We insert a resistance $R_{B, e f f}$ between $A$ and $B$ to equalize the charging effect of the path. The value of $R_{B, e f f}$ is computed similarly. Here the value of $T_{d}$ is the time for the voltage of $C_{2}$ in the circuit of Figure 23(a), rising to $50 \%$ of the total swing.


Figure 23. $\boldsymbol{R}_{B, e f f}$ is computed by equalizing the average current in the two circuits.

Then we get

$$
\begin{equation*}
R_{B, e \text { eff }}=\frac{C_{1}}{C_{1}+C_{2}} \cdot \frac{\ln (2)}{\ln \left(\frac{2\left(C_{1}+C_{2}\right)}{C_{1}+2 C_{2}}\right)} \cdot\left(R_{b}+R_{d o w n 3}+R_{2}\right) \tag{3.7}
\end{equation*}
$$

The faulty cell delay is derived as

$$
\begin{equation*}
d_{B, R_{b}}=\alpha \cdot d_{r 2}+\ln \left(\frac{V_{B, \text { down }}-V_{B, u p}}{V_{B, \text { down }}-0.5 V_{d d}}\right) \cdot R_{B} \cdot C_{2}, \tag{3.8}
\end{equation*}
$$

where $d_{r 2}$ is the intrinsic rising delay of the faulty cell, $\alpha=\left(V_{d d}-V_{B, l o w}\right) / V_{d d}$ is introduced to scale $d_{r 2}$ because the initial voltage for the rising transition at $B$ is not zero, and $R_{B}=\left(R_{\text {down } 1}+R_{1}+R_{B, \text { eff }}+R_{b}+R_{\text {down } 3}\right) / / R_{\text {up } 2}+R_{2}$.

The approximation is verified on an inverter driving a NAND gate. The resistive gate-to-drain short of an NMOS transistor connects the inverter and the NAND gate. We use Cadence Spectre to run SPICE simulation under TSMC 180 nm 1.8V technology. Results of delay vs. $R_{b}$ computed by our model and by SPICE simulation are shown in Figure 24. We assume $V_{L L}=0.9 \mathrm{~V}$, then compute the Bridge Threshold Resistance $(\mathrm{STR}) R_{A, \text { down }}=$ $1284 \Omega$. When $R_{b}<R_{A, l o w}$, a logic fault occurs on $A$. Otherwise, a decreasing delay change occurs, which means the delay in the presence of $R_{b}$ is less than the delay in a short-free circuit. Our model provides a reasonably accurate approximation of the delay change. In
the figure, the two curves of SPICE simulation and our approximation are pretty close for larger $R_{b}$. The inaccuracy for smaller $R_{b}$ is due to non-linear properties of MOS transistors. Similar results are also found in other gates, as long as the circuit is CMOS circuit.


Figure 24. Decreasing delay estimated and computed by SPICE vs. $\boldsymbol{R}_{b}$.

## b. Pattern 2: Static at A, Falling at B

Consider an NMOS transistor $T_{g}$, which is in series with the faulty NMOS transistor $T_{f}$. Assume there is a rising signal driving $T_{g}$, resulting in a falling transition at $B$, while the signal at $A$ is static.

During the transition, a pull-down path through $T_{f}$ and $T_{g}$ is formed. The circuit in transition is shown in Figure 25, where $R_{\text {down } 3}$ represents the linear resistance of $T_{f}$ and the NMOS block connecting the output of the faulty cell, and $R_{\text {down } 2}$ represents the linear resistance of the NMOS block connecting $T_{f}$ and GND.


Figure 25. $C_{2}$ is discharging through the pull-down path in the faulty cell.

The final voltage at $B$ is dependent on the value of $R_{b}$, and can be derived as

$$
\begin{equation*}
V_{B, \text { low }}=\frac{R_{\text {down } 2}}{R_{u p 1}+R_{1}+R_{b}+R_{\text {down } 2}} V_{d d} . \tag{3.9}
\end{equation*}
$$

Then the faulty cell delay is computed as

$$
\begin{equation*}
d_{B, R_{b}}=d_{f 2}+\ln \left(\frac{V_{d d}-V_{B, \text { low }}}{0.5 V_{d d}-V_{B, l o w}}\right) \cdot\left(R_{d o w n 2} / /\left(R_{u p 1}+R_{1}+R_{b}\right)+R_{d o w n 3}+R_{2}\right) \cdot C_{2}, \tag{3.10}
\end{equation*}
$$

where $d_{f 2}$ is the intrinsic rising delay of the faulty cell without the short.
Similarly, we approximate $d_{R b}$ to be

$$
\begin{equation*}
d_{R_{b}}=\frac{d_{B, R_{b}}}{d_{B, R_{b}=\infty}} \cdot D_{B} . \tag{3.11}
\end{equation*}
$$

Experimental results on the same circuit are shown in Figure 26. The BTR for $V_{B, l o w}$ is $R_{B, \text { low }}=1798 \Omega$. The figure shows that when $R_{b}<R_{B, \text { low }}$, there is a logic fault at $A$ and when $R_{b} \geq R_{B, \text { low }}$, there is an increasing delay change in the faulty cell in the presence of $R_{b}$.


Figure 26. Increasing delay estimated and computed by SPICE vs. $\boldsymbol{R}_{\boldsymbol{b}}$.

## B. NMOS Gate-to-source Short

An NMOS gate-to-source short may affect circuit behaviors in two ways.

1) The short acts as a resistor connecting $A$ to GND. For example, given the signals shown in Figure 27(a), the current flows through $R_{b}$ to GND, while the faulty NMOS transistor is cut off. The resistor can be represented by $R_{b}+R_{\text {down }}$, where $R_{\text {down }}$ represents the linear resistance of the NMOS block between the faulty transistor and GND.
2) The short connects $A$ and $B$ through the faulty NMOS transistor, and currents flow through $A$ to the faulty cell output, then through some NMOS transistors in parallel with the faulty transistor to GND, as the example shown in Figure 27(b). Therefore the voltage of $B$ is pulled up incorrectly, and may cause a logic fault or a delay fault. The corresponding STR values and the delay change can be computed similarly.


Figure 27. The NMOS gate-to-source short may connect $\boldsymbol{A}$ to GND or connect to $\boldsymbol{B}$ under different the input signals.

## C. PMOS Gate-to-source/drain Short

The gate-to-source short in a PMOS transistor can be modeled as a bridge resistor in series with a diode between interconnect and VDD. The diode is always forward-biased and is represented by a certain voltage drop across the bridge. The behavior of the diode can be analyzed similarly to NMOS gate-to-source shorts, except currents can never flow through $A$ to $B$. PMOS gate-to-drain shorts can be analyzed in the similar way with NMOS gate-todrain shorts.

## 2. Current Table Based Approach

## A. Static Analysis

In static analysis, input and output signal voltages of the circuit keep static. In the shortfree circuit, we assume that the output voltages of the driving cell $V_{a}$ and the faulty cell $V_{b}$ are VSS or VDD. In the presence of the short, those voltages may not exactly be VSS or VDD, but dependent on the short resistance and the input signals. It is essential to calculate
static voltages fast and accurately. The voltage indicates if a logic fault exists, i.e., if it is out of the range for a correct digital signal. Moreover, it is one important component in delay approximation, which will be discussed in dynamic analysis.
i. Static Voltage Approximation

Consider an NMOS gate-source short in a 2-input NAND gate in Figure 28. We use an inverter as the driving cell. Transistors in the circuit are labeled by $\mathrm{M}_{1}, \mathrm{M}_{2}, \ldots, \mathrm{M}_{6}$. The voltage of the inverter input is $V_{g}$, node $A$ voltage is $V_{a}$, node $B$ voltage is $V_{b}$, and the voltage at the other terminal of the short is $V_{c}$. Signals on transistor $\mathrm{M}_{4}$ and $\mathrm{M}_{6}$ are noncontrolling signals, and in static VDD. The task of the static analysis is to approximate $V_{a}$ and $V_{b}$, given short resistance $R_{b}$ and static input voltage $V_{g}$.


Figure 28. The circuit with an NMOS gate-to-source short in static analysis.

Assume that the current through each transistor is $I_{d s 1}, I_{d s 2}, I_{d s 3}, I_{d s 5}$, and $I_{d s 6}$, and the current through $R_{b}$ is $I_{b}$, as shown in the figure. Note that we assume there is no current through transistor $\mathrm{M}_{4}$ because it is in cut-off state. Then we have:

$$
\begin{equation*}
I_{d s 1}=I_{d s 2}+I_{b} \tag{3.12}
\end{equation*}
$$

$$
\begin{align*}
& V_{a}-V_{c}=R_{b} \cdot I_{b},  \tag{3.13}\\
& I_{b}+I_{d s 5}=I_{d s 6},  \tag{3.14}\\
& I_{d s 3}=I_{d s 5} . \tag{3.15}
\end{align*}
$$

On the other hand, we can express the drain-to-source current of a MOS transistor $I_{d s}$ as a function of voltage $V_{g s}$ between the transistor gate and source, and voltage $V_{d s}$ between the transistor drain and source, i.e. $I_{d s}=F\left(V_{g s}, V_{d s}\right)$. Therefore we have:

$$
\begin{align*}
& I_{d s 1}=F_{1}\left(\mathrm{VDD}-V_{g}, \mathrm{VDD}-V_{a}\right),  \tag{3.16}\\
& I_{d s 2}=F_{2}\left(V_{g}, V_{a}\right),  \tag{3.17}\\
& I_{d s 3}=F_{3}\left(\mathrm{VDD}-V_{a}, \mathrm{VDD}-\mathrm{V}_{\mathrm{b}}\right),  \tag{3.18}\\
& I_{d s 5}=F_{5}\left(V_{a}-V_{c}, V_{b}-V_{c}\right),  \tag{3.19}\\
& I_{d s 6}=F_{6}\left(\mathrm{VDD}, V_{c}\right) \tag{3.20}
\end{align*}
$$

The value of $V_{a}$ and $V_{b}$ can be derived based on the above equations Eq. (3.12) to Eq. (20). The speed and accuracy depend on the current model in use. For example, the Shichman-Hodgs (SPICE Level-1) MOSFET model provide simple piecewise equations, which express $I_{d s}$ as a polynomial of $V_{g s}$ and $V_{d s}$ under different conditions. Then we can use non-linear regression approaches to derive $V_{a}$ and $V_{b}$. However, experiments show that using Shichman-Hodgs model introduces more than $10 \%$ errors. More accurate models, such as the Alpha-Power Law MOSFET model [47], are too complicated to be applied. Instead, in this paper, we use a two-dimension table to describe $I_{d s}=F\left(V_{g s}, V_{d s}\right)$. Each entry of the table is a value of $I_{d s}$ under some certain $V_{g s}$ and $V_{d s}$, which is pre-computed by SPICE simulation. Accurate current models also can be used to setup the table. We only need to compute the table for the smallest size of PMOS and NMOS transistors respectively.

For other size of transistors, the table is scaled proportionally with the scaled transistor size.
The table-based approximation on gate-source shorts works as follows. First we initialize a vector of $V_{a}$, which is sampled from 0 to VDD. Then we derive a vector of $I_{b}$ corresponding to $V_{a}$ based on Eg. (3.12), and then derive a vector of $V_{c}$ according to Eq. (3.13). Corresponding values of $I_{d s 6}$ are computed based on Eq. (3.20). Then we use Eq. (3.14) to obtain values of $I_{d s 5}$, and derive $V_{b}$ using Eq. (3.19). Therefore, values of $I_{d s 3}$ on transistor $\mathrm{M}_{3}$ are computed according to Eq. (3.18). Since we have $I_{d s 3}=I_{d s 5}$ in Eq. (3.15), the data curve of $I_{d s 3}$ and $I_{d s 5}$ on sampled $V_{a}$ must have a cross point, which corresponds to the solution of $V_{a}$. Assume the number of sample points of $V_{a}$ is $n$, then the complexity of the approximation is $\mathrm{O}\left(n^{2}\right)$, since for each sample point we need take $\mathrm{O}(n)$ complexity of interpolation in the current table.

As an example, we show the approximated $V_{a}$ and $V_{b}$ curves over a series of $R_{b}$, and compare them with SPICE simulation results in Figure 29. Experiments was performed on the circuit with PMOS transistor of size $\mathrm{W}=0.7 \mathrm{um}$ and $\mathrm{L}=0.2 \mathrm{um}$, and NMOS transistor of size $\mathrm{W}=0.7 \mathrm{um}$ and $\mathrm{L}=0.2 \mathrm{um}$ under TSMC 180 nm technology. The input signal $V_{g}$ is 0 . The current table size is 19 x 19 , and the size of the initial $V_{a}$ vector is 19 . Results show that the approximated voltage is well matched with that of SPICE simulation.


Figure 29. Static voltage approximation compared with SPICE simulation.

## ii. Short Threshold Resistance

Assume the logic low threshold voltage is $V_{L L}$ and the logic high threshold voltage is $V_{H H}$. Then define the threshold short resistance $R_{A, \text { low }}$ for the logic value of A, such that when $R_{b}<R_{A, \text { low }}, V_{A, l o w}$ is greater than $V_{L L}$. Therefore, when $R_{b}<R_{A, \text { low, }}$, we consider a logic fault on A, else we consider a delay fault. The relationship is simply shown in Figure 30.


Figure 30. Threshold short resistance $\boldsymbol{R}_{A, l o w}$.

The threshold short resistance can be obtained using the proposed voltage approximation approach. That is, once the curve of $V_{a}$ is computed, $R_{A, l o w}$ can be found in corresponding to $V_{L L}$.

Similarly, we can compute threshold resistance $R_{A, h i g h}$ corresponding to $V_{H H}$, and the threshold resistance for signal at $B$. These resistance values are to be used to indicate if a logic fault happens.
B. Dynamic Analysis

In dynamic analysis, we consider circuit behaviors during signal transitions, and explore the relationship between delay changes and the short resistance. Transitions may occur in both the driving cell and the faulty cell, or occur in the faulty cell only, depending on input signal patterns. In this paper, we focus on the transition on both cells, since it is more complex due to the interaction of the two cells. Similar analysis and results can be simply applied to the transition that only occurs on the faulty cell.

We first analyze the transition procedure in the presence of $R_{b}$, and show how delay changes under different input signals. Then we present a new delay model, and use it in our approximation approach. Finally, we compare our approximation result and SPICE simulation. In dynamic analysis, $R_{b}$ is assumed to be bigger than the threshold short resistance such that no logic fault exists. And to simplify the computation, we use $C_{1}$ and $C_{2}$ to represent the load capacitance of the driving cell and the faulty cell respectively. Both $C_{1}$ and $C_{2}$ include the lumped interconnect parasitic capacitance and the input gate capacitance. Because the analysis of NMOS gate oxide shorts can be applied dually to the PMOS gate oxide shorts, in this section we only study the NMOS gate oxide shorts. For

PMOS gate oxide short, the diode in the short model allows current to flow in only one direction. If it is conducted, then it is treated as a certain voltage drop on the current path. If not, then the current path is cut-off.

## i. Transition Procedure

The presence of the short resistor provides an extra current channel for the load capacitor to be charged and discharged. Consider the circuit with an NMOS gate-to-source short in Figure 31, and assume there is one rising signal for 0 to VDD on the input of the driving cell. The circuit in transition is shown in Figure 6.


Figure 31. The circuit with an NMOS gate-to-source short in transition.

Initially, transistor $\mathrm{M}_{1}$ is conducted and $\mathrm{M}_{2}$ is cut-off due to the low voltage on the input of the inverter. Transistor $\mathrm{M}_{6}$, which is connected to node $A$ by $R_{b}$, is conducted due to the high voltage on its gate. Therefore, a current path $\left(\mathrm{M}_{1}-R_{b}-\mathrm{M}_{6}\right)$ is formed. Then $V_{a}$ is pulled-down from VDD. Accordingly, $V_{b}$ may not keep 0 because $M_{3}$ can be in nonsaturation state for $V_{a}>V_{p t}$, where $V_{p t}$ is the threshold voltage of the PMOS transistor $\mathrm{M}_{3}$.

When the transition begins, transistor $\mathrm{M}_{2}$ is becoming conducted. Capacitor $C_{1}$ is
discharged through $\mathrm{M}_{2}$, and through $R_{b}$ and $\mathrm{M}_{6}$ in a parallel path. Here the presence of $R_{b}$ provides an extra discharging path for $C_{1}$. For smaller Rb , the timing constant of the extra discharging path is smaller, which means $C_{1}$ will be discharged even faster. In that scenario, the driving cell delay is decreasing. In the end of the transition, $C_{1}$ is totally discharged and $V_{a}=0$. On the faulty cell, capacitor $C_{2}$ is charged through $\mathrm{M}_{3}$. Although there is no extra current path for $C_{2}$ to be charged faster, the initially lowered voltage on the input of the faulty cell accelerate the process. Finally $V_{b}$ is pulled-up to VDD.

In Figure 32, we show the waveforms of $V_{a}$ under three different values of $R_{b}$ in SPICE simulation, where $\mathrm{VDD}=1.8 \mathrm{v}$ and the input signal is a ramp rising with edge rate 30 ps . The initial voltage of $V_{a}$ becomes smaller with smaller Rb , and the delay on the $50 \%$ of VDD decreases as well.


Figure 32. Delay decreases with smaller $\boldsymbol{R}_{b}$, under a rising input signal.

Now consider the transition triggered by a falling signal at the driving cell input.

Initially, $V_{a}=0$ and $V_{b}=\mathrm{VDD}$. Then capacitor $C_{1}$ is charged through transistor $\mathrm{M}_{1}$ and $V_{a}$ increases. On the other hand, for capacitor $C_{1}$, the discharging path through $R_{b}$ and transistor $\mathrm{M}_{6}$ exist during both the static and transition period. That slows down the charging of $C_{1}$, and finally lowers down $V_{a}$ from VDD. Accordingly, because of the slow increase of $V_{a}$, the discharging process of $C_{2}$ becomes slow as well.

In Figure 33, we show the waveforms of $V_{a}$ under the falling input signals. The delay increases with smaller $R_{b}$.


Figure 33. Delay increases with smaller $\boldsymbol{R}_{b}$, under a falling input signal.

In conclusion, the short resistance $R_{b}$ affects delay changes in two ways. One is that $R_{b}$ provides an additional current path in the pull-up/down path. Therefore the load capacitor is charged or discharged faster when the path helps the process, and slower when the path deters it. The other is that the initial voltage may change. That makes the transistor convert from the cut-off state to the non-saturation state faster.

We conclude the relationship between delay changes and the output waveform shape in Table 4. The initial and the final voltage of the signal are $V_{0}$ and $V_{1}$ respectively. In the table, "off VDD" means that the voltage is lowered down from VDD due to the short, and "off 0" means the voltage is pulled-up from VSS. The offset initial voltage speedups the transition, while the offset final voltage indicates that an extra current path slows it down. When both $V_{0}$ and $V_{1}$ are offset from VSS or VDD, in general cases delay decreases. However, for $R_{b}$ close to the threshold short resistance, delay may be greater than the delay in the fault-free circuit. Therefore we put "undetermined" in that waveform shape.

TABLE 4. Relationship between delay changes and output signal waveform.

| Type | V0 | V1 | Delay change |
| :---: | :---: | :---: | :---: |
| Rising | VSS | Off VDD | increasing |
| Falling | Off VDD | VSS | decreasing |
| Rising | Off VSS | VDD | decreasing |
| Falling | VDD | Off VSS | increasing |
| Rising | Off VSS | Off VDD | undetermined |
| Falling | Off VDD | Off VSS | undetermined |

ii. Delay Approximation

The delay approximation approach is based on a new delay model in the following:

$$
\begin{equation*}
D=D_{\text {sleep }}+D_{\text {intrinsic }}+D_{R C}, \tag{3.21}
\end{equation*}
$$

where $D$ is the cell delay, $D_{\text {sleep }}$ is the time for the transistor to convert from the cut-off status to the non-saturation status, $D_{\text {intrinsic }}$ is the delay with zero output load, and $D_{R C}$ is the extra delay caused by the output load.

## a. $D_{\text {sleep }}$ Approximation

$D_{\text {sleep }}$ is determined by the input signal waveforms. It is used to account for the time that a transistor starts to be conducted, and is computed as the time for $V_{g s}<V_{t}$, where $V_{t}$ is the threshold voltage of the transistor to be conducted. For a slow input signal, $D_{\text {sleep }}$ is more significant. For fast input signals, $D_{\text {sleep }}$ is small, and especially, $D_{\text {sleep }}=0$ under a step input. b. $D_{\text {intrinsic }}$ Approximation
$D_{\text {intrinsic }}$ is computed from the time that the transistor starts to be conducted, to the time that the output waveform across the $50 \%$ of VDD, for the cell with zero loads. Note it is different from the traditional delay model, in that the starting point is not the time for the input waveform to across the $50 \%$ of VDD.

Assume $R_{b}$ is infinity, i.e., in a short-free circuit, we can compute the intrinsic delay using any commercial tools, and name it as $D_{\text {int }}$. Then considering the presence of the short, we treat the transistor in transition as a first-order linear system under a step stimulate, and write the output voltage as:

$$
\begin{equation*}
V(t)=V_{1}+\left(V_{0}-V_{1}\right) e^{-t / \tau} \tag{3.22}
\end{equation*}
$$

where $V_{0}$ is the initial voltage and $V_{1}$ is the final voltage, $\tau$ is the time constant in the system. Then the delay for the waveform to across the $50 \%$ of VDD can be expressed as:

$$
\begin{equation*}
d_{50 \% V d d}=\ln \left(\frac{V_{0}-V_{1}}{50 \% V d d-V_{1}}\right) \cdot \tau \tag{3.23}
\end{equation*}
$$

Note that in the short-free circuit, $d_{50 \% V D D}=D_{\text {int }}=\ln 2 \tau$. Therefore, we approximate the intrinsic delay with the presence of $R_{b}$ as a scaled $D_{\text {int }}$ :

$$
\begin{equation*}
D_{\text {intrinsic }}=\ln \left(\frac{V_{0}-V_{1}}{50 \% \mathrm{VDD}-V_{1}}\right) / \ln 2 \cdot D_{\mathrm{int}} . \tag{3.24}
\end{equation*}
$$

Note that in the above formula, we assume that the time constant $\tau$ does not change with $R_{b}$. That is reasonable since the transistor is in non-saturation status during most of the transition time, and is not affected by the change of $V_{0}$ and $V_{1}$.
c. $D_{R C}$ Approximation
$D_{R C}$ is used to account for the delay due to the output load $C_{l}$, and can be expressed as

$$
\begin{equation*}
D_{R C}=\ln \left(\frac{V_{0}-V_{1}}{50 \% \mathrm{VDD}-V_{1}}\right) \cdot R_{d} C_{l}, \cdot \tag{3.25}
\end{equation*}
$$

where $R_{d}$ is the equivalent driving resistance of the pull-up/down path.
In the short-free circuit, we name it as $D_{R C 0}=\ln 2 R_{d} C_{l}$, and can be computed by subtracting $D_{\text {sleep }}$ and $\mathrm{D}_{\text {intrinsic }}$ from the delay with output load. With the known of output load $C_{l}$, we can compute the driving resistance $R_{d}=D_{R C 0} / \ln 2 / C_{l}$. This value accounts for the transistor driving ability, and keeps consistent in the faulty circuit.

With the presence of $R_{b}$, voltage $V_{0}$ and $V_{1}$ change, as well as $R_{d}$. The value of $R_{d}$ depends on the ${ }_{\text {type }}$ of the gate-oxide short and input signals. In the following, we derive $R_{d}$. in two typical cases. For other cases, the value of $R_{d}$ can be derived similarly. We assume the driving resistance of the transistor $\mathrm{M}_{\mathrm{i}}$ is $R_{i}$, which is pre-computed in the fault-free circuit.

## Case 1. NMOS gate-to-source short under rising input

Consider the circuit in Figure 6, the rising input of the driving cell causes a falling transition at $A$ and a rising transition at $B$. The simplified circuit for $R_{d}$. computation is
shown in Figure 34.


Figure 34. Simplified circuit for RC delay approximation.

Then for the rising transition at $A$, the equivalent driving resistance $R_{d, A}=R_{2} / /\left(R_{b}+R_{6}\right)$, where "//" means two resistors in parallel, and for the falling transition at $B$, the equivalent driving resistance $R_{d, B}=R_{3}$.

Such approximation also can be applied to NMOS gate-to-source short under falling input, and PMOS gate-to-source short under rising and falling input, except that the conducted transistor is different.

Case 2. NMOS gate-to-drain short under rising input
Consider the NMOS gate-to-drain short of an inverter in the circuit of Figure 35.


Figure 35. Circuit with an NMOS gate-to-drain short under a rising input.

Initially, $\mathrm{M}_{1}$ is conducted and $\mathrm{M}_{2}$ is cut-off. Dependent on the initial voltage of $V_{a}, \mathrm{M}_{3}$ and $\mathrm{M}_{4}$ may be in the cut-off or the conducted status. Under the rising input, $V_{a}$ is falling and $V_{b}$ is rising. Then the faulty NMOS transistor becomes cut-off due to the low $V_{a}$, and $\mathrm{M}_{3}$ becomes conducted. If there is no $R_{b}, C_{1}$ is discharged only through $\mathrm{M}_{2}$, and $C_{2}$ is charged only through $\mathrm{M}_{3}$. However, in the presence of $R_{b}$, a path through $R_{b}$ connects A and B . Then $C_{1}$ is partially discharged to $C_{2}$ because the initial voltage of $A$ is higher than that of $B$. At the same time, $C_{1}$ is charged through the pull-up path of the faulty cell and $R_{b}$. By replacing transistors with the driving resistors, we show the circuit in transition in Figure 36.


Figure 36. A current path connects $\boldsymbol{A}$ and $\boldsymbol{B}$ through $\boldsymbol{R}_{\boldsymbol{b}}$.

The value of $R_{d}$ for $V_{a}$ falling and for $V_{b}$ rising is computed similar with the analysis in linear resistance based method in Eq. (3.4) and Eg. (3.7) respectively.

Then we get

$$
\begin{equation*}
R_{A, e f f}=\frac{C_{2}}{C_{1}+C_{2}} \cdot \frac{\ln (2)}{\ln \left(\frac{2\left(C_{1}+C_{2}\right)}{2 C_{1}+C_{2}}\right)} \cdot\left(R_{b}+R_{4}\right) \tag{3.26}
\end{equation*}
$$

Then we get $R_{d, A}=R_{2} / /\left(R_{b}+R_{4}\right) / / R_{A, e f f}$, which is the parallel resistance of $R_{2}, R_{b}+R_{4}$, and $R_{A, \text { eff. }}$

The computation of $R_{d, B}$ is similar.

$$
\begin{equation*}
R_{B, e f f}=\frac{C_{1}}{C_{1}+C_{2}} \cdot \frac{\ln (2)}{\ln \left(\frac{2\left(C_{1}+C_{2}\right)}{C_{1}+2 C_{2}}\right)} \cdot\left(R_{b}+R_{4}\right) . \tag{3.27}
\end{equation*}
$$

The driving resistance of the faulty cell is $R_{d, B}=\left(R_{2}+R_{B, e f f}+R_{b}+R_{4}\right) / / R_{3}$.

## d. Approximation Procedure and Experimental Results

We summarize the procedure of the delay approximation approach as follows.
Step 1. Compute cell delays in the fault-free circuit, and derive the driving resistance of the transistor, and intrinsic delay.

Step 2. Using static analysis, compute the initial voltage $V_{a 0}$ and $V_{b 0}$ and the final voltage $V_{a 1}$ and $V_{b 1}$.

Step 3. Approximate the delay at A, using the delay model in formula (10).
Step 4. Use $V_{a}$ as the input signal of the driving cell, determine $D_{\text {sleep }}$ for the delay approximation of the faulty cell. This step is similar with step 3 , except the delay is computed at the time of $V_{g s}=V_{t}$.

Step 5. Approximate the delay at B, using the delay model in formula (10).

To verify our delay approximation approach, we estimate the delay of a faulty circuit consisting of two inverters in series. TSMC 180 nm technology is used. PMOS transistor size is $\mathrm{Wp}=1.4 \mathrm{um}$ and $\mathrm{Lp}=0.2 \mathrm{um}$. NMOS transistor size is $\mathrm{Wn}=0.7 \mathrm{um}$ and $\mathrm{Ln}=0.2 \mathrm{um}$. Load capacitance $C_{1}=C_{2}=1 \mathrm{fF}$. We show the approximated delay on the output of the circuit and SPICE simulation result in the following figures. Each figure represents a different case of the transition and is summarized in Table 5.

TABLE 5. Approximated delay and SPICE simulation results in Figure 37 - Figure 40.

| Short type | Input | Normal <br> delay | Worst <br> delay | Threshold <br> short resistance | Delay change |
| :---: | :---: | :---: | :---: | :---: | :---: |
| NMOS gate-to- <br> source | rising | 19.7 ps | 0.6 ps | RA,high=853 | decreasing |
| NMOS <br> gate-to-source | falling | 20.9 ps | 48.0 ps | RA,high= $853 \Omega$ | increasing |
| NMOS gate-to- <br> drain | rising | 19.7 ps | 12.3 ps | RB,high $=775 \Omega$ | decreasing |
| NMOS gate-to- <br> drain | falling | 20.9 ps | 5.8 ps | RB,high $=775 \Omega$ | decreasing |

In Figure 37, we consider the delay of the circuit with an NMOS gate-to-source short and a rising input. The threshold short resistance on A is $R_{A, h i g h}=853 \Omega$. When $R_{b}<R_{A, h i g h}$, a logic fault occurs on A, otherwise, the circuit delay is less than the normal circuit delay. With decreasing $R_{b}$, the delay change becomes worse.


Figure 37. Approximated circuit delay compared with SPICE simulation for NMOS gate-to-source short, under a rising input.

When we use a falling input signal, the delay curves are shown in Figure 38, where an increasing delay change happens.


Figure 38. Approximated circuit delay compared with SPICE simulation for NMOS gate-to-source short, under a falling input.

In Figure 39 and Figure 40, we show the results of an NMOS gate-to-drain short under a rising input and a falling input respectively. The SPICE simulation shows that, with $R_{b}$ close to the threshold short resistance, delay changes to increase. That is because the output waveform slope becomes more close to 0 , resulting in the significant increasing of the delay at the $50 \%$ of VDD.


Figure 39. Approximated circuit delay compared with SPICE simulation for NMOS gate-to-drain short, under a rising input.


Figure 40. Approximated circuit delay compared with SPICE simulation for NMOS gate-to-drain short, under a falling input.

## 3. Fault Behavior and Vector Selection

We summarize circuit behaviors of resistive gate oxide shorts in Table 6. For each type of short, we list delay changes for different input signal patterns. In the table, $T_{f}$ is the faulty transistor, and $T_{g}$ is another transistor in the same NMOS/PMOS block with $T_{f}$. We assume that the rising/falling transition at the output of the faulty cell is only provoked by the signals on $T_{g}$ and $T_{f}$, where signals on other transistors in the same NMOS/PMOS block keep static. " $T_{f} \mid T_{g}$ " means the two transistors are in parallel, " $T_{f} \sim T_{g}$ " means the two transistors are in series, "decrease/increase" means the delay decreases/increases compared to the short-free circuit, and " 0 " means the delay dose not change even in the presence of the short. The "static" status means the signal keeps logic high or logic low, to guarantee a pull-up/down path to conduct when $T_{g}$ and $T_{f}$ are in series, or to make the transistor cut off when $T_{g}$ and $T_{f}$ are in parallel. The table enumerates all input signal patterns that cause increasing/decreasing delay change at the output of the faulty cell. The amount of delay change is given by formulae such as (3.3) and (3.11).

## TABLE 6. Behaviors for each type of short under different input signals.

| Short type | Input on $T_{f}$ | Input on $T_{g}$ | Delay change |
| :--- | :--- | :--- | :--- |
| NMOS <br> gate-to- <br> source | Rising | Static | increase |
|  | Falling | Static | decrease |
|  | Static | Rising | $0\left(T_{f} \mid T_{g}\right)$, increase $\left(T_{f} \sim T_{g}\right)$ |
|  | Static | Falling | $0\left(T_{f} \mid T_{g}\right)$, decrease $\left(T_{f} \sim T_{g}\right)$ |
| NMOS <br> gate-to- <br> drain short | Rising | Static | decrease |
|  | falling | Static | decrease |
|  | Static | Rising | decrease $\left(T_{f} \mid T_{g}\right)$, increase $\left(T_{f} \sim T_{g}\right)$ |
|  | Static | Falling | Increase $\left(T_{f} \mid T_{g}\right)$, decrease $\left(T_{f} \sim T_{g}\right)$ |
| PMOS <br> gate-to- <br> drain short | Rising | Static | decrease |
|  | falling | Static | Increase |
|  | Static | Rising | $0\left(T_{f} \mid T_{g}\right)$, decrease $\left(T_{f} \sim T_{g}\right)$ |
|  | Static | Falling | $0\left(T_{f} \mid T_{g}\right)$, increase $\left(T_{f} \sim T_{g}\right)$ |
| PMOS <br> gate-to- <br> source short | Rising | Static | decrease |
|  | Falling | Static | increase |
|  | Static | Rising | $0\left(T_{f} \mid T_{g}\right), 0\left(T_{f} \sim T_{g}\right)$ |
|  | Static | Falling | $0\left(T_{f} \mid T_{g}\right)$, increase $\left(T_{f} \sim T_{g}\right)$ |

Our fault model can be used to select the best vector to cause the greatest delay change.
Note that in this dissertation we only consider delay fault due to the increasing delay change based on the path delay fault model [48]. Analysis of the decreasing delay change is similar. The necessary input signals on $T_{f}$ and $T_{g}$ to cause an increasing delay change can be found in Table 6. To further refine the vector selection, input signals of the driving cell must be considered. That is because input signals will affect the pull-up/down resistance of the driving cell, then further affect the value of delay changes. Such dependence is also observed and called "pattern dependence" in Hao and McCluskey [5], but it is used for logic fault model in their work and is insufficient for delay fault model. In our fault model, the dependence is explicitly expressed in formulas such as in formula (3.1) and (3.8).

We list the best test vectors in Table 7. In the table we use "strong" and "weak" to indicate the preferred input signals of the driving cell. For example, a strong pull-up driving strength for a NAND gate requires all inputs are logic 0 , while a weak pull-up driving strength requires only one input is logic 0 . Choosing the strong or the weak driving strength depends on the input signals of the faulty cells, as well as the network of the transistors structure. For example, for an NMOS gate-to-source short, we choose the weak driving strength when the input of $T_{f}$ is rising and the input of $T_{g}$ is static. However, when the input of $T_{f}$ is static and the input of $T_{g}$ is rising, and if $T_{f}$ and $T_{g}$ are in parallel, we can choose the strong ability.

TABLE 7. Vectors to cause the greatest increasing delay change.

| Short type | Input of $T_{f}$ | Input of $T_{g}$ | Driving strength |
| :--- | :--- | :--- | :--- |
| NMOS <br> gate-to- <br> source | Rising | Static | Weak |
|  | Static | Rising | $\operatorname{strong}\left(T_{f} \sim T_{g}\right)$ |
| NMOS <br> gate-to- <br> drain | Static | Rising | $\operatorname{strong}\left(T_{f} \sim T_{g}\right)$ |
| PMOS <br> gate-to- <br> drain | Static | Falling | Falling |
| Static static weak $\left(T_{f} \mid T_{g}\right)$ <br> PMOS <br> gate-to- <br> source Falling Falling <br>  Static Static <br> strong $\left(T_{f} \sim T_{g}\right)$   | Falling | Weak |  |

## 4. Test Performance Improvement

To evaluate the benefit of the proposed fault model in delay test, we designed a circuit-
level delay fault simulator for both delay test and logic test. We run experiments on ISCAS85 and ISCAS89 benchmark circuits. TSMC 180 nm 1.8 V technology is used. Shorts are assigned between gate and source or between gate and drain for each transistor in the circuit. For each run of the simulation, we assumed that there is only one short present, and evaluated the short resistance that makes the circuit fail in logic or timing requirements. We generated the input signal pattern that causes the worst effect of the resistive short according to Table 7. Then we used an ATPG tool proposed in [49] and [50] to generate the critical path satisfying the input signal pattern.

We calculated the delay of the longest path through the faulty transistor in the short-free circuit, and represented it as $D_{R b=\infty}$. A delay fault is considered to have occurred if the short causes a delay increase of more than $10 \%$ of $D_{R b=\infty}$. Therefore, the resistance value to cause $10 \%$ of $D_{R b=\infty}$ is the maximum value that a delay test is able to detect. On the other hand, for logic test, we only need to calculate the BTR value for each short, which is the maximum value for a short to cause a logic fault. Normally the value of BTR is less than the value to cause a delay fault, which means the delay test will detects more potential defects than the logic test. Using our fault model and a kind of resistance distribution, we can not only illustrate this fact, but also numerically evaluate how much the delay test coverage exceeds the logic test coverage, which cannot be done using previous fault models.

As an example, in Figure 41, we show the circuit failure distribution vs. the short resistance for one ISCAS89 benchmark circuit s1488. There are 3829 potential fault sites for gate oxide shorts in the circuit. A circuit is considered faulty if the test detects a fault on one fault site. For a certain value of the short resistance, we perform delay test and logic
test, and compute the percentage of faulty circuits, assuming all shorts are equally likely. In the figure, the X -axis indicates the short resistance to be evaluated, and the Y -axis indicates the percentage of faulty circuits for that value of the short resistance. As expected, when the short resistance is large, most circuits are fault free, so there is little difference between the two tests. And similarly, when the resistance is low, most circuits have a functional fault, and so are detected. For a certain value of the short resistance, we always found more faulty circuits in delay test. For example, for a short resistance of $1159 \Omega, 44.7 \%$ of circuits are found faulty in delay test, while only $22.9 \%$ are found faulty in logic test. The difference, $21.8 \%$, is also the most benefit that the delay test exceeds the logic test in the circuit. In Figure 42, we show the similar simulation results of another ISCAS89 circuit s38417, which has 101,489 potential fault sites.


Figure 41. The circuit failure distribution vs. the short resistance for ISCAS89 circuit s1488.


Figure 42. The circuit failure distribution vs. the short resistance for ISCAS89 circuit s38417.

## IV. PARAMETRIC DELAY EVALUATION

## 1. ISCAS85/89 Benchmark Circuits

ISCAS85/89 benchmark circuits are proposed in Brglez et al. [51] for 10 combinational circuits, and in Brglez et al. [52] for 26 sequentail circuits. Traditionally, they are designed to verify the performance of the logic test, and only netlists in the stuctual level are provided. However, for delay test and timing analysis under nano-scale technologies, it is necessary to provide more realistic timing information. Especially, in order to research on the process variation impact on circuit delays, the layout and parasitic information must be provided.

In this work, we developed a standard cell library using TSMC 180nm technology, and generated layout and parasitic information for ISCAS85/89 circuits.

## A. Standard Cell Library

The standard cell library is developed using the TSMC 180nm technology. The MOSIS DEEP rule SCN6M_SUBM (6 Metal, 1 Poly, $1.8 \mathrm{~V} / 3.3 \mathrm{~V}$, and $\lambda=100 \mathrm{~nm}$ ) is used. The library The standard cell library consists of 28 standard cells, and contains all the cells that ISCAS benchmark circuits have. The content is listed in Table 8.

TABLE 8. Cell list in standard cell library.

| Cell Name |  |
| :--- | :--- |
| buf_1 | Function |
| inv_1 | Inverter, drive strength 1 |
| and2_1 | 2-input AND gate, drive strength 1 |
| and3_1 | 3-input AND gate, drive strength 1 |
| and4_1 | 4-input AND gate, drive strength 1 |
| and5_1 | 5-input AND gate, drive strength 1 |
| and8_1 | 8-input AND gate, drive strength 1 |
| and9_1 | 9-input AND gate, drive strength 1 |
| nand2_1 | 2-input NAND gate, drive strength 1 |
| nand3_1 | 3-input NAND gate, drive strength 1 |
| nand4_1 | 4-input NAND gate, drive strength 1 |
| nand5_1 | 5-input NAND gate, drive strength 1 |
| nand8_1 | 8-input NAND gate, drive strength 1 |
| nand9_1 | 9-input NAND gate, drive strength 1 |
| or2_1 | 2-input OR gate, drive strength 1 |
| or3_1 | 3-input OR gate, drive strength 1 |
| or4_1 | 4-input OR gate, drive strength 1 |
| or5_1 | 5-input OR gate, drive strength 1 |
| or8_1 | 8-input OR gate, drive strength 1 |
| or9_1 | 9-input OR gate, drive strength 1 |
| nor2_1 | 2-input NOR gate, drive strength 1 |
| nor3_1 | 3-input NOR gate, drive strength 1 |
| nor4_1 | 4-input NOR gate, drive strength 1 |
| nor5_1 | 5-input NOR gate, drive strength 1 |
| nor8_1 | 8-input NOR gate, drive strength 1 |
| nor9_1 | 9-input NOR gate, drive strength 1 |
| xor2-1 | 2-input XOR gate, drive strength 1 |
| Dff | D flip-flop, drive strength 1 |

For each cell, the transistor gate length is $2 \lambda$. The width varies according to drive strength. For lowest drive strength $7 \lambda$ (NMOS), $14 \lambda$ (PMOS). For hi-drive-strength may use $14 \lambda$ (NMOS), $28 \lambda$ (PMOS), $W_{p} / W_{n}=2$ always for every primitive gates. The layout of each cell is provided with GDSII format and LEF format. The current version of the standard
cell library contains only drive strength 1 cells. It can be expanded to contain more standard cells with different drive strength.

## B. Delay Table

For each standard cell, we build up a two-dimensional delay table using SPICE simulation. One dimension is the input signal slew rate, in range of $(20 \sim 1000 \mathrm{ps})$ by 9 uneven samples. The other is the output load, in range of ( $5 \mathrm{fF} \sim 500 \mathrm{fF}$ ) by 9 uneven samples. The range of input slew rate and the output load are chosen according to the parasitic information from the layout of ISCAS circuits.

The table is provided by ".lib" format used in Synopsys standard timing format, and is provided by ". $t l f$ " format used in Cadence timing library format.
C. Layout and Parasitic Information

Based on the standard cell library, we generate the layout of each cell using Cadence Silicon Ensemble. The Cadence Hyper-extractor, a 2.5D parasitic extraction tool, is used to generated distributed interconnect parasitic information. The coupled capacitance between interconnect is generated using another parasitic extraction tool, Synopsys Arcadia. The input capacitance of each standard cell is pre-characterized in the library.

## 2. Linear Delay Modeling

There are many forms of process variation, see for example Nassif [16] and Stine et al. [39]. In this dissertation, we consider the systematic process variation, such as the variation on gate length, and the variation of metal width, metal thickness, and inter-layer-dielectric (ILD) thickness related to each interconnect layer. Our methods can be extended to include other process variation such as the threshold voltage, the supply voltage and the
temperature, as long as the approximated delay can be expressed as a linear function of the process variables within their variation ranges.

In order to calculate the path delay under process variation, we first compute the buffer-to-buffer delay. The buffer-to-buffer delay is defined as the delay from the input pin of a cell to the input pin of a downstream cell. After each buffer-to-buffer delay in the circuit is computed, the delay of any path can be easily obtained by adding up buffer-to-buffer delays along the path.

We approximate the buffer-to-buffer delay as a linear function of process variables:

$$
\begin{equation*}
d(\mathbf{x}, s) \approx d_{0}(s)+b_{1}(s) x_{1}+b_{2}(s) x_{2}+\ldots+b_{p}(s) x_{p} \tag{4.1}
\end{equation*}
$$

where $d_{0}(\mathrm{~s})$ is the nominal delay, $\mathbf{x}=\left(x_{1}, x_{2}, \ldots, x_{\mathrm{p}}\right)$ is the vector of process variables, each representing the deviation from the nominal value, $s$ is the input signal slew, and $b_{i}(\mathrm{~s})=\partial d / \partial x_{\mathrm{i}}$ is the delay sensitivity to process variable $x_{i}$. We assume both the nominal delay and delay sensitivities are functions of input signal slew $s$.

The validity of the linear model is supported by extensive simulation. We performed multiple parasitic extraction and SPICE simulation under different process conditions. It is found that for any single process variation variable, its effect on delay is approximately linear within its small variation range. In Figure 43 we show the SPICE simulation result on a buffer-to-buffer segment in the circuit for several typical process variation variables. Each variable changes within its typical range (metal width $\pm 5 \%$, metal thickness $\pm 20 \%$, ILD thickness $40 \%$, and gate length $\pm 5 \%$ ). In addition, since the systematic process variables in our consideration are determined at different stages of the manufacturing process, we can assume they are independent of each other. Furthermore, within the small variation range of
each variable, the effect of each variable is additive. For example, considering width variation on metal 2 and metal 3, we denote the delay variation under metal 2 width variation and under metal 3 width variation as $\Delta d_{w 2}$ and $\Delta d_{w 3}$ respectively, and denote delay variation under both variations happening as $\Delta d_{w 1+w 2}$. The width changes on both metal 2 and metal 3 are $5 \%$ of the nominal metal width. Then we use $\Delta d_{w 1}+\Delta d_{w 2}$ to approximate $\Delta d_{w 1+w 2}$. In Figure 44 we show the error distribution over 160 buffer-to-buffer delays in circuit c432. From the figure, for most of buffer-to-buffer segments, the error is considerably smaller. Because of the effect of layer overlapping between metal 2 and metal 3 in the layout, the error is over $20 \%$ for few buffer-to-buffer segments. It is interesting to study more complex models to compensate for such segments and keep the additive property. Nevertheless, for most buffer-to-buffer segments the effect of metal 2 and metal 3 width variation can be considered as additive. The similar result is found in other process variables.


Figure 43. Delay variations due to process variation are linear in SPICE simulation. The $\mathbf{x}$-axis indicates process variation and the $\mathbf{y}$-axis indicates the percentage deviation from the nominal delay.


Figure 44. The delay effect of process variation is additive, which is demonstrated by the error distribution of approximating $\Delta d_{w 1+w 2}$ with $\Delta d_{w 1}+\Delta d_{w 2}$ over 160 buffer-tobuffer segments in circuit $\mathbf{c} 432$.

The effect of signal slew has been studied in previous research, for example, in variational delay evaluation [16] and in static timing analysis [53]. We assume the effect of
process variation on output signal slew is small and propagate signal slews under the nominal process condition. The computation of nominal delay and signal slew can be done by any commercial tool, and is not the focus of this paper. The key issue is to efficiently compute delay sensitivities $b_{1}, b_{2}, \ldots, b_{p}$.

A buffer-to-buffer segment in a circuit is represented by a cell driving an RC circuit, which consists of distributed $R_{1}, \ldots, R_{n}$ and distributed $C_{1}, \ldots, C_{n}$ on interconnect and sink capacitance $C_{s}$ for each downstream cell. The RC circuit can be a tree-like structure or a path-like structure. Parasitic RCs are generated by commercial parasitic extraction tools, and each pair of parasitic $\left(R_{i}, C_{i}\right)$ is related to one metal segment or a contact/via on interconnect.

## A. Computation on RC Variations

The sink capacitance $C_{s}$ is only related to device parameters of the downstream cell. In this work, we ignore the variation of $C_{s}$. Thus $\partial C_{s} / \partial x_{i}$ is zero for all process variables.

Parasitic RCs on interconnect vary in different process conditions. The value of $\partial R_{j} / \partial x_{i}$ can be easily derived from the basic resistance computation formula $R=\rho L /(W T)$, where $\rho$ is the resistive constant, $L, W$ and $T$ is the length, width and thickness the metal segment respectively.

However, it is more difficult to compute $\partial C_{j} / \partial x_{i}$. This is because the parasitic capacitance of a metal wire depends not only on the wire itself, but also on the neighboring condition. Formula-based methods for parasitic extraction are no longer used and are replaced by more accurate $2.5 \mathrm{D} / 3 \mathrm{D}$ tools. For these tools, there is no explicitly capacitance formula we can use. To make our method widely applicable to different design flows, the
computation of $\partial C_{j} / \partial x_{i}$ must be independent of any particular parasitic extraction tools. At the same time, we need to avoid multiple extractions on the whole circuit for different process variable.

To get $\partial C_{j} / \partial x_{i}$ for any process variable $x_{i}$ efficiently and accurately under any complex neighboring condition, we introduce the concept of unit capacitance variation $u_{i k}$, which is an estimate of the percentage variation of parasitic capacitance on metal $k$, with respect to process variable $x_{i}$. In practice, we randomly choose n parasitic capacitance on metal $k$ in a circuit, and calculate $u_{i k}$ by:

$$
\begin{equation*}
u_{i k}=\frac{1}{n} \sum_{j} \frac{\Delta C_{j k} / \Delta x_{i}}{C_{j k}} \tag{4.2}
\end{equation*}
$$

where $\Delta x_{i}$ is a small change of process variable $x_{i}, C_{j k}$ indicates a parasitic capacitance on metal $k$ under the nominal condition, and $\Delta C_{j k}$ is the variation of $C_{j k}$ due to $\Delta x_{i}$.

For a given process technology, the value of $\left(\Delta C_{j k} / \Delta x_{i}\right) / C_{j k}$ is in a considerable small range. In Figure 32 we show the distribution of $\left(\Delta C_{j k} / \Delta x_{i}\right) / C_{j k}$ due to the wire width variation on metal 2 in ISCAS85 circuit c432 for 406 sample capacitance. From the figure, we can see that for most parasitic capacitance $C_{j k}$, the value of $\left(\Delta C_{j k} / \Delta x_{i}\right) / C_{j k}$ is around 0.61 with small deviations.


Figure 45. The distribution of $\left(\Delta C_{j k} / \Delta x_{i}\right) / C_{j k}$ due to metal 2 width variation on 406 samples in ISCAS85 circuit $\mathbf{c} 432$. The $x$-axis indicates values of $\left(\Delta C_{j k} / \Delta x_{i}\right) / C_{j k}$.

The unit capacitance variation $u_{i k}$ is pre-computed for each metal layer with respect to each interconnect process variable, and is used to estimate the variation for any parasitic capacitance on metal $k$ under a small change of $x_{i}$. For any $C_{j}$ is on metal $k$, we have:

$$
\begin{equation*}
\partial C_{j} / \partial x_{i}=u_{i k} C_{j} . \tag{4.3}
\end{equation*}
$$

## B. Computation on Delay Sensitivity

We assume a $k$-factor table of delay with respect to input slew $s$ and load $C_{L}$ is given. If such a table is not available, we construct one using existing technology. The delay table under the nominal gate length is named as nominal table. We then build another $k$-factor table with the same indices of the first table, where each entry is the delay under a small change of gate length. The change of gate length $\Delta L_{g}$ is $3 \%$ of the nominal gate length in our experiments. This table is named as variational table.

We apply two delay models in delay sensitivity computation. One is lumped C delay model, and the other is effective capacitance delay model.

## i. Lumped C Delay Model

In the lumped C delay model, all parasitic resistance on interconnect are removed, and all parasitic capacitance and the sink capacitance are lumped into one single load capacitance $C_{L}$. Then we have $C_{L}=\sum C_{j}$. Then we refer to the nominal table and generate delay $d$ according to $s$ and $C_{L}$.

For delay sensitivity to gate length variation, we refer to the variational table according to $s$ and $C_{L}$, and generate delay $d^{\prime}$. Then we calculate $\partial d / \partial x_{i}=\partial d / \partial L_{g}=\left(d-d^{\prime}\right) / \Delta L_{g}$.

For delay sensitivity to interconnect process variable $x_{i}$, we first calculate variation of $C_{L}$ under a small change of $x_{i}$ as $\Delta C_{L}=\Delta x_{i} \cdot \sum\left(u_{i k} \cdot C_{j}\right)$, where $\Delta x_{i}$ is $5 \%$ of the nominal value of $x_{i}$. Then we refer to the variational table and calculate delay $d^{\prime}$ according to $C_{L}+\Delta C_{L}$. Therefore, we calculate $\partial d / \partial x_{i}=\left(d-d^{\prime}\right) / \Delta x_{i}$.
ii. Effective Capacitance Delay Model

For each buffer-to-buffer segment in the circuit, effective capacitance $C_{\text {eff }}$ rather than lumped capacitance $C_{L}$ is used to refer to the table. Thus we can consider the interconnect resistance shielding effect more accurately. There are several effective capacitance methods can be used, such as iterative method [46] and non-iterative method [54][55]. For the speed concern, we use non-iterative method here. The method proposed in [55] is used for RC interconnect and is difficult to be applied in buffer-to-buffer segment. Thus we apply the method proposed in [54], which evaluates effective capacitance by matching the delay of a cell with a $\Pi$ load and the delay of a cell with a single effective capacitance load. The delay under $s$ and $C_{e f f}$ is named as $d$.

For delay sensitivity to gate length variation, we first compute the effective capacitance
under the variational gate length. Under the gate length change $\Delta L_{g}$, effective capacitance $C^{\prime}$ eff is recalculated using the method proposed in [54]. Note here the $\Pi$ load does not change with $\Delta L_{g}$. Then we use $C^{\prime}{ }_{e f f}$ and $s$ to refer to the variational table, and generate delay $d^{\prime}$. Then the delay sensitivity to gate length variation is calculated by $\partial d / \partial x_{i}=\partial d / \partial L_{g}=(d-$ $\left.d^{\prime}\right) / \Delta L_{g}$.

For delay sensitivity to interconnect process variables, we need variational effective capacitance to refer to the nominal table. Thus we have to compute the new effective capacitance $C_{e f f}^{\prime}$ under process variation $\Delta x_{i}$. However, it costs too much to derive a new $\Pi$ load and compute $C_{e f f}^{\prime}$ accordingly. Instead we use $\Delta C_{e f f}=C_{e f f} \cdot \Delta C_{L} / C_{L}$ to approximate the change of $C_{e f f}$ due to $\Delta x_{i}$, where $C_{L}$ is the lumped capacitance, and $\Delta C_{L}$ is variation of $C_{L}$ and is calculated by $\Delta C_{L}=\Delta x_{i} \cdot \sum\left(u_{i k} \cdot C_{j}\right)$. Therefore $C^{\prime}{ }_{\text {eff }}=C_{\text {eff }}+\Delta C_{\text {eff }}$ is used to refer to the nominal table and generate delay $d^{\prime}$, then the delay sensitivity to $x_{i}$ is calculated by $\partial d / \partial x_{i}=$ $\left(d-d^{\prime}\right) / \Delta x_{i}$.

## 3. Experimental Results

We apply our methods to ISCAS85 circuits using a UNIX server running on Solaris 2.7. The systematic process variation variables considered in our paper are variations of the transistor gate length, the width of 5 metal layers, the thickness of 5 metal layers and the thickness of 5 inter-layer-dielectrics (ILD). We apply the following manufacturing ranges of these variables: gate length $\pm 6 \%$, metal width $\pm 5 \%$, metal thickness $\pm 20 \%$, and ILD thickness $\pm 40 \%$. The range of delay variation is about $\pm 10 \%$ of the nominal delay.

We first show the running time comparison between the traditional RSM and new
method in Table 9. For each circuit we perform RSM and our new method respectively to generate the parasitic delay model for all buffer-to-buffer segments in the circuit. RSM is implemented by SPICE simulation with its running time listed in the third column. The path delay is computed by summing buffer-to-buffer delays. The running time of our method is listed in followed columns. Compared to RSM, our method achieves significant speedup. The running time of the method based on lumped C delay model is faster than the method based on effective capacitance delay model by 2-5 times. The reason is that the method based on effective capacitance method spends more cost on Ceff computation.

## TABLE 9. Running time comparison between the traditional RSM and new methods for ISCAS85 circuits.

| Circuit | \# of buffer-to-buffer delays | Running time |  |  |
| :---: | :---: | :---: | :---: | :---: |
|  |  | RSM <br> (hh:mm) | New Methods (s) |  |
|  |  |  | Lumped C | Effective C |
| c432 | 343 | 0:41 | 0.014 | 0.020 |
| c499 | 440 | 1:03 | 0.017 | 0.026 |
| c880 | 755 | 1:30 | 0.014 | 0.053 |
| c1355 | 1096 | 2:13 | 0.044 | 0.084 |
| c1908 | 1523 | 2:48 | 0.075 | 0.304 |
| c2670 | 2292 | 4:19 | 0.108 | 0.456 |
| c3540 | 2961 | 5:39 | 0.143 | 0.466 |
| c5315 | 4509 | $>8 \mathrm{hr}$ | 0.196 | 0.785 |
| c6288 | 4832 | $>9 \mathrm{hr}$ | 0.200 | 0.846 |
| c7552 | 6253 | $>10 \mathrm{hr}$ | 0.308 | 1.600 |

To evaluate the accuracy of our method, we perform RSM and our method on the longest path of each circuit. Results are compared under the corner condition. In our
experiments, the path delay under the nominal process condition $d_{0}$ is computed by SPICE simulation. Under the corner condition, the parametric variational delay computed by the traditional RSM is denoted as $d^{\prime}$ and the parametric variational delay calculated by our method is denoted as $d^{\prime \prime}$ using function (4.1). Then the delay error under the corner condition is computed by $\left(d^{\prime \prime}-d^{\prime}\right) /\left(d_{0}+d^{\prime}\right)$. This value indicates the result of our method is how close to the result of RSM.

The results are shown in Table 10, where the number of cells in the longest path is listed in the second column, the path delay computed by RSM is listed in the third column and the delay variation under the worst case corner condition is listed in the fourth column. From the table, we can conclude that the method based on effective capacitance delay model is more accurate. Its delay error is less than $3 \%$ and for most circuits the error is around $1 \%$ of the path delay, where the delay error of the method based on lumped C model is less than $5 \%$.

TABLE 10. Accuracy comparison between the traditional RSM and new methods for ISCAS85 circuits.

|  | \# of <br> Circuit <br> cells in <br> path | Worst case <br> delay computed <br> by RSM (ps) | Delay <br> Var. <br> $(\%)$ | Delay error under worst case <br> corner (\%) |  |
| :--- | ---: | ---: | ---: | ---: | ---: |
|  |  | 698.5 |  | -4.38 | -0.01 |
| c432 | 17 | 464.6 | 9.83 | -2.09 | -0.60 |
| c499 | 11 | 530.3 | 9.54 | -2.00 | -0.36 |
| c880 | 24 | 609.1 | 10.37 | -3.64 | -2.98 |
| c1355 | 24 | 724.5 | 11.02 | -2.46 | -1.55 |
| c1908 | 40 | 947.6 | 10.83 | -2.84 | -0.65 |
| c2670 | 32 | 1103.1 | 9.97 | -0.33 | -1.24 |
| c3540 | 47 | 994.5 | 10.02 | -2.50 | -1.15 |
| c5315 | 49 | 2853.4 | 9.53 | -0.11 | -1.71 |
| c6288 | 124 | 690.9 | 10.28 | -2.86 | -1.32 |
| c7552 | 41 |  |  |  |  |

## V. LONGEST PATH SELECTION

## 1. Delay Test Using Longest Paths

## A. Delay Test Basics

Delay test of combinational circuits is to ensure that the signal from any primary input to any primary output is propagated in less time than the system clock cycle time. A circuit is considered faulty if the delay of any path exceeds the specification. The delay increase due to a local defect, such as a resistive bridge or a resistive open, may cause a timing violation on the path through the defect, and can be modeled as a delay fault [15][32]. Such delay increase is localized to a gate output or an interconnect wire in the circuit, where the localized position is called a local fault site in this paper. Generally, the local delay fault is modeled as an additional delay $\Delta$ along the path through the fault size.

Testing the longest path through the local fault site will capture the delay increase due to the fault. For example, in the combinational circuit of Figure 46, there are two paths $P_{1}$ and $P_{2}$ through a common local defect. If there is no defect, the delay of $P_{2}$ is larger than $P_{1}$. Then with the additional delay caused by the local defect, the delay of $P_{2}$ is more likely to exceed the timing specification $T_{\text {spec. }}$. Therefore test on the longest path is the most likely to capture the delay increase due to the fault.


Figure 46. Test on the longer path is more likely to capture delay defect.

Modern delay optimization tools tend to make many paths critical or near critical [37], resulting in too many paths for test. Pruning some of the paths based on structural correlation and process variation correlation is an effective approach to reduce the number of paths. If two paths share some nets or gates, there is a structurally correlation between them. Similarly, if two nets run on the same metal layer, there is a process correlation between them. Luong and Walker [27] proposed a pruning technique using both the structural correlation and the process correlation. As a result, they significantly reduced the number of paths. However, they only considered the longest paths for the entire circuit, instead of the longest paths through every local fault site. Furthermore, they did not consider interconnect delay. Tani et al. considered the longest paths through every local fault site [38]. They used a min-max comparison method, with the help of the structural correlation but not process correlation. As a result, their approach is overly pessimistic and produces too many paths. Liou et al. [28] used Monte Carlo simulation to select a set of critical paths that maximizes the probability of covering all critical paths under all process
conditions. However, Monte Carlo simulation is very slow for large circuits and no running time is given for their method.
B. Longest Path Redefined

When variations are not considered, there is only one path whose delay is the maximum in the combinational circuit, and the problem of finding the longest and testable paths that cover all local fault sites has been extensively studied [33][34][35].

When process variation is considered, the path delay becomes a function of process variables. Among all paths through a fault site, there are often multiple paths whose delay can be the maximum under different process conditions [36]. For each fault site $s$, we call a path longest for $s$ if the path has the maximum delay among all paths through $s$ under some process conditions. On the other hand, we call a path redundant for $s$ if the path can never be longest for $s$ under any process condition.

## C. Test All Longest Paths

Traditionally, tests are only performed on the longest paths under the nominal or worstcase process condition. However, this might be insufficient. As an example in Figure 47, we show the delay of four path through one common local fault size under one process parameter $x$. Under the nominal process variation, the delay of path $P_{2}$ is the maximum. However, under the worst-case corners (min and max), the delay of $P_{3}$ and $P_{1}$ is the maximum respectively. Obviously, only testing the path under one special process corner cannot maximize the fault coverage, because we do not what the actual process condition is for a chip in test. On the other hand, it is inefficient to test all these paths. For instance, test on $P_{4}$ is wasteful because it cannot be a longest path in any process condition. Therefore,
testing on all longest paths under any process conditions is the only way to satisfy the fault coverage, as well as to minimize the test cost.


Figure 47. Four path delays under one process parameter $x$.

In the following we present a new method to select longest paths for each local fault site in the circuit. To maximize fault coverage, we want to find as many longest paths as possible. On the other hand, to minimize test costs, we want to find as few paths as possible. Given a set of testable paths, our method first models the path delay as a linear function of process variation variables, then uses two pruning algorithms to remove paths that are redundant or almost redundant. We repeat the process for each fault site in the circuit, and the remaining paths are longest paths for delay test. Experiments on the ISCAS circuits show that the new method is efficient and significantly reduces the number of paths for test, compared to the previous best method. We consider process variations of devices and interconnect in our work, and the method can also be applied in path selection under operating variations of supply voltages and temperature [39].

## 2. Path Pruning Algorithms

Based on the linear delay model presented by PARADE, the delay of a path can be derived as a linear function by accumulating all buffer-to-buffer delays defined in (5.1) along the path:

$$
\begin{equation*}
D(\mathbf{x})=d_{0}+d_{1} x_{1}+d_{2} x_{2}+\cdots+d_{p} x_{p} \tag{5.1}
\end{equation*}
$$

where $d_{0}$ is the nominal path delay, and $d_{1}, d_{2}, \ldots, d_{p}$ are coefficients for process variation variables, $\mathbf{x}=\left(x_{1}, x_{2}, \ldots, x_{\mathrm{p}}\right)$ is the vector of process variables, each representing the deviation from the nominal value.

Let $\boldsymbol{P}=\left\{P_{1}, P_{2}, \ldots, P_{n}\right\}$ be a set of testable paths through a local fault site in the circuit, and let the delay of each path $P_{i}$ be $D_{i}(\mathbf{x})=d_{i 0}+d_{i 1} x_{1}+\cdots+d_{i p} x_{p}$. The range of all process variation variables is defined as $\boldsymbol{G} \subset \mathfrak{R}^{p}$, where $\boldsymbol{G}=\left\{\left(x_{1}, \ldots, x_{p}\right) \mid l_{j} \leq x_{j} \leq h_{j}, j=1, \ldots, p\right\}$, and $l_{j}$ and $h_{j}$ are the lower and upper bounds of $x_{j}$ respectively. Then, any path $P_{q}$ is a longest path in $\boldsymbol{P}$ if and only if there exists $\mathbf{x}^{\prime} \in \boldsymbol{G}$ such that:

$$
\begin{equation*}
D_{q}\left(\mathbf{x}^{\prime}\right) \geq D_{1}\left(\mathbf{x}^{\prime}\right), D_{2}\left(\mathbf{x}^{\prime}\right), \ldots, D_{n}\left(\mathbf{x}^{\prime}\right) . \tag{5.2}
\end{equation*}
$$

Verifying whether the set of inequalities (5.2) can be satisfied is known as the feasibility problem of linear programming (LP). When the dimension $p$ is fixed, LP can be solved in $O(n)$ time [56]. However, the constant factor in the time complexity is exponential with the dimension $p$, resulting in high costs for large $p$. To reduce the running time in the case of large $p$, we replace LP with two heuristics. Heuristic $H_{1}$ prunes redundant paths using less strict constraints, while Heuristic $H_{2}$ refines outputs of Heuristic $H_{1}$ to further reduce the number of paths for delay test.

## A. Heuristic $\mathrm{H}_{1}$

To determine if path $P_{q}$ is longest, we define its rough domain of process variable $x_{k}$, with respect to path $P_{i}$ as:

$$
\begin{aligned}
& R_{k i}=\left[l_{k i}, h_{k i}\right], \text { where } \\
& l_{k i}=\min _{\mathbf{x} \in \mathbf{G}}\left\{x_{k} \mid D_{q}(\mathbf{x}) \geq D_{i}(\mathbf{x})\right\}, \\
& h_{k i}=\max _{\mathbf{x} \in \mathbf{G}}\left\{x_{k} \mid D_{q}(\mathbf{x}) \geq D_{i}(\mathbf{x})\right\} .
\end{aligned}
$$

Intuitively, $R_{i k}$ specifies the possible values of $x_{k}$ such that $P_{q}$ is longer than $P_{i}$. The computation of $l_{k i}$ and $h_{k i}$ is straightforward.

The heuristic is as follows:
$H_{1}$ : Prune redundant paths
Input: path set $\boldsymbol{P}$, process range $G$
1 For each path $P_{q} \in \boldsymbol{P}$, do
2 For each process variable $x_{k}$, do
3 Initial rough domain $R_{k}=\left[l_{k}, h_{k}\right]$.
$4 \quad$ For each path $P_{i}, i=1, \ldots, n, i \neq q$, do
$5 \quad$ Compute the rough domain $R_{k i}$.
Update $R_{k}=R_{k} \cap R_{k i}$.
End If $R_{k}=\varnothing, P_{q}$ is "redundant" and pruned. 9 End
10 End

Heuristic $H_{1}$ prunes path $P_{q}$ if the intersection of rough domains of any process variable for $P_{q}$ with respect to other paths is empty. This is because if the intersection is empty, there does not exist any $\mathbf{x} \in \boldsymbol{G}$ such that $P_{q}$ is the longest under process condition $\mathbf{x}$. Then according to the definition, $P_{q}$ is redundant. The worst-case time complexity of the heuristic is $O\left(n^{2} p^{2}\right)$, since there are $O\left(n^{2} p\right)$ rough ranges and each takes $O(p)$ time to compute. Note that although Heuristic $H_{1}$ prunes a large number of redundant paths, some redundant paths
may escape when these paths are shorter than the combination of other paths.

## B. Heuristic $\mathrm{H}_{2}$

Among the longest paths, some paths are only slightly longer than others under every process condition. A path $P_{q}$ is called insignificant if there is a longest path $P_{i}$ such that their maximum delay difference is small:

$$
\max _{\mathbf{x} \in G}\left\{D_{q}(\mathbf{x})-D_{i}(\mathbf{x})\right\} \leq \varepsilon,
$$

where $\varepsilon$ is a user-specified threshold. If $\varepsilon$ is small, say $1 \%$ of the maximum nominal path delay, then testing $P_{q}$ after testing $P_{i}$ achieves little delay test coverage improvement. Therefore $P_{q}$ should be pruned. The following heuristic prunes insignificant paths.
$H_{2}$ : Prune insignificant paths
Input: Path set $\boldsymbol{P}$, pre-specified threshold $\varepsilon$
1 For each path $P_{q} \in \boldsymbol{P}$, do
2 For any other path $P_{i} \neq P_{q}$, do
3 If $\max _{\mathrm{x} \in \mathrm{G}}\left\{D_{q}(\mathbf{x})-D_{i}(\mathbf{x})\right\}<\varepsilon D_{i}(\mathbf{0})$
$4 \quad$ Prune $P_{q}$ as insignificant.
5 End
6 End

Heuristic $H_{2}$ compares the delay difference between each pair of paths under the worstcase process corners. The time complexity is $O\left(n^{2} p\right)$. We perform $H_{2}$ on the output of $H_{1}$, and the remaining paths are kept for delay test.

## 3. Longest Path Generation

Given a set of testable paths, we generate the set of longest paths by pruning redundant paths from it. Testable paths are generated by the algorithm in [49] and [50] and are ranked in the order of non-increasing nominal delays. The path with the largest nominal delay has
index 0 , and the path with the second largest nominal delay has index 1 , etc. For each fault site, we first request a batch of $K$ longest paths, indexed from 0 to $K-1$. Then the path pruning algorithms are applied to prune all redundant paths. Finally, the probability that a path in the next batch could be longest is estimated. If the probability is less than a specified criteria value, for example $0.1 \%$, the procedure stops. Otherwise, we request the next batch of paths from the path generator, indexed from $K$ to $2 K-1$. The above procedure is repeated until the stop criterion is satisfied. The flowchart of longest path generation is shown in Figure 48.


Figure 48. Flowchart of longest path generation.

To estimate the probability that longest paths could exist in the next batch of paths, we
consider the distribution of the already generated longest paths versus path indexes for all fault sites. Let $f(k)$ be the percentage of fault sites where the path with index $k$ is a longest path. Because of the non-increasing order of the nominal path delay and path delay correlations, paths with greater indexes are less likely to be longest paths. Thus, the value of $f(k)$ decreases with the increasing of index $k$, and can be modeled by a rational function:

$$
\begin{equation*}
f(k)=\frac{1}{1+a k+b k^{2}} \tag{5.3}
\end{equation*}
$$

where parameters $a$ and $b$ can be computed by performing curve fitting on the distribution of already generated longest paths. Let the index of path batch be $l$, we estimate the maximum percentage in the next batch of paths as $f(l \cdot K)$. If the value of $f(l \cdot K)$ is greater than $0.1 \%$, we consider the next batch of paths and recalculate $f(k)$. Otherwise, the procedure stops. Although parameters of $f(k)$ changes when a new batch is considered, experiments show that they vary little when a proper $K$ is used. As an example, we show the actual longest path distribution versus path indexes in Figure 49 for an ISCAS85 circuit, where the x -axis indicates the path index and the y -axis indicates the percentage of longest paths. The batch size $K$ is 10 and four batches are used in the whole procedure. In Table 11 we show parameters of $f(k)$ for each time a new path batch is considered, and we also show the estimated and the actual maximum percentage of longest paths in the next path batch.


Figure 49. Distribution of longest paths versus path indexes.

TABLE 11. Parameters of $\boldsymbol{f}(\boldsymbol{k})$ and the estimated and the actual maximum percentage of longest paths.

| Index of <br> batch $(l)$ | $A$ | $B$ | $\mathrm{f}(k)$ at $k=l \cdot K$ | Maximum percentage <br> in batch $l+1$ |
| :--- | :--- | :--- | :--- | :--- |
| 1 | -0.1959 | 0.7446 | $1.36 \%$ | $2.69 \%$ |
| 2 | -0.2060 | 0.7534 | $0.34 \%$ | $0.67 \%$ |
| 3 | -0.2063 | 0.7536 | $0.15 \%$ | $0.12 \%$ |
| 4 | -0.2067 | 0.7540 | $0.08 \%$ | $0.01 \%$ |

Because a longest path through a fault site is very likely to be a longest path through another fault site in the circuit, a path collapsing procedure is performed to discard the shared paths among all fault sites in the circuit, after the longest path set of each fault site is generated. The procedure can be implemented in linear time in terms of the number of paths. The collapsed path set is the longest path set that covers all delay fault sites in the circuit and must be tested.

## 4. Experimental Results

The experiments were performed for all ISCAS85 combinational circuits and the three largest ISCAS89 sequential circuits. We used Cadence Silicon Ensemble ${ }^{\mathrm{TM}}$ for circuit layout generation and parasitic extraction under TSMC 180 nm 1.8 V 5-metal technology. We implemented heuristics in C on a 2.8 GHz Pentium 4 with 1 GB memory running at WindowsXP. The process variations considered are variations in transistor gate length, metal width, metal thickness and inter-layer-dielectric (ILD) thickness. There are a total of 16 variables for the 5-metal layer technology. The ranges of process variation variables are as follows: gate length $\pm 5 \%$, metal width $\pm 5 \%$, metal thickness $\pm 20 \%$, and ILD thickness $\pm 40 \%$. Under such variation ranges, path delays vary within $\pm 10 \%$ in our experimental circuits.

We first compare the performance of our new method with the min-max method, which is the previous best method for the problem [38]. Considering path structural correlation, the min-max method first identifies shared gates between different paths and eliminates the delay of shared gates from path delays. Then min-max comparison is performed on remaining delays. In experiments of the min-max method, we used parameters $\alpha=1.0$ and $\beta=10 \%$ to achieve a $\pm 10 \%$ min-max delay range. In our new method, path structural correlation is implicitly considered in the formation of delay inequalities, and process correlation is handled by using the same set of variables in each delay function. Therefore, the new method is able to identify and prune more redundant paths than the min-max method does.

We assumed the output of each cell in ISCAS85 circuits as a possible fault site. For
sequential ISCAS89 circuits, we considered the combinational circuit between any pair of flip-flops and assumed there can be a delay fault at the output of the driving flip-flop, as well as gate outputs. A path generator [49] [50] was used to provide critical and testable paths in batches of 50 , where paths are indexed from 0 and sorted in the order of nonincreasing nominal delay. For each batch we applied path pruning heuristics and collected remaining paths into the path set for test. Because of the non-increasing order of the nominal path delay, paths with greater indexes are less likely to be longest paths. Then if most paths in a batch are pruned, e.g. 45 out of 50 are pruned, which means the probability for the next batch to contain a longest path is small, the procedure stops. The stopping threshold is user-defined, and we used $90 \%$ in experiments. Although it is possible that a longest path exists in the following batch, the number of "escaped" paths is very small if we stop when $90 \%$ or more are pruned in a batch of 50 paths. In the experiments, the number of paths in the second batch is at most $1.79 \%$ of all longest paths we selected, and no longest path is found in the third batch. The reason for this behavior is that path delay correlation is high enough that two paths of very different index are unlikely to both be longest and still pass heuristic $H_{2}$. Delay test coverage will not be significantly degraded if only a small number of longest paths escape, since these paths will be only slightly longer than the tested paths. Luong and Walker [27] used a similar batch-based method to decide when to stop global longest path generation.

The comparison between our method and the min-max method is shown in Table 12. In the table, column "\# of critical paths" indicates the total number of paths through each fault site within $20 \%$ of the nominally longest path delay. In column "paths for test" we list the
total and the average number of longest paths for all fault sites in the circuit. The percentage of longest paths in the critical paths is shown in column "percentage". The running time is shown in column "time (s)", where the time for the path generator is not included. We do not list the result of the min-max method for circuit c6288 and larger circuits because the running time is more that several hours. As shown in the table, the number of longest paths selected by the new method is only $1 \%-6 \%$ of that selected by the min-max method. This indicates that only a small percentage of paths are actually longest when structural and process correlations are used, compared to just using structural correlation in the min-max approach. The maximum average number of paths to be tested by the new method is 4.4 . That means only a few paths need to be tested for each fault site. In addition, the new method is 300-3000 times faster than the min-max method. This is because the min-max method takes too much time identifying shared gates among paths. We used LP to verify the results of the new method and found that only $5 \%$ of the selected paths are pruned. That indicates that our method achieves close to the minimal test set at much lower cost.

TABLE 12. Performance comparison between the min-max method and the new method.

| Circuits | $\#$ of $\#$ of top <br> fault $20 \%$  <br> sites paths  |  | Min-max method |  |  |  | New method |  |  |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  |  | Paths for test |  | Perc enta ge (\%) | $\begin{aligned} & \text { Time } \\ & \text { (s) } \end{aligned}$ | Paths for <br> test |  | Percen tage (\%) | Time <br> (s) |
|  |  |  | Total | Avg. |  |  | Total | Avg. |  |  |
| c432 | 140 | 9263 | 5946 | 42.5 | 64.2 | 21 | 249 | 1.8 | 2.7 | 0.06 |
| c499 | 202 | 5962 | 5384 | 26.7 | 90.3 | 47 | 204 | 1.0 | 3.4 | 0.09 |
| c880 | 383 | 19641 | 9340 | 24.4 | 47.6 | 237 | 399 | 1.0 | 2.0 | 0.16 |
| c1355 | 546 | 236771 | 185803 | 340.3 | 78.5 | 1070 | 598 | 1.1 | 0.2 | 0.30 |
| c1908 | 845 | 136530 | 93061 | 110.1 | 68.2 | 1414 | 868 | 1.0 | 0.6 | 0.49 |
| c2670 | 1246 | 80407 | 26095 | 20.9 | 32.5 | 1377 | 1253 | 1.0 | 1.6 | 0.70 |
| c3540 | 1629 | 92617 | 30785 | 18.9 | 33.2 | 3431 | 1636 | 1.0 | 1.8 | 1.05 |
| c5315 | 2278 | 129560 | 70394 | 30.9 | 54.3 | 2449 | 2312 | 1.0 | 1.8 | 1.19 |
| c7552 | 3434 | 180045 | 87822 | 25.6 | 48.8 | 1872 | 3483 | 1.0 | 1.9 | 3.14 |
| c6288 | 2384 | 760550 | N/A | N/A | N/A | $>3 \mathrm{hr}$ | 2384 | 1.0 | 0.3 | 6.35 |
| s35932 | 7491 | 151204 | N/A | N/A | N/A | N/A | 8364 | 1.1 | 5.5 | 2.40 |
| s38417 | 33418 | 987587 | N/A | N/A | N/A | N/A | 93593 | 2.8 | 9.5 | 23.91 |
| s38584 | 18664 | 147952 | N/A | N/A | N/A | N/A | 30238 | 1.6 | 20.4 | 3.71 |

In Table 13, we show the distribution of the path set size for all fault sites in three largest circuits. The distribution shows that, for most of fault sites in circuit s35932, no more than 3 paths must be tested. In a worse case, circuit s38584, no more than 5 paths must be tested for $90 \%$ of the fault sites. In the worst case, circuit s38417, the number is 8 .

TABLE 13. Path set size distribution for all fault sites in three largest circuits.

| Set size <br> circuit | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | $>8$ |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| s35932 | 95.6 | 2.5 | 1.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| s 38417 | 36.0 | 10.9 | 13.0 | 6.2 | 7.0 | 5.5 | 6.9 | 4.5 | 10.0 |
| s 38584 | 44.1 | 26.6 | 14.1 | 5.1 | 1.8 | 0.4 | 1.8 | 0.6 | 5.5 |

The path selection distribution in circuit s38417 is shown in Figure 50, where the X axis indicates the path index, and the Y-axis indicates the percentage of local fault sites where the indexed path is selected. The path with index 0 is the longest under the nominal process condition and is selected for all fault sites. With increasing path index, the percentage selected decreases, as a path with lower nominal delay is less likely to be longest. The distribution also shows that, most of the longest paths are selected within the first batch of 50 paths, while no paths are selected in the second batch. For most fault sites, the path selection procedure stops at the second path batch.


Figure 50. Path distribution vs. path indexes in $\mathbf{s} 38417$ using the new method.

To compare the efficiency between the two methods, in Figure 51and Figure 52 we show the path selection distribution in circuit c432 using the min-max method and the new method respectively. As shown in Figure 34, using the min-max method, the distribution goes to 0 after more than 150 paths, while for the new method the distribution goes to 0 after only 15 paths in Figure 52.


Figure 51. Path distribution vs. path indexes in c432 using the min-max method.


Figure 52. Path distribution vs. path indexes in c432 using the new method.

## VI. SUMMARY AND CONCLUSIONS

In this dissertation, we study three challenging issues for delay test in nano-scale VLSI circuits: fault modeling of resistive spot defects, variational delay evaluation, and path selection under process variation. We present our new solutions and show the improvement in experimental results.

The electrical behaviors of resistive spot defects are comprehensively analyzed. The defect is modeled as a functional fault or a delay fault according to the input signal patterns and the resistance. We derived close-form expressions for the relationship between the delay change and the resistance. Based on the fault model, we are able to numerically compare the performance of different input vectors and choose the one to improve the fault coverage. The fault model is combined into a circuit-level fault simulator and results show the benefits of the delay test over the logic test.

To fast compute effects of process variation on circuit delays, we propose a linear delay model that incorporates the effect of process variations into a linear function. A fast parametric delay evaluation method PARARDE is presented to compute coefficients of the linear function. Our method avoids multiple parasitic extractions and multiple delay evaluations as did in the traditional RSM, and result in a significant speedup. The method based on effective capacitance delay model achieves higher accuracy. Experiments on ISCAS85 circuits show that our methods are effective and accurate for the parametric delay evaluation under process variation. And our new estimation method for capacitance sensitivity computation is applicable for any commercial parasitic extraction tools.

On path selection for delay test under process variation, we present a novel and efficient
method to find the set of longest paths. For the first time, we consider both path structural correlation and process correlation, and consider process variation in both devices and interconnect. Two heuristics are proposed to prune redundant paths and insignificant paths. Experimental results show the heuristics are very efficient and effective. Our method can significantly reduce the number of paths and test patterns for delay test, compared with the previous best method. Experiments on ISCAS circuits show that the new method reduces the number of paths for test to $1 \%-6 \%$ of the results using the min-max method [36], without decreasing the fault coverage in delay test. The significant reduction indicates that considering both structural correlation and process correlation is much more effective than considering path structural correlation alone. In addition, the new method runs 300-3000 times faster than the min-max method, mainly because the min-max method examines far more paths.

The work described above only considers die-to-die process variation. Systematic within-die variation, such as computed by lithography simulation tools, can be incorporated into the delay model, as it only affects the delay equation coefficients. Random within-die variation will be incorporated into the model in the future. This requires the addition of more process variables and a spatial correlation structure. The lower path correlation will result in more paths selected for testing. But these test sets will still be significantly smaller than the min-max test sets, which assume only structural correlation between path delays.

## REFERENCES

[1] "2002 Update Tables", International Technology Roadmap for Semiconductors, 2002 Update, Semiconductor Industries Association: San Jose, CA, 2002, pp. 16150.
[2] C. F. Hawkins, J. M. Soden, A. W. Righter, F. J. Ferguson, "Defect Classes - An Overdue Paradigm for CMOS IC Testing", in Proc. IEEE International Test Conference, Washington DC, Oct. 1994, pp. 413-425.
[3] P. Banerjee and J. A. Abraham, "Fault Characterization of VLSI MOS Circuits", in Proc. IEEE International Conference on Circuits and Computers, New York, Sept. 1982, pp. 564-568.
[4] P. Banerjee and J. Abraham, "Generating Tests for Physical Failures in MOS Logic Circuits", in Proc. IEEE International Test Conference, Philadelphia, Oct. 1983, pp. 554-559.
[5] H. Hao and E. J. McCluskey, "Resistive Shorts Within CMOS Gates", in Proc. IEEE International Test Conference, Nashville, Oct. 1991, pp. 292-301.
[6] M. Renovell, P. Huc, and Y. Bertrand, "The Concept of Resistance Interval: A New Parametric Model for Realistic Resistive Bridging Fault", in Proc. VLSI Test Symposium, Princeton, NJ, Apr. 1995, pp. 184-189.
[7] R. Degraeve, B.Kaczer, A. de Keersgieter, G. Groeseneken, "Relation Between Breakdown Mode and Location in Short-channel NMOSFETs and Its Impact on Reliability Specifications", IEEE Trans. Dev. Mat. Rel., vol. 1, no.3, pp.163-169, Sept. 2001.
[8] D. Gaitonde and D. M. H. Walker, "Circuit-level Modeling of Spot Defects", in Proc. IEEE International Workshop on Defect and Fault Tolerance in VLSI Systems, Pittsburgh, Nov. 1991, pp. 63-66.
[9] J. Segura, C. D. Benito, A. Bubio and C. F. Hawkins, "A Detailed Analysis of GOS Defects in MOS Transistors: Testing Implications at Circuit Level", in Proc. IEEE International Test Conference, Washington DC, Oct. 1995, pp. 544-551.
[10] H. Hao and E. J. McCluskey, "On the Modeling and Testing of Gate Oxide Shorts in CMOS Logic Gates", in Proc. IEEE International Workshop on Defect and Fault Tolerance in VLSI Systems, Pittsburgh, Nov. 1991, pp. 161-174.
[11] V. R. Sar-Dessal and D. M. H. Walker, "Resistive Bridge Fault Modeling, Simulation and Test Generation", in Proc. IEEE International Test Conference, Atlantic City, NJ, Sept. 1999, pp. 596-605.
[12] W. Moore, G. Gronthoud, K. Baker and M. Lousberg, "Delay-fault Testing and Defects in Sub-micron ICs - Does Critical Resistance Really Mean Anything?", in Proc. IEEE International Test Conference, Atlantic City, NJ, Oct. 2000, pp. 95-104.
[13] W. Chuang and I N.Hajj, "Fast Mixed-mode Simulation for Accurate MOS Bridging Fault Detection", in Proc. IEEE International Symposium on Circuits and Systems, Chicago, May 1993, pp. 1503-1506.
[14] D. Shaw, D. Al-Khalili and C. Rozon, "Accurate CMOS Bridge Fault Modeling with Neural Network-based VHDL Saboteurs", in Proc. International Conference on Computer-Aided Design, San Jose, CA, Nov. 2001, pp. 531-536.
[15] Z. Li, X. Lu, W. Qiu, W. Shi and D. M. H. Walker, "A Circuit Level Fault Model for Resistive Opens and Bridges", ACM Trans. on Design Automation of Electronic Systems, vol. 8, no. 4, pp. 546-559, 2003.
[16] S. R. Nassif, "Modeling and Analysis of Manufacturing Variations", in Proc. IEEE Custom Integrated Circuits Conference, San Diego, CA, May 2001, pp. 223-228.
[17] Y. Liu, S. R. Nassif, L. T. Pileggi and A. J. Strojwas, "Impact of Interconnect Variations on the Clock Skew of A Gigahertz Microprocessor", in Proc. ACM/IEEE Design Automation Conference, Los Angeles, Jun. 2000, pp. 168-171.
[18] V. Mehrotra, S. L. Sam, D. Boning, A. Chandrakasan, R. Vallishayee and S. Nassif, "A Methodology for Modeling the Effects of Systematic Within-die Interconnect and Device Variation on Circuit Performance", in Proc. ACM/IEEE Design Automation Conference, Los Angeles, Jun. 2000, pp. 172-175.
[19] E. Malavasi, S. Zanella, C. Min J. Uschersohn, M. Misheloff and C. Guardiani, "Impact Analysis of Process Variability on Clock Skew", in Proc. International Symposium on Quality Electronic Design, San Jose, CA, Mar. 2002, pp. 129-132.
[20] R. B. Brawhear, N. Menezes, C. Oh, L. T. Pillage and M. R. Mercer, "Predicting Circuit Performance Using Circuit-level Statistical Timing Analysis", in Proc. European Conference on Design Automation, Paris, Mar. 1994, pp. 332-337.
[21] H. Chang and S. S. Sapatnekar, "Statistical Timing Analysis Considering Spatial Correlations Using a Single PERT-like Traversal", in Proc. International Conference on Computer-Aided Design, San Jose, CA, Nov. 2003, pp. 621-625.
[22] A. Agarwal, D. Blaauw and V. Zolotov, "Statistical Timing Analysis for Intra-die Process Variations with Spatial Correlations", in Proc. International Conference on Computer-Aided Design, San Jose, CA, Nov. 2003, pp. 271-276.
[23] M. Orshansky, L. Milor, P. Chen, K. Keutzer and C. Hu, "Impact of Systematic Spatial Intra-chip Gate Length Variability on Performance of High-speed Digital Circuits", in Proc. International Conference on Computer-Aided Design, San Jose, CA, Nov. 2000, pp. 62-67.
[24] E. Acar, S. N. Nassif, L. Ying and L. T. Pileggi, "Assessment of True Worst Case Circuit Performance Under Interconnect Parameter Variations", in Proc. International Symposium on Quality Electronic Design, San Jose, CA, Mar. 2001, pp. 431-436.
[25] A. Gattiker, S. Nassif, R. Dinakar and C. Long, "Timing Yield Estimation from Static Timing Analysis", in Proc. International Symposium on Quality Electronic Design, San Jose, CA, Mar. 2001, pp. 437-442.
[26] S. Borkar, T. Kamik, S. Narendra, J. Tschanz, A. Keshavarzi and V. De, "Parameter Variations and Impact on Circuits and Microarchitecture", in Proc. ACM/IEEE Design Automation Conference, Chicago, Jun. 2003, pp. 338-342.
[27] G. M. Luong and D. M. H. Walker, "Test Generation for Global Delay Faults", in Proc. International Test Conference, Washington DC, Oct. 1996, pp. 433-442.
[28] J. J. Liou, A. Krstic, L. C. Wang and K. T. Cheng, "False-path-aware Statistical Timing Analysis and Efficient Path Selection for Delay Testing and Timing Validation", in Proc. ACM/IEEE Design Automation Conference, New Orleans, Jun. 2002, pp. 566-569.
[29] A. Krstic, L. C. Wang, K. T. Cheng and J. J. Liou, "Diagnosis of Delay Defects Using Statistical Timing Models", in Proc. IEEE VLSI Test Symposium, Napa, CA, Apr. 2003, pp. 339-344.
[30] X. Lu, Z. Li, W. Qiu, D. M. H. Walker and W. Shi, "Longest Path Selection for Delay Test Under Process Variation", in Proc. IEEE Asia South Pacific Design Automation Conference, Yokohama, Japan, Jan. 2004, pp. 98-103.
[31] A. D. Fabbro, B. Franzini, L. Croce and C. Guardiani, "An Assigned Probability Technique to Derive Realistic Worst-case Timing Models of Digital Standard Cells", in Proc. ACM/IEEE Design Automation Conference, San Francisco, Jun. 1995, pp. 702-706.
[32] R. R. Montanes, J. P. de Gyvez and P. Volf, "Resistance Characterization for Weak Open Defects", IEEE Design and Test of Computers, vol. 19, no. 5, pp. 18-25, Sept. 2002.
[33] W. N. Li, S. M. Reddy and S. K. Sahni, "On Path Selection in Combinational Logic Circuits", IEEE Trans. Computer-Aided Design, vol. 8, no. 1, pp. 56-63, Jan. 1989.
[34] Y. Shao, S. M. Reddy, I. Pomeranz and S. Kajihara, "On Selecting Testable Paths in Scan Designs", in Proc. IEEE European Test Workshop, Corfu, Greece, May 2002, pp. 53-58.
[35] M. Sharma and J. H. Patel, "Finding a Small Set of Longest Testable Paths that Cover Every Gate", in Proc. IEEE International Test Conference, Baltimore, Oct. 2002, pp. 974-982.
[36] M. Sivaraman and A. J. Strojwas, "Path Delay Fault Diagnosis and Coverage - A Metric and an Estimation Technique", IEEE Trans. Computer-Aided Design, vol. 20, no. 3, pp. 440-457, Mar. 2001.
[37] T. W. Williams, B. Underwood and M. R. Mercer, "The Interdependence Between Delay Optimization of Synthesized Networks and Testing", in Proc. ACM/IEEE Design Automation Conference, San Francisco, Jun. 1991, pp. 87-92.
[38] S. Tani, M. Teramoto, T. Fukazawa and K. Matsuhiro, "Efficient Path Selection for Delay Testing Based on Partial Path Evaluation", in Proc. IEEE VLSI Test Symposium, Princeton, NJ, Apr. 1998, pp. 188-193.
[39] B. Stine, D. Boning and J. Chung, "Analysis and Decomposition of Spatial Variation in Integrated Circuit Process and Devices", IEEE Trans. on Semiconductor Manufacturing, vol. 10, no. 1, pp. 24-41, 1997.
[40] L. Pillage and R. Rohrer, "Asymptotic Waveform Evaluation for Timing Analysis", IEEE Trans. Computer-Aided Design, vol. 9, no. 4, pp. 352-366, Apr. 1990.
[41] P. R. O'Brien and T. L. Savarino, "Modeling the Driving-point Characteristic of Resistive Interconnect for Accurate Delay Estimation", in Proc. International Conference on Computer-Aided Design, Santa Clara, CA, Nov. 1989, pp. 512-515.
[42] S. Irajpour, S. Nazarian, L. Wang, S. K. Gupta and M. A. Breuer "Analyzing Crosstalk in the Presence of Weak Bridge Defects", in Proc. VLSI Test Symposium, Napa, CA, Apr. 2003, pp. 385-393.
[43] N. H. E. Weste and K. Eshraghiaghian, "MOS Transistro Theory" in Principles of CMOS VLSI Design - A Systems Perspective, 2nd edition, Addison-Wesley Publishing Company: Boston,1992, pp. 61-63..
[44] W. Qiu, X. Lu, Z. Li, D. M. H. Walker and W. Shi, "CodSim: A Combined Delay Fault Simulator", in Proc. International Symposium on Defect and Fault Tolerance in VLSI Systems, Boston, Nov. 2003, pp. 79-88.
[45] M. Spica, M. Tripp and R. Roeder, "A New Understanding of Bridge Defect Resistances and Process Interactions from Correlating Inductive Fault Analysis Predictions to Empirical Test Results", in Proc. International Workshop on Defect Based Testing, Los Angeles, Apr. 2001, pp. 11-16.
[46] J. Qian, S. Pullela and L. Pillage, "Modeling the 'Effective Capacitance' for the RC Interconnect of CMOS Gates", IEEE Trans. on Computer-Aided Design, vol. 13, no. 12, pp. 1526-1535, 2001.
[47] K. A. Bowman, B. L. Austin, J. C. Eble, X. Tang, and J. D. Meindl, "A Physical Alpha-Power Law MOSFET Model", IEEE Journal of Solid-State Circuits, vol. 34, no. 10, pp. 1410-1414, 1999.
[48] G. L. Smith, "Model the Delay Faults Based Upon Path", in Proc. IEEE International Test Conference, Philadelphia, Nov. 1985, pp. 342-349.
[49] W. Qiu and D. M. H. Walker, "An Efficient Algortihm for Finding the K Longest Testable Paths Through Each Gate in a Combinational Circuit", in Proc. IEEE International Test Conference, Charlotte, NC, Oct. 2003, pp. 592-601.
[50] W. Qiu, J. Wang, D. M. H. Walker, D. Reddy, X. Lu, Z. Li, W. Shi and H. Balachandran, "K Longest Paths per Gate (KLPG) Test Generation for Scan-Based Sequential Circuits", in Proc. IEEE International Test Conference, Charlotte, NC, Oct. 2004, pp. 223-231.
[51] F. Brglez and H. Fujiwara, "A Neutral Netlist of 10 Combinational Benchmark Circuits and a Target Translator in Fortran", in Proc. IEEE International Symposium on Circuits and Systems, Newport Beach, CA, Jun. 1985, pp. 663-698.
[52] D. Brglez, D. Bryan, and K. Kozminski, "Combinational Profiles of Sequential Benchmark Circuits", in Proc. IEEE International Symposium on Circuits and Systems, Portland, OR, May1989, pp. 1929-1934,
[53] D. Blaauw, V. Zolotov and S. Sundareswaran, "Slope Propagation in Static Timing Analysis", IEEE Trans. on Computer-Aided Design, vol. 21, no. 10, pp. 1180-1195, 2002.
[54] A. B. Kahng and S. Muddu, "Improved Effective Capacitance Computations for Use in Logic and Layout Optimizations", in Proc. International Conference on VLSI Design, Goa, India, Jan. 1999, pp. 578-582.
[55] C. V. Kashyap, C. J. Alpert and A. Devgan, "An ‘Effective' Capacitance Based Delay Metric for RC Interconnect", in Proc. International Conference on Computer-Aided Design, San Jose, CA, Nov. 2000, pp. 229-235.
[56] N. Megiddo, "Linear Programming in Linear Time When the Dimension Is Fixed", J. ACM, vol. 31, no. 1, pp. 114-127, Jan. 1984.

## VITA

Xiang Lu was born in Shi Yan City, Hu Bei, China. He completed his Bachelor and Master's degree at Xi’an Jiaotong University, Xi'an, China in July 1997 and June 2000, respectively. He then attended Texas A\&M University, College Station, TX as a graduate student in Computer Engineering and graduated with a Ph.D. degree in December 2005. His research interests are process variation effects on nano-scale VLSI circuits, delay test under process variation, and static/statistical timing analysis. Now he is working with P . A. Semi, Inc., a chip design company in Santa Clara, CA. He can be reached by email at lu.shawn@gmail.com, or by mail care of Dr. Weiping Shi, Dept. of Electrical Engineering, Texas A\&M University, College Station, TX 77843.

