Enhancement and validation of a test technique for integrated circuits by El-Kafrouni, Roger
ECOLE DE TECHNOLOGIE SUPERIEURE 
UNIVERSITE DU QUEBEC 
THESIS PRESENTED TO 
ECOLE DE TECHNOLOGIE SUPERIEURE 
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR 
A MASTER'S DEGREE IN ELECTRICAL ENGINEERING 
M.Eng. 
BY 
Roger EL-KAFROUNI 
ENHANCEMENT AND VALIDATION OF A TEST TECHNIQUE FOR INTEGRATED 
CIRCUITS 
MONTREAL, MAY 10, 2010 
1 Copyright 2010 reserved by Roger El-Kafrouni 
BOARD O F EXAMINER S 
THIS THESIS HAS BEEN EVALUATED 
BY THE FOLLOWING BOARD OF EXAMINERS: 
M. Claude Thibeault, Thesis Supervisor 
Departement de genie electrique a I'Ecole de technologic superieure 
M. Ghyslain Gagnon, President of the Board of Examiners 
Departement de genie electrique a I'Ecole de technologic superieure 
M. Christopher Fuhrman, Member of the Board of Examiners 
Departement de genie logiciel et des TI a I'Ecole de technologic superieure 
THIS THESIS HAS BEEN PRESENTED AND DEFENDED 
BEFORE A BOARD OF EXAMINERS AND PUBLIC 
April 9, 2010 
AT ECOLE DE TECHNOLOGIE SUPERIEURE 
REMERCIEMENTS 
Je tiens tout d'abord a exprimer ma reconnaissance envers mon directeur de maitrise et 
professeur titulaire, Dr. Claude Thibeault, pour son appui et sa confiance envers moi durant 
mon projet de recherche. Sa methode pedagogique etait enrichissante et a rendu mon projet 
de plus en plus interessant. 
Je remercie aussi les etudiants du LACIME dont I'aide, le dynamisme et le travail 
remarquable ont contribue a creer une ambiance de travail cooperative et chaleureuse. 
Je remercie mes parents et mes deux freres Michel et Paul, qui m'ont donne un support infmi, 
beaucoup de passion et d'espoir pour arriver a completer ma maitrise. 
ACKNOWLEDGEMENTS 
First and foremost, I wish to express my sincere appreciation and gratitude to my supervisor. 
Dr. Claude Thibeault, for his guidance and encouragement during my master's work. 
I would like to thank the graduate students of LACIME; they have created a great 
cooperative and pleasant working environment. 
I would like to dedicate this work to my parents and my brothers Michel and Paul. Their 
support, encouragement and understanding have been monumental during the course of my 
education. Without them, I doubt I would have been successful in my academic work. 
AMELIORATION E T VALIDATION D'UN E TECHNIQUE D E TEST POU R 
CIRCUITS INTEGRE S 
Roger EL-KAFROUNI 
RESUME 
Ce memoire s'interesse a une approche de test recemment developpee a I'ETS. Cette 
approche, appelee methode de test dc delai sans capture (Capture-less  Delay  Testing,  CDT), 
a ete proposee comme technique complementaire aux approches plus traditionnelles de test 
visant a s'assurer que les circuits integres fonctionnent a la frequence prevue, afm d'ameliorer 
la couverture de test de ce type de test. CDT utilise entre autres des capteurs permettant de 
detecter la presence de transitions a des endroits strategiques. 
L'objectif de ce projet est d'ameliorer certains aspects de cette nouvelle approche. Dans un 
premier temps, nous allons analyser la distribution de delai des noeuds non couverts par les 
methodes traditionnelles de test, afin de developper la meilleure maniere de deployer les 
capteurs CDT. Nous presentons I'ensemble d'outils, utilisant le langage Perl, developpe a 
cette fin. Les resultats obtenus confirment que les chemins passant par les noeuds non 
couverts sont plus longs que ceux qui passent par les noeuds couverts. La difference entre les 
deux types de chemins represente plus de 20% de la periode d'horloge si Ton considere les 
delais des chemins les plus courts. 
Dans un deuxieme temps, nous proposons un algorithme entierement automatise qui permet, 
pendant les premieres etapes du processus de generation automatise des vecteurs de test: 1) 
d'identifier les noeuds non couverts, 2) d'identifier les emplacements des senseurs CDT sur 
les entrees des bascules afin d'ameliorer la couverture de test, et 3) de minimiser le nombre 
de senseurs selon le besoin. Nos resultats indiquent que lorsque nous appliquons CDT en 
complement aux methodes transitionnelles basees sur le modele de pannes de type transition 
nous pouvons augmenter la couverture de test de pres de 5%. De plus, ralgorithme de 
minimisation du nombre de senseurs de CDT permet de reduire dc plus de 85% le nombre de 
ces senseurs avec une perte de couverture minimale, en moyenne de 1.6%. 
Mots cles : circuits integres analogiques, generateur algorithmique de sequence de test, 
methode de test de delai sans capture, methode de test pour circuits integres. 
ENHANCEMENT AN D VALIDATION O F A TEST TECHNIQUE FO R 
INTEGRATED CIRCUIT S 
Roger EL-KAFROUNI 
ABSTRACT 
This thesis focuses on a scan-based delay testing technique that was recently developed at 
ETS. This new approach, called Captureless Delay Testing (CDT), has been proposed as a 
technique that complements traditional methods of test, ensuring the integrated circuits will 
function at their proposed clock speed, further improving the test coverage of the particular 
type of test. Furthermore, CDT incorporates the use of sensors enabling the detection of the 
presence of transitions at strategic locations. 
The purpose of this project is to improve on certain aspects of this novel technique. At first, 
we analyze the delay distribution of the non-covered nodes by traditional methods of test, in 
order to develop the best way possible of placement of the CDT sensors. We present, using 
Perl Language, the ensemble of tools developed for this purpose. The end results obtained 
confirm that the paths that pass through the non-covered nodes are longer than those that 
traverse the covered ones. The difference between the two types of paths exceeds 20%) of the 
clock period when considering the shorter path delay values. 
Secondly, we propose a fially automated algorithm that enables, at the earliest stages of the 
test vectors generation process: 1) the identification of the non-covered nodes, 2) the 
identification of the placements of the CDT sensors at the inputs of the flip-flops for further 
improvement of the test coverage, and 3) the minimization of the number of sensors with 
regards to requirements. Our results indicate that when we apply CDT on top of transition-
based fault model we can improve the test coverage by 5%. Moreover, the algorithm of CDT 
sensors minimization allows a reduction of more than 85% the number of those sensors with 
a minimal test coverage loss, on average of 1.6%. 
Keywords: analogue circuits, automatic test pattern generation, captureless delay testing, 
integrated circuit testing, low cost testing, scan-based test technique. 
TABLE O F CONTENT S 
Page 
CHAPTER 1 INTRODUCTION 1 
1.1 Motivations 1 
1.2 Thesis Outline 3 
1.3 Contribution 4 
CHAPTER 2 MANUFACTURING DEFECTS AND DETECTION 
MECHANISM 5 
2.1 Introduction 5 
2.2 IC catastrophic defects 6 
2.2.1 IC catastrophic defects detection 7 
2.3 IC parametric defects 8 
2.3.1 Resistive vias 8 
2.3.2 Metal mousebites 9 
2.3.3 Metal Slivers 11 
2.4 Parametric failures due to defects 11 
2.5 Parametric timing failure due to process variation 12 
2.6 IC delay defects detection 12 
2.7 Summary 13 
CHAPTER 3 EXISTING DELAY TESTING TECHNIQUES 14 
3.1 Introduction 14 
3.2 Functional testing 14 
3.3 LBIST 15 
3.4 Scan-based testing 15 
3.4.1 Launch on shift (LOS) 17 
3.4.2 Launch on capture (LOC) 18 
3.5 Delay fault models 19 
3.5.1 Path delay model 19 
3.5.2 Transient fault model 20 
3.6 Current DFT techniques limitations 20 
3.6.1 Small delay defect 21 
3.6.2 Testers limitafions 21 
3.7 Conclusion 21 
CHAPTER 4 CDT (CAPTURE-LESS DELAY TESTING) 23 
4.1 Introduction 23 
4.2 CDT Overview 23 
4.3 CDT ftinctionahty 24 
4.3.1 Implementation of CDT sensor 25 
4.3.2 CTVC operation 27 
4.3.3 Dynamic compensafion 28 
VII 
4.3.4 CTVC 1st stage: Low Gain AmpHfication 29 
4.3.5 CTVC 2nd stage: Differential Amplification 30 
4.3.6 CTVC Buffering Stage 30 
4.3.7 Delay measurement stage 31 
4.4 Advantages of CDTP (Capture-less Delay Testing Patterns) 32 
4.5 Conclusion 33 
CHAPTER 5 TIMING BASED DELAY DISTRIBUTIONS OF TRANSITION 
UNDETECTED FAULTS MODEL 34 
5.1 Introduction 34 
5.2 Scan based structural test techniques 34 
5.3 ATPG methodology 35 
5.4 Simulated implementation steps 37 
5.5 Simulated results 37 
5.5.1 Minimum path delay distribution of non-covered faults 37 
5.5.2 Maximum path delay distribution of non-covered faults 38 
5.5.3 Minimum path delay distribution of covered faults 39 
5.5.4 Maximum path delay distribufion of covered faults 40 
5.5.5 Comparing delay distribution of non-covered & covered faults 41 
5.6 Conclusion 42 
CHAPTER 6 ANALYSIS OF DELAY TEST EFFECTIVENESS WITH CDT 
ON TOP OF LOC 43 
6.1 Introduction 43 
6.2 Capture-less Delay Testing CDT 44 
6.3 Experiments CDT on Top-off LOC technique 44 
6.3.1 Evaluation Framework 45 
6.4 Contribution 46 
6.4.1 Algorithm general steps 46 
6.5 Algorithm implementafion 48 
6.5.1 Implementafion steps 48 
6.6 Running CDT on top of LOC patterns on muUiple ITC99 benchmarks 50 
6.6.1 Proposed complementary ATPG process 50 
6.6.2 Applying CDT random patterns on Un-detected faults 50 
6.6.3 Un-optimized CDT sensors count coverage 51 
6.6.4 Optimized CDT sensors count coverage 51 
6.6.5 Summary of obtained test coverage results on selected ITC 99 
benchmarks 52 
6.6.6 Validating the obtained test coverage with the optimized list of sensors 54 
6.7 Conclusion 54 
CONCLUSION 56 
ANNEX I LOGIC BUILT-IN SELF TEST BIST 59 
ANNEX II TYPES OF FAULT CLASSES 62 
VIII 
ANNEX III PATH DELAY DISTRIBUTION PERL SCRIPTS 67 
ANNEX IV CDT SENSOR PLACEMENT AND OPTIMIZATION PERL 
SCRIPTS 79 
BIBLIOGRAPHY 90 
LIST O F TABLE S 
Page 
Table 2.1 
Table 5.1 
Table 6.1 
Table 6.2 
Table 6.3 
Table 6.4 
Table 6.5 
Stuck-at truth table of a 2 input AND gate 8 
Minimum, mean & maximum values of umin-pd(i), cmin-pd(j), umax-
pd(i), cmax-pd(j) expressed as a percentage of the clock period (T) 41 
LOC test coverage results of ITC benchmarks 50 
Un-optimized SS- random patterns test coverage results 51 
Optimized SS-random patterns fault coverage results 52 
Summary of simulated ITC 99 benchmark test coverage 53 
A compromise of Test coverage with minimal CDT sensors use 54 
LISTE DES FIGURE S 
Page 
Figure 2.1 Global and local manufacturing defects 6 
Figure 2.2 Logic AND gate 7 
Figure 2.3 Resistive vias 9 
Figure 2.4 Defect-free and a defective interconnect 9 
Figure 2.5 Zoom-in defective interconnect 9 
Figure 2.6 Normal Metal line and one with mousebite 10 
Figure 2.7 Voided metal resistance (mQ) versus percent 10 
Figure 3.1 Shifting patterns in scan chains 16 
Figure 3.2 Capturing the response of the combinatorial logic 16 
Figure 3.3 Launch on shift transition delay fault pattern generation 17 
Figure 3.4 Launch on capture transition delay fault pattern generation 18 
Figure 4.1 Scan based CDT architecture 25 
Figure 4.2 CDT sensor implementation: intrinsic (blue), extrinsic (burgundy), 26 
Figure 4.3 Current to voltage conversion CTVC block 27 
Figure 4.4 Dynamic compensation on Vddl3c 28 
Figure 4.5 (a)Low gain amphfier, (b) and a differential amplifier 29 
Figure 4.6 CDT Timing Diagram 31 
Figure 5.1 Path delay distribution extraction flow 36 
Figure 5.2 Minimum path delay distribufion of non-covered faults 38 
Figure 5.3 Maximum path delay distribufion of non-covered faults 39 
Figure 5.4 Minimum path delay distribution of covered faults 40 
Figure 5.5 Maximum path delay distribufion of covered faults 41 
XI 
Figure 6.1 Benchmark test coverage evaluation 45 
Figure 6.2 CDT sensor aflocation, placement and optimization flow 47 
Figure 6.3 CDT sensor allocation, placement and optimization implementation 
steps 49 
LIST OF ABREVIATION S 
ASIC Application-specific integrated circuit 
ATPG Automatic test pattern generation 
ATE Automatic test equipment 
CDT Capture-less delay testing 
CDTP Capture-less delay tesfing pattern 
CMP Chemical mechanical polishing 
CTVC Current to voltage conversion block 
CUT Circuit under test 
CVP Current voltage pulse 
DC Dynamic compensation block 
DFT Design for testability 
DM Delay measurement block 
DVP Data voltage pulse 
LB 1ST Logic built-in self test 
LFSR Linear feedback shift register 
LOC Launch from capture 
LOS Launch from last shift 
SOC System on chip 
SSS Sensors switching at the same time 
STA Static timing analyzer 
VLSI Very Large Scale Integrated Circuit 
CHAPTER 1 
INTRODUCTION 
1.1 Motivation s 
"The success of the semiconductor industry has been due in large part to its ability to 
continuously increase the complexity, and therefore the processing power, of integrated 
circuits" [Nanowerk Spotlight]. Moore's law predicts that the number of transistors in a 
computer chip doubles every two years, due to miniaturization of the components. However, 
as device and interconnect dimensions continue to scale down from sub-micron to nanometer 
towards thousand-pico dimensions, IC designers and test engineers have to deal with an 
increase in process variation and the manifestation of new defect mechanisms. 
Integrated circuits fabricated using older technologies, based on larger feature size, were 
relatively insensitive to process variation. As the feature size has approached the 32 nm 
dimensions and the wafer size has grown to 450 mm (Samsung-TSMC, Intel Fabs), process 
variation impact on the operation of a chip has become non-deterministic. This is mainly 
attributed to a decrease of feature dimensions without a corresponding increase in 
manufacturing machine precision. As technology has been scaling down to nanometer and 
feature sizes shrink accordingly, photolithography became a concern. The wavelength of 
light used for geometry imaging is longer than the one desired for printing [Mak 2004]. For 
example, a 248 nm light source is used for a 130 nm to 180 nm gate length. This issue 
required using the light diffraction method causing the printed image to be different than the 
intended shape. To solve this issue, lithography engineers generate shaping rules in order to 
add or subtract geometries to the mask. This method is successful to a large degree, but can 
still create variations on the width and uniformity of the metal lines, and the shape of vias. 
Furthermore it might affect the poly-silicon layer that defines the gate length of a transistor. 
The polishing process in Chemical Mechanical Polishing (CMP) technology that is used to 
help planarize th e metal layers or the interlayer dielectrics for successive layer deposition 
depends on the geometries underneath it. A dishing phenomenon occurs when there are less 
dense materials underneath, thus increasing the interlayer capacitance. Due to CMP process, 
copper wires that are widely used nowadays to decrease wire resistance, tends to wear down 
much faster than the neighboring dielectrics, hence creating erosion and dishing effect that 
might affect the copper interconnects resistance [Mak 2004]. All these phenomena may lead 
to faults, including the so-called timing related failures that need to be detected, as affected 
ICs do not meet the frequency specifications. In other words a chip might work at a particular 
speed but fails at the desired clock frequency. 
IC manufacturing defects can also cause faults, including the timing related ones. Defects 
might occur randomly during fabrication process and are related to photolithography, CMP 
mentioned above, and some other fabrication processes that are beyond the scope of this 
work. In the so-called nanometer designs, new types of manufacturing defects have been 
introduced with the ever increasing number of interconnects, namely timing induced delay 
defects [Lin 2003]. As a consequence, more attention nowadays is being given to the test of 
these delay defects, this kind of test being known as delay testing. 
Most of the techniques for delay testing used in the industry inject transitions through 
patterns to the device under test on some dedicated input ports and check its response on the 
output. Those kinds of techniques can be categorized as slack based delay testing. Scan-
based delay testing is the dominant delay testing technique applied today as it generally 
provides fair coverage results and that it is fially automated. However, the quality of this kind 
of test is often limited by the tester memory which is not large enough to store all the 
required test patterns [Saxena 2002]. CPU time required by the automated test pattern 
generation (ATPG) tools is also a limiting factor. Consequently, transition test coverage of 
80% is typical in the industry [Mentor Graphics website]. Moreover, conventional ATPG 
tools do not use timing information, and tend to select the shortest paths to propagate 
transitions, leaving undetected most of the faults that lie on the longest most critical paths 
[Lin 2006]. 
A new technique, called Capture less Delay Testing (CDT), has been recently developed to 
increase the delay test coverage [Thibeault 2006]. With this technique, coverage is improved 
by special sensors. An outstanding advantage of CDT is that it does not require any 
additional test patterns. In this thesis, we present a robust set of tools to automate the 
selection of test points where CDT sensors are required. The newly introduced procedure 
uses CDT on top of conventional delay testing and works in harmony with current industry 
used ATPG tools. With this new procedure, test engineers can: 1) pin-point the left non-
covered nodes by the tools during ATPG flow and automatically select the appropriate CDT 
sensor locations, 2) identify the potential percentage increase of test coverage with each 
selection of CDT sensors, and 3) optimize the number of needed sensors to achieve a 
reasonable test coverage increase with reduced area overhead, in a timely manner. 
1.2 Thesi s Outlin e 
In Chapter 2, we review the types of delay defects that are rendering manufactured ICs with 
sub nanometer technologies more prone to defect and harder to spot. We further analyze the 
delay fault model and how it is used in conventional ATPG tools. The discussion 
encompasses the concept of transition delay fault model as well as shed light on the IC speed 
failure due in large to manufacturing defects. 
In Chapter 3, we investigate the current delay testing techniques as well as unravel the 
shortcomings of each method and show the aspects and challenges that limit current timing 
insensitive ATPG tools from achieving higher test coverage. 
In Chapter 4, we propose a methodology that allows the DFT engineer to better understand 
the timing delay distribution of transition model left undetected faults. A set of tools was 
implemented to allow the user to pin point those remaining non-covered nodes in any 
particular design, identify all those combinatorial paths and capture all the appropriate 
transition delay estimations in order to better analyze the switching activity of a circuit as 
well as the maximum achievable frequency it can run at. 
In Chapter 5 we present the CDT technique and explain in details all the aspects of its 
implementation stage by stage as well as analyze its functionality and potential in the real 
world of DFT design. 
In Chapter 6, we present our proposed procedure to automate CDT application. This 
procedure is implemented through a set of tools that enables the test engineer to achieve 
during the ATPG process, a proper robust placement of CDT sensors along specific non-
covered paths, as well as optimize the number of needed sensors to achieve an optimal 
coverage in terms of area overhead and the highest possible test coverage. 
In conclusion. Chapter 7 reviews the objectives of this thesis and summarizes the 
contributions made in the field of scan-based delay testing. Possible future work is also 
discussed in this chapter. 
1,3 Contributio n 
Significant contributions of this thesis include: 
• The development of an algorithm that enables the test engineers to pinpoint the remaining 
non-covered nodes by the conventional ATPG tools as well as placing the sensors at the 
appropriate end flip flops to ensure optimal test coverage. 
• An optimized algorithm that minimizes the number of needed CDT sensors to achieve a 
rather similar final test coverage with less area overhead and higher achievable at speed 
tester frequency. 
• An investigation of the shortcomings of current ATPG tools from both Mentor Graphics 
Fastscan and Synopsys Tetramax timing insensitive tools that might leave thousands of 
non-covered combinatorial paths along the way and lead to potential IC test escapes. 
CHAPTER 2 
MANUFACTURING DEFECT S AN D DETECTION MECHANIS M 
2.1 Introductio n 
Manufacturing defects have a direct impact on VLSI circuit behavior and can drastically alter 
its functionality. Those undesired phenomena in the silicon structure of an IC range from 
mild to catastrophic defects. They can take different forms from missing pieces of 
manufacturing materials to having extra added materials at the wrong spot inside a die. The 
latest ICs designed with over 2 billion transistors on a die represent a serious challenge in 
terms of manufacturing process precision, i.e., photolithography, as well as the detection 
process of potential manufacturing defects [Groeneveld 2002]. Heat and voltage drop are 
also critical factors to be considered, but they are beyond the scope of this work. 
According to [Sachdev 2007], defects range from global defects such as mask misalignments, 
non-uniformity of critical dimensions, shifting of dopants under etching, to more localized 
spot defects of the silicon layer structure caused by dust, process variations, etc. Any process 
error during manufacturing process might have a tremendous impact on the chip by 
introducing a defect. Such a defect that alters circuit behavior is rendered as a fault. Faults in 
turn can be classified as catastrophic, or parametric. A fault is catastrophic when the 
functional behavior of the IC is incorrect. "On the other hand, according to [Sachdev 2007], 
parametric faults are those faults for which the IC is functional but it fails to meet its 
specificafions, e.g. timing, power budget, leakage, etc". In today's sub-micron very large 
scale integration (VLSI) manufacturing demands, the soft parametric faults can drastically 
limit the maximum frequency the IC can run at, and might develop with time into critical 
catastrophic faults due to fault site being more susceptible and vulnerable to excess of heat, 
resistance and electromigration. 
2.2 IC catastrophic defects 
Catastrophic defects occur during IC manufacturing process and have direct impact on the 
fianctionality of the chip. For example, these IC deformations are due in part to wafer 
contamination as dust particle that can break a metal line, or flakes due to fabrication 
machinery errors. Figure 2.1 shows some types of global and local spot defects occurring 
during IC manufacturing process. 
Irregular Shapes Smal l Particles Open Lines 
Figure 2.1 Global and local manufacturing defects . 
Extracted from Sachdev (2007, p. 25) 
2.2.1 I C catastrophic defect s detectio n 
These types of catastrophic defects on a chip can be detected using the traditional "stuck-at" 
fault model. Over the past decade, the "single-stuck-at" fault model was the most widely 
used in the industry on digital circuits to detect manufacturing defects. The "single-stuck-at" 
is a static approximation of a physical defect, in other words, it models all the failures as if all 
gate level pins or nets connected to the gate as they were stuck or connected or shorted to 
power or ground. Figures 2.2 and table 2.1 respectively illustrate an AND gate and its single 
stuck-at truth table. In Fig.2.3, the shaded cells represent the faults that are detected bay 
applying the corresponding AB combination. As an example, when AB = 11,3 single stuck-
at faults are detected: A saO, B saO and Z saO. 
This type of fault model is insensitive to clock frequency the device operates on and it 
assumes that one fault exists at a time during test mode. This makes it applicable under any 
circumstance, regardless of frequency and time domain. Its simplicity allows a fast 
computation during Automated Test Pattern Generation and time spent on tester during 
diagnosis. 
A ^ , . . n ^ 
Figure 2.2 Logic AND gate . 
Higher operational frequency, higher complexity, smaller area, and lower power 
consumption usually are the design objectives. Unfortunately, all of these criteria have 
caused ICs to become susceptible to various yield loss mechanisms which are parametric in 
their nature and that are not necessarily covered by single stuck-at based test patterns 
[Sachdev 2007]. 
The following section 2.3 is a brief summary of a study done by [Hawkings 2003] that sheds 
light on certain types of IC parametric defects. 
Table 2.1 Stiick-at truth table of a 2 input AND gate 
AND 
AB 
00 
01 
10 
11 
z 
0 
0 
0 
1 
A 
SaO 
0 
0 
0 
0 
Sa1 
0 
1 
0 
1 
B 
SaO 
0 
0 
0 
0 
Sa1 
0 
0 
1 
1 
z 
SaO 
0 
0 
0 
0 
Sal 
1 
1 
1 
1 
2.3 IC parametric defect s 
Parametric failures have been there since the beginning of CMOS technology, but their 
significance is now more serious and growing. According to [Hawkins 2003], inaccuracies of 
lithography with CMOS IC nanometer technologies and increasing lack of manufacturing 
control of circuit parameter variance have shown that allow transistor and interconnect 
variations. Temperature variation across the circuit as well as power supply levels within the 
die, and during switching activities may result in inaccuracies that impact circuit quality, and 
can provoke erroneous functional behaviors and might lead to chip failure. 
Interconnect properties include crosstalk errors arise from poor design rule implementation 
or fluctuations in metal line spacing and width. In the following a description of three 
different parametric failures is provided: resistive vias, metal mousebites, and metal slivers in 
ultrathin technologies. 
2.3.1 Resistive via s 
Nowadays ICs might contain billions of transistors and approximately ten times that number 
of metal vias. Contacts and vias at the lowest metal level are close to minimum technology 
feature size. With nanometer technology it is not surprising to see defective vias with 
elevated resistance. 
Figure 2.3 Resistive vias . 
Extracted from Cook (2003) 
Crack in metal lines show the same characteristic of a resistive via, even though it is less 
common to induce failure mechanism on a chip. 
2.3.2 Meta l mousebite s 
As mentioned above, missing parts of interconnect metal are called mousebites. They can 
happen during IC manufacturing process due to particles defects, or electro-migration. 
Figures below show a defect-free and a defective (mousebite) section of interconnect. 
Figure 2.4 Defect-free an d a  defective interconnect . 
Extracted from Cook (2003) 
Figure 2.5 Zoom-in defectiv e interconnect . 
Extracted from Cook (2003) 
10 
Mousebites might have a minor effect on the overall delay on the metal line, but if we divide 
a healthy metal line to squares of 0.5 micron each, then if 90% of the middle square as shown 
in figure 2.8, is missing, then the new ratio becomes 0.5 |im /0.05|am. 
Figure 2.6 Normal Metal line and one with mousebite. 
Extracted from Segura (2003) 
Figure 2.7 Voided metal resistance (mil) versus percent 
metal voiding using Rs = TOmli/sq. 
Extracted from Segura (2003) 
Assume that sheet resistance is Rj = 70mf^/sq, the resistance of the square with the 
Mousebite defect will be equal to Rn = (70mQ/sq)(0.5 |j,m /0.05|am) = 700m^. Therefore, 
the original segment of one square was 140mQ and now it becomes 840mQ, yielding an 
increase in resistance by a factor of 6. 
11 
2.3.3 Meta l Sliver s 
Metal sliver defect is due to a metal particle that falls between two metal conductors and 
slightly contacts the signal line. It can be formed from any of the metal layers used in the fab. 
With temperature change, this metal can expand and touches or connect the two interconnect 
lines. This bridge resistance might be permanent and might cause noise on the two signal 
lines or even cause a fatal functional failure. 
2.4 Parametri c failure s du e to defect s 
IC parametric defects can lead to parametric failures. Here, we focus on timing failures. Any 
device with logic network is considered faulty if it does operate correctly at a slower clock 
speed but fails at the targeted or desired clock frequency. 
The cause of failure in a synchronous sequential logic might be extra induced propagation 
delay on combinational data path reaching a storage element such as a flip-flop or a latch. 
Each flip-flop has a setup time and a hold time. If the signal propagating through the data 
path doesn't arrive or be stable before the flip-flop setup time, it is called to be violating the 
long path timing constraints (setup time violation). In other hands, if the signal is not stable 
long enough to be captured by the flip-flop it is called to be violating the short path timing 
constraints (hold timing violation). 
According to [Kim 2003], delay defect testing is critical to insure fault free integrated circuits 
in the overall test strategy. A demonstration of these types of IC defects has been established 
by Stanford University's Murphy and ELF35 experiments (0.7-and 0.35-micron technology, 
respectively) on logic circuits designed using standard cells showed that 3 out of 116 
defective parts were not detected when tested at lower speed than the expected functional 
operating speed. 
12 
2.5 Parametri c timing failure du e to process variatio n 
Timing failures can also be caused by regular process variations. During IC manufacturing 
process, a small natural variation in physical parameters can alter the operating frequency 
(fmax) and varies in severity from one unit to another. The die location on a wafer, differences 
in materials and equipments can cause such a variation according to [Hawkins 2003]. It can 
affect the entire die or can be localized in a part or a block within the die. They might 
introduce a delay changing without killing the entire die by decreasing the desired frequency 
the device should operate on. 
According to [Chandrakasan 2000], parametric variations might be due from optical effects 
during lithography processes, resulting in wafer images different from the original layout. It 
might degrade transistor parameters and might lead to catastrophic manufacturing defects 
occurring in the poly-silicon layer. Metal interconnect lines can also suffer from variation 
due to chemical-mechanical polishing (CMP). A chip might function at certain power supply 
voltages, but not on all its specified VDD range. The die might pass at high speed with high 
temperature and might fail with colder temperature. It might have windows of pass and fail. 
To overcome this issue, engineers should design the chip at a higher frequency targeting all 
longest path delays called critical path with the worst case conditions. This approach might 
be very expensive in terms of die size and packaging and difficult to implement and might 
require extraordinary engineering efforts and time. 
2.6 I C delay defects detectio n 
"From the SOC testing point of view, test solutions must address new fault models and 
failure mechanisms caused by manufacturing defects at the 65-nanometer  (nm) process node 
and below" [Kaufman 2008]. In practice, delay faults can be a combination of direct 
manufacturing defects caused by lithography as resistive shorts and opens and capacitive 
crosstalk that can impact a local island or functional circuit path within the die, whereas 
13 
power supply noise, intra-die temperature distribution, and process variation might cause a 
global defects that might affect a large part or even the entire chip. 
As above underlined, in the real silicon design not all faults can be simply described by the 
single-stuck-at model that does not include any timing effects. As discussed in the next 
chapter, this led to the development of delay models that are very similar to the single stuck-
at one but that also take into consideration the timing relationship. The application of these 
models implies the use of transitions. In the test terminology, AC scan refers to using a scan 
chain to launch transitions through a combinatorial circuit and capture the response to those 
transitions within the period of the system clock. However, testing delay faults in a sequential 
circuit using standard scan (scan based delay testing), has its own limitations, and structural 
transitional fault model tests might not cover all delays defects leading to a yield escape. 
Chapter 3 discusses in details the pros and cons of each delay testing approach, be it 
functional testing, logic built-in self test, as well as scan-based delay testing and the 
limitations of nowadays ATPG tools. 
2.7 Summar y 
As shown in this chapter, with nanometer technology, new types of manufacturing defects 
have risen to the surface, mainly due to the decrease in feature size and probable lack of 
manufacturing precision of lithography mask. With that in mind, IC manufacturing faults can 
be classified as catastrophic or parametric. The latter type of faults might occur during 
manufacturing process due to IC machinery fabrication flaws. These types of faults, that can 
also be caused by process variafions, might alter the desired fianctional speed on a given 
ASIC, triggering the necessity of what is called delay testing. These delay defects are not 
covered by the traditional stuck-at fault model, which is timing insensitive. Delay testing 
techniques are a must in today ASIC manufacturing process to insure fair test coverage. In 
the next chapter we discuss in details these delay test techniques and their limitations, and the 
possibility of enhancing such a test scheme. 
CHAPTER 3 
EXISTING DELA Y TESTIN G TECHNIQUE S 
3.1 Introductio n 
In this chapter we present and discuss the three main approaches used to detect delay failures, 
namely: Functional testing. Logical Built In Self Test (LBIST), and scan based delay testing. 
It shows the pros and the cons, and opens up the discussion about a new testing technique 
called CDT that complement the actual ones and might boost up the testing coverage to an 
acceptable level without the need of much engineering effort and time. 
3.2 Functiona l testin g 
Functional testing consists on applying test patterns derived based upon the functionality of a 
chip. According to [Bareisa 2008], when used for delay testing, functional test patterns are 
specifically derived to detect delay failures (caused by defects or by process variations), 
which most likely affect the longest combinational paths on a chip, or what it is called "the 
critical paths". According to [Ahmed 2006], the identification of these critical paths, usually 
performed by using a tool called Static Timing Analyzer (STA), is part of the design flow to 
guarantee that the chip will work at the desired speed. To derive these functional test patterns 
for delay testing, test engineers must manually generate these patterns such that they exercise 
those critical paths (namely, sending a transition along those paths) when applied later on the 
tester. In order to detect a delay failure, those patterns should be applied at-speed, hence 
exercising all relevant combinations on the targeted functional blocks using the operation 
speed or the desired clock frequency. The disadvantages of such approach are the following: 
Functional Patter n Developmen t effort : According to [Thibeault 2006], developing 
suitable efficient functional test patterns requires long and difficult engineering work. 
Functional at-speed test task is very expensive and requires lots of manual engineering works 
targeting on some large chips thousands of critical paths rendering it extremely difficult and 
15 
somehow impossible to implement. Furthermore, importing such patterns to the tester 
requires also an extra effort of manual debugging, changing the timing sequences from 
simulation to tester environment. 
High teste r costs : According to [Bareisa 2008], applying functional test patterns to a tester 
at the desired product speed, using device primary inputs and checking the response at the 
device primary outputs require a high-end tester that can operate at a very high frequency 
along with a very high pin counts. 
3.3 LBIS T 
LBIST is a test approach where most (if not all) test patterns are pseudo-random ones 
generated on chip, using a linear-feedback shift registers (LFSR), and where the response of 
the injected patterns are verified on-chip by a signature analyzer (or more information about 
LBIST structure and design flow, please refer to Annex I). Therefore BIST data exchange 
with the tester is minimal and drastically reduced. Test costs are generally reduced due to 
reduced test time, tester memory requirements, or tester investment costs, as most of the 
tester functions reside on-chip itself Another positive aspect of BIST is that the test can be 
performed at-speed. 
According to [Thibeault 2006], the main disadvantages of LBIST are the area/performance 
penalty and the extra design effort to deal with: 1) the propagation of the necessary at-speed 
scan-enable signal (discussed later, section 3.3.1, as this issue is shared by other approaches), 
2) the elimination of don't care conditions and multi-cycle paths (when the circuit under test 
(CUT) requires two or more clock cycles to settle), and 3) the issues related to multiple clock 
domains and test clock skews peculiar to this test mode. 
3.4 Scan-base d testin g 
This technique is well known and a common design for testability (DFT) application through 
the entire semiconductor industry and has been used for decades. In this approach, automated 
16 
scan insertion tools, such as Synopsys TetraMax and Mentor DFTAdvisor, arrange part or all 
of the internal flip-flops of a particular device in scan chains. With this architecture in mind, 
test patterns generated by ATPG tools are applied to the device under test using the sequence 
of events depicted in figures 3.1 and 3.2. 
Figure 3.1 Shifting patterns in scan chains. 
Figure 3.2 Capturing the response of the combinatorial logic. 
The tester puts the chip in test mode by setting the scan enable signal to I on each scan 
converted flip-flop. It then shifts each scan pattern serially on the scan primary input. In the 
17 
second phase, tester de-assert scan enable for one or two clock cycles (depending on the 
selected transition launch strategy), bringing back the chip to its normal functionality, 
allowing the capture of data and the circuit response is stored in the device storage elements 
i.e. flip-flops. The third phase starts by asserting again the scan enable allowing the stored 
values in scan chain flip-flops to be shifted out, while shifting in a new pattern. As for the 
other delay testing techniques, the detection of delay defects requires that transitions are 
launched and propagated along combinational paths, i.e. there is no dependence between test 
vectors. With scan-based test techniques, there are two main transition test strategies: the 
launch on shift (LOS) and the launch on capture (LOC). 
3.4.1 Launc h on shift (LOS ) 
With LOS, the logic value launching the transition is initiated during the last scan shift cycle, 
when the scan enable signal (scan-en) is still active (figure 3.3). The fault is exercised at this 
period and the new logic value is captured by the first active clock edge in the capture phase, 
when the scan-en has been de-asserted (low). 
sea n - e n-
ac t i ve 
clocks 
I 
l a u n c h 
i n a c t i v e 
C a p t u r e 
Figure 3.3 Launch o n shift transition dela y fault pattern generation . 
LOS takes advantage of a single capture clock pulse to catch the result of the launched logic 
value; moreover, as the fault is sensitized during the chains load/unload, any of the available 
clocks can be used. For this reason, a basic scan combinational engine can be used for the 
test pattern generation, which brings to a much compact set of vectors in a reasonable amount 
of time. This is the mosfly comparable technique with the single stuck- at ATPG. According 
to [Benayahu 2007], the significant drawback is the scan enable signal management. In delay 
test, the scan enable signal must switch between the launch and capture clock; in design, it 
fans out to every flip-flop. Therefore, the skew effect plays a relevant role and, if the 
developed design is not very robust in timing, the balancing of this signal may be required 
with evident criticalities in routing. Additionally, if the scan enable is slower on the 
automatic test equipment (ATE) than predicted (different load, more delay induced on the 
ATE board, induced skew between Scan Enable and CLK, etc.) the device can easily fail also 
if fault-free. To avoid ATE induced failures, when applying LOS it is recommended to 
implement the pipelined scan enable technique [Synopsys DFT Compiler Manual]. 
Otherwise, the scan-enable signal must be routed like a clock signal. Moreover, according to 
[Benayahu 2007], a very accurate pin-to-pin timing between the scan-enable, scan-in, scan-
out, and clock pins must be provided by the tester. 
3.4.2 Launc h o n capture (LOC ) 
With LOC, both launch and capture operations occur when the scan enable signal is inactive, 
meaning that the chip is in its normal operation mode (figure 3.4) Therefore, the logic value 
launching the transition comes from the combinational paths, sampled at the regular flip-flop 
inputs. It exercises the target delay fault at the first active edge of the clock after the scan-
enable signal is disabled, and then it captures the corresponding effect at the next clock edge. 
scan-Gn 
c l o c k 
1 i n a c t i v e 
1 
l a u r 
 
^ch 
1 
Caf 
[ 
3 t u r 
-n 
Figure 3.4 Launch o n capture transition dela y faul t pattern generation . 
19 
The big advantage of LOC is that it relaxes the timing constraints on the scan enable signal, 
which becomes a regular combinational one. The disadvantage is that a multiple clock 
capture (sequential) procedure is requested, making the ATPG more compute-intensive and 
time-consuming. This brings to the generation of more vectors, which may arise potential test 
data volume issues. The usage of scan compression techniques is strongly recommended to 
reduce the impact of the pattern count. However, such techniques are often considered too 
costly in terms of area penalty. In spite of its disadvantages, LOC is often the preferred 
launch strategy. Once the transition launching strategy is selected, one must also choose the 
delay fault model on which the test patterns generation will be based. 
3.5 Dela y fault model s 
There are 2 main fault models that can be used in order to generate the scan-based delay test 
patterns: the path delay fault and the transient delay fault models. 
3.5.1 Pat h delay mode l 
The path-delay model is used for testing delay failures on selected paths. According to [Qiu 
2004], a circuit is considered faulty if the delay of any path exceeds the specification. As it 
assumes that the delay fault may be distributed all over a path, this model is most suitable 
(and used) to detect delay failure caused by process variations. The path-delay fault model 
requires that the transition traverse a specific path previously defined. Since the number of 
paths in a real device grows exponentially with the number of nodes in the circuit, it will be 
impossible to target all the possible combinations of paths because this number may become 
really huge. Therefore, only the critical paths are addressed when the path-delay model is 
applied. 
20 
3.5.2 Transien t faul t mode l 
The transient delay fault assumes that any delay defect is significant enough to cause a delay 
failure and that it is associated with (located at) the output of the gate driving the selected 
node. With this fault model, the single "gate" delay fault represents itself as a pin value of a 
gate component that works as if it has a "Slow-To-Rise" or "Slow-To-Fall" logic transition, 
and test patterns are created that passes the transition throughout a single gate only, no matter 
which path it follows. Test patterns generation in a transition fault model can use the same 
techniques as stuck-at-faults and can cover theoretically 100% of fault coverage. It requires 
minor modification for existing stuck-at-fault test patterns generation and simulation tools, 
and doesn't need any timing analysis. Definitely, transition fault patterns should be injected 
in the design under test using the highest desired frequency. It can then detect a delay fault 
on a particular data path, if the data arrival time to the end flip-flop is violating the setup or 
hold time. 
Unfortunately, it must be underlined that the extra-step of applying both 0 and 1 to the 
identified fault renders the delay test more difficult to compute and more time-consuming. 
For this reason, the transition delay test coverage of large complex SOCs is typically lower 
covering, typically 80% of all faults. It is rather more difficult to propagate a transition along 
the longest paths, as it becomes more difficult to control or observe a particular fault on a 
given combinatorial site. Nevertheless, the transient delay fault model is the preferred model 
to generate scan-based delay testing test patterns. 
3.6 Curren t DFT techniques limitation s 
There are some limitations to the currently used DFT techniques. Some of the limitations are 
discussed in the following sections. 
21 
3.6.1 Smal l delay defec t 
Unfortunately it has been shown that timing related defects often introduce a delay which 
size is less than the at-speed cycle time. According to [Kim 2003], this makes traditional 
transition fault testing less accurate for this class of faults, as transition ATPG tools attempt 
sensitizing a fault on the shortest (minimum slack) paths. Transition fault test vectors are 
therefore unable to individuate a defect which manifests itself on a long circuit path. 
3.6.2 Tester s limitation s 
It is a well known observation in the semiconductor industry, that even by using state of the 
art ATPG tools, several gigabits of test data may be required to exercise transition, stuck-at, 
and path delay faults for a multi-million gate SOC. According to [Pateras 2003], in many 
cases the testers used in the industry don't have enough memory to store all patterns, forcing 
test engineers to load and reload test patterns, or use a subsets of the test patterns at a time, 
hence increasing the test time and cost. Typically every second of test time might cost 
between 25 to 50 cents, moreover reports from high-end testers used in large 
Microprocessors, that amortization time for such testers is around $6000 per hour, 
[Hetherington 1999]. To summarize, large volume of test patterns needed to detect 
manufacturing defects on a particular ASIC creates a bottleneck for testers in terms of 
capacity and diagnosis time. As mentioned before, scan compression and LBIST are often 
considered too complex or costly. 
3.7 Conclusio n 
The scan-based LOC approach is the dominant delay testing technique, at it eases the scan 
insertion and design and as it minimizes the area/performance overhead penalty, at the 
expense of extra test patterns and ATPG CPU time. In this project, our goal is to improve the 
test coverage of the LOC patterns along the longest critical paths, using DFT techniques, 
mainly a Capture-less Delay Testing technique that incorporates analog circuitry into the 
22 
early stages of the design, widening the area of coverage of traditional delay test patterns. 
Next in chapter 4, we discuss CDT structure and shed light on the major components that 
make up this novel complementary DFT technique. 
CHAPTER 4 
CDT (CAPTURELESS DELA Y TESTING ) 
4.1 Introductio n 
As mentioned in Chapter 3, one of the most difficult test challenges is the ever-growing 
number of test vectors that need to be applied on the tester to increase the required fault 
coverage. This is a big challenge in terms of cost and time required for device under test. 
Multiple types of test pattems are required in order to get high coverage as possible for all 
sorts of manufacturing defects. The conventional approach is to apply all test pattems until 
the tester memory is full, LBIST and test compression being often seen as too complex and 
costly. And as it was also discussed in Chapter 3, LOC transient delay testing is the dominant 
way to detect faults in nanometer technologies that take into consideration timing related 
defects. 
In this chapter, we present CDT, a recent type of scan based delay testing that requires no 
additional test pattems, and increase the potential of detecting such delay faults. We discuss 
the major components that make up this novel complementary delay test technique. It is 
worth mentioning that CDT was proposed before this master project started. Therefore, CDT 
is not a contribution of this thesis, which rather covers how to integrate CDT in a traditional 
design flow. 
4.2 CD T Overvie w 
CDT (Capture-less Delay Testing) [Thibeault 2006], is a scan-based delay tesfing technique 
that increases the potential of detecting imperfections that might lead to less bad chips that 
are tested "Good". This novel approach uses analog test circuitry in the digital world. A 
redundant circuitry is added on chip, mainly sensors, with a main purpose to test and measure 
delay affected by defects occurring during manufacturing process. The term captureless 
means that no logical value is captured during CDT application. As explained later, the most 
24 
outstanding CDT aspect is that it does not require any addifional test pattems to be loaded in 
the tester. The overall area and speed penalty of redundant analog testing blocks is minimal. 
Finally, the CDT potential for automation is partly demonstrated later in this thesis, which 
should keep low the required additional design effort. 
4.3 CD T functionaUt y 
CDT requires adding sensors at selected scan flip-flop data inputs and scan flip-flop clock 
inputs. While the device is under test during transition fault pattern shift mode, all data 
transitions are captured by the sensors located at the input of the modified-to-be CDT scan 
flip flops. 
According to [Thibeault 2006], as seen in figure 4.1, the transition capture is performed by 
the sensors (in this example: SDO and SDI for the data, Sco and Sci for the clock) that 
transform the voltage transitions into current pulses collected by the parallel power rails 
(SVD and SGD for the data, SVC and SGC for the clock). These current pulses are then 
converted back to voltage pulses by the CTVC (Current-To-Voltage Converter) blocks. 
Finally, the DM (Delay Measurement) block estimates the delay by comparing the clock and 
the data voltage pulses. 
25 
L'l 
SDO 
D C 
T 
•s "•; 'Q 
SVD 
SGD 
/ 
SVC 
SGC 
S.V CTVC 
1 
D Q ~ 
—Mk .  1 
^ 
-Sr; 
' 
DM 
1 . 
CTVC 
Del34 
Figure 4.1 Scan based CDT architecture. 
4.3.1 Implementatio n of CDT sensor 
Each sensor consists of two small inverters twice the minimum size for the target technology 
and one small capacitor in series according to [Thibeault 2006]. The dual inverters sensor 
structure in figure 4.2 ensures a much balanced and stable voltage to current conversion. The 
transient behavior of the cascaded inverter pair is influenced by the intrinsic and extrinsic 
related parasitic capacitances. Cgdl and Cgd2 are the gate drain capacitances due to overlap 
in Ml and M2. This sensor parasitic model assumes Ml and M2 are either cut-off or in 
saturation, which means the transistors are fimctioning in steady state. 
26 
V i . 
-" 
M 
II 3£' l 
11 ^ 
* ivu 
^VtM 
C t t . 
SVD 
c«i_. 
( ' • 
1 
C.:i.i 
SGD 
L 
^ o Ms 
i. 
C'L 
Md 
Figure 4.2 CDT sensor implementation: intrinsic (blue), extrinsic (burgundy), 
and load (red) capacitances. 
Cdbl and Cdb2 are the diffusion capacitances due to the reverse-biased pn-junction. Cw is 
the wiring capacitance that depends on the length and width of the connecting wire as it is a 
fianction of the fan-out of the gate and the distance to reach those gates. Cg3 and Cg4 are the 
gate capacitances of the fan-out gate that depends primarily on the width of M3 and M4 
which includes both linear overlap and nonlinear gate capacitances. 
The CDT sensor uses small equally sized transistors, with a small capacitance in series that 
matches the input capacitance of the inverter driving it. The intrinsic and extrinsic parasitic 
capacitances might limit the number of sensors that can be connected on one CTVC block. 
The cumulative parasitic capacitances may be a limiting factor when multiple sensors in 
chain are switching at the same time, and might require a dynamic compensation to take care 
for voltage attenuations on the power rail. 
27 
4.3.2 CTV C operatio n 
When the CTVC analog block receives the collected current pulses generated by the sensors 
during switching activity, as transitions happen at the input of the related flip flops chain, it 
converts the current pulses into voltage pulses (SVD^DVP, SVC^CVP). 
The CTVC block as seen figure 4.3, consists of multiple stages, starting from the dynamic 
compensation on the SVD and SVC current-collecting power rails (DCl, DC2), the pre-
amplification stage using a low gain amplifier (LA), calibration circuitry (DAI), a current to 
voltage conversion stage formed with a current mirror load differential amplifier (DA2), and 
a final buffering stage consisting of two inverters (I) in series. The low level transistor 
schematics for each sub-block are fiarther analyzed and explained in the upcoming sections. 
vdd13c 
DCl 
vdd 2 3 
v b l 
/ 
fr 
v d d l S -
v d d l 3 d 
|_£e2_J 
vdd 2 3 
1 A  > . 
vcWJg 
o u t m 
|"'^ -^ .s.^ outm2 
vb2ci 
J\ 
o u t m S l 
a' " " \ outmS 2 i.. A ^ 
P A i I > o 
o u t m 7 
I po-
Figure 4.3 Current to voltage conversion CTVC block . 
28 
4.3.3 Dynami c compensatio n 
The dynamic compensation block works like a bleeder that compensates for any potential 
attenuation on the power rails voltages. It forms an internal on the fly compensation during 
switching activity of the sensors. The implementation of the DC block is seen in figure 4.4. 
vr2— 
vr3 
VdctlSc™, 
vr4— 
VddlSc-
vr8— 
V>i 
''*' ^^^^^ 
iH, 
m 
Vddl3c 
•1^ i m200 
mBOO 
m400 
mSOO 
Vdd23 
Figure 4.4 Dynamic compensatio n o n Vddl3c . 
As the voltage decrease on the vddl3c node, when transitions occur at the input of multiple 
sensors, the dynamic compensation structure rectifies vddl3c pulling it up to vdd23. When 
both differential amplifier inputs are at the same potential, the comparison between vddI3c 
with one of the different internally generated voltage sources (vr2-vr8) leads to logic 0 that 
tums the related PMOS transistor on as it enters the resistive region of operation. Hence, the 
current flows through the active transistor and pulls up the depleted vdd 13c. This helps to 
reduce the impact of the number of simultaneously switching sensors (SSS) on the measured 
delay at the final DM stage. 
29 
4.3.4 CTV C 1s t stage: Low Gain Amplificatio n 
The low gain ampHfier L, as seen in figure 4.5(a), is biased by the differential amplifier DAI 
which is twice the transistor sizing of DCl. One way to control the quiescent current is to 
sense and feedback a copy of the input current. The opted way limits the variation in the 
quiescent current by designing the amplifiers to have low gain. The quiescent current is 
controlled by the gate-source voltages on the M3-M4 NMOS transistors of LA, which in turn 
is confroUed by the output of the differential amplifier DAI. Therefore, reducing the preamp 
gain reduces the variation of gate-source voltages and the quiescent current for a given 
variation in the offset voltages. Also a second LA amplifier is added as a load on the outm51 
node, for offset calibration purposes. As Ml is connected in diode mode, it mirrors the 
current from M3 and flows into transistor M2 through M4. M5 is always branched always in 
saturation as an output PMOS resistor bleeder. 
Figure 4.5 (a)Lo w gai n amplifier, (b ) an d a differential amplifier . 
30 
4.3.5 CTVC 2n d stage: Differentia l Amplificatio n 
A current mirror load differential amplifier, as seen in figure 4.5(b), is used in the CTVC 
block to produce an output voltage proportional to the input current. The amplifier has a 
current mirror load so any imbalance in the drain currents of Ml and M2 causes the output of 
the differential amplifier to swing either towards Vdd23 or Vss. 
When Vin- is larger than Vjn+, the current in M2 is larger than the current in Ml as VGS2 > 
VGSI. The current in Ml flows through M3 and is mirrored by M4. This causes DA2 output 
to go towards Vss until the current in M2 equals the current in M4. The internally generated 
reference voltage outm51 is deliberately connected on the inverting node of DA2 in order to 
achieve maximum gain. Since M3 is connected in a diode configuration it has a lower 
resistance than M4 hence the gain from Vjn- to the output out is larger. 
The amplifier operates as a current to voltage converter due to its near zero input and output 
impedance. The self-biased differential amplifier DA2 receives the voltage outm,  as 
previously seen in figure 4.3, on the inverting input while its non-inverting input receives the 
reference voltage outm51 which is always grounded. 
4.3.6 CTVC Bufferin g Stage 
As the CMOS inverter can be modeled as a dynamic equivalent output resistance ro, there are 
more aspects that need to be taken care of, such as the intrinsic and extrinsic parasitic 
capacitances that play a major role during the switching activity of the cascaded inverters as 
seen in figure 4.3. 
The resulting current to voltage conversion at the output of DA2 gets buffered through a 
cascade of two inverters in series generating the final DVP (data voltage pulse) and CVP 
(clock voltage pulse) output voltages to be later processed by the delay measurement DM 
block. DA2 and the cascaded inverters at the final stage form a high speed input buffer which 
31 
transforms the input signal that might have uneven slow-to-rise and slow-to-fall transitions 
into a clean digital signal with correct pulse width and level. 
4.3.7 Dela y measurement stage 
The delay between the incoming voltage pulses from the data path is then measured against 
the voltage pulse received from the clock network. As seen in figure 4.6, the delay 
measurement is taking place between the falling edges of the CVP and DVP voltage pulses 
by the delay measurement (DM) block. 
CLK 
A 
B 
SVC 
SVD 
CVP 
DVP 
: -'^ 
Delay 
; 1 
X ' 
'1 
i—1 . 
Delay 
'1 
Figure 4.6 CDT Timing Diagram. 
The delay measurement can be done on chip or off-chip during wafer probing. With this 
approach we can estimate the propagation delay of the combinatorial clouds and therefore the 
ICs maximum achievable clock frequency. When the CUT's frequency is exceedingly lower 
than specified frequency, the circuit under test is declared faulty. It is worth mentioning that 
the delay measurement differential approach should mostly compensate for any offset error 
along the path. CDT limitations are discussed in [Thibeault 2006]. 
32 
4.4 Advantage s o f CDTP (Capture-les s Dela y Testing Patterns ) 
The majority of DFT engineers are using a combination of stuck-at and LOC transition fault 
model pattems to achieve an acceptable level of fault coverage. Traditional ATPG tools fail 
to cover those hard to control/observe faults that lie on the longest critical paths. It's much 
more convenient to integrate CDT in the architecture along those critical paths then spending 
time writing functional path delay pattems to exercise the faults. CDT might detect potential 
faults and allows for seamless integration with current techniques, as DFT engineer can aim 
for higher if not near perfect test coverage with a very reasonable amount of area/speed 
overhead. 
As memory tester is often limiting the number of test pattems that can be applied. CDT then 
becomes a very efficient way to boost the delay fault coverage without requiring any 
additional tester memory. CDT takes advantage of the fact that no data is captured to 
transform the intermediate values contained in the scan chains during the shifting (in and out) 
into CDT test pattems. 
The CDT pattems can be characterized as: 
C^'rPatterns —  (Sc — 1) X PattemSfraditional (4-1) 
Where Sc in the total number of scan flip flops in one scan chain. 
As an example, if there are 5000 scan test pattems, 1 scan chain of 1000 scan cells; it creates 
around 5 million additional (CDT) test pattems. This means that the number of delay test 
pattems can be increased by order of magnitude. 
It is important to disable all sensors during normal functional mode in order to eliminate any 
addifional dynamic or static power consumption. The CDT architecture allows such 
important low power feature, as it consists of multiple separate power domains for the CTVC 
33 
and DM blocks as well as the data current pulses (SVD, SGD) and the clock related current 
pulses (SVC, SGC). 
4.5 Conclusio n 
CDT is a Top-Off technique that uses the already generated transition faults model pattems. 
This complementary technique doesn't require any additional pattems to be generated nor 
stored, hence no additional tester memory load and no restrictions on the CPU computation 
time. CDT is an at-speed testing technique. It captures transitions at the input pin of a CDT 
Scan Flip Flop during LOC shift mode, and allows the test engineer to measure the delay 
differences between data paths and clock network, hence giving an accurate estimation of the 
highest frequency the circuit under test can operate on. It doesn't require any post layout 
information and it can be inserted during early stages of DFT / ATPG flow. Next in chapter 
5, we discuss the timing based delay distribution of the left undetected faults by traditional 
mainstream ATPG tools, the ones we will target with CDT. 
CHAPTER 5 
Timing Based Dela y Distributions o f Transition Undetecte d Fault s Mode l 
5.1 Introductio n 
Any testing that is not aware of the delay that might occur due to process variation or any 
other defects in nanometer technology is not complete. A small delay on a critical path might 
cause a timing failure that can render the chip unusable or operate at a lower frequency. 
Commercial ATPG tools usually exercise transition fault test pattems on the shortest path, 
and are timing unaware, as transition ATPG technique attempts sensitizing a fault on the 
shortest (minimum slack) paths without incorporating the SDF annotation timing. 
Furthermore, since the ATPG tool doesn't consider the timing constraints and the actual 
delays of the devices and interconnects in a given design, a transition test pattern that detects 
a fault on a long path might fail to detect the same fault if that path was critically timed. 
In the industry, according to [Davidson 2007], test coverage of 75-85% for transition fault 
pattems is considered acceptable when factoring in time, cost and reliability of the test. 
Hence, CDT implementation might bridge the gap by complementing the LOC transition 
fault model and further improving the test coverage with minimum time and engineering 
effort, at no extra tester cost. This chapter discusses this issue in details by analyzing the 
delay distribution of the left non-covered faults by conventional ATPG tools such as Mentor 
Fastscan. The ATPG process is targeting low cost testers and meant to respect reasonable 
industry standard test abort limit. 
5.2 Sca n based structural tes t technique s 
Scan-based structural tests are widely used in the industry for their cost/coverage test 
effectiveness versus at-speed functional test. Transition fault models targets for slow-to-rise 
and slow-to-fall the output on each gate in the design. Path Delay targets the full path from 
the start point (output of a flip-flop) to the end point (input of a flip-flop) of the total delay 
35 
(gates and interconnects) on a specific circuit path. Detecting a delay induced defect on a 
chip, transition faults and path delay faults models are so far effective in producing good 
fault coverage. Unfortunately they have some limitations that are discussed in chapter 3. 
Transition fault test are applicable on one clock domain, assuming a fixed cycle time. When 
a transition occurs and reaches the end point and being captured and observed, a defect might 
be detected if it doesn't meet the timing slack of the exercised and observed circuit path. 
Relatively to the clock domain, if the slack of this particular circuit path is big, the delay 
defect might not be detected. This defect escape might introduce a failure later in the life 
cycle of the chip. The quality of the chip is then reduced, and might trigger a costly recall. 
A small delay defect that might occur on a short path might have subsequent aging failure on 
a chip, while a defect occurring on the longest path might have a catastrophic immediate 
effect on the correct operation of an integrated circuit. By using current commercial ATPG 
tools, such a defect on the longest path might be left undetected with transition fault models. 
In this chapter we will focus on the longest path fault detection and its impact on the overall 
test coverage. 
5.3 ATPG methodology 
In this study we determine the ATPG undetected faults and observe where they lie in the 
timing domain of each related path. For simplicity we used a single clock domain design, 
especially that transition fault model requires one clock domain in each scan chain to be 
exercised and observed one at a time. A delay distribution of the undetected faults paths 
using Launch-On-Capture (LOC) transition fault model is evaluated. In the following figure, 
the implementation flow is described. A set of tools using Perl scripting is implemented at 
each stage of the flow to extract and process data. 
In this study, we target some of the Politecnico di Torino circuits belonging to the 
Intemational Test Conference (ITC99) benchmark suite that were meant to be used for 
36 
experimentation on DFT and ATPG. Those benchmarks correspond to synthesizable RT-
level descriptions of different size, complexity, and type [ITC 99 Benchmarks]. 
,/*! 
V ^ V 
T&tVaiDr5 
X 
Uncovered(i) 
• Coveredy) 
>"ODES 
(3 KLOC} ; 3  ! {LOS})~CDT 
3{LOC}; 3f{LOS > 
Paths (J m n-pdf ir 
Um3>.-pdf i}''Cmin -
Figure 5.1 Path delay distribution extraction flow. 
Perl scripts were created to mn DFT /ATPG processes that are applied on the ITC99 
benchmark b05 and to analyze results. It determines the min and max path delays of those 
nodes that are non-covered by LOC. It can further analyze and determine the final Delay 
distribution of any selection of covered nodes in relation to non-covered ones. This 
methodology allows for proper investigation of the placement of the undetected faults along 
the respective paths as well as it shows the occurrences of the suitable path transition delays 
which is crucial to understanding the switching activity of the CDT sensors that are going to 
be inserted at a later stage. 
37 
5.4 Simulate d implementatio n step s 
Scan chains insertions are implemented by DFTAdvisor in the design and Mentor Graphics 
Fastscan ATPG tools using transition fault with Launch-On-Capture and Launch-On-Shift. 
Fastscan is then used to report all Fan-in and Fan-out of all non-covered nodes on both LOC 
and LOS. A set of Perl scripting tools compare the LOC undetected faults and the LOS 
undetected ones to create the following lists: 1) the list of nodes non-covered by LOC but 
covered by LOS (those are the nodes we want to cover with CDT and from which the 
destination FFs, where sensors are added, are determined), 2) the list of nodes non-covered 
by LOS but covered by LOC, and 3) the list of nodes that are non-covered by both. 
After generating the destination FF file, we use the paths links and mn it in Prime-Time 
Static Timing Analyzer, with type 'Max" meaning we are taking into account the maximum 
delay. The worst case process comer is used, for slack timing calculation. Parsing through 
the STA report, the arrival time for each path is calculated. The minimum path delay umin-
pd(i), and the maximum path delay umax-pd(i), of all the paths passing through the same 
node i are measured. The number of occurrences of all minimum and maximum path delays, 
are then computed on all undetected nodes. 
As can be seen in figure 5.1, the previous procedure is repeated for all the covered nodes to 
calculate their path delay occurrences. For each covered node j , we respectively define cmin-
pd(j) and cmax-pd(j) as the minimum and maximum path delay of all the paths passing 
through this node. 
5.5 Simulatio n result s 
5.5.1 Minimu m path delay distribution o f non-covered fault s 
The following chart shows the delay distribution of all the shortest paths, namely all Umin-
pd(i) values, along all non-covered faults. The fastest path of all shortest paths namely 
38 
min(umin-pd(i),) is taking 18.7% of the cycle period to arrive at the (to be inserted) CDT 
sensor. Whereas the slowest path max(umin-pd(i), V i) is taking 77.1% of the clock cycle 
period to arrive at the to be inserted CDT sensor that lies at the input of the destinafion FF. 
25 -1 
20 -
5« '1 C 
s 
1 1  1 0 -
O 
5 -
0 - 1 
Delay Distribution o f Min . Paths of I'ncov faults, uniin-])<l(i ) 
• 
/ \ 
•f \ 
5 20 40 GO 80 100 
Path Delay (%T ) 
120 
Figure 5.2 Minimum path delay distribution of non-covered faults. 
The mean delay of shortest paths passing by all undetected faults, mean(umin-pd(i), V i), is 
approximately equal to 45% of the cycle time. 
5.5.2 Maximu m path delay distribution of non-covered faults 
The following chart shows the delay distribution of all the longest paths, namely all the 
umax-pd(j) values, along all non-covered faults. Among these paths, the one with the most 
slack (the lowest max-pd(j) value), is taking 30.3%) of the cycle period to arrive at the sensor. 
Whereas, the one with the least slack (the highest umax-pd(j) value), is taking 94.1%) of the 
clock cycle period to arrive to the sensor at the input of each of their destination FFs. 
39 
/ i ) • 
18 • 
10 -
14 
'fi 1 ' -
S 1 0 
Urn 
S 
O 
4 -
2 -
0 * 
( 
Delay Di.stiibutioii of Max. Patljs of I'licoveied faults, uniax-|>4l(i) 
• 
[ \ 
• 
• • 
•• • • 
) 2 0 4 0 GO 8 0 1 0 0 
Piith Delay (%T) 
1 2 0 
Figure 5.3 Maximum path delay distribution o f non-covered faults . 
The mean delay of longest paths passing by all undetected faults, mean(umax-pd(i), V i), is 
approximately equal to 72%o of the cycle time. 
In order to evaluate ATPG tools fault detection scheme, and in order to verify and compare 
delay distribution along undetected faults; in the following we evaluate delay distribution 
along all ATPG detected faults. 
5.5.3 Minimu m path delay distribution o f covered fault s 
The following chart shows the delay distribution of all the shortest paths along all covered 
faults, namely all the cmin-pd(j) values. The fastest path of all minimum delay paths along 
covered faults, min(cmin-pd(j)), is taking 6.1%) of the cycle period to arrive at their 
destination flip-flops. The slowest path of all minimum paths, max(cmin-pd(j)), takes 77.1%o 
of the clock period to arrive to its path destination FF. 
40 
140 
120 
100 
•i- s o 
h 
t 6 0 
40 
20 
Delay Dis'lribution o f IMin . Paths o f Covere d faults , cniin-i><l(j ) 
A rr 
-I 
A 
20 40 6 0 
Path Dela y (%T ) 
SO 100 120 
Figure 5.4 Minimum path delay distribution of covered faults. 
The mean delay of shortest paths passing by all detected faults, mean(cmin-pd(j), V j), is 
equal to the total number of detected faults is approximately equal to 24% of the cycle time. 
5.5.4 Maximu m path delay distribution of covered faults 
The delay distribution of all longest paths along all covered faults, namely all the cmax-pd(j) 
values, is shown in chart below. The fastest path of all longest paths passing through covered 
faults, min(cmax-pd(j), V j) is taking 6.1%o of the cycle period to arrive at their destination 
flip flops. The slowest path, max(cmax-pd(j), V j) takes 95.8% of the clock period to arrive to 
its path destination flip flops. 
41 
30 
70 
60 
50 
S 4 0 
30 
10 
Dfckiy DLstribution of Max. Path.v of Covered faults, cniax-i)<l(J} 
7r 
MX 
f I 
I ^i 
-* / \ 
\ / 
X 
20 40 60 SO 
Path Delay (%T) 
100 120 
Figure 5.5 Maximum path delay distribution o f covered faults . 
The mean delay of longest paths passing by all detected faults, mean(cmax-pd(j), V j) is 
equal to the total number of detected faults is approximately equal to 64% of the cycle time. 
5.5.5 Comparin g dela y distribution o f non-covered & covered fault s 
Table 5.1 sums up the previous results, expressed as a percentage of the clock period. 
Table 5.1 Minimum, mean & maximum values of umin-pd(i), cmin-pd(j), umax-pd(i), 
cmax-pd(j) expressed as a percentage of the clock period (T) 
Node Path Delay 
umin-pd(i) 
cmin-pd(j) 
umax-pd(i) 
cmax-pd(j) 
Min 
18.7 
6.1 
30.3 
6.1 
Mean 
45 
24 
72 
64 
Max 
77.1 
77.1 
94.1 
95.8 
The delay distribution of the non-covered faults along the shortest paths has a mean value of 
45%) of the cycle period, while the same distribution of the covered faults has a mean value 
42 
of 24%o. Thus it means a shift of 21% along the delay axis. Furthermore, the delay 
distribution of the non-covered faults on the longest paths has a mean value of 72%, whereas 
the corresponding covered faults delay distribution have a mean value of 64% of the clock 
cycle, leading to a shift of 8% along the delay axis. These results validate the assumption 
according to which non-covered nodes are located along longer paths. 
5.6 Conclusio n 
On average, the paths passing through the non-covered faults are longer than the ones 
passing through the covered faults. The difference between the two types of paths exceeds 
20% of T when considering the shorter path delay values (umin-pd(i), cmin-pd(j)). As longer 
paths usually means more gates or longer interconnects along these paths, these non-covered 
paths should have a higher probability of being affected by (delay) defects. Therefore, it is 
important to cover as much as possible these paths. 
The CDT tackles this issue, and might significantly improve the overall test coverage. In the 
next chapter, we will implement a complementary fially automated ATPG method that would 
identify the left non-covered faults, determine the related combinational paths, and allow the 
test engineer to place the CDT sensors along selected paths achieving higher test coverage, 
with minimal effort and area overhead. 
CHAPTER 6 
ANALYSIS O F DELAY TES T EFFECTIVENESS WIT H CD T ON TOP OF LO C 
6.1 Introductio n 
As mentioned earlier, delay testing is widely used by the ASIC industry. Two types of fault 
models, transition and path delay, are the most considered in this testing category. Transition 
fault model is typically the same as stuck-at-fault model. It doesn't require a timing aware 
ATPG tools and is used for large size of delay manufacturing defects. It is widely used in the 
industry as at-speed testing, to verify the timing stmcture of a circuit and to detect delay 
manufacturing defects. In transition fault, the single "gate" delay fault represents itself as a 
pin value of a gate and toggles as if it has a "Slow-To-Rise" or "Slow-To-Fall" logic 
transition. As an example, ATPG tools need to exercise at least 4 vectors to mimic a 
transition from "Slow-To-Rise" and "SIow-To-Fall" on a two input identified fault logic 
gate, rendering the delay test more time-consuming and difficult to compute, with a very 
large number of test pattems. In spite of this large number of pattems, the transition delay 
test coverage is typically lower (around 8 0%) than the single-stuck-at coverage (> 99%). 
Furthermore, as underlined before, the analysis of the defect is usually performed along a 
very short path, leaving some faults that lie on a critical path left undetected. 
The reasoning behind accepting test coverage of 80% using transition fault model from 
industry nowadays is based on some statistical data that ASIC vendors have evaluated, and 
the lack of a better approach that is feasible in terms of time and cost. Running transition 
faults pattems at speed theoretically can detect higher than 90% of the delay induced defects, 
with stuck-at-fault on top, leading to a decent test and fault coverage. DFT engineers rely 
also on path delay testing that cover around 5%  of the overall long paths that are left non-
covered by transition faults. Even though this approach is recommended by the industry, it is 
not flawless and the probability of not detecting a fault is not zero. Moreover, it requires 
extensive engineering work, and a thorough investigation and analysis about the overall 
circuit timing using STA tools. 
44 
In chapter 4 we evaluated the Capture-less Delay Testing "CDT" which is a Top-off 
technique that complements traditional delay testing scan techniques. It doesn't require an 
increase on the pattern count to be stored in the tester memory, and it is suitable for low cost 
testers. In chapter 5 we showed that most of the ATPG undetected faults using transition fault 
model lie on the longest circuit paths. In this chapter we discuss the potential on increasing 
the overall delay test coverage by introducing CDT DFT circuitry. Multiple benchmarks are 
processed and the results are thoroughly discussed. 
6.2 Capture-les s Dela y Testing CD T 
As described in chapter 4, the approach consists of adding sensors at selected scan flip-flop 
data inputs and scan flip-flop clocks. The scan flip-flops do not capture data when CDT is 
applied. The capture is rather performed by the sensors that transform the voltage transitions 
into current pulses collected by the parallel power rails. This approach helps to accurately 
estimate the propagation delay of the IC combinatorial parts and therefore the maximum 
achievable clock frequency. When this frequency is too low, the circuit under test is declared 
faulty [Thibeault 2006]. This test technique doesn't impose any restrictions on scan signals, 
nor does it involve into generating more test pattems as it doesn't require any additional 
tester memory. It works in harmony with other conventional techniques such as LOC 
transition fault model, where it automatically generates its pattems during the shifting of test 
pattems into scan chains. 
6.3 Experiment s CD T on Top-off LO C techniqu e 
This section will deal with the basics of how Capture-less Delay testing can improve test 
coverage, and discuss in part the potential of applying CDT partially on a given SOC. 
45 
6.3.1 Evaluation Framewor k 
After applying scan insertion successfully on a given benchmark, ATPG tools are used to 
generate transition fault pattems based on Launch-On-Capture technique. Deterministic test 
pattems generated by ATPG tools are used for transition fault LOC model as well as random 
pattems in the experiments. As shown in figure 6.1, an evaluation of the test coverage of 
each benchmark is performed. 
Figure 6.1 Benchmark tes t coverage evaluation . 
Test coverage, which is a measure of test quality, consists of the percentage of all testable 
faults that the test pattern set tests. Typically, this is the number of most concern when the 
testability of design is considered. 
TestCovera ge = 
FaultsDete cted 
Detectable Faults 
xlOO 
Here we assume that low cost testers are used, meaning that primary input values are frozen 
and that the primary outputs are not observed during the ATPG process. Based on this 
assumption, the test coverage becomes: 
46 
^ ^ #DT  +  (#PDxPosDetCredit) ,^^ (6.1) 
TestCovera ge =  x 100 
#Tes table 
Where DT are the detected faults. Please refer to Annex II for the complete description of the 
fault classes. 
6.4 Contributio n 
Our work is to observe ATPG identified faults that lie in the timing domain of each related 
path as well as determine the related destination flip flops where the CDT sensors are going 
to be placed in order to achieve the extended test coverage. Furthermore, the goal is to find 
the optimal list of destination flip flops that ensures the highest possible test coverage with 
minimal area overhead and engineering effort. Figure 6.2 shows the implemented algorithm 
flow. 
6.4.1 Algorith m general step s 
1) Create the list of non-covered nodes. 
2) For each non-covered node, determine its destination FFs, and create a stmcture that 
allows to pinpoint from each node all the relating destination FFs and from each FF all 
the non-covered nodes that can reach to it. 
3) Create the list of identified FF as node destinations. This is the unoptimized list of FFs 
representing the entries at which we are going to insert CDT sensors in order to maximize 
the fault coverage. 
4) Count the total number of non-covered nodes for each related destination FF prior to 
starting the sensor minimization process. 
5) Create a list of the identified destination FFs classified in a descending order with respect 
to the total number of related non-covered nodes. 
6) Choose the FF with the highest number of related non-covered nodes and include it in the 
optimized list of destination FFs where the CDT sensors are going to be inserted to 
maximize the test coverage with minimal area overhead. 
47 
7) Remove the chosen FF out of the optimal list of FF destinations and update the list of 
non-covered nodes by eliminating all the nodes that are related to the chosen FF. 
8) Repeat steps 4 to 7 until there are no more nodes left in the pool to cover. 
f Nodf s 
Flip Flop 
Destinations 
I Structur e i f 
ItiltliigAlt : ' 
I^'JO'.'-s FF 
Count # Nodes/FF 
-^ 
List .All FFs 
Starting ivit h , 
the Highes t ' I 
Node count 
Insert FF With 
Highest Mode 
Count In The 
Optimized Lis t 
Remove Selecte d 
FF from lis t of 
Destination FFs 
Eliminate Selecte d FF 
related nodes from lis t 
of uncovered node s 
NO Node 
Poo 
Empty 
Figure 6.2 CDT sensor allocation, placement and optimization flow. 
48 
6.5 Algorithm implementation 
Perl scripts were created to mn DFT /ATPG processes that are applied on the ITC99 
benchmarks bl4, b l5 , and bl8 (2 copies of bl4 and bl7), and several other Perl scripts are 
used to implement the CDT sensor placement and optimization algorithm. Those developed 
tools can be used during the ATPG process to analyze the list of non-covered nodes and 
determine the final optimized list of destination FFs where the CDT sensors are going to be 
placed in order to generate the optimal increase in test coverage with minimal area overhead 
and engineering effort. The major implementation steps are presented in Figure 6.3, and 
discussed next. 
6.5.1 Implementatio n step s 
Scan chains insertions are implemented by DFTAdvisor in the design and Mentor Graphics 
Fastscan ATPG tools using transition fault with Launch-On-Capture (step 1). Fastscan is then 
used to report all Fan-in and Fan-out of all non-covered nodes using transition fault LOC 
pattems (step 2). A Perl script uses the gate report file to generate a stmcture that shows all 
the non-covered nodes and their related destination FFs. For each non-covered node, the 
script determines its destination FFs, and creates a stmcture that allows pinpointing from 
each node all related destination FFs and from each FF all related non-covered nodes (step 
3). 
Another script is used to filter out repetitive lines in the generated stmcture that shows each 
non-covered node and its related destination FFs, prior to creating the new stmcture where 
we show each destination FF with its related non-covered nodes (step 4). A Perl script is then 
used to create the stmcture that shows for each destination ff all related non-covered nodes 
(step 5). Next script represents the first step in the FF list optimization iteration; it generates 
the report of node count for each destination FF (step 6). Another script processes the 
selection list of the FF with the highest node count and purges only the nodes in a separate 
file for later use to eliminate any same instances of non-covered nodes from the big list. 
49 
further reducing the number of needed destination FFs (step7). Another script is created to 
scan through the Nodes/FF stmcture and removes all FF instances with nodes similar to that 
of the FF destination with the highest node count (step 8). Next step requires repeating steps 
6-8 until no non-covered nodes are left in the pool (step 9). This algorithm gives us two lists, 
one that determines the destination FFs representing where we are to put the CDT sensors 
that is needed for optimal fault coverage, as well as the optimized list of destination FFs that 
gives the highest possible test coverage with minimal area overhead. 
DFTAdvisor 
I Sca n Chain 
Insertion 
Fastscan 
LOC ATPG 
Process 
oat t 
Report 
On 
Uncovered 
'•i 
I L 
Endpoints t o 
Uncovered F-4odes 
Systematic 
Stmctures 
Redundant Node 
Instances Filterin q 
Uncovered 
Node/FF 
Destination 
Structure 
Endpoint Node 
Count Repor t 
Gene rati en 
Highest 
Coverage FF 
Node fetchin g 
J I 
Elimination o f 
Selected FF s 
related nodes ft'om 
the poo l 
^^•,A;K''~/'-~><^^~^4- -  '"  , i '.^-.^^fcHrW-a^* ^ 
Figure 6.3 CDT sensor allocation, placement and optimization implementation steps. 
50 
6.6 Runnmg CDT on top of LOC patterns o n multiple ITC99 benchmark s 
The ATPG process consists of three steps. First, we mn LOC pattems on the selected 
benchmark and determine the test coverage. Second step consists of mnning the random 
pattems that emulates the CDTP "CDT pattems" on the full list of destination FFs and 
determine the test coverage increase of CDT over that of LOC. Third step we only target the 
optimized selection of flip flop destinations using them as observe endpoints and masking all 
the rest, then we mn CDT on top LOC pattems and determine the final test coverage. The 
test engineer can easily repeat the process on a new optimized selection of endpoints using 
the set of implemented scripts as needed until achieving more pleasing results. 
6.6.1 Propose d complementary ATP G proces s 
The first step we launch the transition fault LOC pattems on each benchmark. One scan chain 
is applied and LOC is used to generate deterministic test pattems. Table 6.1 summarizes the 
LOC fault coverage statistics. 
Table 6.1 LOC test coverage results of ITC benchmarks 
ITC Benche s 
B14 
B15 
B18 
Total # of Faults 
26236 
31704 
276838 
LOC 
91.63% 
78.97 % 
77.89 % 
6.6.2 Applyin g CD T random patterns on Un-detected fault s 
In an effort to emulate the CDT pattems that are automatically triggered during shifting of 
the LOC pattems into the scan chain, we chose to use Random pattems and mn them on the 
left non-covered UD undetected faults. Because we are targeting low cost testers, CDT will 
be mnning in real testing environment using the same pattem of LOC during LOC shift time. 
51 
Therefore, if the number of simulated LOC pattems is equal x, the CDT sensors will be 
triggered y-1 times (where y is number of flops in the scan chain) on each pattem input data 
exercised at the scan input of the chip, leading us to the formula (2) shown below. To 
emulate these pattems, we will be mnning the Random ATPG, on top-off LOC, and we will 
be using a set of random pattems of size equal to x(y-l). Thus, the number of simulated 
random pattems can be characterized by the following Formula: 
SimulatedRitternSj. ,  =  (#FFs-1) x SimulatedRitterns^ 
zoc 
(6.2) 
6.6.3 Un-optimize d CD T sensors coun t coverag e 
In here we assume that sensors are applied on all endpoint destination flip flops inputs. As 
predicted, the application of random pattems minimized the number of un-detected faults and 
increased the overall test coverage. Table 6.2 shows the selected benchmarks test coverage 
results that might mirror the optimal test coverage of CDT on top of LOC ATPG process. 
Table 6.2 Un-optimized SS- random pattems test coverage results 
ITC99 
Test Benches 
B14 
B15 
B18 
CDT on Top of LOC 
Coverage 
94.71 % 
83.22 % 
85.37 % 
6.6.4 Optimize d CDT sensors count coverag e 
We want to minimize the number of needed sensors by which we should be able to cover 
most of the undetected faults and propagate the transition along their paths. Thus, we should 
be able to minimize area overhead by reducing sensor count and obtain reasonably optimal 
fault coverage. The algorithm was applied using Perl scripts. It forms a set of tools that 
52 
determines from the left non-covered faults of a particular design, all the destination FFs. We 
assume that the reduced optimized list of sensors will almost give us an equal coverage, 
further minimizing undetected node counts with minimal effect on the test coverage. 
From here arise the two following scenarios: the first ones, described in the previous section 
"Un-optimized Sensors count", with its results derived from the un-optimized list of 
destination FFs. The second scenario consists of analyzing the list of optimized sensor count. 
This list results from mnning all the scripts of the algorithm and hence determining the 
minimal number of destination FFs that have the highest undetected node counts. The 
following table 6.3 shows the fault coverage results that might reflect the optimized sensor 
test coverage of CDT. 
Table 6.3 Optimized SS- random pattems fault coverage results 
ITC99 
Test Benche s 
B14 
B15 
B18 
B18(n-detect) 
Optimized CD T Senso r 
List Coverag e 
94.03 % 
82.37 % 
82.20 % 
80.20 % 
6.6.5 Summar y o f obtained test coverage result s on selected IT C 99 benchmark s 
We exploit the benchmarks bl4, bl5 and bl8 in order to analyze the effectiveness of the 
proposed ATPG process. The following table summarizes the simulation results for the rest 
of the benchmarks. The results gathered on the benchmarks show that our proposed ATPG 
process that incorporates CDT on top of LOC vectors was able to generate higher test 
coverage with results comparable and exceeding those generated by path delay testing. 
53 
Table 6.4 Summary of simulated ITC 99 benchmark test coverage 
ITC 
Benches 
B14 
B15 
B18 
Detected 
Faults 
23788 
24727 
211962 
Testable 
Faults 
25961 
31311 
257861 
CDT 
Sensors 
187 
398 
2785 
Optimized 
CDT Sensors 
22 
62 
330 
LOC 
Test 
Coverage 
9 1 . 6 3 % 
78.97 % 
77.89 % 
Optimal 
CDT 
Coverage 
94.71 % 
83.22 % 
85.37 % 
Optimized 
Sensor 
Coverage 
94.03 % 
82.37 % 
82.20 % 
Thus as seen in Table 6.4, our proposed complementary ATPG process shows the large 
benefits of incorporating CDT in any given design with greater improvements that might 
broaden the test coverage. The proposed algorithm identifies those endpoints for sensory 
placement and serves the test engineer early on to incorporate the CDT structure in the 
design netlist in the most optimal way possible. What sets our proposed approach apart from 
the completion is that it will make it feasible for designers to evaluate the testability of their 
circuits that incorporate CDT prior to layout while tremendously reducing the test cost and 
engineering effort. 
From the acquired results, we can assume higher test coverage by using the optimized list of 
sensors, with a substantial decrease in the total number of needed CDT sensors. In the 
following table 6.5, we summarize the improvements that were achieved as we can deduce 
that a test engineer could on average minimize the number of CDT sensors by -87%) and still 
achieve similar test coverage results with a minimal loss of ~1.57%. 
54 
Table 6.5 A compromise of Test coverage with minimal CDT sensors use 
ITC99 
Benchmarks 
B14 
B15 
B18 
CDT Senso r 
Minimization 
88.24% 
84.40% 
88.20% 
Overall Tes t 
Coverage Los s 
0.68% 
0.85% 
3.17% 
6.6.6 Validatin g the obtained test coverage with the optimized lis t of sensor s 
We choose a different selection of equal number of destination ffs, mainly those flops with 
highest node count that result from the first iteration of the proposed algorithm. We mn the 
simulafion to check the test coverage of CDT on the opfimized list of sensor endpoints 
targeting benchmark B18. This method emulates the N-detect algorithm ensuring that each 
targeted node is covered N-times. The following table 6.5 shows the resulting fault coverage 
of the new selection of CDT sensors. If we analyze the ATPG untestable faults, the test 
generator was able to find more pattems to create a test, and yet cannot prove the rest of the 
faults redundant. These faults might be caused by the constraints, or limitations, purposely 
placed on the ATPG tool such as scan cell constraints that masks the excluded scan flops 
during simulation. Thus, from the remaining detected unobserved faults, we got a number of 
ATPG untestable fautts that might become possibly detected or detected if we remove some 
constraints or in our case trying a slightly modified selection of sensors could further enhance 
the overall test coverage. 
6.7 Conclusion 
The proposed algorithm has tremendously reduced the number of needed CDT sensors. The 
results showed that using CDT on top of LOC can produce higher test coverage results. One 
can always enhance the selection of sensory endpoints, doing a couple of minor trial and 
error steps to further optimize the final selection. Further improvements on the sensor 
55 
minimization algorithm might be needed to achieve higher test coverage, close to the ideal 
unrestrained CDT coverage. Some longer combinatorial paths are harder to propagate a 
transition along to an observable node. Some circuits might have paths with multiple 
reconvergent fan-outs, others have multi-cycle paths and can be hard to observe on a given 
endpoint further reducing the effective coverage or observability of certain selected 
endpoints. Therefore some endpoints that share coverage of an equally high number of 
targeted nodes might have a better probability of achieving higher test coverage. CDT as 
shown, not only improves the overall test coverage by ~ 5% but also eliminates the over 
testing issue since there is no need to mn more LOC transition vectors to improve the 
coverage, or manually generate time consuming fiinctional pattems on top of transition 
pattems targeting millions of critical paths in today's multi-billion transistor SOCs. 
CONCLUSION 
This thesis has attempted to address the need for Capture-less Delay Testing technique 
(CDT) as a complementary technique on top of LOC transition fault model. In this project, 
our goal was to improve the test coverage of the LOC pattems along the longest critical 
paths, using DFT techniques, mainly CDT that incorporates analog circuitry into the early 
stages of the design, widening the area of coverage of traditional delay test pattems. 
It further emphasized on the area of test coverage that current conventional techniques could 
lack to resolve, and fail to ensure a feasible test coverage especially in sub-nanometer high 
density modem SOCs that are potentially defect-prone. We propose an algorithm that 
implements a fully automatable process and allows a test engineer to locate and insert CDT 
sensors. These sensors are inserted on specific flip flop endpoints that can be triggered by the 
LOC vector's transitions through the left non-covered nodes and serve to estimate, at the 
final stage, the total combinatorial propagation delay of an IC determining whether it meets 
or violates the maximum operating clock frequency. 
As discussed in this thesis, with nanometer technology, new types of manufacturing defects 
has emerged to the surface mainly due to the decrease in feature size and probable critical 
lack of lithography mask manufacturing precision. The high tech industry is mnning 
nowadays into unanticipated problems with clock speeds due to on-chip variations with the 
evolving huge die size, high current leakage and elevated power consumption. New types of 
faults are emerging that can be caused by process variations, might alter the desired 
fiinctional speed on a given ASIC, triggering the necessity of what is called delay testing. 
These delay defects are not covered by the traditional stuck-at fault model, which is timing 
insensitive. Delay testing techniques are a must in today ASIC manufacturing process to 
insure fair test coverage that protects the end user from potentially defective and low 
performance SOCs. 
57 
The scan-based LOC approach is the dominant delay testing technique, as it eases the scan 
insertion and design and as it minimizes the area/performance overhead penalty, at the 
expense of extra test pattems and ATPG CPU time. CDT is an at-speed testing technique. It 
captures transitions at the input pin of a CDT Scan Flip Flop during LOC shift mode, and 
allows the test engineer to measure the delay differences between data paths and clock 
network, hence giving an accurate estimation of the highest frequency the circuit under test 
can operate on. It doesn't require any post layout information and it can be inserted during 
early stages of DFT / ATPG flow. 
In this project, we conducted a study that seeks to understand the delay distribution of ATPG 
undetected faults and observe where they lie in the timing domain of each related path. We 
chose and followed a specific ATPG process as previously discussed in chapter 5. For 
simplicity we used a single clock domain design, especially that transition fault model 
requires one clock domain in each scan chain to be exercised and observed one at a time. A 
delay distribution of the undetected faults paths using Launch-On-Capture (LOC) transition 
fault model is evaluated. A set of tools using Perl scripting was implemented at each stage of 
the flow to extract and process data as we proceed from RTL synthesized netlist to 
determining the path delay distribution of all nodes that are covered by LOS but non-covered 
by LOC. The purpose of our non-covered node selection is to ensure that we are only 
choosing the functional paths and are capable of propagating a transition along any given 
path. 
We found through simulation results that on average, the paths passing through the non-
covered faults are longer than the ones passing through the covered faults. The difference 
between the two types of paths exceeds 20%) of clock period when considering the shorter 
path delay values. 
We have further implemented a fully automatable algorithm that allows the test engineer 
during the eariy stages of the ATPG process to seek those non-covered nodes, place CDT 
sensors on related flip flop endpoints further increasing the area of test coverage, as well as 
58 
the ability to minimize on need the sensors count. Our results have showed that when we 
apply CDT on top LOC transition pattems, we can achieve higher test coverage results with 
an estimated 5% increase. Our CDT sensor optimization algorithm also produces equally 
high test coverage with minimal loss on average of 1.57% with an outstanding reduction of 
sensor count by around 87%. 
ANNEX I 
LOGIC BUILT-IN SEL F TEST BIS T 
BIST architectur e 
A well known logic BIST architecture using a single clock based on STUMPS technique can 
be shown in figure A-I-1. The core logic, which is the combinational and sequential main 
functional logic of the chip, is the circuit under test. It is surrounded by the BIST components 
that include test pattem generation block using Linear Feedback Shift Register LFSR, a 
phase shifter circuit, the output response analysis block composed of multiple input signature 
register MISR, and a space compactor. Two counters are used; one is the test pattem counter 
and the shift counter or bit counter that keeps track of how many cycles are needed to fill in 
the scan chains. A test control block controls all test points in the design, and the BIST 
control block that depicts the BIST diagnosis steps. In some cases a normal ATPG mode 
might be used on the same chip requires a multiplexing scheme between the BIST phase 
shifter and scan inputs to convert some or all BIST controlled scan chains in deeper ATPG 
mode scan chains. 
Figure-A I- l Built-i n sel f test architecture . 
60 
Logic BIST architecture 
The BIST doesn't require any primary inputs or outputs to the extemal world, unless it is 
used as stand-alone mode logic BIST. Through Boundary scan TAP controller with few test 
control pattem the BIST can be initiated. In order to minimize the time of the device under 
test, the core logic can be divided into many shallow scan chains. During test shift time, new 
test pattems generated by LFSR are loaded into the scan chains while simultaneously 
unloading and checking the previous pattems in the MISR block. Whenever all pattems are 
fully loaded into the scan chains, all scan flops and test points are put in normal system mode 
for one at speed clock cycle allowing capturing the circuit's under test response. As shown in 
figure A-I-2, once all test pattems are applied, the output of MISR signature that depicts a 
BIST failure or success can be captured and scanned out using TAP controller output. 
BIST Start 
Te.-^ t C^oiitiollei 
Crenel atorl 
BIST Done 
a tnre^ tP '*^"^F«4 
System Input 
Sy.stem Outputs 
Figure-A 1-2 Logic BIST functional architecture . 
61 
Logic BIST design flow 
BIST timing should be analyzed using Static Timing Analyzers tools, and treating it the same 
as any functional logic within the ASIC is a must, in the whole ASIC flow from synthesis, 
formal verification (equivalence checking), timing, layout, place and route and back 
annotation as seen in figure A-I-3. 
y es'tiiiiati| r 
; timing ani 
LBIST controller, LFSR 
L d e s a - i p t i ^ ^ . MISR,Phase shifter. 
Compactor 
•t (co)sJmulati<a 
> sr \ 'Utl l^^^^Constramts 
jf__ Gate level netlist 
3 can testable Gate le'«l neilst 
pojttt generat 
teeig«t«l'-a»^ 
Figure-A 1-3 Logic BIST design flow. 
62 
ANNEX I I 
TYPES OF FAULT CLASSE S 
FastScan categorizes faults into fault classes, based on how the faults were detected or why 
they could not be detected. Each fault class has a unique name and two character class code. 
The following, presents some excerpts of the [Mentor Graphics ATPG Guide] showing 
different types of identified Fault Classes. 
UNTESTABLE 
Untestable (UT) faults are faults for which no pattem can exist to either detect or possibly 
detect them. Untestable faults cannot cause fianctional failures, so the tools exclude them 
when calculating test coverage. 
Unused (UU) 
The unused fault class includes all faults on circuitry unconnected to any circuit observation 
point as shown in figure A-II-1. 
D 
CLH^, 
Master 
Latch 
Q 
fflL 
s-a-1/s-a-O ^^ 
Figure-A II- l Exampl e o f unused fault . 
63 
Tied (TI) 
The tied fault class includes faults on gates where the point of the fault is tied to a value 
identical to the fault stuck value. 
The tied circuitry could be due to: 
• Tied signals 
• AND and OR gates with complementary inputs 
• Exclusive-OR gates with common inputs 
• Line holds due to primary input pins held at a constant logic value during test by CTO or 
CTl pin constraints you applied with the the FastScan or FlexTest. 
Because tied values propagate, in figure A-II-2 below, the tied circuitry at A causes the tied 
faults at A, B, C, and D. 
s-a-O 
Ar-^ B  C , \ D • ^ 
^ , j « . 
7 GN D 
Figure-A II-2 Example of tied fault. 
Blocked (BL) 
The blocked fault class includes faults on circuitry for which tied logic blocks all paths to an 
observable point. 
The tied circuitry could be due to: 
• Tied signals 
• AND and OR gates with complementary inputs 
• Exclusive-OR gates with common inputs 
64 
Tied faults and blocked faults can be equivalent faults. Figure A-II-3 shows the site of a 
blocked fault. 
Figure-A II-3 Example of blocked fault. 
Redundant (RE) 
The redundant fault class includes faults the test generator considers undetectable. After the 
test pattem generator exhausts all pattems, it performs a special analysis to verify GND. 
The fault is undetectable under any conditions. Figure A-II-4 shows the site of a redundant 
fault. 
Figure-A II-4 Example of redundant fault. 
In this circuit, signal G always has the value of 1, no matter what the values of A, B, and C. 
If D is stuck at 1, this fault is undetectable because the value of G can never change, 
regardless of the value at D. 
65 
TESTABLE 
Testable (TE) faults are all those faults that cannot be proven untestable. The testable fault 
classes include: 
Detected (DT ) 
The detected fault class includes all faults that the ATPG process identifies as detected. 
The detected fault class contains two subclasses: 
• Detsimulation (DS) - faults detected when the tool performs fault simulation. 
• Detimplication (DI) - faults detected when the tool performs learning analysis. 
The detimplication subclass normally includes faults in the scan path circuitry, as well as 
faults that propagate ungated to the shift clock input of scan cells. The scan chain functional 
test, which detects a binary difference at an observation point guarantees detection of these 
faults. 
Posdet (PD ) 
The posdet, or possible-detected, fault class includes all faults that fault simulation identifies 
as possible-detected but not hard detected. A possible-detected fault results from a 0-X or 1-
X difference at an observation point. The posdet class contains two subclasses: 
posdet_testable (PT) - potentially detectable posdet faults. PT faults result when the tool 
cannot prove the 0-X or 1-X difference is the only possible outcome. A higher abort limit 
may reduce the number of these faults, 
posdetuntestable (PU) - proven ATPGuntestable and hard undetectable posdet faults. 
66 
Uninitialized (UI ) 
The uninitialized fault class includes faults for which the test generator is unable to: 
• find an initialization pattem that creates the opposite value of the faulty value at the fault 
pin. 
• prove the fault is tied. 
In sequential circuits, these faults indicate that the tool cannot initialize portions of the 
circuit. 
ATPG untestabl e (AU ) 
The ATPG_untestable fault class includes all faults for which the test generator is unable to 
find a pattem to create a test, and yet cannot prove the fault redundant. Testable faults 
become ATPGuntestable faults because of constraints, or limitations, placed on the ATPG 
tool (such as a pin constraint or an insufficient sequential depth). These faults may be 
possible-detectable, or detectable, if you remove some constraint, or change some limitation, 
on the test generator (such as removing a pin constraint or changing the sequential depth). 
Undetected (UD) 
The undetected fault class includes undetected faults that cannot be proven untestable or 
ATPGuntestable. The undetected class contains two subclasses: 
• uncontrolled (UC) - undetected faults, which during pattem simulation, never achieve the 
value at the point of the fault required for fault detection—that is, they are uncontrollable. 
• unobserved (UO) - faults whose effects do not propagate to an observable point. 
All testable faults prior to ATPG are put in the UC category. Faults that remain UC or UO 
after ATPG are aborted, which means that a higher abort limit may reduce the number of UC 
or UO faults. Uncontrolled and unobserved faults can be equivalent faults. If a fault is both 
uncontrolled and unobserved, it is categorized as UC. 
ANNEX II I 
PATH DELAY DISTRIBUTIO N PER L SCRIPT S 
############Dofiles used to automate the process##################### 
## Author : Roger EL-KAFROUNI ## 
## beginning : 17 June 2008. ## 
## modified : 30 October 2008. ## 
## revised : 18 Febmary 2009. ## 
#######################LOS pattem Generation##################### 
//fastscan -verilog Top_ELABMEM_syn_fs.v -lib atpglib_atsisanl8.3.0.atpg -dofile 
LOSbatch.dofile 
//fastscan -verilog top_ELABMEM.v -lib atpglib_atsisanl8.3.0.atpg -dofile 
LOS_batch.dofile 
set system mode setup 
add clocks 0 /t_CLOCK /t_RESET 
add scan groups grpl 
/users/kafrouni/TDF/SCANPERL/FINALWORK/Tweaks/Top_ELABMEM_syn_fs.testproc 
add scan chains chain 1 grpl /scan_inl /scan_outl 
set output masks on 
set transition holdpi on 
// Create Launch off last shift transition fault pattems 
set system mode atpg 
set fault type transition 
set pattem type -sequential 0 
add faults -all 
set Abort Limit 100 
create pattems 
write faults LOS 1 OOFaultsRPT -replace -class FULL 
save pattems LOSlOOPatemsRPT.ascii -ascii -parallel -replace 
report aborted faults all >! 
/users/kafrouni/TDF/SCANPERL/FINALWORK/Tweaks/LOS 1 OOAbortFaultsRPT 
###########################LOC Pattem Generation######################### 
//fastscan -verilog Top_ELABMEM_syn_fs.v -lib atpglib_atsisanl8.3.0.atpg -dofile 
LOCbatch.dofile 
//fastscan -verilog top_ELABMEM.v -lib atpglib_atsisanl8.3.0.atpg -dofile 
LOCbatch.dofile 
set system mode setup 
add clocks 0 /t CLOCK /t RESET 
68 
add scan groups grpl 
/users/kafrouni/TDF/SCANPERL/FINALWORK/Tweaks/Top_ELABMEM_syn_fs.testproc 
add scan chains chain 1 grpl /scan_inl /scanoutl 
add pin constraint scan en 1 cO 
set output masks on 
set fransition holdpi on 
// Create broadside transition fault pattems 
set system mode atpg 
set fault type transition 
set pattem type -sequential 2 
add faults -all 
set fault type transition -noshiftlaunch 
set Abort Limit 100 
create pattems 
write faults LOCI OOFaultsRPT -replace -class FULL 
save pattems LOClOOPatemsRPT.ascn -ascii -parallel -replace 
report aborted faults all >! 
/users/kafrouni/TDF/SCANPERL/FINALWORK/Tweaks/LOC 1 OOAbortFaultsRPT 
#PrimeTime SCRIPTS Used to determine the post layout static timing analysis Process. 
#This script is an example of the main script, where I am not including all the paths used 
#in the process. 
###Roger EL-KAFROUNI, Masters degree candidate, ETS 2008##### 
#PrimeTime script 
#Author: Roger EL-KAFROUNI, ETS, Lacime 
#Junel7 2008 
#Start of scripts: 
echo "mnning Report Timing script" 
set scenario slow 
#set scenario typical 
#—— set these for setup 
set type setup 
set fhame elabmem.$scenario.$type.timing 
set dtype max 
# 
# set these for hold -
#set type hold 
#set fiiame elabmem.$scenario.$type.timing 
69 
#set dtype min 
# 
source .s)mopsys_pt.setup 
echo "read in 3 essential files: netlist, constraints and spef file + link the design" 
read_verilog top_ELABMEM.v 
link_design top_ELABMEM 
echo "reading spef files..." 
read_parasitics -quiet -increment topELABMEM.spef 
echo " create a virtual clock on the output of the clock pad" 
create_clock -name CLK -period 10 -waveform [list 0 5] [getjpins clk_pad/C] 
#create_clock -name RST -period 10 -waveform [list 0 5] [get_pins res_pad/C] 
#create_clock -name CLK -period 10 -waveform [list 0 5] [get_pins s_CLOCK/C] 
#create_clock -name CLK -period 10 -waveform [list 0 5] [get_pins clk_pad/PAD] 
set_dont_touchnetwork [get_clocks CLK] 
set_propagated_clock [get_clocks CLK] 
#set_dont_touch_network [get_clocks RST] 
#set_propagated_clock [get_clocks RST] 
echo "Propagating the transition along the paths" 
#echo "set case analysis to enable caputre mode" 
set_case_analysis 0 [get_ports scan_enl] 
#echo "set case analysis to enable shift mode" 
#set_case_analysis 1 [get_ports scan_enl] 
#=============A sample from a nodes Gate report======================== 
echo "report timing on all non-covered nodes from output of Inflops to input of outflops" 
echo "Non-covered Node U328/Y " » $fname 
reportjiming -from link_ELABMEM_MAX_reg_8/Q -through U328/Y -to 
link_ELABMEM_MAX_reg_8/D -nets -nosplit -delay_type max » $fname 
echo "Non-covered Node U328/Y " » Sfiiame 
reportjiming -from link_ELABMEM_MAX_reg_8/Q -through U328/Y -to 
link_ELABMEM_MAX_reg_7/D -nets -nosplit -delay_type max » $fhame 
echo "Non-covered Node U328/Y " » $fiiame 
reportjiming -from link_ELABMEM_MAXjeg_8/Q -through U328/Y -to 
liiik_ELABMEM_MAX_reg_5/D -nets -nosplit -delay type max » $fhame 
70 
echo "Non-covered Node U328/Y " » $fname 
reportjiming -from link ELABMEM_MAX_reg_8/Q -through U328/Y -to 
link_ELABMEM_MAX_reg_6/D -nets -nosplit -delayJype max » Sfname 
echo "Non-covered Node U328/Y " » Sfiiame 
reportjiming -from link_ELABMEM_MAX_reg_8/Q -through U328/Y -to 
link_ELABMEM_MAX_reg_3/D -nets -nosplit -delayJype max » Sfname 
echo "Non-covered Node U328/Y " » $fname 
reportjiming -from link_ELABMEM_MAX_reg_8/Q -through U328/Y -to 
link_ELABMEM_MAXreg_l/D -nets -nosplit -delayJype max » Sfname 
echo "Non-covered Node U328/Y " » Sfname 
reportjiming -from link ELABMEM_MAX_reg_8/Q -through U328/Y -to 
link_ELABMEM_MAX_reg_2/D -nets -nosplit -delayJype max » Sfname 
echo "Non-covered Node U328/Y " » Sfname 
reportjiming -from link_ELABMEM_MAX_reg_8/Q -through U328/Y -to 
link_ELABMEM_MAX_reg_4/D -nets -nosplit -delayJype max » Sfname 
echo "Non-covered Node U328/Y " » Sfiiame 
reportjiming -from link_ELABMEM_MAX_reg_8/Q -through U328/Y -to 
link_ELABMEM_MAX_reg_0/D -nets -nosplit -delayJype max » Sfname 
echo "Non-covered Node link_ELABMEM_sub_80_U2_7/B " » Sfname 
reportjiming -from link_ELABMEM_MAX_reg_7/Q -through 
link_ELABMEM_sub_80_U2_7/B -to link_ELABMEM_MAX_reg_8/D -nets -nosplit -
delay J y p e max » Sfname 
echo "Non-covered Node link_ELABMEM_sub_80_U2_7/B " » Sfname 
reportjiming -from link_ELABMEM_MAX_reg_7/Q -through 
link_ELABMEM_sub 80_U2_7/B -to link_ELABMEM_MAX_reg_7/D -nets -nosplit -
delay J y p e max » Sfname 
echo "Non-covered Node link_ELABMEM_sub_80_U2_7/B " » Sfname 
reportjiming -from link_ELABMEM_MAX_reg_7/Q -through 
link_ELABMEM_sub_80_U2_7/B -to link_ELABMEM_MAX_reg_5/D -nets -nosplit • 
delay type max » Sfname 
echo "Non-covered Node link_ELABMEM_sub_80_U2_7/B " » Sfname 
reportjiming -from link_ELABMEM_MAX_rcg_7/Q -through 
link_ELABMEM_sub_80_U2_7/B -to link_ELABMEM_MAX_reg_6/D -nets -nosplit • 
delay type max » Sfname 
echo "Non-covered Node link_ELABMEMjub_80_U2_7/B " » Sfname 
reportjiming -from link_ELABMEM_MAX_reg_7/Q -through 
link_ELABMEM_sub_80_U2_7/B -to link_ELABMEM_MAX_reg_3/D -nets -nosplit • 
delay J y p e max » Sfname 
echo "Non-covered Node link_ELABMEM_sub_80_U2_7/B " » Sfname 
reportjiming -from link_ELABMEM_MAXjeg_7/Q -through 
link_ELABMEM_sub_80_U2_7/B -to link_ELABMEM_MAX_reg_l/D -nets -nosplit 
delay type max » Sfname 
echo "Non-covered Node link_ELABMEM_sub 80 U2 7/B " » Sfname 
71 
reportjiming -from link_ELABMEM_MAX_reg_7/Q -through 
link_ELABMEM_sub_80_U2 7/B -to link ELABMEM_MAX reg_2/D -nets -nosplit 
delay J y p e max » Sfname 
echo "Non-covered Node link ELABMEM_sub_80_U2_7/B " » Sfname 
reportjiming -from link_ELABMEM_MAX_reg_7/Q -through 
link_ELABMEM_sub_80_U2_7/B -to link_ELABMEM MAX_reg_4/D -nets -nosplit 
delay type max » Sfname 
echo "Non-covered Node link_ELABMEM_sub 80_U2_7/B " » Sfname 
reportjiming -from linkELABMEM MAX_reg_7/Q -through 
link_ELABMEM_sub_80_U2_7/B -to link_ELABMEM_MAX_reg_0/D -nets -nosplit 
delay J y p e max » Sfname 
#ETS 2008-2009, Roger EL-KAFROUNI 
# The results are LosFaults, LocFaults, and commonLosLoc faults 
# The non-covered faults by LOC but covered by LOS are found in 
# the file "LocFaults". From it we'll find the FF destinations 
# leading to finding the appropriate number of sensors needed to 
# cover all faults. 
#!/usr/bin/perl -w 
open ( I N F I L E L O S , "<./LOS-UO-FAULTS.rpt") or die; 
open (IN_FILE_LOC, "<./LOC-UO-FAULTS.rpt") or die; 
#open (IN_FILE_LOS, "<SARG[0]") or die("can't find los fault coverage file\n"); 
#open (IN_FILE_LOC, "<$ARG[1]") or die("can't find loc fault coverage file\n"); 
open (OUT_FILE_COM, ">commonLosLoc.log"); 
open (OUTFILELOS, ">LosFault.log"); 
open (OUT FILELOC, ">LocFault.log"); 
open (OUTFILEReportGatesFS, ">ReportGates_FS.log"); 
#open (OUT_FILE_TEST, ">TestLine.log"); 
(S)los_data = <IN_FILE_LOS>; 
@loc_data = <IN_FILE_LOC>; 
SFAULT_LOS = 0; 
SFAULT_LOC = 0; 
$FOUND_LOS_LOC_COM = 0; 
SFOUNDLOS = 0; 
SFOUNDLOC = 0; 
# Find LOS only faults and common faults 
SLineNum los = 0; 
72 
SLineNum loc = 0; 
foreach Sline (@los_data) 
{ 
#remove leading and trailing spaces 
#Sline =~ s/^\s*(\S*(?:\s+\S+)*)\s*S/Sl/; 
#remove trailing white space 
#$line =~ sAs+$//; 
#remove leading white sapce 
#$line =~ s/^\s+//; 
if(Sline=~/'^(.*)(.UO)(.*)$/) 
{ 
STargetNodeLOS = S2."\n"; 
STargetNodeLOS =~ s/^\s+//; 
chomp($TargetNodeLOS); 
# print OUT_FILE_TEST "STargetNodeLOS \n"; 
} 
SFAULT_LOS[SLineNumJos]=$TargetNodeLOS; 
foreach Sline ((gloc_data) 
{ 
if($line=~/^(.*)(.UO)(.*)S/) 
STargetNodeLOC = S2."\n"; 
STargetNodeLOC =~ s/'^\s+//; 
chomp(STargetNodeLOC); 
# print OUT_FILE_TEST "STargetNodeLOC \n"; 
} 
$F AULT_LOC [S LineNum Joe]=$TargetNodeLOC; 
if ($FAULT_LOS[SLineNum los] eq SFAULT LOC [SLineNum loc]) 
{ 
$FOUND_LOS_LOC_COM[$LineNum los] = $FAULT_LOS[SLineNumJos]; 
} 
elsif(SFAULT_LOS[SLineNum los] eq $FAULT_LOC[$LineNumJoc]) 
{ 
$FOUND_LOS [SLineNum J o s ] = SFAULT_LOS[SLineNumJos]; 
} #elsif 
SLineNum loc = SLineNumJos + 1; 
73 
}# foreach (loc) 
print OUT FILE_COM "SFOUNDLOSLOC COM[SLineNumJos]\n"; 
print OUT_FILE_LOS "$FOUND_LOS[SLineNumJos]\n"; 
SLineNum loc = 0; 
# LOC faults only 
SLineNumJos = 0; 
SLineNum J o e = 0; 
foreach Sline ((^loc_data) 
{ 
if($line=~/'^(.*)(.UO)(.*)$/) 
{ 
STargetNodeLOC = $3."\n"; 
chomp(STargetNodeLOC); 
} 
$FAULT_LOC[$LineNumJoc]=STargetNodeLOC; 
foreach Sline (@los_data) 
{ 
STargetNodeLOS = S3."\n"; 
STargetNodeLOS =~ s/'^\s+//; 
chomp(STargetNodeLOS); 
$FAULT_LOS [SLineNum Jos]=STargetNodeLOS; 
if ($FAULT_LOC[SLineNumJoc] ne SFAULT_LOS[$LineNumJos]) 
{ 
$FOUND_LOC[$LineNum loc] = $FAULT_LOC[SLineNumJoc]; 
} 
} #foreach (los) 
if(SLineNum l o O O ) { 
if ($FOUND_LOC[SLineNum loc] ne SFOUND LOC[(^LineNum loc - 1]){ 
print OUT_FILE_LOC "$FOUND_LOC[SLineNum loc]\n"; 
74 
print OUTFILE ReportGatesFS qq{echo "CMD> report gates -endpoints -
backward SFOUND LOC[SLineNumJoc]" » gates_report.rpt\n}; 
print OUTFILEReportGatesFS "report gates -endpoints -backward 
SFOUND_LOC[SLineNum loc] » gates_report.rpt\n"; 
print OUTFILEReportGatesFS qq{echo "CMD> report gates -endpoints • 
forward SFOUND_LOC[SLineNumJoc]" » gates_report.rpt\n}; 
print OUT FILE_ReportGates_FS "report gates -endpoints -forward 
$FOUND_LOC[SLineNumJoe] » gates_report.rpt\n"; 
SLineNumJoc = SLineNum J o e + 1; 
} else {SLineNumJoc = SLineNumJoc + 1; } 
} else { # first line in file 
print O U T F I L E L O C "SFOUND_LOC[SLineNum loc]\n"; 
print OUTFILEReportGatesFS qq{echo "CMD> report gates -endpoints • 
backward SFOUND_LOC[SLineNumJoc]" >! gates_report.rpt\n}; 
print OUTFILEReportGatesFS "report gates -endpoints -backward 
$FOUND_LOC[$LineNumJoc] » gates_report.rpt\n"; 
print OUTFILEReportGates FS qqjecho "CMD> report gates -endpoints 
forward SFOUND_LOC[$LineNumJoc]" » gates_report.rpt\n}; 
print OUT_FILE ReportGatesFS "report gates -endpoints -forward 
$FOUND_LOC[$LineNumJoc] » gates_report.rpt\n"; 
SLineNumJoc = SLineNumJoc + 1; 
SLineNumJos = 0; 
} 
} 
close I N F I L E L O S ; 
close I N F I L E L O C ; 
close O U T F I L E C O M ; 
close O U T F I L E L O S ; 
close OUT FILELOC; 
close OUTFILEReportGatesFS; 
#ETS 2009, Roger EL-KAFROUNI 
#The results are, the arrival times of the paths. I am rendering the timing file that I generated 
#using sysnopsys PrimeTime, in order to process the data, scan through and extrapolate the 
#needed information. 
75 
#!/usr/binyperl -w 
open (IN_FILE_PathDlysData, "<./elabmem.slow.setup.fiming") or die; 
open (OUTFILETEST, ">ArrivalTime.rpt"); 
@node_data = <IN_FILE_PathDlysData>; 
Sflag = 0; 
foreach Sline (@node data) 
{ 
if ((Sline =~ /Non-covered Node/) && (Sflag eq 1)) 
{ 
chomp(Sline); 
print OUT_FILE TEST "Sline: "; 
#SuncovNode = S2."\n"; 
Sflag = 1 ; 
#print OUT_FILE_TEST "SuncovNode: "; 
} 
# get slack data 
#if ((Sline =~ /^( *)(.slack)(.*)S/) && (Sflag eq 0)) 
if ((Sline =~ /data arrival time/) && (Sflag eq 1)) 
{ 
s/data arrival time//g foreach (Sline); 
Sline =~ s/^\s+//; 
chomp(Sline); 
print O U T F I L E T E S T "Sline ns\n"; 
Sflag = 0; 
} 
}#foreach 
close INFILEPathDlysData; 
close OUT FILE TEST; 
#ETS 2009, Roger EL-KAFROUNI 
#The results are, MinPathDelay.rpt, MinPathDelayNode.rpt, and 
#MinPathDelay_NodeValue.rpt. 
#!/usr/bin/perl -w 
open (INFILL Jestdelay, "<./ArrivalTime.rpt") or die; 
open (OUTFILEMIN, ">MinPathDelay.rpt"); 
76 
open (OUT_FILE_MINl, ">MinPathDelay_Node.rpt"); 
open (OUT_FILE_MIN2, ">MinPathDelay_NodeV.rpt"); 
@nodedata = <rN_FILEjestdelay>; 
Sflag = 0; 
$line_count = 0; 
foreach Sline (@node_data) 
{ 
#find Node 
if ((Sline =~ /Non-covered Node/) && (Sflag eq 1)) 
{ 
SNode = Sline; 
s/Non-covered Node //g foreach (SNode); 
sA :\ [0-9]+/ /g foreach(SNode); 
SNode =~ s/\s+$//; 
chomp(SNode); 
SFound_Node = SNode; 
Sflag = 0; 
if ($line_count eq 0) { SFound_Node_prev = SFound_Node; } 
#print OUT_FILE_MIN "SFound_Node \n"; 
} 
# find Node Value 
if ((Sline =~ /Non-covered Node/) && (Sflag eq 2)) 
{ 
SNodeValue = Sline; 
s/Non-covered Node //g foreach (SNodeValue); 
s/SFound_Node \://g foreach (SNodeValue); 
#remove leading white sapce 
SNodeValue =~ s/'^\s+//; 
$Found_NodeValue = SNodeValue; 
Sflag = 2; 
if (Sline_count eq 0) { $Found_NodeValue_prev = SFound_NodeValue; } 
$line_count = Slinecount + 1; 
#print OUT_FILE_MIN "SFoundNodeValue \n"; 
} 
if ((SFound_Node eq $Found_Node_prev) && (Sflag eq 1)) 
{ 
if ((SFoundNodeValue eq $Found_NodeValue_prev) or (SFoundNodeValue > 
$Found_NodeValue_prev)) 
77 
{ 
Sflag = 0; 
} 
elsif ($Found_NodeValue < SFound_NodeValue_prev) 
SFoundNodejprev = $Found_Node; 
$Found_NodeValue_prev = SFound_NodeValue; 
} 
elsif ((SFoundNode ne SFound_Node_prev) && (Sflag eq 2)) 
{ 
chomp(SFound_Node_prev); 
chomp(SFound_NodeValue_prev); 
print OUT_FILE_MIN "SFound_Node_prev: $Found_NodeValue_prev ns\n"; 
print OUT_FILE_MINl "SFound_Nodej)rev\n"; 
print 0UT_FILE_MIN2 "SFound_NodeValue_prev\n"; 
Sflag = 0; 
SFound_NodeValue_prev = $Found_NodeValue; 
} 
}#foreach 
close IN_FILEjestdelay; 
close OUT_FILE_MIN; 
close 0UT_FILE_MIN1; 
close OUT_FILE_MIN2; 
#ETS 2008, Roger EL-KAFROUNI 
#The results are, MinDlyDistUncov.rpt, MinPathDlyData.rpt, and #MinPathDlyFreq.rpt. 
#This script determines the Paths delays and the equivalent frequency of occurrences. 
#!/usr/bin/perl -w 
#output file: MinNodeOcc.rpt, MinOccurence.rpt, MinNodeValue.rpt 
#perl find_min_Occ.pl 
open (IN_FILE_testdelay, "<./MinPathDelay_NodeV.rpt") or die; 
open (OUTFILEMIN, ">MinDlyDistUncov.rpt"); 
open (0UT_FILE_MIN1, ">MinPathDlyData.rpt"); 
open (0UT_FILE_MIN2, ">MinPathDlyFreq.rpt"); 
@node_data = <IN_FILEjestdelay>; 
$ counter = 0; 
78 
Sindex = 0; 
SLoadedValues=0; 
SValue=0; 
SData=0; 
print OUT FILEMIN "Value \t OccurenceVn"; 
print OUT FILE MIN "\****** * * * * * * * * * \ j ^ " . 
foreach Sline (@node_data) 
{ 
SData=$line; 
chomp(SData); 
SLoadedValues[$index] = SData; 
#print OUT_FILE_MnM "$LoadedValues[$index]\n"; 
Sindex = ++Sindex; 
}#foreach 
foreach Sline ((^node_data) 
{ 
SValue=Sline; 
chomp(S Value); 
for(Si=l; Si<=$#LoadedValues; $i++) 
{ 
if(SValue ne $LoadedValues[$i]) 
{ 
$ counter = $ counter + 1; 
}#if 
}#for 
chomp(Sline); 
chomp(Scounter); 
#print OUT_FILE_MIN "Value \ OccurenceVn"; 
if (Scounter ne I) 
{ 
print OUT_FILE_MIN "Sline \ Scounter\n"; 
print OUT_FILE_MINl "Sline\n"; 
print OUT_FILE_MIN2 "$counter\n"; 
} 
}#foreach 
close IN_FILE_testdelay; 
close OUT_FILE_MIN; 
ANNEX I V 
CDT SENSOR PLACEMEN T AN D OPTIMIZATION PER L SCRIPT S 
### Author : Roger EL-KAFROUNI ### 
### beginning : 22 October 2008. ### 
### modified : 25 Febmary 2009. ### 
### revised : 18 January 2010. ### 
rmduplicate.pl 
##We filter out the lines from repetitive lines in the Destination file prior to creating the new 
##structure where we show each destination ff with its related non-covered nodes. This list 
##corresponds at the maximum coverage list of ff destinations at which we are going to input 
##sensors. 
open (IN_FILE_NodeData, "<./DESTINATION.rpt") or die; 
open (OUTFILERMD, ">RMDUP.rpt"); 
@node_data = <IN_FILE_NodeData>; 
foreach Sline (@node_data) 
{ 
if (Sline =~ /'^(Destination:)(.*)$/) 
{ 
SDestPoint = S2."\n"; 
SDestPoint =~ s/V\D//; 
SDestPoint =~ s/^\s*(\S*(?:\s+\S+)*)\s*S/Sl/; 
@a[Smyline] = SDestPoint; 
Smyline = Smyline + 1; 
} 
} 
close n^_FILE_NodeData; 
sub remove_duplicates(\@) 
{ 
80 
my $ar = shift; 
my %seen; 
for ( my Si = 0; $i <= S#{$ar} ; ) 
splice (^Sar, —Si, 1 
ifSseen{$ar->[Si++]}++; 
} 
} 
remove_duplicates( @a); 
for ($i=0; Si<= S#a; Si++) 
{ 
pnnt OUT_FILE_RMD "$a[Si]\n"; 
} 
close O U T F I L E R M D ; 
DestFFallNodes.pl 
#Create the stmcture that shows for each destination ff its relating non-covered nodes. 
#!/usr/bin/perl -w 
open (IN_FILE_FFDest, "<./RMDUP.rpt") or die; 
open (IN_FILE_FFDestNode, "<./DESTINATION.rpt") or die; 
open (OUT_FILE_DESTINATION, ">ListFFDest.rpt"); 
@node_FFDest = <IN_FILE_FFDest>; 
@node_FFDestNode = <IN_FILE_FFDestNode>; 
SDestPoint = 0; 
SNodePoint = 0; 
$flag=0; 
$DestFF=0; 
81 
Scounter = 0; 
foreach Sline (@node_FFDest) 
{ 
SDestFF=Sline; 
SDestFF =~ s/^\s*(\S*(?:\s+\S+)*)\s*$/Sl/; 
Scounter = 0; 
foreach Sline (@node_FFDestNode) 
{ 
if (Sline =~ /^(Destination:)(.*)S/) 
{ 
SDestPoint = $2."\n"; 
SDestPoint =~ sAAD//; 
SDestPoint =~ s/^\s*(\S*(?:\s+\S+)*)\s*S/Sl/; 
if (SDestFF eq SDestPoint) 
{ 
Sflag=l; 
} 
}#if 
if ((Sline =~ /'^(Node:)(.*)S/) && (Sflag eq 1)) 
{ 
SNodePoint = S2."\n"; 
SNodePoint =~ s/^\s*(\S*(?:\s+\S+)*)\s*S/Sl/; 
print OUT_FILE_DESTINATION "Destination: SDestPoint Node: SNodePoint\n"; 
Scounter = Scounter + 1; 
$flag=0; 
} 
} 
82 
} 
close IN FILE_FFDest; 
close IN_FILE_FFDestNode; 
close OUT_FILE_DESTINATION; 
RMDUPLICATE2.pl 
#We filter out the lines from repetitive lines in the Destination file prior to creating the new 
#stmcture where we show each destination ff with its related non-covered nodes. This list 
#corresponds at the maximum coverage list of ff destinations at which we are going to input 
#sensors. 
open (IN_FILE_NodeData, "<./ListFFDest.rpt") or die; 
open (OUT FILERMD, ">RMDUP2.rpt"); 
(gnodedata = <IN_FILE_NodeData>; 
foreach Sline ((a^node_data) 
{ 
if (Sline =~ /^(Destination:)(.*)$/) 
{ 
SDestPoint = $2."\n"; 
SDestPoint =~ sA/\D//; 
SDestPoint =~ s/^\s*(\S*(?:\s+\S+)*)\s*S/Sl/; 
(ga[Smyline] = SDestPoint; 
Smyline = Smyline + 1; 
} 
} 
close IN_FILE_NodeData; 
sub remove_duplicates(\(§) 
83 
{ 
my Sar = shift; 
my %seen; 
for ( my Si = 0; Si <= S#{Sar} ; ) 
{ 
splice @Sar, —Si, 1 
ifSseen{Sar->[Si++]}++; 
} 
} 
#my @a = qw( a a b c c c d e f e f e a f g h h h ) ; 
remove_duplicates( @a ); 
for (Si=0; Si<= S#a; Si++) 
f 
I 
print OUT_FILE_RMD "Sa[Si]\n"; 
} 
close O U T F I L E R M D ; 
FF2NodeCount.pl 
#Here we start STEP 1 of the iteration. 
#we are generating the report of node count for each destination FF. 
#!/usr/bin/perl -w 
#open (IN_FILE_FFDest, "<./RMDUPEX2.rpt") or die; 
#open (OUTFILEDESTINATION, ">FF2NodeCnt.rpt"); 
open (IN_FILE_FFDest, "<./ExcludedListl.rpt") or die; 
open (OUT_FILE_DESTINATION, ">FF2NodeCnt2.rpt"); 
@node_FFDest = <IN_FILE_FFDest>; 
84 
SDestPoint = 0; 
Sflag=0; 
Scounter = 1; 
SSavedDestPoint = 0; 
foreach Sline (@node_FFDest) 
{ 
if ((Sline =~ /^(Destination:)(\ )(.*)(\ )(Node:)(\ )(.*)S/) && (Sflag eq 0)) 
{ 
SDestPoint = S3. "V; 
SDestPoint =~ s/^\s*(\S*(?:\s+\S+)*)\s*S/$l/; 
SSavedDestPoint = SDestPoint; 
Sflag =1 ; 
} 
if ((Sline =~ /^(Destination:)(\ )(.*)(\ )(Node:)(\ )(.*)$/) && (Sflag eq 1)) 
{ 
SDestPoint = S3."\n"; 
SDestPoint =~ s/^\s*(\S*(?:\s+\S+)*)\s*$/$l/; 
if (SDestPoint eq SSavedDestPoint) 
{ 
Scounter = Scounter + 1; 
} 
else { 
print OUTFILEDESTINATION "Destination: SSavedDestPoint \t $counter\n"; 
Scounter = 1; 
SSavedDestPoint = SDestPoint; 
} 
}#if 
}# foreach 
85 
close INFILL FFDest; 
close OUT_FILE_DESTINATION; 
SamplelNodes.pl 
#Here we start STEP 2 of the iteration. 
#This script takes the sample data with ff destination and nodes and output the nodes only for 
#further use to eliminate any same instances of non-covered nodes from the big list further 
#reducing the number of needed Destination FFs. 
#!/usr/bin/perl -w 
open (IN_FILE_FFDestSmpll, "<./samplel.rpt") or die; 
open (OUT FILE_RMD, ">NodeSamplel.rpt"); 
@node FFDestSample = <rN_FILE_FFDestSmpll>; 
SSampleNodePoint = 0; 
Spointer = 0; 
foreach Sline (@node_FFDestSample) 
{ 
if (Sline =~ /^(Destination:)(\ )(.*)(\ )(Node:)(\ )(.*)S/) 
{ 
SSampleNodePoint = S7."\n"; 
SSampleNodePoint =~ s/^\s*(\S*(?:\s+\S+)*)\s*$/$l/; 
$SampleNodes[Spointer] = SSampleNodePoint; 
Spointer = Spointer + 1; 
} 
} 
86 
for ($i=0; $i<=S#SampleNodes; $i++) 
{ 
pnnt OUT_FILE_RMD "SSampleNodes[Si]\n"; 
} 
close INFILEFFDestSmpl 1; 
close O U T F I L E R M D ; 
PreIteration.pl 
#Here we start STEP 3 of the iteration. 
#This script scans through the file and removes all the FFs whose nodes are similar to that of 
#the highest FF destination nodes. 
#!/usr/bin/perl -w 
#use strict; 
#use wamings; 
open (IN_FILE_FFDestSmpl, "<./NodeSamplel.rpt") or die; 
open ( INFILEFFDest , "<./RMDUP2.rpt") or die; 
open (OUTFILEDESTINATION, ">ExcludedListl.rpt"); 
@node_FFDestSmpl = <IN_FILE_FFDestSmpl>; 
(gnodcFFDest = <IN_FILE_FFDest>; 
$SampleNodePoint=0; 
SSampledLine=0; 
$NodePoint=0; 
#Scount=0; 
#SSavedLines[$count]=0; 
$flag=0; 
87 
Sline_cntr = 0; 
$StoreLine[Sline_cntr] = 0; 
SSampleLine = 0; 
foreach Sline (@node_FFDestSmpl) 
{ 
if(Sline=~/'^(.*)S/) 
{ 
SSampleNodePoint = Sl."\n"; 
SSampleNodePoint =~ s/''\s*(\S*(?:\s+\S+)*)\s*S/Sl/; 
} 
#We are reading the file and storing the unmatched first iteration 
#lines in an array. 
if(Sflag eq 0) 
{ 
foreach Sline (@node_FFDest) 
{ 
#Store the line first than see if you want to keep it. 
if (Sline =~ /^(Destination:)(\ )(.*)(\ )(Node:)(\ )(.*)$/) 
{ 
SSampledLine = Sline; 
SNodePoint = $7."\n"; 
SNodePoint =~ s/^\s*(\S*(?:\s+\S+)*)\s*S/$l/; 
SSampledLine =~ s/^\s*(\S*(?:\s+\S+)*)\s*S/Sl/; 
$StoreLine[$line_cntr] = SSampledLine; 
#print OUT_FILE_DESTINATION "$SampledLine\n"; 
}#if 
if (SSampleNodePoint ne SNodePoint) 
{ 
88 
#Update the line_cntr to store the current line. Otherwise, 
#next line will overwrite the current one in the array 
Sline_cntr = Sline_cntr + 1; 
} 
}# foreach OF nodeFFDest 
Sflag = 1 ; 
} 
if(SflagneO) 
{ 
for (Sindex = 0; Sindex <= S#StoreLine; Sindex++) 
{ 
SSampleLine = SStoreLine[$index]; 
if (SSampleLine =~ /'"(Destination:)(\ )(.*)(\ )(Node:)(\ )(.*)S/) 
{ 
SNodePoint = S7."\n"; 
SNodePoint =~ s/^\s*(\S*(?:\s+\S+)*)\s*$/$l/; 
#Update counter to delete the current line. Otherwise, leave it in the array, 
if (SSampleNodePoint eq SNodePoint) 
{ 
$StoreLine[Sindex] =""; 
} 
} 
}# for loop 
}# flag not equal to 0 
}#foreach Sline (@node_FFDestSmpl) 
89 
for (Si=0; $i<= S#StoreLine; Si++) 
{ 
if(SStoreLine[$i]ne"") 
{ 
print OUT_FILE_DESTINATION "$StoreLine[Si]\n"; 
} 
} 
close INFILEFFDest ; 
close INFILEFFDestSmpl; 
close OUT_FILE_DESTINATION; 
BIBLIOGRAPHY 
Ahmed N., Tehranipoor M., Jayaram V. 2006. "Timing-based delay test for screening 
small delay defects". Design automation conference, 43rd ACM/IEEE, pp. 320-325. 
Bareisa E., Jusas V., Motiejunas K., Seinauskas R. 2008. "Functional delay clock fault 
models". 124X, Information technology and control. Volume 37, No. 1. 
Benayahu N., A. Chechik, Ron Press. 2007. "Launch-off-shift At-Speed Test". Test and 
Measurement World, pp.37-42. 
Chadrakasan A., Robert W. B. 2000. Design of High-Performance Micro-processor Circuits. 
Wiley-IEEE press, pp. 98-116. 
Crouch A., G. Eide. 2006. Advances In Electronic Testing: Challenges and Methodologies: 
DFT-Oriented, Low-Cost Testers, edited by Dimitris Gizopoulos, Springer 2006, 
page 179. 
Davidson S., Helen D. 2007. "The phsychology of electronic tesf'. IEEE design and test of 
computers, vol. 24, n°5, p.495-501. 
Datta R., Sebastine A., Gupta R., J. Townsend W., A. Abraham J. 2004. "Test and debug in 
deep-submicron technologies". ACAS 2004, Computer Eng. Research Center, 
University of Texas at Austin. 
Groeneveld P. R. 2002, "Physical Design Challenges for Billion Transistor Chips". IEEE 
Intemational Conference on Computer Design: VLSI in Computers and Processors, 
pp.78-83, 2002. 
Hawkings C , Keshavarzi A., Segura J. 2003. "A view from the bottom: Nanometer 
technology AC parametric failures - Why, where, and how to detect. IEEE 
intemational symposium on defect and fault tolerence in VLSI systems, pp. 267-276. 
Hetherington G., Fryars T., Tamarapalli N., Kassab M., Hassan A., Rajski J. 1999. "Logic 
BIST for large industrial designs: real issues and case studies". Test conference 
proceedings intemational, vol. issue, pp. 358-367, 1999. 
ITC Benchmark. 2010. The Intemational Test Conference, ITC99 Benchmark Home Page. 
On line, <www.cerc.utexas.edu/itc99-benchmarks/bench.html>. Consulted October 
2008. 
Kaufman M. 2008. "System-on-Chip Test Architectures: Nanometer Design For Testability", 
edited by Laung-Temg Wang, Charles E. Stroud, Nur A. Touba, Elsevier Inc. 2008. 
91 
Kim et al.K. S. 2003. "Delay Defect Characteristics and Testing Strategies," IEEE Design & 
Test, vol. 20, no.5, Sept. 2003, pp. 8-16. 
Krsfic A., K. Cheng, J. Liou, M. Abadir. 2003. "Delay Defect Diagnosis Based Upon 
Statistical Timing Models - The First Step". Computers and Digital Techniques, IEEE 
Proceedings, Stevenage, United Kingdom, pp. 346-54. 
Liou J.-J., Wang, L.-C, Cheng K.-T., Dworak J., Mercer M.R., Kapur R.,Williams T.W. 
2002. "Analysis of delay test effectiveness with a multiple-clock scheme". ITC 
intemational test conference. (Dec. 10 2002), pp. 407-416. 
Lin X., Kun-Han T., Chen W., Mark K., Janusz R., Takeo K., Randy K., Yasuo S., Shuji H., 
Takashi A. 2006. "Timing-Aware ATPG for High Quality At-speed Testing of Small 
Delay Defects". Asian Test Symposium, ats 2006, pp. 139-146. 
Lin X., Press R., Rajski J., Renter P., Rinderknecht T., Swanson B., Tamarapalli N. (2003). 
"High-frequency, at-speed scan testing". Volume 20, Issue 5, pp. 17 - 25. 
Madge R., G. Oregon, B. Benware, R. Turakhia, J. Ruffler. 2004. "In search of the optimum 
test set - adaptive test methods for maximum defect coverage and lowest test cosf. 
Test conference, 2004, proceedings, ITC 2004, intemational, pp.203-212. 
Mak T.M., Krstic A., Cheng K.-T., Wang Li.-C. 2004. "New challenges in delay testing of 
nanometer, mulfigigahertz designs". Vol. 21, Issue 3, May-June 2004, pp. 241 - 248. 
Mentor Graphics ATPG Guide. 2006. Scan and ATPG Process Guide (DFTAdvisorT^, 
FastScan™ and FlexTest™), Software Version 8.2006_3. August 2006. 
Nanowerk Spotlight. FuUerene resist materials for the 32nm node and beyond. 2008. 
Nanowerk Nanotechnology Spotlight posts. On ligne. 
< http://www.nanowerk.com/spotlight/spotid=6625.php>. Consulted August 5 2008. 
Pateras S. 2003. "Achieving at-speed stmctural test". Design & test of computers: IEEE, pp. 
26-33. 
Qiu W., Wang J., Lu X., Li Z., D.M.H. W., Shi W. 2004. "At-speed test for path delay faults 
using practical techniques". Current and defect based testing: DBT 2004. 
Proceedings. IEEE intemational workshop. Volume, issue, pp. 61-66. 
Sachdev M., J. P. D. Gyves. Eide. 2007. "Defect Oriented Tesfing for Nano-Metric CMOS 
VLSI Circuitsnd Methodologies", 2"'* ed. Springer 2007, page 25. 
Saxena, J. Butter, K.M. Gatt, J. Raghuraman, R. Kumar, S.P. Basu, S. Campbell, D.J. 
Berech, J. 2002. "Scan-based transition fault testing - implementation and low cost 
test challenges". Test Conference, 2002. Proceedings, Intemational, pp. 1120-1129. 
92 
Sengupta S., S. Kundu, S. Chakravarty, P. Parvathala, R. Galivanche, G. Kosonocky, M. 
Rodgers, and T. M. Mak, 1999. Defect-based test: A key enabler for successful 
migration to stmctural test, Intel Technology  J.,  pp. 1-14, Ql 1999. 
Synopsys DFT Compiler, "User Manual for SYNOPSYS Toolset Version 2007.12," 
Synopsys, Inc., 2007. 
Thibeault C. 2006. "Improving Digital IC testing with Analog Circuits". Circuits and 
Systems, IEEE North-East Workshop, Gatineau, Quebec, pp. 285-288, 30 Nov. 2006. 
Vorisek v. , Koch T., Fischer H. "At-speed testing of SOC ICs". 2004. Design, Automation 
and Test in Europe Conference and Exhibition. Volume 3, pp.120- 125. 
